compass.pipeline.collection.persistence#

Persistence for collected documents

Functions

build_collection_manifest(tech, jurisdictions)

Build the serialized collection manifest payload

load_collected_docs(collection_info, *, ...)

Load all docs for one jurisdiction from collection info

load_collection_manifest(manifest_fp, ...)

Load a collection manifest from disk

persist_documents(jurisdiction, ...[, ...])

Persist deduplicated documents for one jurisdiction

write_collection_manifest(manifest_dir, ...)

Write a collection manifest to disk

write_collection_manifest_shard(shard_dir, ...)

Write one jurisdiction collection manifest shard to disk