compass.pipeline.collection.steps.KnownUrlDocumentsStep#
- class KnownUrlDocumentsStep[source]#
Bases:
CollectionStepConcrete Strategy for known URL document collection
Methods
collect(workflow)Collect documents from known URL's for this jurisdiction
Attributes
Identifier for step
- STEP_NAME = 'known_doc_urls'#
Identifier for step
- async collect(workflow)[source]#
Collect documents from known URL’s for this jurisdiction
- Parameters:
workflow (
compass.pipeline.jurisdiction.SingleJurisdictionRun) – The workflow for the jurisdiction being processed, which may or may not have user-supplied known document URLs configured. If the workflow doesn’t have any user-supplied known document URLs, this function will return an empty list.- Returns:
list– List of documents loaded from the user-supplied known URLs, with user-supplied metadata attrs added.