compass.extraction.apply.check_for_ordinance_info#

async check_for_ordinance_info(doc, model_config, heuristic, tech, ordinance_text_collector_class, permitted_use_text_collector_class=None, usage_tracker=None)[source]#

Parse a single document for ordinance information

Parameters:
  • doc (elm.web.document.BaseDocument) – A document instance (PDF, HTML, etc) potentially containing ordinance information. Note that if the document’s attrs has the "contains_ord_info" key, it will not be processed. To force a document to be processed by this function, remove that key from the documents attrs.

  • model_config (compass.llm.config.LLMConfig) – Configuration describing which LLM service, splitter, and call parameters should be used for extraction.

  • heuristic (object) – Domain-specific heuristic implementing a check method to qualify text chunks for further processing.

  • tech (str) – Technology of interest (e.g. “solar”, “wind”, etc). This is used to set up some document validation decision trees.

  • ordinance_text_collector_class (type) – Collector class invoked to capture ordinance text chunks.

  • permitted_use_text_collector_class (type, optional) – Collector class used to capture permitted-use districts text. When None, the permitted-use workflow is skipped.

  • usage_tracker (UsageTracker, optional) – Optional tracker instance to monitor token usage during LLM calls. By default, None.

Returns:

elm.web.document.BaseDocument – Document that has been parsed for ordinance text. The results of the parsing are stored in the documents attrs. In particular, the attrs will contain a "contains_ord_info" key that will be set to True if ordinance info was found in the text, and False otherwise. If True, the attrs will also contain a "date" key containing the most recent date that the ordinance was enacted (or a tuple of None if not found), and an "ordinance_text" key containing the ordinance text snippet. Note that the snippet may contain other info as well, but should encapsulate all of the ordinance text.

Notes

The function updates progress bar logging as chunks are processed and sets contains_district_info when permitted_use_text_collector_class is provided.