compass.extraction.apply.check_for_ordinance_info#
- async check_for_ordinance_info(doc, model_config, heuristic, tech, ordinance_text_collector_class, permitted_use_text_collector_class=None, usage_tracker=None)[source]#
Parse a single document for ordinance information
- Parameters:
doc (
elm.web.document.BaseDocument) – A document instance (PDF, HTML, etc) potentially containing ordinance information. Note that if the document’sattrshas the"contains_ord_info"key, it will not be processed. To force a document to be processed by this function, remove that key from the documentsattrs.model_config (
compass.llm.config.LLMConfig) – Configuration describing which LLM service, splitter, and call parameters should be used for extraction.heuristic (
object) – Domain-specific heuristic implementing acheckmethod to qualify text chunks for further processing.tech (
str) – Technology of interest (e.g. “solar”, “wind”, etc). This is used to set up some document validation decision trees.ordinance_text_collector_class (
type) – Collector class invoked to capture ordinance text chunks.permitted_use_text_collector_class (
type, optional) – Collector class used to capture permitted-use districts text. WhenNone, the permitted-use workflow is skipped.usage_tracker (
UsageTracker, optional) – Optional tracker instance to monitor token usage during LLM calls. By default,None.
- Returns:
elm.web.document.BaseDocument– Document that has been parsed for ordinance text. The results of the parsing are stored in the documents attrs. In particular, the attrs will contain a"contains_ord_info"key that will be set toTrueif ordinance info was found in the text, andFalseotherwise. IfTrue, the attrs will also contain a"date"key containing the most recent date that the ordinance was enacted (or a tuple of None if not found), and an"ordinance_text"key containing the ordinance text snippet. Note that the snippet may contain other info as well, but should encapsulate all of the ordinance text.
Notes
The function updates progress bar logging as chunks are processed and sets
contains_district_infowhenpermitted_use_text_collector_classis provided.