Ordinance Plugin Development Tutorial#
This tutorial walks you through building a battery storage extraction plugin. The goal is not just to show what to type, but to explain how and why each piece of a COMPASS plugin works. By the end, you will have a complete, registered plugin and a mental model of the data flow that makes extraction reliable.
What you will build#
A complete battery_storage plugin that can be discovered by the CLI,
can filter documents quickly, and can turn ordinance text into structured
outputs. Along the way you will see how COMPASS uses a label chain to pass
data from one stage to the next.
You will build:
A keyword heuristic for fast filtering
An LLM text collector and extractor
Decision-tree parsers for structured outputs
A registered plugin that runs with
compass process
Choose the base class#
Start by choosing the base class that matches how much control you need.
For ordinance technologies that look like solar or wind, start with
OrdinanceExtractionPlugin. It wires together filtering, collection,
extraction, and CSV output so you can focus on domain logic.
Reach for FilteredExtractionPlugin when filtering can remain standard,
but you need custom parsing logic. Choose BaseExtractionPlugin only when
you want to control the whole workflow yourself, including filtering and
output handling. Think of this choice as your contract with the framework:
the more you want to customize, the more responsibilities you accept.
Project skeleton#
Create a new extraction package. This keeps the pieces close to each other so you can reason about the full pipeline as you build it.
compass/extraction/battery_storage/
__init__.py
plugin.py
ordinance.py
parse.py
graphs.py
How the pieces fit#
The pipeline is a conversation between components. Each one writes its results to a label, and the next component reads from that label. If the labels are misaligned, the data disappears and the parser will never see the text you extracted.
Document.text -> collector.OUT_LABEL -> extractor.OUT_LABEL -> parser.IN_LABEL
This label chain is the most important mental model to keep while you are building and debugging. You will revisit it in the troubleshooting guide.
Step 1: Heuristic#
In ordinance.py, define a keyword heuristic to quickly screen out
non-relevant documents before LLM calls. Think of this as the bouncer at
the door: it keeps obvious non-target content out so you spend model budget
only where it matters.
from compass.plugin import KeywordBasedHeuristic
class BatteryStorageHeuristic(KeywordBasedHeuristic):
NOT_TECH_WORDS = [
"car battery",
"vehicle battery",
"phone battery",
"laptop battery",
"battery lawsuit",
]
GOOD_TECH_KEYWORDS = [
"bess",
"battery",
"storage",
"lithium",
"capacity",
]
GOOD_TECH_ACRONYMS = ["BESS", "ESS"]
GOOD_TECH_PHRASES = [
"battery energy storage",
"battery storage system",
"energy storage system",
]
Step 2: Text collector#
In ordinance.py, define a collector that keeps only chunks that
describe actual regulations. The collector answers a simple question:
should this chunk move forward in the pipeline, or be discarded? You want
to be conservative here, keeping the chunks that contain real regulatory
requirements.
from compass.plugin import PromptBasedTextCollector
class BatteryStorageTextCollector(PromptBasedTextCollector):
IN_LABEL = "text"
OUT_LABEL = "battery_storage_relevant_text"
PROMPTS = [
{
"prompt": (
"Does this text discuss regulations or zoning rules for "
"battery energy storage systems (BESS)? Return JSON with "
"key '{key}' set to true/false. Ignore consumer batteries."
),
"key": "discusses_battery_storage_regulations",
"label": "battery storage regulation check",
},
{
"prompt": (
"Does this text contain specific requirements like "
"setbacks, capacity limits, safety standards, or "
"permit procedures? Return JSON with key '{key}' set to "
"true/false"
),
"key": "contains_specific_requirements",
"label": "specific requirements check",
},
]
Step 3: Text extractor#
In ordinance.py, define an extractor that condenses the retained
chunks into a clean extraction text. This step is where you reduce a messy
set of regulatory paragraphs into a focused summary that a parser can
reason over. The output label becomes the parser input, so clarity here
directly improves downstream accuracy.
from compass.plugin import PromptBasedTextExtractor
class BatteryStorageTextExtractor(PromptBasedTextExtractor):
IN_LABEL = "battery_storage_relevant_text"
PROMPTS = [
{
"prompt": (
"Extract battery storage regulations. Include setbacks, "
"capacity limits, height limits, safety requirements, "
"fire suppression, screening, noise, and permits. "
"{FORMATTING_PROMPT} {OUTPUT_PROMPT}"
),
"key": "battery_storage_ordinance_text",
"out_fn": "{jurisdiction}_battery_summary.txt",
}
]
Step 4: Parsers and decision trees#
In parse.py, define parsers that run decision trees against the
extraction text. The trees live in graphs.py. This is the moment where
your plugin turns narrative text into structured fields, so it helps to
think in terms of a guided interview: the tree asks focused questions and
routes to the right follow-up based on answers.
from compass.plugin import OrdinanceParser
from compass.common import setup_async_decision_tree, run_async_tree
from compass.extraction.battery_storage.graphs import (
setup_graph_battery_setbacks,
setup_graph_capacity_limits,
)
import pandas as pd
class BatteryStorageParser(OrdinanceParser):
IN_LABEL = "battery_storage_ordinance_text"
OUT_LABEL = "battery_storage_ordinance_values"
async def parse(self, text):
chat_caller = self._init_chat_llm_caller(
"You extract battery storage regulations from ordinance text"
)
results = {}
setback_tree = setup_async_decision_tree(
setup_graph_battery_setbacks,
text=text,
chat_llm_caller=chat_caller,
)
results["setbacks"] = await run_async_tree(setback_tree)
capacity_tree = setup_async_decision_tree(
setup_graph_capacity_limits,
text=text,
chat_llm_caller=chat_caller,
)
results["capacity"] = await run_async_tree(capacity_tree)
if not any(results.values()):
return None
return pd.DataFrame([results])
Step 5: Wire the plugin#
In plugin.py, connect the components and register the plugin. The
registration step is what makes your class discoverable by the CLI, so do
not skip it. You are effectively telling COMPASS, “this identifier maps to
this pipeline.”
from compass.plugin import OrdinanceExtractionPlugin, register_plugin
from compass.extraction.battery_storage.ordinance import (
BatteryStorageHeuristic,
BatteryStorageTextCollector,
BatteryStorageTextExtractor,
)
from compass.extraction.battery_storage.parse import BatteryStorageParser
BatteryStorageParser.IN_LABEL = BatteryStorageTextExtractor.OUT_LABEL
class BatteryStorageExtractor(OrdinanceExtractionPlugin):
IDENTIFIER = "battery_storage"
QUESTION_TEMPLATES = [
"battery energy storage ordinance {jurisdiction}",
"{jurisdiction} battery storage regulations",
"{jurisdiction} BESS zoning",
]
WEBSITE_KEYWORDS = {
"ordinance": 1000,
"battery": 900,
"storage": 800,
"bess": 900,
"energy storage": 850,
}
heuristic = BatteryStorageHeuristic()
TEXT_COLLECTORS = [BatteryStorageTextCollector]
TEXT_EXTRACTORS = [BatteryStorageTextExtractor]
PARSERS = [BatteryStorageParser]
register_plugin(BatteryStorageExtractor)
Step 6: Run it#
At this point you have a complete pipeline: a heuristic to filter, a collector to keep relevant chunks, an extractor to shape them, and a parser to produce structured outputs. Running the CLI now exercises the full story you just built.
export OPENAI_API_KEY="your-key"
compass process \
--tech battery_storage \
--jurisdiction "Boulder County, CO" \
--config my_config.yaml
Where to go next#
Run the pipeline against a second jurisdiction to see how your heuristic and parser choices generalize
Review the troubleshooting guidance in this tutorial to debug label chain issues quickly
Explore the solar and wind plugins in the repository for more real-world patterns
Step into Advanced Plugin Development when you need custom filtering, RAG-style parsing, or nonstandard output schemas
Each step builds on the same flow you used here—filter, collect, extract, parse, save. Move to the next level only when the current pattern no longer fits your domain.