compass.validation.location.JurisdictionWebsiteValidator#
- class JurisdictionWebsiteValidator(browser_semaphore=None, file_loader_kwargs=None, **kwargs)[source]#
Bases:
objectValidate whether a website is the primary jurisdiction portal
Notes
The validator stores the initialization arguments so they can be reused across many documents without reconfiguration.
- Parameters:
browser_semaphore (
asyncio.Semaphore, optional) – Semaphore constraining concurrent Playwright usage.Noneapplies no concurrency limit. Default isNone.file_loader_kwargs (
dict, optional) – Keyword arguments passed toelm.web.file_loader.AsyncWebFileLoader. Default isNone.**kwargs – Additional keyword arguments cached for downstream LLM calls triggered during validation.
Methods
check(url, jurisdiction)Determine whether a website serves as a jurisdiction's portal
Attributes
System message for main jurisdiction website validation calls
- WEB_PAGE_CHECK_SYSTEM_MESSAGE = 'You are an expert data analyst that examines website text to determine if the website is the main website for a given jurisdiction. Only ever answer based on the information from the website itself.'#
System message for main jurisdiction website validation calls
- async check(url, jurisdiction)[source]#
Determine whether a website serves as a jurisdiction’s portal
The validator first performs an inexpensive URL classification before downloading page content. Only when the URL fails the initial check does it fetch and inspect the page text using a generic LLM caller.
- Parameters:
url (
str) – URL to inspect. Empty values returnFalseimmediately.jurisdiction (
compass.utilities.location.Jurisdiction) – Target jurisdiction descriptor used to frame the validation prompts.
- Returns:
bool–Truewhen either the URL quick check or the full page evaluation indicates the site is the official main website for the jurisdiction.- Raises:
compass.exceptions.COMPASSError – Propagated from
BaseLLMCallerif configured to raise on LLM failures.
Examples
>>> validator = JurisdictionWebsiteValidator() >>> await validator.check("https://county.gov", jurisdiction) True