compass.pipeline.data_classes.RuntimeSettings#

class RuntimeSettings(td_kwargs=None, tpe_kwargs=None, ppe_kwargs=None, max_num_concurrent_jurisdictions=25, log_level='INFO', keep_async_logs=False)[source]#

Bases: object

Value Object for runtime and execution settings

Parameters:
  • td_kwargs (dict, optional) – Additional keyword arguments to pass to tempfile.TemporaryDirectory. The temporary directory is used to store documents which have not yet been confirmed to contain relevant information. By default, None.

  • tpe_kwargs (dict, optional) – Additional keyword arguments to pass to concurrent.futures.ThreadPoolExecutor, used for I/O-bound tasks such as logging and file writes. By default, None.

  • ppe_kwargs (dict, optional) – Additional keyword arguments to pass to concurrent.futures.ProcessPoolExecutor, used for CPU-bound tasks such as PDF loading and parsing. By default, None.

  • max_num_concurrent_jurisdictions (int, default 25) – Maximum number of jurisdictions to process concurrently. Limiting this can help manage memory usage when dealing with a large number of documents. By default, 25.

  • log_level (str, default "INFO") – Logging level for ordinance scraping and parsing (e.g., “TRACE”, “DEBUG”, “INFO”, “WARNING”, or “ERROR”). By default, "INFO".

  • keep_async_logs (bool, default False) – Option to store the full asynchronous log record to a file. This is only useful if you intend to monitor overall processing progress from a file instead of from the terminal. If True, all of the unordered records are written to a “all.log” file in the log_dir directory. By default, False.

Methods