compass.plugin.interface.BaseTextCollector#

class BaseTextCollector(llm_service, usage_tracker=None, **kwargs)[source]#

Bases: BaseLLMCaller, ABC

Base class for text collectors that gather relevant text

Parameters:

llm_service (Service) – LLM service used for queries.
usage_tracker (UsageTracker, optional) – Optional tracker instance to monitor token usage during LLM calls. By default, None.
**kwargs –
Keyword arguments to be passed to the underlying service processing function (i.e. llm_service.call(**kwargs)). Should not contain the following keys:
- usage_sub_label
- messages
These arguments are provided by this caller object.

Methods

check_chunk(chunk_parser, ind)

Check if a text chunk is relevant for extraction

Attributes

`OUT_LABEL`	Identifier for text collected by this class
`relevant_text`	Combined relevant text from the individual chunks

abstract property OUT_LABEL#

Identifier for text collected by this class

Type:: str

abstract property relevant_text#

Combined relevant text from the individual chunks

Type:: str

abstractmethod async check_chunk(chunk_parser, ind)[source]#

Check if a text chunk is relevant for extraction

You should validate chunks like so:

is_correct_kind_of_text = await chunk_parser.parse_from_ind(
    ind,
    key="my_unique_validation_key",
    llm_call_callback=my_async_llm_call_function,
)

where the “key” is unique to this particular validation (it will be used to cache the validation result in the chunk parser’s memory) and my_async_llm_call_function is an async function that takes in a key and text chunk and returns a boolean indicating whether or not the text chunk passes the validation. You can call chunk_parser.parse_from_ind as many times as you want within this method, but be sure to use unique keys for each validation.

Parameters:

chunk_parser (ParseChunksWithMemory) – Instance that contains a parse_from_ind method.
ind (int) – Index of the chunk to check.

Returns:

bool – Boolean flag indicating whether or not the text in the chunk contains information relevant to the extraction task.