compass.plugin.interface.BaseTextCollector#
- class BaseTextCollector(llm_service, usage_tracker=None, **kwargs)[source]#
Bases:
BaseLLMCaller,ABCBase class for text collectors that gather relevant text
- Parameters:
llm_service (
Service) – LLM service used for queries.usage_tracker (
UsageTracker, optional) – Optional tracker instance to monitor token usage during LLM calls. By default,None.**kwargs –
Keyword arguments to be passed to the underlying service processing function (i.e.
llm_service.call(**kwargs)). Should not contain the following keys:usage_sub_label
messages
These arguments are provided by this caller object.
Methods
check_chunk(chunk_parser, ind)Check if a text chunk is relevant for extraction
Attributes
Identifier for text collected by this class
Combined relevant text from the individual chunks
- abstractmethod async check_chunk(chunk_parser, ind)[source]#
Check if a text chunk is relevant for extraction
You should validate chunks like so:
is_correct_kind_of_text = await chunk_parser.parse_from_ind( ind, key="my_unique_validation_key", llm_call_callback=my_async_llm_call_function, )
where the “key” is unique to this particular validation (it will be used to cache the validation result in the chunk parser’s memory) and my_async_llm_call_function is an async function that takes in a key and text chunk and returns a boolean indicating whether or not the text chunk passes the validation. You can call chunk_parser.parse_from_ind as many times as you want within this method, but be sure to use unique keys for each validation.
- Parameters:
chunk_parser (
ParseChunksWithMemory) – Instance that contains aparse_from_indmethod.ind (
int) – Index of the chunk to check.
- Returns:
bool– Boolean flag indicating whether or not the text in the chunk contains information relevant to the extraction task.
See also
parse_from_ind()Method used to parse text from a chunk with memory of prior chunk validations.