compass.utilities.parsing.merge_overlapping_texts#
- merge_overlapping_texts(text_chunks, n=300)[source]#
Merge text chunks while trimming overlapping boundaries
Overlap detection compares at most
ncharacters at each boundary but never more than half the length of the accumulated output. Chunks that do not overlap are concatenated with a newline separator.- Parameters:
text_chunks (
Iterableofstr) – Iterable containing text chunks which may or may not contain consecutive overlapping portions.n (
int, optional) – Number of characters to check at the beginning of each message for overlap with the previous message. Will always be reduced to be less than or equal to half of the length of the previous chunk. By default,300.
- Returns:
str– Merged text assembled from the non-overlapping portions.