compass.services.cpu.read_docling_web_file#

async read_docling_web_file(doc_bytes, url, source_uri=None, **kwargs)[source]#

Read a web file using Docling in a Process Pool

Parameters:
  • doc_bytes (bytes) – Raw document payload forwarded to the Docling parser.

  • url (str) – Filename or URL of the file to read.

  • source_uri (str, optional) – Original remote URL for the file. If specified, this is used as the HTML base URI while url is still used as the stream name for Docling format inference. By default, None.

  • **kwargs – Additional keyword arguments passed to Docling’s export_to_markdown() method.

Returns:

elm.web.document.MDDocument – Parsed document.