compass.services.cpu.PDFLoader#
- class PDFLoader(**kwargs)[source]#
Bases:
ProcessPoolServiceClass to load PDFs in a ProcessPoolExecutor
- Parameters:
**kwargs – Keyword-value argument pairs to pass to
concurrent.futures.ProcessPoolExecutor. By default,None.
Methods
Open thread pool and temp directory
call(*args, **kwargs)Call the service
process(fn, pdf_bytes, **kwargs)Execute a PDF parsing function in the process pool
process_using_futures(fut, *args, **kwargs)Process a call to the service
Shutdown thread pool and cleanup temp directory
Attributes
Max number of concurrent job submissions.
Always
True(limiting is handled by asyncio)Service name used to pull the correct queue object
- async process(fn, pdf_bytes, **kwargs)[source]#
Execute a PDF parsing function in the process pool
- Parameters:
fn (
callable()) – Callable executed inside the process pool. Receivespdf_bytesas the first argument.pdf_bytes (
bytes) – Raw PDF payload forwarded tofn.**kwargs – Additional keyword arguments passed to
fn.
- Returns:
Any– Result returned byfnafter execution.
- MAX_CONCURRENT_JOBS = 10000#
Max number of concurrent job submissions.
- acquire_resources()#
Open thread pool and temp directory
- async classmethod call(*args, **kwargs)#
Call the service
- Parameters:
*args – Positional and keyword arguments to be passed to the underlying service processing function.
**kwargs – Positional and keyword arguments to be passed to the underlying service processing function.
- Returns:
object– A response object from the underlying service.
- async process_using_futures(fut, *args, **kwargs)#
Process a call to the service
The result is communicated by updating
fut.- Parameters:
fut (
asyncio.Future) – A future object that should get the result of the processing operation. If the processing function returnsanswer, this method should callfut.set_result(answer).**kwargs – Keyword arguments to be passed to the underlying processing function.
- release_resources()#
Shutdown thread pool and cleanup temp directory