elm.base.ClientEmbeddingsApiQueue

class ClientEmbeddingsApiQueue(client, request_jsons, ignore_error=None, rate_limit=40000.0, max_retries=10)[source]

Class to manage the parallel API embedding submissions using a client

Parameters:

client (openai.AzureOpenAI | openai.OpenAI) – OpenAI client object to use for API calls.
request_jsons (list) – List of API data input, one entry typically looks like this for chat completion:

{“model”: “gpt-3.5-turbo”,

“messages”: [{“role”: “system”, “content”: “You do this…”},
{“role”: “user”, “content”: “Do this: {}”}],

“temperature”: 0.0}
ignore_error (None | callable) – Optional callable to parse API error string. If the callable returns True, the error will be ignored, the API call will not be tried again, and the output will be an empty string.
rate_limit (float) – OpenAI API rate limit (tokens / minute). Note that the gpt-3.5-turbo limit is 90k as of 4/2023, but we’re using a large factor of safety (~1/2) because we can only count the tokens on the input side and assume the output is about the same count.
max_retries (int) – Number of times to retry an API call wi

Methods

`collect_jobs`()	Collect asyncronous API calls and API outputs.
`run`()	Run all asyncronous API calls.
`submit_jobs`()	Submit a subset jobs asynchronously and hold jobs in the api_jobs attribute.

Attributes

Get a list of async jobs that are being waited on.

async collect_jobs(): Collect asyncronous API calls and API outputs. Store outputs in the out attribute.

async run()

Run all asyncronous API calls.

Returns:: out (list) – List of API call outputs with same ordering as request_jsons input.

submit_jobs(): Submit a subset jobs asynchronously and hold jobs in the api_jobs attribute. Break when the rate_limit is exceeded.