Module: Raif::Concerns::Llms::SupportsBatchInference
- Extended by:
- ActiveSupport::Concern
- Defined in:
- app/models/raif/concerns/llms/supports_batch_inference.rb
Overview
Mixed into LLM provider classes (e.g. Raif::Llms::Anthropic, Raif::Llms::OpenAiBase) that can submit work to the provider's Batch API in exchange for a discounted rate.
The contract is intentionally narrow: a provider implementation must know how to (1) submit a Raif::ModelCompletionBatch holding pending Raif::ModelCompletion children, (2) poll the provider for status, (3) fetch and apply per-entry results, and (4) name the Raif::ModelCompletionBatches subclass that should back its batches. Provider-side cancellation (#cancel_batch!) is opt-in.
Submission orchestration: Raif::ModelCompletionBatch#submit! enqueues Raif::PollModelCompletionBatchJob automatically; the job self-reschedules until the batch reaches a terminal status, at which point it dispatches the batch's completion handler.
Producers: in v1, Raif::Task is the only built-in producer for batch entries (via Raif::Task.build_for_batch / Raif::Task#prepare_for_batch!). The pipeline itself (poll, finalize, dispatch handler) is producer-agnostic. Other call sites that want to attach Raif::ModelCompletion records to a batch can call Raif::Llm#build_pending_model_completion directly with batch_custom_id: set to a unique-within-batch identifier; the rest of the pipeline does not care where the completion came from.
Instance Method Summary collapse
-
#apply_batch_result(model_completion, raw_result) ⇒ Raif::ModelCompletion
Applies a single per-entry batch result to its corresponding Raif::ModelCompletion.
-
#batch_class ⇒ Class
The Raif::ModelCompletionBatches::* STI subclass that holds batches submitted to this provider.
-
#cancel_batch!(batch) ⇒ String
Optional.
-
#create_batch(**attrs) ⇒ Raif::ModelCompletionBatch
Convenience: creates and persists a Raif::ModelCompletionBatch sized to this LLM.
-
#fetch_batch_results!(batch) ⇒ void
Streams the batch's results from the provider and applies them to each child Raif::ModelCompletion via #apply_batch_result.
-
#fetch_batch_status!(batch) ⇒ String
Polls the provider for the batch's current status.
-
#submit_batch!(batch) ⇒ Raif::ModelCompletionBatch
Submits a Raif::ModelCompletionBatch (with its child Raif::ModelCompletion records already built and persisted) to the provider's Batch API.
Instance Method Details
#apply_batch_result(model_completion, raw_result) ⇒ Raif::ModelCompletion
Applies a single per-entry batch result to its corresponding Raif::ModelCompletion. Implementations should populate prompt/completion tokens and any other usage data, apply the provider's batch discount to token costs, and persist the completion.
91 92 93 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 91 def apply_batch_result(model_completion, raw_result) raise NotImplementedError, "#{self.class.name} must implement #apply_batch_result" end |
#batch_class ⇒ Class
The Raif::ModelCompletionBatches::* STI subclass that holds batches submitted to this provider.
35 36 37 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 35 def batch_class raise NotImplementedError, "#{self.class.name} must implement #batch_class" end |
#cancel_batch!(batch) ⇒ String
Optional. Requests cancellation of a batch from the provider.
Cancellation is typically asynchronous: the provider acknowledges with a transitional status (e.g. "canceling" / "cancelling"), and the next poll picks up the final "canceled" state. Implementations should send the cancel request, update batch.status from the response, and return the new (possibly transitional) status.
Implementations should refuse to cancel a batch that's already terminal or that hasn't been submitted yet (no provider_batch_id) by raising Raif::Errors::InvalidBatchError.
109 110 111 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 109 def cancel_batch!(batch) raise NotImplementedError, "#{self.class.name} must implement #cancel_batch!" end |
#create_batch(**attrs) ⇒ Raif::ModelCompletionBatch
Convenience: creates and persists a Raif::ModelCompletionBatch sized to this LLM. Saves callers from having to know the provider's batch subclass or repeat the LLM model key / api_name.
All other batch attributes (creator, completion_handler_class_name, metadata, ...) are forwarded.
47 48 49 50 51 52 53 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 47 def create_batch(**attrs) batch_class.create!( llm_model_key: key.to_s, model_api_name: api_name, **attrs ) end |
#fetch_batch_results!(batch) ⇒ void
This method returns an undefined value.
Streams the batch's results from the provider and applies them to each child
Raif::ModelCompletion via #apply_batch_result. Each child should be transitioned
to completed! or failed! (via record_failure!) before this method returns.
80 81 82 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 80 def fetch_batch_results!(batch) raise NotImplementedError, "#{self.class.name} must implement #fetch_batch_results!" end |
#fetch_batch_status!(batch) ⇒ String
Polls the provider for the batch's current status. Should update batch.status, batch.request_counts, and any provider-specific bookkeeping in provider_response.
70 71 72 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 70 def fetch_batch_status!(batch) raise NotImplementedError, "#{self.class.name} must implement #fetch_batch_status!" end |
#submit_batch!(batch) ⇒ Raif::ModelCompletionBatch
Submits a Raif::ModelCompletionBatch (with its child Raif::ModelCompletion records already built and persisted) to the provider's Batch API. Should populate provider_batch_id, provider_response, status, and submitted_at on the batch.
61 62 63 |
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 61 def submit_batch!(batch) raise NotImplementedError, "#{self.class.name} must implement #submit_batch!" end |