Module: Raif::Concerns::Llms::SupportsBatchInference

Extended by:: ActiveSupport::Concern

Included in:: Anthropic::BatchInference, Google::BatchInference, OpenAi::BatchInference

Defined in:: app/models/raif/concerns/llms/supports_batch_inference.rb

Overview

Mixed into LLM provider classes (e.g. Raif::Llms::Anthropic, Raif::Llms::OpenAiBase) that can submit work to the provider's Batch API in exchange for a discounted rate.

The contract is intentionally narrow: a provider implementation must know how to (1) submit a Raif::ModelCompletionBatch holding pending Raif::ModelCompletion children, (2) poll the provider for status, (3) fetch and apply per-entry results, and (4) name the Raif::ModelCompletionBatches subclass that should back its batches. Provider-side cancellation (#cancel_batch!) is opt-in.

Submission orchestration: Raif::ModelCompletionBatch#submit! enqueues Raif::PollModelCompletionBatchJob automatically; the job self-reschedules until the batch reaches a terminal status, at which point it dispatches the batch's completion handler.

Producers: in v1, Raif::Task is the only built-in producer for batch entries (via Raif::Task.build_for_batch / Raif::Task#prepare_for_batch!). The pipeline itself (poll, finalize, dispatch handler) is producer-agnostic. Other call sites that want to attach Raif::ModelCompletion records to a batch can call Raif::Llm#build_pending_model_completion directly with batch_custom_id: set to a unique-within-batch identifier; the rest of the pipeline does not care where the completion came from.

Instance Method Summary collapse

#apply_batch_result(model_completion, raw_result) ⇒ Raif::ModelCompletion
Applies a single per-entry batch result to its corresponding Raif::ModelCompletion.
#batch_class ⇒ Class
The Raif::ModelCompletionBatches::* STI subclass that holds batches submitted to this provider.
#cancel_batch!(batch) ⇒ String
Optional.
#create_batch(**attrs) ⇒ Raif::ModelCompletionBatch
Convenience: creates and persists a Raif::ModelCompletionBatch sized to this LLM.
#fetch_batch_results!(batch) ⇒ void
Streams the batch's results from the provider and applies them to each child Raif::ModelCompletion via #apply_batch_result.
#fetch_batch_status!(batch) ⇒ String
Polls the provider for the batch's current status.
#submit_batch!(batch) ⇒ Raif::ModelCompletionBatch
Submits a Raif::ModelCompletionBatch (with its child Raif::ModelCompletion records already built and persisted) to the provider's Batch API.

Instance Method Details

#apply_batch_result(model_completion, raw_result) ⇒ `Raif::ModelCompletion`

Applies a single per-entry batch result to its corresponding Raif::ModelCompletion. Implementations should populate prompt/completion tokens and any other usage data, apply the provider's batch discount to token costs, and persist the completion.

Parameters:

model_completion (Raif::ModelCompletion)
raw_result (Hash) —
provider-specific per-entry result payload

Returns:

(Raif::ModelCompletion)

Raises:

(NotImplementedError)



91
92
93

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 91

def apply_batch_result(model_completion, raw_result)
  raise NotImplementedError, "#{self.class.name} must implement #apply_batch_result"
end

#batch_class ⇒ `Class`

The Raif::ModelCompletionBatches::* STI subclass that holds batches submitted to this provider.

Returns:

(Class)

Raises:

(NotImplementedError)



35
36
37

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 35

def batch_class
  raise NotImplementedError, "#{self.class.name} must implement #batch_class"
end

#cancel_batch!(batch) ⇒ `String`

Optional. Requests cancellation of a batch from the provider.

Cancellation is typically asynchronous: the provider acknowledges with a transitional status (e.g. "canceling" / "cancelling"), and the next poll picks up the final "canceled" state. Implementations should send the cancel request, update batch.status from the response, and return the new (possibly transitional) status.

Implementations should refuse to cancel a batch that's already terminal or that hasn't been submitted yet (no provider_batch_id) by raising Raif::Errors::InvalidBatchError.

Parameters:

batch (Raif::ModelCompletionBatch)

Returns:

(String) —
the batch's new status (one of Raif::ModelCompletionBatch::STATUSES)

Raises:

(NotImplementedError)



109
110
111

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 109

def cancel_batch!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #cancel_batch!"
end

#create_batch(**attrs) ⇒ `Raif::ModelCompletionBatch`

Convenience: creates and persists a Raif::ModelCompletionBatch sized to this LLM. Saves callers from having to know the provider's batch subclass or repeat the LLM model key / api_name.

All other batch attributes (creator, completion_handler_class_name, metadata, ...) are forwarded.

Returns:

(Raif::ModelCompletionBatch) —
a persisted batch in the pending state

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 47

def create_batch(**attrs)
  batch_class.create!(
    llm_model_key: key.to_s,
    model_api_name: api_name,
    **attrs
  )
end

#fetch_batch_results!(batch) ⇒ `void`

This method returns an undefined value.

Streams the batch's results from the provider and applies them to each child Raif::ModelCompletion via #apply_batch_result. Each child should be transitioned to completed! or failed! (via record_failure!) before this method returns.

Parameters:

batch (Raif::ModelCompletionBatch)

Raises:

(NotImplementedError)



80
81
82

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 80

def fetch_batch_results!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #fetch_batch_results!"
end

#fetch_batch_status!(batch) ⇒ `String`

Polls the provider for the batch's current status. Should update batch.status, batch.request_counts, and any provider-specific bookkeeping in provider_response.

Parameters:

batch (Raif::ModelCompletionBatch)

Returns:

(String) —
the new status (one of Raif::ModelCompletionBatch::STATUSES)

Raises:

(NotImplementedError)



70
71
72

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 70

def fetch_batch_status!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #fetch_batch_status!"
end

#submit_batch!(batch) ⇒ `Raif::ModelCompletionBatch`

Submits a Raif::ModelCompletionBatch (with its child Raif::ModelCompletion records already built and persisted) to the provider's Batch API. Should populate provider_batch_id, provider_response, status, and submitted_at on the batch.

Parameters:

batch (Raif::ModelCompletionBatch)

Returns:

(Raif::ModelCompletionBatch) —
the same batch, persisted with provider state

Raises:

(NotImplementedError)



61
62
63

# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 61

def submit_batch!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #submit_batch!"
end

Module: Raif::Concerns::Llms::SupportsBatchInference

Overview

Instance Method Summary collapse

Instance Method Details

#apply_batch_result(model_completion, raw_result) ⇒ Raif::ModelCompletion

#batch_class ⇒ Class

#cancel_batch!(batch) ⇒ String

#create_batch(**attrs) ⇒ Raif::ModelCompletionBatch

#fetch_batch_results!(batch) ⇒ void

#fetch_batch_status!(batch) ⇒ String

#submit_batch!(batch) ⇒ Raif::ModelCompletionBatch

#apply_batch_result(model_completion, raw_result) ⇒ `Raif::ModelCompletion`

#batch_class ⇒ `Class`

#cancel_batch!(batch) ⇒ `String`

#create_batch(**attrs) ⇒ `Raif::ModelCompletionBatch`

#fetch_batch_results!(batch) ⇒ `void`

#fetch_batch_status!(batch) ⇒ `String`

#submit_batch!(batch) ⇒ `Raif::ModelCompletionBatch`