Module: Raif::Concerns::Llms::SupportsBatchInference

Extended by:
ActiveSupport::Concern
Included in:
Anthropic::BatchInference, Google::BatchInference, OpenAi::BatchInference
Defined in:
app/models/raif/concerns/llms/supports_batch_inference.rb

Overview

Mixed into LLM provider classes (e.g. Raif::Llms::Anthropic, Raif::Llms::OpenAiBase) that can submit work to the provider's Batch API in exchange for a discounted rate.

The contract is intentionally narrow: a provider implementation must know how to (1) submit a Raif::ModelCompletionBatch holding pending Raif::ModelCompletion children, (2) poll the provider for status, (3) fetch and apply per-entry results, and (4) name the Raif::ModelCompletionBatches subclass that should back its batches. Provider-side cancellation (#cancel_batch!) is opt-in.

Submission orchestration: Raif::ModelCompletionBatch#submit! enqueues Raif::PollModelCompletionBatchJob automatically; the job self-reschedules until the batch reaches a terminal status, at which point it dispatches the batch's completion handler.

Producers: in v1, Raif::Task is the only built-in producer for batch entries (via Raif::Task.build_for_batch / Raif::Task#prepare_for_batch!). The pipeline itself (poll, finalize, dispatch handler) is producer-agnostic. Other call sites that want to attach Raif::ModelCompletion records to a batch can call Raif::Llm#build_pending_model_completion directly with batch_custom_id: set to a unique-within-batch identifier; the rest of the pipeline does not care where the completion came from.

Instance Method Summary collapse

Instance Method Details

#apply_batch_result(model_completion, raw_result) ⇒ Raif::ModelCompletion

Applies a single per-entry batch result to its corresponding Raif::ModelCompletion. Implementations should populate prompt/completion tokens and any other usage data, apply the provider's batch discount to token costs, and persist the completion.

Parameters:

  • model_completion (Raif::ModelCompletion)
  • raw_result (Hash)

    provider-specific per-entry result payload

Returns:

Raises:

  • (NotImplementedError)


91
92
93
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 91

def apply_batch_result(model_completion, raw_result)
  raise NotImplementedError, "#{self.class.name} must implement #apply_batch_result"
end

#batch_classClass

The Raif::ModelCompletionBatches::* STI subclass that holds batches submitted to this provider.

Returns:

  • (Class)

Raises:

  • (NotImplementedError)


35
36
37
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 35

def batch_class
  raise NotImplementedError, "#{self.class.name} must implement #batch_class"
end

#cancel_batch!(batch) ⇒ String

Optional. Requests cancellation of a batch from the provider.

Cancellation is typically asynchronous: the provider acknowledges with a transitional status (e.g. "canceling" / "cancelling"), and the next poll picks up the final "canceled" state. Implementations should send the cancel request, update batch.status from the response, and return the new (possibly transitional) status.

Implementations should refuse to cancel a batch that's already terminal or that hasn't been submitted yet (no provider_batch_id) by raising Raif::Errors::InvalidBatchError.

Parameters:

Returns:

  • (String)

    the batch's new status (one of Raif::ModelCompletionBatch::STATUSES)

Raises:

  • (NotImplementedError)


109
110
111
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 109

def cancel_batch!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #cancel_batch!"
end

#create_batch(**attrs) ⇒ Raif::ModelCompletionBatch

Convenience: creates and persists a Raif::ModelCompletionBatch sized to this LLM. Saves callers from having to know the provider's batch subclass or repeat the LLM model key / api_name.

All other batch attributes (creator, completion_handler_class_name, metadata, ...) are forwarded.

Returns:



47
48
49
50
51
52
53
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 47

def create_batch(**attrs)
  batch_class.create!(
    llm_model_key: key.to_s,
    model_api_name: api_name,
    **attrs
  )
end

#fetch_batch_results!(batch) ⇒ void

This method returns an undefined value.

Streams the batch's results from the provider and applies them to each child Raif::ModelCompletion via #apply_batch_result. Each child should be transitioned to completed! or failed! (via record_failure!) before this method returns.

Parameters:

Raises:

  • (NotImplementedError)


80
81
82
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 80

def fetch_batch_results!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #fetch_batch_results!"
end

#fetch_batch_status!(batch) ⇒ String

Polls the provider for the batch's current status. Should update batch.status, batch.request_counts, and any provider-specific bookkeeping in provider_response.

Parameters:

Returns:

  • (String)

    the new status (one of Raif::ModelCompletionBatch::STATUSES)

Raises:

  • (NotImplementedError)


70
71
72
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 70

def fetch_batch_status!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #fetch_batch_status!"
end

#submit_batch!(batch) ⇒ Raif::ModelCompletionBatch

Submits a Raif::ModelCompletionBatch (with its child Raif::ModelCompletion records already built and persisted) to the provider's Batch API. Should populate provider_batch_id, provider_response, status, and submitted_at on the batch.

Parameters:

Returns:

Raises:

  • (NotImplementedError)


61
62
63
# File 'app/models/raif/concerns/llms/supports_batch_inference.rb', line 61

def submit_batch!(batch)
  raise NotImplementedError, "#{self.class.name} must implement #submit_batch!"
end