Table of Contents
- Task Batching
- Producing a batch
- The polling lifecycle
- Consuming a batch
- Cancellation
- Configuration
- Schema and accessors
- Testing
Task Batching
Anthropic and OpenAI both expose a Batch API: you submit a JSONL of requests, the provider processes them asynchronously (typically minutes, with a 24-hour ceiling), and you fetch results in one go in exchange for ~50% off the per-token rate. Raif wraps both behind a single Raif::ModelCompletionBatch primitive so your task code doesn’t have to care which provider it’s talking to.
Provider support
| Provider | Batch support |
|---|---|
| Anthropic | Yes (Messages Batches API) |
| OpenAI | Yes (Batch API, Completions and Responses) |
| Bedrock | No |
| No | |
| OpenRouter | No |
Guard with Raif.llm(model_key).supports_batch_inference? if you need to fall back to Raif::Task.run for non-batch-capable providers.
Producing a batch
Three steps: ask the LLM for a batch, build your tasks, attach them.
llm = Raif.llm(:anthropic_claude_4_6_sonnet)
batch = llm.create_batch(
completion_handler_class_name: "MyApp::DocumentSummaryBatchHandler",
metadata: {
"campaign_id" => campaign.id,
"requested_by_user_id" => current_user.id
}
)
documents.each do |document|
task = Raif::Tasks::DocumentSummarization.new(
creator: current_user,
document: document
)
batch.add_task(task, batch_custom_id: "doc_#{document.id}")
end
batch.submit!
batch.add_task(task, batch_custom_id: ...) saves the task if it isn’t already and routes it through Raif::Task#prepare_for_batch! to build the pending Raif::ModelCompletion attached to the batch. No LLM request is made until batch.submit!.
A few details worth knowing:
batch_custom_idmust be unique within a single batch. It’s the identifier the provider echoes back on each per-entry result so Raif can match it to the right child completion. Raif enforces a unique partial index on(raif_model_completion_batch_id, batch_custom_id). Defaults to"raif_task_<task.id>"if you don’t pass one explicitly – pass your own when the task ID alone isn’t meaningful for debugging.metadatais ajsoncolumn. Use it to carry whatever context your completion handler needs to resolve domain records (account, workflow, campaign, etc.). It is not sent to the provider.completion_handler_class_nameis the class that runs once the batch reaches a terminal state. See Consuming a batch.
batch.submit! is single-shot – it serializes every child completion into the provider’s batch payload, uploads/creates the batch, stamps provider_batch_id and submitted_at, and auto-enqueues Raif::PollModelCompletionBatchJob. A second call on the same batch raises Raif::Errors::InvalidBatchError. If you need to retry, cancel and create a new batch.
One-call shortcut
If you don’t need a separate handle on the task before attaching it, Raif::Task.build_for_batch collapses construction + save + attachment into a single call:
documents.each do |document|
Raif::Tasks::DocumentSummarization.build_for_batch(
batch: batch,
batch_custom_id: "doc_#{document.id}",
creator: current_user,
document: document
)
end
Equivalent to the new + add_task pair above; reach for it when the loop body is the entire task lifecycle.
The polling lifecycle
You don’t usually interact with the polling job directly. Once submit! is called:
Raif::PollModelCompletionBatchJobis enqueued with a delay fromRaif.config.model_completion_batch_poll_schedule(default:[1m, 2m, 5m, 10m, 30m], repeating the last entry).- Each poll calls
batch.fetch_status!. Transient provider errors (the classes inRaif.config.llm_request_retriable_exceptions) cause the chain to self-reschedule rather than fail the job. - When the batch hits a terminal status:
ended(success):batch.fetch_results!streams the per-entry payload back from the provider. Each child completion is updated in place viaapply_batch_result(token usage, costs, raw response, completed/failed transition).canceled/expired/failed: every still-pending child is force-failed with the batch status as the reason. No per-entry fetch happens.
batch.dispatch_completion_handler!runs your handler.
If a batch outlives Raif.config.model_completion_batch_max_age (default 26 hours) without resolving, it’s expired: raif issues a best-effort provider-side cancel via batch.cancel! and then force-fails the batch locally so any waiting workflow can advance. If the cancel call fails (network, 5xx, etc.), the local force-fail still happens — the provider-side batch may continue running and be billed, but raif logs and moves on. The hourly safety sweep Raif::ExpireStuckModelCompletionBatchesJob runs the same expiry path for batches whose polling chain was dropped entirely (e.g., a queue restart that loses scheduled jobs). Schedule it to run hourly in your host app’s cron – without it, a stranded batch never advances.
# config/schedule.yml (sidekiq-cron example)
expire_stuck_model_completion_batches:
cron: "every hour"
class: "Raif::ExpireStuckModelCompletionBatchesJob"
queue: "default"
Consuming a batch
Subclass Raif::TaskBatchCompletionHandler and register one or more lifecycle blocks. Inside any block, batch and tasks are exposed as readers, plus any helper methods you define on the subclass.
Three macros are available:
on_batch_successfires only whenbatch.successful?(status is"ended"). Tasks have been hydrated with real provider results —parsed_responsepopulated, tool invocations processed. This is where you aggregate.on_batch_failurefires only when the batch reached a non-success terminal status (canceled/expired/failed). The per-entry fetch was skipped, so tasks are all infailedstate with no usefulparsed_response— the meaningful state is on the batch itself:batch.failure_reason,batch.failure_error,batch.status,batch.metadata.on_batch_completionfires for every terminal status. Use it for unconditional logging or as a one-block catchall when you don’t need to branch on outcome.
When all three are registered, on_batch_completion fires first, then exactly one of on_batch_success / on_batch_failure.
class MyApp::DocumentSummaryBatchHandler < Raif::TaskBatchCompletionHandler
on_batch_success do
campaign = Campaign.find_by(id: batch.metadata["campaign_id"])
next unless campaign
tasks.select(&:completed?).each do |task|
campaign.summaries.create!(
document_id: task.run_with["document"].id,
text: task.parsed_response["summary"]
)
end
end
on_batch_failure do
campaign = Campaign.find_by(id: batch.metadata["campaign_id"])
next unless campaign
Airbrake.notify(
"Document summary batch ##{batch.id} failed: #{batch.status}",
params: { campaign_id: campaign.id, failure_reason: batch.failure_reason }
)
end
end
If your success and failure paths share most of their work, the catchall is fine too — the macros are independent and you can use any subset:
class MyApp::SimpleBatchHandler < Raif::TaskBatchCompletionHandler
on_batch_completion do
next unless batch.successful?
# ... aggregate ...
end
end
Things to know about the handlers:
- The base class hydrates each child
Raif::ModelCompletionthrough its sourceRaif::Task#process_completion!before any block runs. By the time blocks execute, every task is in its terminal state (completed?orfailed?). -
Blocks are
instance_exec‘d on a handler instance, so helper methods on the subclass are reachable:class MyApp::DocumentSummaryBatchHandler < Raif::TaskBatchCompletionHandler on_batch_success do next if successful_tasks.empty? persist!(successful_tasks) end def successful_tasks tasks.select(&:completed?) end def persist!(completed_tasks) # ... end end - Use
next(notreturn) for early exit. Blocks are defined at class-body scope, soreturnwould raiseLocalJumpError. - Per-task hydration errors are caught and logged so one bad task doesn’t block the rest of the batch. Errors raised from inside a registered block propagate to
Raif::PollModelCompletionBatchJob– handle them yourself if you need different semantics.
If you’re attaching non-task completions to a batch via Raif::Llm#build_pending_model_completion directly, override handle_batch_completion(batch) instead of using these macros – you’ll skip the task hydration step and read children off batch.raif_model_completions.
Cancellation
Cancellation is provider-side and asynchronous on both Anthropic and OpenAI: the request is acknowledged with a transitional status, and the next poll picks up the final canceled state. Force-failing remaining children and dispatching the handler happens through the same path as any other terminal status, so the handler always runs once.
batch.cancel!
Refuses to cancel a batch that’s already terminal or hasn’t been submitted yet (no provider_batch_id).
Configuration
Raif.configure do |config|
# Backoff between polls of a non-terminal batch. The Nth poll waits
# poll_schedule[N], clamping to the last entry once exhausted.
config.model_completion_batch_poll_schedule = [
60.seconds,
2.minutes,
5.minutes,
10.minutes,
30.minutes
]
# Hard ceiling on any non-terminal batch. The hourly safety sweep
# force-fails older batches so the workflow can advance.
config.model_completion_batch_max_age = 26.hours
# OpenAI-only: the completion_window passed when creating an OpenAI batch.
# OpenAI currently only accepts "24h".
config.open_ai_batch_completion_window = "24h"
# Anthropic-only: beta header for the Messages Batches API.
config.anthropic_message_batches_beta_header = "message-batches-2024-09-24"
end
Schema and accessors
batch.status # one of: pending, submitted, in_progress, ended, canceled, expired, failed
batch.terminal? # true once status is ended/canceled/expired/failed
batch.successful? # true only when status == "ended"
batch.provider_batch_id # the provider's id for the batch
batch.submitted_at
batch.ended_at
batch.failure_error
batch.failure_reason
batch.request_counts # provider-reported per-status counts (jsonb)
batch.metadata # whatever you put on it (jsonb)
batch.raif_model_completions
batch.tasks # convenience: Raif::Task records attached via their child completions
batch.total_cost # aggregated after results are applied
batch.prompt_token_cost
batch.output_token_cost
Costs reflect the provider’s batch discount – apply_batch_result halves the per-token rate before persisting it on each child completion, and recalculate_costs! rolls the children up to the batch.
Testing
The synchronous path of a task is exercised by Raif::Task.run. The batch path can be exercised by building a batch + children with Task.build_for_batch, populating each child completion’s raw_response directly, marking it completed! (or failed!), and then calling YourHandler.handle_batch_completion(batch). The handler’s hydration step will route each child through Raif::Task#process_completion! exactly as the polling job would, so the resulting tasks (and any side records your handler creates) match production output.
Read next: Images/Files/PDFs