Class: Raif::ModelCompletion

Inherits:
ApplicationRecord
  • Object
show all
Includes:
Concerns::BooleanTimestamp, Concerns::HasAvailableModelTools, Concerns::HasRuntimeDuration, Concerns::LlmResponseParsing, Concerns::ProviderManagedToolCalls
Defined in:
app/models/raif/model_completion.rb

Overview

Schema Information

Table name: raif_model_completions

id :bigint not null, primary key available_model_tools :jsonb not null cache_creation_input_tokens :integer cache_read_input_tokens :integer citations :jsonb completed_at :datetime completion_tokens :integer failed_at :datetime failure_error :string failure_reason :text failure_response_body :text failure_response_status :integer llm_model_key :string not null max_completion_tokens :integer messages :jsonb not null model_api_name :string not null output_token_cost :decimal(10, 6) prompt_token_cost :decimal(10, 6) prompt_tokens :integer raw_response :text response_array :jsonb response_format :integer default("text"), not null response_format_parameter :string response_tool_calls :jsonb retry_count :integer default(0), not null source_type :string started_at :datetime stream_response :boolean default(FALSE), not null system_prompt :text temperature :decimal(5, 3) tool_choice :string total_cost :decimal(10, 6) total_tokens :integer created_at :datetime not null updated_at :datetime not null batch_custom_id :string raif_model_completion_batch_id :bigint response_id :string source_id :bigint

Indexes

index_raif_model_completions_on_batch_custom_id (batch_custom_id) index_raif_model_completions_on_batch_id_and_custom_id (raif_model_completion_batch_id,batch_custom_id) UNIQUE WHERE (raif_model_completion_batch_id IS NOT NULL) index_raif_model_completions_on_completed_at (completed_at) index_raif_model_completions_on_created_at (created_at) index_raif_model_completions_on_failed_at (failed_at) index_raif_model_completions_on_raif_model_completion_batch_id (raif_model_completion_batch_id) index_raif_model_completions_on_source (source_type,source_id) index_raif_model_completions_on_started_at (started_at)

Foreign Keys

fk_rails_... (raif_model_completion_batch_id => raif_model_completion_batches.id)

Constant Summary collapse

FAILURE_RESPONSE_BODY_MAX_CHARS =

Maximum number of characters of an upstream HTTP body we persist on failure. The body usually carries the provider's actual error reason (e.g. OpenAI/Anthropic structured error JSON), which failure_reason cannot fit in 255 chars. 4 KB is enough to capture realistic error payloads without bloating storage.

4_000

Constants included from Concerns::LlmResponseParsing

Concerns::LlmResponseParsing::ASCII_CONTROL_CHARS

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Concerns::ProviderManagedToolCalls

#provider_managed_tool_calls, #sanitized_citations

Methods included from Concerns::HasRuntimeDuration

#runtime_duration, #runtime_duration_seconds, #runtime_ended_at

Methods included from Concerns::HasAvailableModelTools

#available_model_tools_map

Methods included from Concerns::LlmResponseParsing

#parse_html_response, #parse_json_response, #parsed_response

Instance Attribute Details

#anthropic_prompt_caching_enabledObject

Returns the value of attribute anthropic_prompt_caching_enabled.



69
70
71
# File 'app/models/raif/model_completion.rb', line 69

def anthropic_prompt_caching_enabled
  @anthropic_prompt_caching_enabled
end

#bedrock_prompt_caching_enabledObject

Returns the value of attribute bedrock_prompt_caching_enabled.



69
70
71
# File 'app/models/raif/model_completion.rb', line 69

def bedrock_prompt_caching_enabled
  @bedrock_prompt_caching_enabled
end

Instance Method Details

#calculate_costsObject



111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
# File 'app/models/raif/model_completion.rb', line 111

def calculate_costs
  # Each retry resends the same prompt, so the provider charges input tokens
  # for every attempt. Factor in retry_count to reflect actual billing.
  total_attempts = (retry_count || 0) + 1

  if prompt_tokens.present? && llm_config[:input_token_cost].present?
    self.prompt_token_cost = calculate_prompt_token_cost(total_attempts)
  end

  if completion_tokens.present? && llm_config[:output_token_cost].present?
    self.output_token_cost = llm_config[:output_token_cost] * completion_tokens
  end

  if prompt_token_cost.present? || output_token_cost.present?
    self.total_cost = (prompt_token_cost || 0) + (output_token_cost || 0)
  end

  apply_batch_inference_discount if raif_model_completion_batch_id.present?
end

#json_response_schemaObject



103
104
105
# File 'app/models/raif/model_completion.rb', line 103

def json_response_schema
  source.json_response_schema if source&.respond_to?(:json_response_schema)
end

#pending?Boolean

Returns:

  • (Boolean)


86
87
88
# File 'app/models/raif/model_completion.rb', line 86

def pending?
  started_at.nil? && completed_at.nil? && failed_at.nil?
end

#record_failure!(exception) ⇒ Object



138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# File 'app/models/raif/model_completion.rb', line 138

def record_failure!(exception)
  self.failed_at = Time.current
  self.failure_error = exception.class.name
  self.failure_reason = exception.message.truncate(255)
  # Always clear before re-populating so a second call with a different
  # exception kind doesn't leave stale response metadata attached.
  self.failure_response_status = nil
  self.failure_response_body = nil

  # Faraday errors carry the provider's HTTP status and response body —
  # the latter is where the actual provider-side error reason lives. Both
  # are nil when the failure happened before a response was received
  # (DNS/connection refused/timeout).
  if exception.is_a?(Faraday::Error)
    self.failure_response_status = exception.response_status
    body = exception.response_body
    self.failure_response_body = body.to_s.first(FAILURE_RESPONSE_BODY_MAX_CHARS) if body.present?
  end

  save!
end

#set_total_tokensObject



107
108
109
# File 'app/models/raif/model_completion.rb', line 107

def set_total_tokens
  self.total_tokens ||= completion_tokens.present? && prompt_tokens.present? ? completion_tokens + prompt_tokens : nil
end