Class: Raif::ModelCompletion
- Inherits:
-
ApplicationRecord
- Object
- ApplicationRecord
- Raif::ModelCompletion
- Includes:
- Concerns::BooleanTimestamp, Concerns::HasAvailableModelTools, Concerns::HasRuntimeDuration, Concerns::LlmResponseParsing, Concerns::ProviderManagedToolCalls
- Defined in:
- app/models/raif/model_completion.rb
Overview
Schema Information
Table name: raif_model_completions
id :bigint not null, primary key available_model_tools :jsonb not null cache_creation_input_tokens :integer cache_read_input_tokens :integer citations :jsonb completed_at :datetime completion_tokens :integer failed_at :datetime failure_error :string failure_reason :text failure_response_body :text failure_response_status :integer llm_model_key :string not null max_completion_tokens :integer messages :jsonb not null model_api_name :string not null output_token_cost :decimal(10, 6) prompt_token_cost :decimal(10, 6) prompt_tokens :integer raw_response :text response_array :jsonb response_format :integer default("text"), not null response_format_parameter :string response_tool_calls :jsonb retry_count :integer default(0), not null source_type :string started_at :datetime stream_response :boolean default(FALSE), not null system_prompt :text temperature :decimal(5, 3) tool_choice :string total_cost :decimal(10, 6) total_tokens :integer created_at :datetime not null updated_at :datetime not null batch_custom_id :string raif_model_completion_batch_id :bigint response_id :string source_id :bigint
Indexes
index_raif_model_completions_on_batch_custom_id (batch_custom_id) index_raif_model_completions_on_batch_id_and_custom_id (raif_model_completion_batch_id,batch_custom_id) UNIQUE WHERE (raif_model_completion_batch_id IS NOT NULL) index_raif_model_completions_on_completed_at (completed_at) index_raif_model_completions_on_created_at (created_at) index_raif_model_completions_on_failed_at (failed_at) index_raif_model_completions_on_raif_model_completion_batch_id (raif_model_completion_batch_id) index_raif_model_completions_on_source (source_type,source_id) index_raif_model_completions_on_started_at (started_at)
Foreign Keys
fk_rails_... (raif_model_completion_batch_id => raif_model_completion_batches.id)
Constant Summary collapse
- FAILURE_RESPONSE_BODY_MAX_CHARS =
Maximum number of characters of an upstream HTTP body we persist on failure. The body usually carries the provider's actual error reason (e.g. OpenAI/Anthropic structured error JSON), which
failure_reasoncannot fit in 255 chars. 4 KB is enough to capture realistic error payloads without bloating storage. 4_000
Constants included from Concerns::LlmResponseParsing
Concerns::LlmResponseParsing::ASCII_CONTROL_CHARS
Instance Attribute Summary collapse
-
#anthropic_prompt_caching_enabled ⇒ Object
Returns the value of attribute anthropic_prompt_caching_enabled.
-
#bedrock_prompt_caching_enabled ⇒ Object
Returns the value of attribute bedrock_prompt_caching_enabled.
Instance Method Summary collapse
- #calculate_costs ⇒ Object
- #json_response_schema ⇒ Object
- #pending? ⇒ Boolean
- #record_failure!(exception) ⇒ Object
- #set_total_tokens ⇒ Object
Methods included from Concerns::ProviderManagedToolCalls
#provider_managed_tool_calls, #sanitized_citations
Methods included from Concerns::HasRuntimeDuration
#runtime_duration, #runtime_duration_seconds, #runtime_ended_at
Methods included from Concerns::HasAvailableModelTools
Methods included from Concerns::LlmResponseParsing
#parse_html_response, #parse_json_response, #parsed_response
Instance Attribute Details
#anthropic_prompt_caching_enabled ⇒ Object
Returns the value of attribute anthropic_prompt_caching_enabled.
69 70 71 |
# File 'app/models/raif/model_completion.rb', line 69 def anthropic_prompt_caching_enabled @anthropic_prompt_caching_enabled end |
#bedrock_prompt_caching_enabled ⇒ Object
Returns the value of attribute bedrock_prompt_caching_enabled.
69 70 71 |
# File 'app/models/raif/model_completion.rb', line 69 def bedrock_prompt_caching_enabled @bedrock_prompt_caching_enabled end |
Instance Method Details
#calculate_costs ⇒ Object
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'app/models/raif/model_completion.rb', line 111 def calculate_costs # Each retry resends the same prompt, so the provider charges input tokens # for every attempt. Factor in retry_count to reflect actual billing. total_attempts = (retry_count || 0) + 1 if prompt_tokens.present? && llm_config[:input_token_cost].present? self.prompt_token_cost = calculate_prompt_token_cost(total_attempts) end if completion_tokens.present? && llm_config[:output_token_cost].present? self.output_token_cost = llm_config[:output_token_cost] * completion_tokens end if prompt_token_cost.present? || output_token_cost.present? self.total_cost = (prompt_token_cost || 0) + (output_token_cost || 0) end apply_batch_inference_discount if raif_model_completion_batch_id.present? end |
#json_response_schema ⇒ Object
103 104 105 |
# File 'app/models/raif/model_completion.rb', line 103 def json_response_schema source.json_response_schema if source&.respond_to?(:json_response_schema) end |
#pending? ⇒ Boolean
86 87 88 |
# File 'app/models/raif/model_completion.rb', line 86 def pending? started_at.nil? && completed_at.nil? && failed_at.nil? end |
#record_failure!(exception) ⇒ Object
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'app/models/raif/model_completion.rb', line 138 def record_failure!(exception) self.failed_at = Time.current self.failure_error = exception.class.name self.failure_reason = exception..truncate(255) # Always clear before re-populating so a second call with a different # exception kind doesn't leave stale response metadata attached. self.failure_response_status = nil self.failure_response_body = nil # Faraday errors carry the provider's HTTP status and response body — # the latter is where the actual provider-side error reason lives. Both # are nil when the failure happened before a response was received # (DNS/connection refused/timeout). if exception.is_a?(Faraday::Error) self.failure_response_status = exception.response_status body = exception.response_body self.failure_response_body = body.to_s.first(FAILURE_RESPONSE_BODY_MAX_CHARS) if body.present? end save! end |
#set_total_tokens ⇒ Object
107 108 109 |
# File 'app/models/raif/model_completion.rb', line 107 def set_total_tokens self.total_tokens ||= completion_tokens.present? && prompt_tokens.present? ? completion_tokens + prompt_tokens : nil end |