Class: Raif::Utils::TransientRetry

Inherits:
Object
  • Object
show all
Defined in:
lib/raif/utils/transient_retry.rb

Overview

Retries a block on transient errors using exponential backoff.

Single source of truth for "retry the HTTP call on a network blip" across Raif's synchronous (Raif::Llm#perform_model_completion!) and batch (Raif::Concerns::Llms::*::BatchInference) paths.

Defaults to Raif.config.llm_request_max_retries and Raif.config.llm_request_retriable_exceptions so retry behavior moves together when hosts tune those.

Constant Summary collapse

DEFAULT_BASE_DELAY =
3
DEFAULT_MAX_DELAY =
30

Class Method Summary collapse

Class Method Details

.call(label:, max_retries: nil, retriable_exceptions: nil, base_delay: DEFAULT_BASE_DELAY, max_delay: DEFAULT_MAX_DELAY, on_retry: nil) { ... } ⇒ Object

Returns whatever the block returns on its successful attempt.

Parameters:

  • label (String)

    short identifier for log lines (e.g. "open_ai submit_batch upload"). Surfaces in retry/exhaustion log messages so the call site is visible without grepping.

  • max_retries (Integer) (defaults to: nil)

    retries permitted after the initial attempt. Defaults to Raif.config.llm_request_max_retries.

  • retriable_exceptions (Array<Class>) (defaults to: nil)

    exception classes that trigger a retry. Anything else raises immediately. Defaults to Raif.config.llm_request_retriable_exceptions.

  • base_delay (Numeric) (defaults to: DEFAULT_BASE_DELAY)

    seconds for the first backoff interval.

  • max_delay (Numeric) (defaults to: DEFAULT_MAX_DELAY)

    cap for the exponential backoff in seconds.

  • on_retry (Proc, nil) (defaults to: nil)

    optional callback invoked before each sleep with (error, attempt, max_retries, delay). Use this to layer call-site bookkeeping on top of the default logging (e.g. incrementing a counter).

Yields:

  • the block to execute. Re-yielded on each retry.

Returns:

  • whatever the block returns on its successful attempt.

Raises:

  • the original exception once retries are exhausted, or immediately for non-retriable exceptions.



33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/raif/utils/transient_retry.rb', line 33

def self.call(label:, max_retries: nil, retriable_exceptions: nil, base_delay: DEFAULT_BASE_DELAY, max_delay: DEFAULT_MAX_DELAY, on_retry: nil)
  max_retries ||= Raif.config.llm_request_max_retries
  retriable_exceptions ||= Raif.config.llm_request_retriable_exceptions
  retriable_exceptions = Array(retriable_exceptions)

  attempt = 0
  begin
    yield
  rescue *retriable_exceptions => e
    attempt += 1
    if attempt <= max_retries
      delay = [base_delay * (2**(attempt - 1)), max_delay].min
      on_retry&.call(e, attempt, max_retries, delay)
      Raif.logger.warn(
        "Raif::Utils::TransientRetry[#{label}]: retry #{attempt}/#{max_retries} " \
          "after #{e.class}: #{e.message}. Sleeping #{delay}s."
      )
      sleep_for(delay)
      retry
    end

    Raif.logger.error(
      "Raif::Utils::TransientRetry[#{label}]: exhausted #{max_retries} retries. " \
        "Last error: #{e.class}: #{e.message}"
    )
    raise
  end
end