Class: Raif::Evals::ScoringRubric

Inherits:
Object
  • Object
show all
Defined in:
lib/raif/evals/scoring_rubric.rb

Overview

ScoringRubric provides a standardized way to define evaluation criteria with multiple scoring levels. Each level can define either a score range or a single score value, along with descriptive text explaining what qualifies for that score.

Examples:

Creating a custom rubric

rubric = ScoringRubric.new(
  name: :technical_accuracy,
  description: "Evaluates technical correctness and precision",
  levels: [
    { score_range: (9..10), description: "Technically perfect with no errors" },
    { score_range: (7..8), description: "Mostly correct with minor technical issues" },
    { score_range: (5..6), description: "Generally correct but some technical problems" },
    { score_range: (3..4), description: "Significant technical errors present" },
    { score_range: (0..2), description: "Technically incorrect or misleading" }
  ]
)

Integer scoring levels

rubric = ScoringRubric.new(
  name: :technical_accuracy ,
  description: "Evaluates technical correctness and precision",
  levels: [
    { score: 5, description: "Technically perfect with no errors" },
    { score: 4, description: "Mostly correct with minor technical issues" },
    { score: 3, description: "Generally correct but some technical problems" },
    { score: 2, description: "Significant technical errors present" },
    { score: 1, description: "Mostly incorrect or misleading" },
    { score: 0, description: "Completely incorrect or misleading" }
  ]
)

Using built-in rubrics

accuracy_rubric = ScoringRubric.accuracy
helpfulness_rubric = ScoringRubric.helpfulness
clarity_rubric = ScoringRubric.clarity

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(name:, description:, levels:) ⇒ ScoringRubric

Creates a new ScoringRubric with the specified criteria.

Parameters:

  • name (Symbol)

    Identifier for this rubric (e.g., :accuracy, :helpfulness)

  • description (String)

    Human-readable description of what this rubric evaluates

  • levels (Array<Hash>)

    Array of scoring level definitions. Each level must contain either :score (Integer) or :score_range (Range), plus :description (String)



55
56
57
58
59
# File 'lib/raif/evals/scoring_rubric.rb', line 55

def initialize(name:, description:, levels:)
  @name = name
  @description = description
  @levels = levels
end

Instance Attribute Details

#descriptionString (readonly)

Returns Human-readable description of what this rubric evaluates.

Returns:

  • (String)

    Human-readable description of what this rubric evaluates



45
46
47
# File 'lib/raif/evals/scoring_rubric.rb', line 45

def description
  @description
end

#levelsArray<Hash> (readonly)

Returns Array of scoring level definitions.

Returns:

  • (Array<Hash>)

    Array of scoring level definitions



47
48
49
# File 'lib/raif/evals/scoring_rubric.rb', line 47

def levels
  @levels
end

#nameSymbol (readonly)

Returns The rubric’s identifier name.

Returns:

  • (Symbol)

    The rubric’s identifier name



43
44
45
# File 'lib/raif/evals/scoring_rubric.rb', line 43

def name
  @name
end

Class Method Details

.accuracyScoringRubric

Creates a rubric for evaluating factual accuracy and correctness.

This rubric focuses on whether information is factually correct, precise, and free from errors or misconceptions.

Examples:

rubric = ScoringRubric.accuracy
expect_llm_judge_score(response, scoring_rubric: rubric, min_passing_score: 4)

Returns:



110
111
112
113
114
115
116
117
118
119
120
121
122
# File 'lib/raif/evals/scoring_rubric.rb', line 110

def accuracy
  new(
    name: :accuracy,
    description: "Evaluates factual correctness and precision",
    levels: [
      { score: 5, description: "Completely accurate with no errors" },
      { score: 4, description: "Mostly accurate with minor imprecisions" },
      { score: 3, description: "Generally accurate but some notable errors" },
      { score: 2, description: "Significant inaccuracies present" },
      { score: 1, description: "Mostly or entirely inaccurate" }
    ]
  )
end

.clarityScoringRubric

Creates a rubric for evaluating clarity and comprehensibility.

This rubric focuses on how easy content is to understand, whether it’s well-organized, and if the language is appropriate for the audience.

Examples:

rubric = ScoringRubric.clarity
expect_llm_judge_score(response, scoring_rubric: rubric, min_passing_score: 4)

Returns:



158
159
160
161
162
163
164
165
166
167
168
169
170
# File 'lib/raif/evals/scoring_rubric.rb', line 158

def clarity
  new(
    name: :clarity,
    description: "Evaluates clarity and comprehensibility",
    levels: [
      { score: 5, description: "Crystal clear and easy to understand" },
      { score: 4, description: "Clear with minor ambiguities" },
      { score: 3, description: "Generally clear but some confusion" },
      { score: 2, description: "Unclear in significant ways" },
      { score: 1, description: "Very unclear or incomprehensible" }
    ]
  )
end

.helpfulnessScoringRubric

Creates a rubric for evaluating how well content addresses user needs.

This rubric assesses whether the response is useful, relevant, and effectively helps the user accomplish their goals.

Examples:

rubric = ScoringRubric.helpfulness
expect_llm_judge_score(response, scoring_rubric: rubric, min_passing_score: 4)

Returns:

  • (ScoringRubric)

    Pre-configured helpfulness rubric (1-5 scale)



134
135
136
137
138
139
140
141
142
143
144
145
146
# File 'lib/raif/evals/scoring_rubric.rb', line 134

def helpfulness
  new(
    name: :helpfulness,
    description: "Evaluates how well the response addresses user needs",
    levels: [
      { score: 5, description: "Extremely helpful, fully addresses the need" },
      { score: 4, description: "Very helpful with good coverage" },
      { score: 3, description: "Moderately helpful but missing some aspects" },
      { score: 2, description: "Somewhat helpful but significant gaps" },
      { score: 1, description: "Not helpful or misleading" }
    ]
  )
end

Instance Method Details

#to_promptString

Converts the rubric into a formatted string suitable for LLM prompts.

The output includes the rubric description followed by a detailed breakdown of all scoring levels with their criteria.

Examples:

Output format

"Evaluates factual correctness and precision

Scoring levels:
- 9-10: Completely accurate with no errors
- 7-8: Mostly accurate with minor imprecisions
- 5-6: Generally accurate but some notable errors"

Returns:

  • (String)

    Formatted rubric text ready for inclusion in prompts

Raises:

  • (ArgumentError)

    If a level doesn’t contain :score or :score_range



77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/raif/evals/scoring_rubric.rb', line 77

def to_prompt
  prompt = "#{description}\n\nScoring levels:\n"

  levels.each do |level|
    if level.key?(:score)
      score = level[:score]
      prompt += "- #{score}: #{level[:description]}\n"
    else
      range = level[:score_range]
      min, max = case range
      when Range
        [range.begin, range.exclude_end? ? range.end - 1 : range.end]
      else
        raise ArgumentError, "level must include :score or :score_range (Range)"
      end
      prompt += "- #{min}-#{max}: #{level[:description]}\n"
    end
  end

  prompt.strip
end