Skip to main content

Overview

Opik provides a set of built-in evaluation metrics that can be used to evaluate the output of your LLM calls. These metrics are broken down into two main categories:

  1. Heuristic metrics
  2. LLM as a Judge metrics

Heuristic metrics are deterministic and are often statistical in nature. LLM as a Judge metrics are non-deterministic and are based on the idea of using an LLM to evaluate the output of another LLM.

Opik provides the following built-in evaluation metrics:

MetricTypeDescriptionDocumentation
EqualsHeuristicChecks if the output exactly matches an expected stringEquals
ContainsHeuristicCheck if the output contains a specific substring, can be both case sensitive or case insensitiveContains
RegexMatchHeuristicChecks if the output matches a specified regular expression patternRegexMatch
IsJsonHeuristicChecks if the output is a valid JSON objectIsJson
LevenshteinHeuristicCalculates the Levenshtein distance between the output and an expected stringLevenshtein
HallucinationLLM as a JudgeCheck if the output contains any hallucinationsHallucination
G-EvalLLM as a JudgeTask agnostic LLM as a Judge metricG-Eval
ModerationLLM as a JudgeCheck if the output contains any harmful contentModeration
AnswerRelevanceLLM as a JudgeCheck if the output is relevant to the questionAnswerRelevance
ContextRecallLLM as a JudgeCheck if the output contains any hallucinationsContextRecall
ContextPrecisionLLM as a JudgeCheck if the output contains any hallucinationsContextPrecision

You can also create your own custom metric, learn more about it in the Custom Metric section.

Customizing LLM as a Judge metrics

By default, Opik uses GPT-4o from OpenAI as the LLM to evaluate the output of other LLMs. However, you can easily switch to another LLM provider by specifying a different model in the model_name parameter of each LLM as a Judge metric.

from opik.evaluation.metrics import Hallucination

metric = Hallucination(model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0")

metric.score(
input="What is the capital of France?",
output="The capital of France is Paris. It is famous for its iconic Eiffel Tower and rich cultural heritage.",
)

This functionality is based on LiteLLM framework, you can find a full list of supported LLM providers and how to configure them in the LiteLLM Providers guide.