Context recall
The context recall metric evaluates the accuracy and relevance of an LLM’s response based on provided context, helping to identify potential hallucinations or misalignments with the given information.
How to use the ContextRecall metric
You can use the ContextRecall
metric as follows:
Asynchronous scoring is also supported with the ascore
scoring method.
ContextRecall Prompt
Opik uses an LLM as a Judge to compute context recall, for this we have a prompt template that is used to generate the prompt for the LLM. By default, the gpt-4o
model is used to detect hallucinations but you can change this to any model supported by LiteLLM by setting the model
parameter. You can learn more about customizing models in the Customize models for LLM as a Judge metrics section.
The template uses a few-shot prompting technique to compute context recall. The template is as follows:
with VERDICT_KEY
being context_recall_score
and REASON_KEY
being reason
.