GEval

class opik.evaluation.metrics.GEval(task_introduction: str, evaluation_criteria: str, model: str | OpikBaseModel | None = None, name: str = 'g_eval_metric', track: bool = True)

Bases: BaseMetric

property llm_chain_of_thought: str
score(output: str, **ignored_kwargs: Any) ScoreResult

Calculate the G-Eval score for the given LLM’s output.

Parameters:
  • output – The LLM’s output to evaluate.

  • **ignored_kwargs – Additional keyword arguments that are ignored.

Returns:

A ScoreResult object containing the G-Eval score (between 0.0 and 1.0) and a reason for the score.

Return type:

score_result.ScoreResult

async ascore(output: str, **ignored_kwargs: Any) ScoreResult

Calculate the G-Eval score for the given LLM’s output.

Parameters:
  • output – The LLM’s output to evaluate.

  • **ignored_kwargs – Additional keyword arguments that are ignored.

Returns:

A ScoreResult object containing the G-Eval score (between 0.0 and 1.0) and a reason for the score.

Return type:

score_result.ScoreResult