GEval¶

class opik.evaluation.metrics.GEval(task_introduction: str, evaluation_criteria: str, model: str | OpikBaseModel | None = None, name: str = 'g_eval_metric', track: bool = True, project_name: str | None = None)¶

Bases: BaseMetric

property llm_chain_of_thought: str¶

score(output: str, **ignored_kwargs: Any) → ScoreResult¶

Calculate the G-Eval score for the given LLM’s output.

Parameters:

output – The LLM’s output to evaluate.
**ignored_kwargs – Additional keyword arguments that are ignored.

Returns:

A ScoreResult object containing the G-Eval score (between 0.0 and 1.0) and a reason for the score.

Return type:

score_result.ScoreResult

async ascore(output: str, **ignored_kwargs: Any) → ScoreResult¶