evaluate_experiment

opik.evaluation.evaluate_experiment(experiment_name: str, scoring_metrics: List[BaseMetric], scoring_threads: int = 16, verbose: int = 1) EvaluationResult

Update existing experiment with new evaluation metrics.

Parameters:
  • experiment_name – The name of the experiment to update.

  • scoring_metrics – List of metrics to calculate during evaluation. Each metric has score(…) method, arguments for this method are taken from the task output, check the signature of the score method in metrics that you need to find out which keys are mandatory in task-returned dictionary.

  • scoring_threads – amount of thread workers to run scoring metrics.

  • verbose – an integer value that controls evaluation output logs such as summary and tqdm progress bar.