Moderation¶
- class opik.evaluation.metrics.Moderation(model: str | OpikBaseModel | None = None, name: str = 'moderation_metric', few_shot_examples: List[FewShotExampleModeration] | None = None, track: bool = True)¶
Bases:
BaseMetric
A metric that evaluates the moderation level of an input-output pair using an LLM.
This metric uses a language model to assess the moderation level of the given input and output. It returns a score between 0.0 and 1.0, where higher values indicate more appropriate content.
- Parameters:
model – The language model to use for moderation. Can be a string (model name) or an opik.evaluation.models.OpikBaseModel subclass instance. opik.evaluation.models.LiteLLMChatModel is used by default.
name – The name of the metric. Defaults to “moderation_metric”.
few_shot_examples – A list of few-shot examples to be used in the query. If None, default examples will be used.
track – Whether to track the metric. Defaults to True.
Example
>>> from opik.evaluation.metrics import Moderation >>> moderation_metric = Moderation() >>> result = moderation_metric.score("Hello, how can I help you?") >>> print(result.value) # A float between 0.0 and 1.0 >>> print(result.reason) # Explanation for the score
- score(output: str, **ignored_kwargs: Any) ScoreResult ¶
Calculate the moderation score for the given input-output pair.
- Parameters:
output – The output text to be evaluated.
**ignored_kwargs (Any) – Additional keyword arguments that are ignored.
- Returns:
A ScoreResult object containing the moderation score (between 0.0 and 1.0) and a reason for the score.
- Return type:
score_result.ScoreResult
- async ascore(output: str, **ignored_kwargs: Any) ScoreResult ¶
Asynchronously calculate the moderation score for the given input-output pair.
This method is the asynchronous version of
score()
. For detailed documentation, please refer to thescore()
method.- Parameters:
output – The output text to be evaluated.
**ignored_kwargs – Additional keyword arguments that are ignored.
- Returns:
A ScoreResult object with the moderation score and reason.
- Return type:
score_result.ScoreResult