Dataset

class opik.Dataset(name: str, description: str | None, rest_client: OpikApi)

Bases: object

__init__(name: str, description: str | None, rest_client: OpikApi) None

A Dataset object. This object should not be created directly, instead use opik.Opik.create_dataset() or opik.Opik.get_dataset().

property name: str

The name of the dataset.

property description: str | None

The description of the dataset.

insert(items: Sequence[DatasetItem | Dict[str, Any]]) None

Insert new items into the dataset.

Parameters:

items – List of DatasetItem objects or dicts (which will be converted to DatasetItem objects) to add to the dataset.

update(items: List[DatasetItem]) None

Update existing items in the dataset.

Parameters:

items – List of DatasetItem objects to update in the dataset. You need to provide the full item object as it will override what has been supplied previously.

Raises:

DatasetItemUpdateOperationRequiresItemId – If any item in the list is missing an id.

delete(items_ids: List[str]) None

Delete items from the dataset.

Parameters:

items_ids – List of item ids to delete.

clear() None

Delete all items from the given dataset.

to_pandas() DataFrame

Convert the dataset to a pandas DataFrame.

Returns:

A pandas DataFrame containing all items in the dataset.

to_json() str

Convert the dataset to a JSON string.

Returns:

A JSON string representation of all items in the dataset.

get_all_items() List[DatasetItem]

Retrieve all items from the dataset.

Returns:

A list of DatasetItem objects representing all items in the dataset.

insert_from_json(json_array: str, keys_mapping: Dict[str, str] | None = None, ignore_keys: List[str] | None = None) None
Parameters:
  • json_array – json string of format: “[{…}, {…}, {…}]” where every dictionary is to be transformed into dataset item

  • keys_mapping – dictionary that maps json keys to item fields names Example: {‘Expected output’: ‘expected_output’}

  • ignore_keys – if your json dicts contain keys that are not needed for DatasetItem construction - pass them as ignore_keys argument

insert_from_pandas(dataframe: DataFrame, keys_mapping: Dict[str, str] | None = None, ignore_keys: List[str] | None = None) None
Parameters:
  • dataframe – pandas dataframe

  • keys_mapping – Dictionary that maps dataframe column names to dataset item field names. Example: {‘Expected output’: ‘expected_output’}

  • ignore_keys – if your dataframe contains columns that are not needed for DatasetItem construction - pass them as ignore_keys argument