Using Opik with Langchain

For this guide, we will be performing a text to sql query generation task using LangChain. We will be using the Chinook database which contains the SQLite database of a music store with both employee, customer and invoice data.

We will highlight three different parts of the workflow:

  1. Creating a synthetic dataset of questions
  2. Creating a LangChain chain to generate SQL queries
  3. Automating the evaluation of the SQL queries on the synthetic dataset

Creating an account on Comet.com

Comet provides a hosted version of the Opik platform, simply create an account and grab you API Key.

You can also run the Opik platform locally, see the installation guide for more information.

1%pip install --upgrade --quiet opik langchain langchain-community langchain-openai
1import opik
2
3opik.configure(use_local=False)

Preparing our environment

First, we will download the Chinook database and set up our different API keys.

1import os
2import getpass
3
4if "OPENAI_API_KEY" not in os.environ:
5 os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
1# Download the relevant data
2import os
3from langchain_community.utilities import SQLDatabase
4
5import requests
6import os
7
8url = "https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite"
9filename = "./data/chinook/Chinook_Sqlite.sqlite"
10
11folder = os.path.dirname(filename)
12
13if not os.path.exists(folder):
14 os.makedirs(folder)
15
16if not os.path.exists(filename):
17 response = requests.get(url)
18 with open(filename, "wb") as file:
19 file.write(response.content)
20 print("Chinook database downloaded")
21
22db = SQLDatabase.from_uri(f"sqlite:///{filename}")

Creating a synthetic dataset

In order to create our synthetic dataset, we will be using the OpenAI API to generate 20 different questions that a user might ask based on the Chinook database.

In order to ensure that the OpenAI API calls are being tracked, we will be using the track_openai function from the opik library.

1from opik.integrations.openai import track_openai
2from openai import OpenAI
3import json
4
5os.environ["OPIK_PROJECT_NAME"] = "langchain-integration-demo"
6client = OpenAI()
7
8openai_client = track_openai(client)
9
10prompt = """
11Create 20 different example questions a user might ask based on the Chinook Database.
12
13These questions should be complex and require the model to think. They should include complex joins and window functions to answer.
14
15Return the response as a json object with a "result" key and an array of strings with the question.
16"""
17
18completion = openai_client.chat.completions.create(
19 model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}]
20)
21
22print(completion.choices[0].message.content)

Now that we have our synthetic dataset, we can create a dataset in Comet and insert the questions into it.

Since the insert methods in the SDK deduplicates items, we can insert 20 items and if the items already exist, Opik will automatically remove them.

1# Create the synthetic dataset
2import opik
3
4synthetic_questions = json.loads(completion.choices[0].message.content)["result"]
5
6client = opik.Opik()
7
8dataset = client.get_or_create_dataset(name="synthetic_questions")
9dataset.insert([{"question": question} for question in synthetic_questions])

Creating a LangChain chain

We will be using the create_sql_query_chain function from the langchain library to create a SQL query to answer the question.

We will be using the OpikTracer class from the opik library to ensure that the LangChan trace are being tracked in Comet.

1# Use langchain to create a SQL query to answer the question
2from langchain.chains import create_sql_query_chain
3from langchain_openai import ChatOpenAI
4from opik.integrations.langchain import OpikTracer
5
6opik_tracer = OpikTracer(tags=["simple_chain"])
7
8llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
9chain = create_sql_query_chain(llm, db).with_config({"callbacks": [opik_tracer]})
10response = chain.invoke({"question": "How many employees are there ?"})
11response
12
13print(response)

Automating the evaluation

In order to ensure our LLM application is working correctly, we will test it on our synthetic dataset.

For this we will be using the evaluate function from the opik library. We will evaluate the application using a custom metric that checks if the SQL query is valid.

1from opik import Opik, track
2from opik.evaluation import evaluate
3from opik.evaluation.metrics import base_metric, score_result
4from typing import Any
5
6
7class ValidSQLQuery(base_metric.BaseMetric):
8 def __init__(self, name: str, db: Any):
9 self.name = name
10 self.db = db
11
12 def score(self, output: str, **ignored_kwargs: Any):
13 # Add you logic here
14
15 try:
16 db.run(output)
17 return score_result.ScoreResult(
18 name=self.name, value=1, reason="Query ran successfully"
19 )
20 except Exception as e:
21 return score_result.ScoreResult(name=self.name, value=0, reason=str(e))
22
23
24valid_sql_query = ValidSQLQuery(name="valid_sql_query", db=db)
25
26client = Opik()
27dataset = client.get_dataset("synthetic_questions")
28
29
30@track()
31def llm_chain(input: str) -> str:
32 response = chain.invoke({"question": input})
33
34 return response
35
36
37def evaluation_task(item):
38 response = llm_chain(item["question"])
39
40 return {"output": response}
41
42
43res = evaluate(
44 experiment_name="SQL question answering",
45 dataset=dataset,
46 task=evaluation_task,
47 scoring_metrics=[valid_sql_query],
48 nb_samples=20,
49)

The evaluation results are now uploaded to the Opik platform and can be viewed in the UI.

LangChain Evaluation

Built with