0% found this document useful (0 votes)
70 views6 pages

Query GPT

QueryGPT

Uploaded by

inaki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views6 pages

Query GPT

QueryGPT

Uploaded by

inaki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

🛳️

QueryGPT - Natural Language to


SQL Using Generative AI
Status Completed

Tags LLM QueryGPT SQL genAI

https://www.uber.com/en-IN/blog/query-gpt/?uclick_id=6cfc9a34-
Link
aa3e-4140-9e8e-34e867b80b2b

Author Jeffrey Johnson

End
@October 22, 2024
Date

Source Uber

Start
@October 22, 2024
Date

Time 15

QueryGPT - Natural Language to SQL Using


Generative AI
Introduction
SQL → access and manipulate data

SQL syntax to craft queries to look things up in relational databases

QueryGPT → generate SQL queries through natural language prompts

uses LLMs, vector databases, and similarity search

QueryGPT - Natural Language to SQL Using Generative AI 1


Motivation
Uber → data platform handles 1.2 million interactive queries each montj

authoring queries requires lot of time between searching for relevant datasets
in data dictionary and then authoring the query inside editor

query authoring = creating and refining queries to extract specific


information from a database, search engine, or other data systems

Architecture
original architecture

relied on simple RAG to fetch (retrieving relevant data from a database) the
relevant samples needed to include in our query generation call to the LLM
(few-shot prompting) → take prompt, vectorize it and do similarity search
on SQL samples and schemas to fetch 3 relevant tables and 7 relevant
SQL samples

SQL sample queries → provide the LLM guidance on how to use the
table schemas provided

schema samples provided the LLM information about the columns that
existed on those tables

to help the LLM understand internal lingo and work with specific datasets,
some custom instructions were added in the LLM call

QueryGPT - Natural Language to SQL Using Generative AI 2


wrap all relevant schema samples, SQL samples, prompt, and business
instructions around a system prompt and send the request to the LLM

answer include an SQL query and an explanation of how the LLM


generated the query

worked well for a small set of schemas and SQL samples, nut as more
tables and SQL samples were added, accuracy was declining

better RAG

simple similarity search for prompt on schema samples and SQL queries
doesn’t return relevant results

understanding user’s intent

very challenging to go from user’s prompt to finding relevant schemas →


intermediate step was needed, which classifies the user’s prompt into an
“intent” that maps to relevant schemas and SQL samples

handling large schemas

combining really large schemas → lots of tokens → problems with context


window

Current Design
workspaces

= curated collections of SQL samples and tables tailored to specific


business domains → help narrow the focus for the LLM, improving
relevance and accuracy of generated queries

QueryGPT - Natural Language to SQL Using Generative AI 3


you can also create custom workspaces to fit very niche requirements

intent agent

incoming prompt first runs through an intent agent → map user question to
one or more business domains/workspaces (and by extension a set of SQL
samples and tables mapped to the domain)

LLM call to infer intent and mapping to workspaces

table agent

allows users to select the tables used in query generation → agent


provides suggestions for tables, user can validate

column prune agent

intermittent token size issue → when some requests included one or more
tables that consumed a large amount of tokens

LLM prunes irrelevant columns from schemas provided to LLM +


explanation why

improved cost and latency

Evaluation
to track incremental improvements in performance → standardized evaluation
procedure is needed

QueryGPT - Natural Language to SQL Using Generative AI 4


evaluation set

curating set of golden question-to-SQL answer mappings → manual


investment

set of real questions from logs, manually verified correct intent, schemas
required, and the golden SQL

evaluation procedure

for each question in evaluation, capture following signals

intent → is intent assigned accurate?

table overlap → are tables identified via Seach + Table Agent correct?

successful run → does generated query run successfully?

run has output → does query execution return >0 records (to check for
hallucinations such as “Finished” instead of “Completed”)

qualitative query similarity → how similar is generated query relative to


golden SQL → LLM assigns similarity score between 0 and 1

also aggregate accuracy and latency metrics for each evaluation run to
track performance over time

limitations

stochastic nature of LLMs → different outcomes for exactly the same


query

identify error patterns over longer time periods that can be addressed
by specific feature improvements

not one correct answer, same question can be answered by multiple


queries/tables

Learnings
LLMs are excellent classifiers (intermediate agents)

hallucinations (LLMs might generate query with tables or columns that don’t
exist)

QueryGPT - Natural Language to SQL Using Generative AI 5


user prompts are not always ‘context’-rich (prompts range from very detailed
with the right keywords to very short prompts → requires ‘good’ input from
users)

QueryGPT - Natural Language to SQL Using Generative AI 6

You might also like