0% found this document useful (0 votes)
5 views

Generating-SQL-Queries-with-AI

This document outlines a hands-on lab focused on generating SQL queries using generative AI, specifically ChatGPT, to enhance data retrieval and analysis for BI analysts. Participants will learn to create optimized SQL queries tailored to their data sets, using the Heart Disease data set as a practical example. The lab aims to streamline the query creation process, allowing analysts to focus on strategic insights and decision-making.

Uploaded by

iola
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Generating-SQL-Queries-with-AI

This document outlines a hands-on lab focused on generating SQL queries using generative AI, specifically ChatGPT, to enhance data retrieval and analysis for BI analysts. Participants will learn to create optimized SQL queries tailored to their data sets, using the Heart Disease data set as a practical example. The lab aims to streamline the query creation process, allowing analysts to focus on strategic insights and decision-making.

Uploaded by

iola
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

Generating-SQL-Queries-with-AI

Hands-on lab: Generating SQL queries with AI

Estimated time: 30 minutes

Learning objectives

After completing this lab, you will be able to:

Leverage generative AI models to create efficient queries for your data set.

Improve the accuracy and performance of data retrieval and analysis tasks.

Welcome to the lab on Generating SQL queries with AI!

Introduction

BI analysts juggle with large volumes of data, writing intricate queries to access processed data
from database tables according to specific requirements. These activities are time-consuming and
require meticulous attention to detail for efficient data retrieval and analysis. However, consider a
BI analyst who leverages generative AI tools and agents to streamline the creation of optimized
queries. This allows them to focus on more strategic tasks, such as interpreting data insights and
supporting decision-making processes.

Processed data stored in a database table can be accessed using queries tailored to your
requirements. Since queries are a crucial part of a data professional's workflow, mastering the skill
of writing efficient queries is essential.

In this lab, you will learn how to leverage generative AI platforms to create optimized queries for
your data, provided you supply the model with sufficient context.

Note: The prompts given in this lab are for samples only. You can write prompts
based on your requirements to generate responses. However, you may get a different
response even if you use prompts from this lab.

Exercise 1: Leverage ChatGPT to create efficient queries for your data


set

In this exercise, you'll leverage OpenAI's AI-powered assistant, ChatGPT. ChatGPT is a powerful

1 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

language model designed to assist both consumers and businesses. It understands and processes
text, audio, video, and images. ChatGPT helps professionals streamline their tasks, and BI analysts
are no exception.

By using ChatGPT, BI analysts can simplify their workflow, create optimized queries, analyze data
efficiently, and focus more on strategic decision making and insights.

In this lab, you will leverage ChatGPT to create efficient SQL queries tailored to your specific data
analysis needs. You will start by providing a detailed description of your data set, including
attributes such as age, gender, and chest pain type, to give ChatGPT the necessary context. Using
this context, ChatGPT will then generate SQL queries for various data analysis tasks. For example,
you can prompt ChatGPT to create queries for obtaining age distribution, performing gender
analysis, determining the frequency of chest pain types, and investigating the distribution of heart
disease within different age groups. This process will enable you to efficiently generate and execute
SQL queries, enhancing your ability to analyze and interpret data as a BI analyst.

Step 1: Provide the required data set to ChatGPT

In this step, you will provide the model with a description of your data set to generate efficient and
readily usable queries tailored to your requirements for fetching the data.

For this lab, you will use the Heart Disease data set from the UCI ML library, available publicly
under the CCA 4.0 International license.

Note: You can download the data set and run the generated queries using any SQL querying
system.

You can access the ChatGPT platform using the link https://chatgpt.com/; right-click to open in a
new tab, and log in to ChatGPT. If you are a first-time user, set up OpenAI's ChatGPT account to
sign up.

Paste the following text in the input box to provide ChatGPT with the appropriate context for the
data.

In this lab, we will use the Heart Disease data set from the UCI Machine Learning Repository. This
dataset includes various features such as age, gender, chest pain type, cholesterol levels, and more.
The primary goal is to classify the presence of heart disease.

age - age in years

gender - gender (1 = male; 0 = female)

cp - chest pain type

--Value 1: typical angina

--Value 2: atypical angina

2 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

--Value 3: nonanginal pain

--Value 4: asymptomatic

trestbps - resting blood pressure (in mm Hg on admission to the hospital)

chol - serum cholesterol in mg/dl

fbs - (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)

restecg - resting electrocardiographic results

--Value 0: normal

--Value 1: having ST-T wave abnormality (T wave inversions and/or ST


elevation or depression of > 0.05 mV)

--Value 2: showing probable or definite left ventricular hypertrophy by


Estes’ criteria

thalach - maximum heart rate achieved

exang - exercise induced angina (1 = yes; 0 = no)

oldpeak - ST depression induced by exercise relative to rest

slope - the slope of the peak exercise ST segment

--Value 1: upsloping

--Value 2: flat

--Value 3: downsloping

ca - number of major vessels (0-3) colored by fluoroscopy

thal - 3 = normal; 6 = fixed defect; 7 = reversible defect

num (the predicted attribute) - diagnosis of heart disease (angiographic disease status)

--Value 0: < 50% diameter narrowing

--Value 1: > 50% diameter narrowing

In this lab, we will use the Heart Disease data set from the UCI Machine Learning Repository. This
dataset includes various features such as age, gender, chest pain type, cholesterol levels, and more.
The primary goal is to classify the presence of heart disease.

3 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

age - age in years

gender - gender (1 = male; 0 = female)

cp - chest pain type

--Value 1: typical angina


--Value 2: atypical angina
--Value 3: nonanginal pain
--Value 4: asymptomatic

trestbps - resting blood pressure (in mm Hg on admission to the hospital)

chol - serum cholesterol in mg/dl

fbs - (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)

restecg - resting electrocardiographic results

--Value 0: normal
--Value 1: having ST-T wave abnormality (T wave inversions and/or ST
elevation or depression of > 0.05 mV)
--Value 2: showing probable or definite left ventricular hypertrophy by
Estes’ criteria

thalach - maximum heart rate achieved

exang - exercise induced angina (1 = yes; 0 = no)

oldpeak - ST depression induced by exercise relative to rest

slope - the slope of the peak exercise ST segment

--Value 1: upsloping
--Value 2: flat
--Value 3: downsloping

ca - number of major vessels (0-3) colored by fluoroscopy

thal - 3 = normal; 6 = fixed defect; 7 = reversible defect

num (the predicted attribute) - diagnosis of heart disease (angiographic disease status)

--Value 0: < 50% diameter narrowing


--Value 1: > 50% diameter narrowing

4 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

Select Enter on your keyboard or the top arrow sign on the screen to provide the input.

Step 2: Write prompts to create SQL queries

Once you have set the context, ChatGPT will have enough background to generate SQL queries for
your prompts. Consider the following prompts, asking ChatGPT to generate SQL queries for
different tasks:

Obtain age distribution

Write an SQL query to find the minimum, maximum, and average age of patients in the data set.

Select Enter on your keyboard or the top arrow sign on the screen to obtain the response.

5 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

You can see the following SQL query generated in response.

Perform gender analysis

Write an SQL query to count the number of male and female patients in the data set.

Select Enter on your keyboard or the top arrow sign on the screen to obtain the response.

6 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

You can see the following SQL query generated in response.

Obtain the frequency of chest pain types

Write an SQL query to determine the frequency of each type of chest pain (typical angina, atypical
angina, nonanginal pain, asymptomatic) among patients.

Select Enter on your keyboard or the top arrow sign on the screen to obtain the response.

7 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

You can see the following SQL query generated in response.

To investigate target variables within different age groups

Write an SQL query to investigate the distribution of the target variable (presence or absence of
heart disease) within different age groups (e.g., 20-30, 30-40, etc.).

Select Enter on your keyboard or the top arrow sign on the screen to obtain the response.

8 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

You can see the following SQL query generated in response.

9 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

Try yourself

Write clear and descriptive prompts to leverage the full potential of ChatGPT.

Try to generate queries for the data set for the following prompts:

Cholesterol range:

Find the range of cholesterol levels among patients (minimum, maximum).

Age range and gender Analysis:

Determine the age range (youngest and oldest) for male and female patients separately.

Maximum heart rate by age group:

Find the maximum heart rate achieved during exercise for different age groups (e.g., 30-40, 40-50,
etc.).

Percentage of patients with high blood sugar:

Calculate the percentage of patients with fasting blood sugar greater than 120 mg/dl.

Ratio of patients with resting electrocardiographic abnormality:

Find the ratio of patients with abnormal resting electrocardiographic results to those with normal
results.

Number of patients with reversible thalassemia:

Count the number of patients with reversible thalassemia detected by thallium stress testing.

Average age of patients with chest pain:

Calculate the average age of patients who experienced chest pain during diagnosis.

Distribution of patients by the number of major vessels:

Investigate the distribution of patients based on the number.

Summary

Congratulations on completing the hands-on lab Generating SQL Queries with AI.

In this lab, you've leveraged ChatGPT, an AI chatbot, to create personalized SQL queries for
efficiently extracting insights from large data sets. ChatGPT assists BI analysts in generating SQL

10 of 11 1/3/2025, 12:05 PM
Generating-SQL-Queries-with-AI https://generative-ai-elevate-your-business-intelligence-f75b7d40f993...

queries for various business intelligence tasks, enhancing their ability to retrieve data quickly and
accurately.

11 of 11 1/3/2025, 12:05 PM

You might also like