? PDF - 1000+ SQL Interview Questions & Answers v2
? PDF - 1000+ SQL Interview Questions & Answers v2
Warm Regards,
N.H.
Founder, Zero Analyst
2
1000+ SQL Interview Questions & Answers | By Zero Analyst
Introductions
Copyright Information
© 2025 Zero Analyst(N.H.)
All rights reserved. No part of this eBook may be reproduced, distributed, or
transmitted in any form or by any means, including photocopying, recording, or other
electronic or mechanical methods, without the prior written permission of the author,
except in the case of brief quotations embodied in critical reviews and certain other
non-commercial uses permitted by copyright law.
ISBN: 9798306737812
Imprint: Independently published
3
1000+ SQL Interview Questions & Answers | By Zero Analyst
Dedication
To all aspiring Data Analysts, Data Engineers, Business Analysts, SQL Developers,
and tech enthusiasts working tirelessly to secure their dream job—this book is for
you.
4
1000+ SQL Interview Questions & Answers | By Zero Analyst
Acknowledgments
5
1000+ SQL Interview Questions & Answers | By Zero Analyst
6
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Data Analyst
• Data Engineer
• Business Analyst
• SQL Developer
• Full Stack Developer
• Cloud Data Engineer
7
1000+ SQL Interview Questions & Answers | By Zero Analyst
What to Expect
• 1000+ SQL Interview Questions & Answers: Covering fundamental,
intermediate, and advanced levels.
• Detailed Explanations: Understand not just the “how” but also the “why”
behind SQL concepts.
• Practical Scenarios: Real-world problems and solutions to bridge the gap
between theory and practice.
• Interview Tips: Insights to help you stand out in interviews and impress
hiring managers.
8
1000+ SQL Interview Questions & Answers | By Zero Analyst
1. Start with Basics: If you’re new to SQL, go through the foundational questions and
explanations.
2. Practice Hands-On: For intermediate and advanced questions, implement solutions
in your preferred SQL environment.
3. Mock Interviews: Use the questions to simulate interview scenarios and improve
your confidence.
4. Revisit Regularly: SQL is a skill that sharpens with practice—make this book your
go-to reference.
5. Utilize the eBook: Access the eBook for interactive datasets and solutions, making
your practice more engaging and practical.
9
1000+ SQL Interview Questions & Answers | By Zero Analyst
Index
Introduction - Page No.3
10
1000+ SQL Interview Questions & Answers | By Zero Analyst
11
1000+ SQL Interview Questions & Answers | By Zero Analyst
12
1000+ SQL Interview Questions & Answers | By Zero Analyst
13
1000+ SQL Interview Questions & Answers | By Zero Analyst
14
1000+ SQL Interview Questions & Answers | By Zero Analyst
15
1000+ SQL Interview Questions & Answers | By Zero Analyst
16
1000+ SQL Interview Questions & Answers | By Zero Analyst
GitHub Repository - 🛢
17
1000+ SQL Interview Questions & Answers | By Zero Analyst
18
1000+ SQL Interview Questions & Answers | By Zero Analyst
19
1000+ SQL Interview Questions & Answers | By Zero Analyst
20
1000+ SQL Interview Questions & Answers | By Zero Analyst
21
1000+ SQL Interview Questions & Answers | By Zero Analyst
22
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
• To retrieve all rows from the table, we use the SELECT statement.
• The wildcard is used to select all columns from the table.
• We don’t need to filter any rows, so no WHERE clause is needed.
• The SELECT * query will return every row and every column from the
students table.
Answer:
SELECT * FROM students;
This query will retrieve all the data from the students table. The result will
include all five rows and four columns: student_id, first_name,
last_name, and age
o Q.2
23
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT *
FROM employees
WHERE salary > 4000;
Explanation:
• This query will return all employees whose salary is greater than 4000.
The result will include the employee details (ID, name, and salary) for
those who meet the salary condition.
o Q.3
24
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT *
FROM books
WHERE published_year = 2023;
Explanation:
• This query retrieves all the books from the books table that were
published in the year 2023.
• The WHERE clause ensures that only the rows with the published_year
equal to 2023 are returned.
• The result will include the book_id, title, author, and published_year
for each book that matches the condition.
o Q.4
Question 4:
Question: Count the number of products in the products table.
25
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learning:
• To count the total number of rows in a table, we can use the COUNT()
aggregate function.
• The COUNT() function will return the number of rows in a table, based
on a specified column or all rows.
• If we use COUNT(*), it counts all rows regardless of any column’s
value, which is ideal when counting the total number of products in
this case.
Answer:
SELECT COUNT(*) AS total_products
FROM products;
Explanation:
• This query counts the total number of rows in the products table.
• The COUNT(*) function returns the total number of products in the
table, regardless of the product details.
• The result will be a single number showing the total number of
products, which in this case should return 5 since there are 5 rows in
the table.
o Q.5
Question 5:
Question: Find all orders placed by the customer with customer_id = 1.
26
1000+ SQL Interview Questions & Answers | By Zero Analyst
product_name VARCHAR(50),
order_date DATE
);
Learning:
Answer:
SELECT *
FROM orders
WHERE customer_id = 1;
Explanation:
o Q.6
27
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT *
FROM cities;
Explanation:
o Q.7
28
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
• To filter customers who are older than a specific age, we use the WHERE
clause with a condition.
• The > (greater than) operator helps us filter rows where the age is
greater than 30.
• The query will return all customers who satisfy this condition (i.e.,
customers whose age is greater than 30).
Answer:
SELECT *
FROM customers
WHERE age > 30;
Explanation:
o Q.8
29
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT *
FROM animals
WHERE species = 'Dog';
Explanation:
o Q.9
30
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT COUNT(*) AS total_movies
FROM movies;
Explanation:
o Q.10
Learning:
31
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The > (greater than) operator allows us to retrieve rows where the
amount is greater than 100.
• This query will return all columns (transaction_id, customer_id,
amount, transaction_date) for the transactions where the amount is
greater than 100.
Answer:
SELECT *
FROM transactions
WHERE amount > 100;
Explanation:
o Q.11
Learning:
32
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT *
FROM employees
WHERE company = 'TCS';
Explanation:
o Q.12
Learning:
• To filter products based on both price and company, we use the WHERE
clause with multiple conditions.
33
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT *
FROM products
WHERE company = 'Flipkart'
AND price > 1000;
Explanation:
o Q.13
Learning:
34
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT COUNT(*) AS ola_ride_count
FROM rides
WHERE company = 'Ola'
AND ride_date BETWEEN '2024-12-01' AND '2024-12-31';
Explanation:
• The SELECT COUNT(*) function counts the total number of rows in the
rides table that meet the specified conditions.
• The WHERE company = 'Ola' condition filters for rides taken with the
company 'Ola'.
• The AND ride_date BETWEEN '2024-12-01' AND '2024-12-31'
condition ensures that only rides taken in December 2024 are counted.
• The result will be 3, as there are three rides taken with 'Ola' in
December 2024 (on the 1st, 5th, and 10th).
o Q.14
35
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT DISTINCT city
FROM offices
WHERE company = 'Wipro';
Explanation:
• The SELECT DISTINCT city statement retrieves all unique cities from
the offices table.
• The WHERE company = 'Wipro' condition filters the rows to include
only those where the company is 'Wipro'.
• The result will return Hyderabad and Kolkata, as these are the cities
where Wipro has offices.
o Q.15
Learning:
36
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT SUM(amount) AS total_revenue
FROM orders
WHERE company = 'Zomato';
Explanation:
o Q.16
Learning:
37
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT *
FROM flights
WHERE airline = 'Indigo';
Explanation:
• The SELECT * statement retrieves all columns from the flights table.
• The WHERE airline = 'Indigo' condition filters the rows to include
only those where the airline is Indigo.
• The result will include Akshay, Vishal, and Kiran as these are the
customers who booked flights with Indigo.
o Q.17
Learning:
38
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The WHERE company = 'ITC' condition will filter the rows to return
only products where the company is ITC.
• This query will return all columns (product_id, product_name,
company, category) for ITC products.
Answer:
SELECT *
FROM products
WHERE company = 'ITC';
Explanation:
o Q.18
Find all Jio customers who recharged for more than 300.
Learning:
• To filter rows based on two conditions, we use the WHERE clause with
multiple conditions connected by AND.
• The first condition filters by company = 'Jio', and the second
condition filters by amount > 300.
39
1000+ SQL Interview Questions & Answers | By Zero Analyst
• This query will return only Jio customers who have recharged for
amounts greater than 300.
Answer:
SELECT *
FROM recharges
WHERE company = 'Jio' AND amount > 300;
Explanation:
o Q.19
Learning:
• To count the number of rows that meet a certain condition, we use the
COUNT() function.
40
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT COUNT(*) AS paytm_transactions_november
FROM transactions
WHERE company = 'Paytm'
AND transaction_date BETWEEN '2024-11-01' AND '2024-11-30';
Explanation:
• The COUNT(*) function counts the total number of rows that match the
given conditions.
• The WHERE company = 'Paytm' condition filters the rows to include
only Paytm transactions.
• The AND transaction_date BETWEEN '2024-11-01' AND '2024-
11-30' condition filters the transactions to include only those that
occurred in November 2024.
• The result will give the total number of Paytm transactions in
November 2024.
o Q.20
Learning:
41
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT customer_name
FROM accounts
WHERE bank_name = 'SBI';
Explanation:
• Easy
o Q.21
Products Table:
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
supplier_id INT
);
42
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT DISTINCT s.supplier_name
FROM suppliers s
JOIN products p ON s.supplier_id = p.supplier_id
WHERE p.category = 'Electronics';
Explanation:
o Q.22
List all orders along with customer names from the orders and customers
tables.
Orders Table:
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT,
43
1000+ SQL Interview Questions & Answers | By Zero Analyst
order_date DATE
);
Learning:
Answer:
SELECT c.first_name, c.last_name, o.order_date
FROM orders o
JOIN customers c ON o.customer_id = c.id;
Explanation:
o Q.23
44
1000+ SQL Interview Questions & Answers | By Zero Analyst
(1, '2024-01-05'),
(1, '2024-01-05'),
(2, '2024-01-15'),
(3, '2023-02-10');
Learning:
Answer:
SELECT COUNT(DISTINCT order_id) AS unique_orders
FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-01-31';
Explanation:
o Q.24
Products Table:
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
supplier_id INT
45
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learning:
Answer:
SELECT DISTINCT s.supplier_name
FROM suppliers s
JOIN products p ON s.supplier_id = p.supplier_id
WHERE p.category = 'Electronics';
Explanation:
o Q.25
46
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT customer_id, COUNT(transaction_id) AS total_transactions
FROM transactions
GROUP BY customer_id;
Explanation:
o Q.26
Given the employee table with columns EMP_ID and SALARY, write an SQL
query to find all salaries greater than the average salary. Return EMP_ID and
SALARY.
47
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
• To find salaries greater than the average salary, we will first calculate
the average salary using the AVG() function.
• Then, we will filter the employees whose salary is greater than the
calculated average.
• The HAVING clause can be used to filter results based on aggregate
functions like AVG(), but in this case, we can also use a subquery to
compare each employee's salary to the average salary.
Answer:
SELECT EMP_ID, SALARY
FROM employee
WHERE SALARY > (SELECT AVG(SALARY) FROM employee);
Explanation:
o Q.27
48
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT *
FROM courses;
Explanation:
• The SELECT * query retrieves all columns from the courses table,
which includes the course ID, course name, and duration.
• There is no need to filter the data as the question asks for all available
courses.
o Q.28
Learning:
• To retrieve banks established before the year 2000, we will use the
WHERE clause with a condition that filters the established_year.
49
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT *
FROM public_sector_banks
WHERE established_year < 2000;
Explanation:
o Q.29
Products Table:
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100)
);
Learning:
• To find the total revenue for each product, we will use the SUM()
function, which calculates the total revenue for each product_id.
50
1000+ SQL Interview Questions & Answers | By Zero Analyst
• We need to join the sales table with the products table on the
product_id to get the product name along with the total revenue.
• Use GROUP BY to aggregate the results by product_name.
Answer:
SELECT p.product_name, SUM(s.revenue) AS total_revenue
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY p.product_name;
Explanation:
• JOIN: We are joining the sales table and the products table using the
product_id column so we can display the product names alongside
the revenue.
• SUM(s.revenue): This sums up the revenue for each product.
• GROUP BY p.product_name: This groups the results by the product
name so that we get the total revenue for each individual product.
Output Example:
product_name total_revenue
Smartphone 80000.00
Laptop 75000.00
o Q.30
51
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
• To filter products with stock less than a certain value, we use the
WHERE clause.
• The < operator will help us filter products with stock less than 10.
• We can simply select the product_name and stock columns to display
the required results.
Answer:
SELECT product_name, stock
FROM products
WHERE stock < 10;
Explanation:
• WHERE stock < 10: This condition filters out the products whose
stock is less than 10.
• The query will return the product_name and stock of all products that
meet this condition.
Output Example:
product_name stock
Smartphone 5
T-shirt 8
o Q.31
52
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT phone_name, brand
FROM mobile_phones;
Explanation:
• The query retrieves the phone_name and brand for all rows from the
mobile_phones table.
• No filter is applied, so it returns all records in the table.
Output Example:
phone_name brand
iPhone 14 Apple
OnePlus 9 OnePlus
Poco X3 Xiaomi
o Q.32
Write a SQL query to find the total number of employees in each company.
53
1000+ SQL Interview Questions & Answers | By Zero Analyst
Employees Table:
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(100),
company_id INT,
salary DECIMAL(10, 2),
FOREIGN KEY (company_id) REFERENCES companies(company_id)
);
Data Insertion:
INSERT INTO companies (company_id, company_name) VALUES
(1, 'TechCorp'),
(2, 'HealthInc'),
(3, 'FinanceSolutions'),
(4, 'EduGlobal'),
(5, 'RetailWorld');
Learning:
• To find the total number of employees for each company, you can use
the GROUP BY clause to group the employees by company.
• The COUNT() function will then count the number of employees for
each company.
Answer:
SELECT c.company_name, COUNT(e.employee_id) AS total_employees
FROM companies c
JOIN employees e ON c.company_id = e.company_id
GROUP BY c.company_name;
Explanation:
54
1000+ SQL Interview Questions & Answers | By Zero Analyst
Output Example:
company_name total_employees
TechCorp 2
HealthInc 2
FinanceSolutions 1
EduGlobal 1
RetailWorld 1
o Q.33
List all Indian tech companies with more than 50,000 employees.
Data Insertion:
INSERT INTO tech_companies VALUES
(1, 'Infosys', 25000),
(2, 'TCS', 150000),
(3, 'Wipro', 20000);
Learning:
55
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT company_name, employees
FROM tech_companies
WHERE employees > 50000;
Explanation:
Output Example:
company_name employees
TCS 150000
o Q.34
Data Insertion:
INSERT INTO clients VALUES
(1, 'ABC Corp', '9876543210'),
(2, 'XYZ Ltd.', '9123456780'),
(3, 'Tech Solutions', '8765432100'),
(4, 'Innovatech', '9988776655'),
(5, 'Alpha Industries', '9988123456');
Learning:
56
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT client_name, contact_number
FROM clients;
Explanation:
Output Example:
client_name contact_number
Innovatech 9988776655
o Q.35
Data Insertion:
INSERT INTO manufacturers VALUES
(1, 'Tata Motors', 'Electric Vehicle'),
(2, 'Mahindra', 'Diesel Vehicle'),
(3, 'Reva', 'Electric Vehicle');
Learning:
57
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer:
SELECT manufacturer_name
FROM manufacturers
WHERE product_type = 'Electric Vehicle';
Explanation:
Output Example:
manufacturer_name
Tata Motors
Reva
o Q.36
Data Insertion:
INSERT INTO employees VALUES
58
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
Explanation:
• SELECT emp_id, name, salary: This selects the emp_id, name, and
salary columns from the employees table.
• ORDER BY salary DESC: Orders the employees by salary in
descending order (highest to lowest).
• LIMIT 3: Restricts the results to the top 3 highest-paid employees (for
MySQL/PostgreSQL).
• TOP 3: In SQL Server, the TOP keyword is used to fetch the first 3
rows.
Output Example:
1 Amit 90000
2 Kavya 75000
3 Rahul 60000
59
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query will retrieve the top 3 highest-paid employees from the employees
table.
o Q.37
Data Insertion:
INSERT INTO products VALUES
(1, 'Smartphone', 'Electronics'),
(2, 'T-shirt', 'Clothing'),
(3, 'Laptop', 'Electronics');
Learning:
Answer:
SELECT product_id, product_name, category
FROM products
WHERE category = 'Electronics';
Explanation:
Output Example:
60
1000+ SQL Interview Questions & Answers | By Zero Analyst
1 Smartphone Electronics
3 Laptop Electronics
This query will return all the products that belong to the Electronics category
from the products table.
o Q.38
Employees Table:
CREATE TABLE employees (
employee_id INT,
employee_name VARCHAR(100),
department_id INT
);
Data Insertion:
INSERT INTO departments VALUES
(1, 'HR'),
(2, 'Finance'),
(3, 'IT'),
(4, 'Marketing');
Learning:
Answer:
SELECT e.employee_id, e.employee_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE d.department_name = 'IT';
61
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
Output Example:
employee_id employee_name
2 Nisha Patil
This query will return the list of employees who are working in the IT
department.
o Q.39
Orders Table:
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT
);
Data Insertion:
INSERT INTO customers VALUES
(1, 'Rajesh', 'Bangalore'),
(2, 'Aditi', 'Mumbai');
62
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learning:
Answer:
SELECT o.order_id, c.customer_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE c.city = 'Bangalore';
Explanation:
Output Example:
order_id customer_name
1 Rajesh
o Q.40
Retrieve the names of all employees who are also managers. In other words,
find employees who appear as managers in the manager_id column.
63
1000+ SQL Interview Questions & Answers | By Zero Analyst
Data Insertion:
INSERT INTO employees (emp_id, name, manager_id)
VALUES
(1, 'John Doe', NULL),
(2, 'Jane Smith', 1),
(3, 'Alice Johnson', 1),
(4, 'Bob Brown', 3),
(5, 'Emily White', NULL),
(6, 'Michael Lee', 3),
(7, 'David Clark', NULL),
(8, 'Sarah Davis', 2),
(9, 'Kevin Wilson', 2),
(10, 'Laura Martinez', 4);
Learning:
Answer:
SELECT DISTINCT e.name
FROM employees e
WHERE e.emp_id IN (SELECT DISTINCT manager_id FROM employees WHERE
manager_id IS NOT NULL);
Explanation:
• Medium
o Q.41
Find the total revenue generated by each product category for Flipkart.
Explanation:
64
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Orders table
CREATE TABLE orders (
order_id INT PRIMARY KEY,
product_id INT,
quantity INT,
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
Learnings:
• JOIN: We learn how to use JOIN to combine data from two related
tables based on a common column (product_id).
• GROUP BY: We understand how to group data by a column (in this
case, the category) to perform aggregate operations.
65
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
The solution works the same way in PostgreSQL because the query relies on
standard SQL functions (JOIN, SUM, GROUP BY), which are supported in both
PostgreSQL and MySQL.
-- PostgreSQL Solution
SELECT p.category,
SUM(p.price * o.quantity) AS total_revenue
FROM products p
JOIN orders o ON p.product_id = o.product_id
GROUP BY p.category;
MySQL Solution:
Similarly, the same solution will work in MySQL.
-- MySQL Solution
SELECT p.category,
SUM(p.price * o.quantity) AS total_revenue
FROM products p
JOIN orders o ON p.product_id = o.product_id
GROUP BY p.category;
o Q.42
Explanation:
We need to find the employee with the highest salary in each department. A
subquery with MAX(salary) can help retrieve the highest salary for each
department, and then we can join this result with the original table to get the
employee details.
66
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT e.department, e.name, e.salary
FROM employees e
WHERE e.salary = (
SELECT MAX(salary)
FROM employees
WHERE department = e.department
);
MySQL Solution:
-- MySQL Solution
SELECT e.department, e.name, e.salary
FROM employees e
WHERE e.salary = (
SELECT MAX(salary)
FROM employees
WHERE department = e.department
);
o Q.43
List all customers who placed orders worth more than the average order value
on Swiggy.
Explanation:
We need to find customers who have placed orders with a total_amount
greater than the average order value. First, we calculate the average order
value, then we filter customers whose order values are above this average.
-- Orders table
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
67
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT c.name, o.total_amount
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.total_amount > (
SELECT AVG(total_amount)
FROM orders
);
MySQL Solution:
-- MySQL Solution
SELECT c.name, o.total_amount
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.total_amount > (
SELECT AVG(total_amount)
FROM orders
);
o Q.44
Explanation:
To solve this, we need to:
68
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT city
FROM employees
WHERE company = 'TCS'
GROUP BY city
HAVING COUNT(employee_id) > 3;
MySQL Solution:
-- MySQL Solution
SELECT city
FROM employees
WHERE company = 'TCS'
GROUP BY city
HAVING COUNT(employee_id) > 3;
o Q.45
List all sellers on Amazon who sold more than 5 different products.
69
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
We need to:
-- Products table
CREATE TABLE products (
product_id INT PRIMARY KEY,
seller_id INT,
name VARCHAR(50),
FOREIGN KEY (seller_id) REFERENCES sellers(seller_id)
);
Learnings:
• JOIN: Joining the sellers and products tables to link product sales
to sellers.
• COUNT(DISTINCT): Counting distinct products sold by each seller.
• HAVING: Filtering the results to only show sellers who sold more
than 5 distinct products.
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT s.name
FROM sellers s
JOIN products p ON s.seller_id = p.seller_id
GROUP BY s.seller_id
HAVING COUNT(DISTINCT p.product_id) > 5;
70
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
-- MySQL Solution
SELECT s.name
FROM sellers s
JOIN products p ON s.seller_id = p.seller_id
GROUP BY s.seller_id
HAVING COUNT(DISTINCT p.product_id) > 5;
o Q.46
Explanation:
We need to identify which product has the highest total quantity ordered. To
do this, we can use GROUP BY to group the orders by product and then use
SUM(quantity) to calculate the total quantity for each product. Finally, we
can sort the results and pick the top product.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT product_name
FROM orders
GROUP BY product_name
ORDER BY SUM(quantity) DESC
LIMIT 1;
71
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
-- MySQL Solution
SELECT product_name
FROM orders
GROUP BY product_name
ORDER BY SUM(quantity) DESC
LIMIT 1;
o Q.47
Find all customers who have accounts in both SBI and ICICI.
Explanation:
To solve this, we need to find customers who appear with both 'SBI' and
'ICICI' in the accounts table. This can be done using a JOIN or a GROUP BY
approach with a HAVING clause to ensure that each customer has accounts in
both banks.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT customer_id
FROM accounts
WHERE bank_name IN ('SBI', 'ICICI')
GROUP BY customer_id
HAVING COUNT(DISTINCT bank_name) = 2;
72
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
-- MySQL Solution
SELECT customer_id
FROM accounts
WHERE bank_name IN ('SBI', 'ICICI')
GROUP BY customer_id
HAVING COUNT(DISTINCT bank_name) = 2;
o Q.48
Explanation:
To find the employee(s) with the second-highest salary, we can use a subquery
to first identify the highest salary, then another query to find the maximum
salary that is less than the highest salary. Finally, we can use that result to
filter out the employee(s) with the second-highest salary.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT name, salary
FROM employees
WHERE salary = (
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees)
73
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
MySQL Solution:
-- MySQL Solution
SELECT name, salary
FROM employees
WHERE salary = (
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees)
);
o Q.49
Find all movies released after 2015 with a rating higher than the average rating
of all movies.
Explanation:
To solve this:
2. Then, filter movies that were released after 2015 and have a rating
higher than the average rating.
Learnings:
Solutions:
PostgreSQL Solution:
74
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- PostgreSQL Solution
SELECT name, release_year, rating
FROM movies
WHERE release_year > 2015
AND rating > (SELECT AVG(rating) FROM movies);
MySQL Solution:
-- MySQL Solution
SELECT name, release_year, rating
FROM movies
WHERE release_year > 2015
AND rating > (SELECT AVG(rating) FROM movies);
o Q.50
Find the total number of transactions done per day by Paytm, sorted in
descending order of the number of transactions.
Explanation:
We need to:
Learnings:
75
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT transaction_date, COUNT(transaction_id) AS total_transactions
FROM transactions
WHERE company = 'Paytm'
GROUP BY transaction_date
ORDER BY total_transactions DESC;
MySQL Solution:
-- MySQL Solution
SELECT transaction_date, COUNT(transaction_id) AS total_transactions
FROM transactions
WHERE company = 'Paytm'
GROUP BY transaction_date
ORDER BY total_transactions DESC;
o Q.51
Explanation:
To solve this:
3. Filter out companies whose rank is greater than 3 to get only the top 3
most profitable companies in each industry.
76
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
WITH RankedCompanies AS (
SELECT name, industry, profit,
ROW_NUMBER() OVER (PARTITION BY industry ORDER BY profit DESC)
AS rank
FROM companies
)
SELECT name, industry, profit
FROM RankedCompanies
WHERE rank <= 3;
MySQL Solution:
-- MySQL Solution
WITH RankedCompanies AS (
SELECT name, industry, profit,
ROW_NUMBER() OVER (PARTITION BY industry ORDER BY profit DESC)
AS rank
FROM companies
)
SELECT name, industry, profit
FROM RankedCompanies
WHERE rank <= 3;
o Q.52
Calculate the average revenue and profit for each sector and list sectors where
the average profit exceeds $10 billion.
Explanation:
To solve this:
2. Calculate the average revenue and average profit for each sector
using AVG().
3. Use a HAVING clause to filter the sectors where the average profit
exceeds $10 billion.
77
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT industry,
AVG(revenue) AS avg_revenue,
AVG(profit) AS avg_profit
FROM companies
GROUP BY industry
HAVING AVG(profit) > 10000000000;
MySQL Solution:
-- MySQL Solution
SELECT industry,
AVG(revenue) AS avg_revenue,
AVG(profit) AS avg_profit
FROM companies
GROUP BY industry
HAVING AVG(profit) > 10000000000;
o Q.53
Find the company with the second-highest revenue in the Technology sector.
Explanation:
To solve this:
1. Use a subquery to find the company with the highest revenue in the
Technology sector.
78
1000+ SQL Interview Questions & Answers | By Zero Analyst
2. Exclude this company and then use another query to find the maximum
revenue again to get the second-highest revenue in the same sector.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT name, revenue
FROM companies
WHERE industry = 'Technology'
AND revenue = (
SELECT MAX(revenue)
FROM companies
WHERE industry = 'Technology'
AND revenue < (SELECT MAX(revenue) FROM companies WHERE industry = '
Technology')
);
MySQL Solution:
-- MySQL Solution
SELECT name, revenue
FROM companies
WHERE industry = 'Technology'
AND revenue = (
SELECT MAX(revenue)
FROM companies
WHERE industry = 'Technology'
AND revenue < (SELECT MAX(revenue) FROM companies WHERE industry = '
Technology')
);
79
1000+ SQL Interview Questions & Answers | By Zero Analyst
o Q.54
List all employees of Google who earn above the average salary of employees
in the Technology sector.
Explanation:
To solve this:
2. Then, select the employees from Google whose salary is greater than
the calculated average salary.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT name, salary
FROM employees
WHERE company = 'Google'
AND salary > (
SELECT AVG(salary)
FROM employees
WHERE sector = 'Technology'
);
80
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
-- MySQL Solution
SELECT name, salary
FROM employees
WHERE company = 'Google'
AND salary > (
SELECT AVG(salary)
FROM employees
WHERE sector = 'Technology'
);
o Q.55
Find all companies that generate more than 10% of the total revenue of their
respective industry.
Explanation:
To solve this:
2. Then, for each company, check if their revenue is more than 10% of
the total revenue of their respective industry.
Learnings:
81
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT name, industry, revenue
FROM companies c
WHERE revenue > 0.1 * (
SELECT SUM(revenue)
FROM companies
WHERE industry = c.industry
);
MySQL Solution:
-- MySQL Solution
SELECT name, industry, revenue
FROM companies c
WHERE revenue > 0.1 * (
SELECT SUM(revenue)
FROM companies
WHERE industry = c.industry
);
o Q.56
List all products sold by Amazon that generate more than 15% of Amazon's
total sales.
Explanation:
To solve this:
2. For each product sold by Amazon, check if its sales exceed 15% of the
total sales.
-- Sales table
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
product_id INT,
quantity INT,
82
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• JOIN: Join the products and sales tables to get the sales details for
each product.
• SUM(): Calculate the total sales for each product and the overall total
sales for Amazon.
• Subquery: Use a subquery to calculate Amazon's total sales, and then
filter products whose total sales exceed 15% of this total.
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT p.name,
SUM(s.quantity * p.price) AS product_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE p.company = 'Amazon'
GROUP BY p.product_id, p.name
HAVING SUM(s.quantity * p.price) > 0.15 * (
SELECT SUM(s2.quantity * p2.price)
FROM products p2
JOIN sales s2 ON p2.product_id = s2.product_id
WHERE p2.company = 'Amazon'
);
MySQL Solution:
-- MySQL Solution
SELECT p.name,
SUM(s.quantity * p.price) AS product_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE p.company = 'Amazon'
GROUP BY p.product_id, p.name
HAVING SUM(s.quantity * p.price) > 0.15 * (
SELECT SUM(s2.quantity * p2.price)
FROM products p2
JOIN sales s2 ON p2.product_id = s2.product_id
WHERE p2.company = 'Amazon'
);
o Q.57
83
1000+ SQL Interview Questions & Answers | By Zero Analyst
Find the total number of employees working in each sector and list sectors
with more than 1 million employees.
Explanation:
To solve this:
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT sector, COUNT(employee_id) AS total_employees
FROM employees
GROUP BY sector
HAVING COUNT(employee_id) > 1000000;
MySQL Solution:
-- MySQL Solution
SELECT sector, COUNT(employee_id) AS total_employees
FROM employees
84
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY sector
HAVING COUNT(employee_id) > 1000000;
o Q.58
Explanation:
To solve this:
3. Use ORDER BY to sort the companies by this ratio and then select the
top one.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT name, revenue, employees, (employees / revenue) AS employee_to_reve
nue_ratio
FROM companies
ORDER BY employee_to_revenue_ratio DESC
LIMIT 1;
85
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
-- MySQL Solution
SELECT name, revenue, employees, (employees / revenue) AS employee_to_reve
nue_ratio
FROM companies
ORDER BY employee_to_revenue_ratio DESC
LIMIT 1;
o Q.59
Find the total sales for the top 5 performing products of Apple.
Explanation:
To solve this:
-- Sales table
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
product_id INT,
quantity INT,
FOREIGN KEY (product_id) REFERENCES products(product_id)
);
86
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT p.name,
(s.quantity * p.price) AS total_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE p.company = 'Apple'
ORDER BY total_sales DESC
LIMIT 5;
MySQL Solution:
-- MySQL Solution
SELECT p.name,
(s.quantity * p.price) AS total_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE p.company = 'Apple'
ORDER BY total_sales DESC
LIMIT 5;
o Q.60
List all industries where at least 3 companies have profits above the industry
average.
Explanation:
To solve this:
3. Count the companies per industry: For each industry, count how
many companies have profits greater than the industry average.
87
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
WITH IndustryAvgProfit AS (
SELECT industry, AVG(profit) AS avg_profit
FROM companies
GROUP BY industry
)
SELECT c.industry
FROM companies c
JOIN IndustryAvgProfit iap ON c.industry = iap.industry
WHERE c.profit > iap.avg_profit
GROUP BY c.industry
HAVING COUNT(c.company_id) >= 3;
MySQL Solution:
-- MySQL Solution
WITH IndustryAvgProfit AS (
SELECT industry, AVG(profit) AS avg_profit
FROM companies
GROUP BY industry
)
SELECT c.industry
FROM companies c
JOIN IndustryAvgProfit iap ON c.industry = iap.industry
WHERE c.profit > iap.avg_profit
GROUP BY c.industry
HAVING COUNT(c.company_id) >= 3;
88
1000+ SQL Interview Questions & Answers | By Zero Analyst
o Q.61
Find the year with the highest number of new patents filed by Microsoft.
Explanation:
To solve this:
1. Group patents by year: First, group the patents by filing year for
Microsoft.
2. Count patents per year: For each year, count the number of patents
filed.
3. Identify the year with the maximum count: Use the ORDER BY clause
to order the years based on the count of patents in descending order
and use LIMIT 1 to get the year with the highest number.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT filing_year, COUNT(*) AS num_patents
FROM patents
WHERE company = 'Microsoft'
GROUP BY filing_year
ORDER BY num_patents DESC
LIMIT 1;
89
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
-- MySQL Solution
SELECT filing_year, COUNT(*) AS num_patents
FROM patents
WHERE company = 'Microsoft'
GROUP BY filing_year
ORDER BY num_patents DESC
LIMIT 1;
o Q.62
List all companies whose profit margin (profit/revenue) exceeds the average
margin across all companies.
Explanation:
To solve this:
1. Calculate the profit margin: For each company, calculate the profit
margin as the ratio of profit to revenue.
Learnings:
90
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
WITH AverageMargin AS (
SELECT AVG(profit / revenue) AS avg_margin
FROM companies
)
SELECT name
FROM companies
WHERE (profit / revenue) > (SELECT avg_margin FROM AverageMargin);
MySQL Solution:
-- MySQL Solution
WITH AverageMargin AS (
SELECT AVG(profit / revenue) AS avg_margin
FROM companies
)
SELECT name
FROM companies
WHERE (profit / revenue) > (SELECT avg_margin FROM AverageMargin);
o Q.63
Identify the city where Tesla has the maximum number of sales.
Explanation:
To solve this:
1. Group sales by city: First, group the sales data by city for Tesla.
2. Sum units sold per city: Calculate the total number of units sold in
each city.
3. Identify the city with the maximum sales: Use ORDER BY to sort the
cities based on total units sold in descending order and limit the result
to the top city.
91
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT city
FROM sales
WHERE company = 'Tesla'
GROUP BY city
ORDER BY SUM(units_sold) DESC
LIMIT 1;
MySQL Solution:
-- MySQL Solution
SELECT city
FROM sales
WHERE company = 'Tesla'
GROUP BY city
ORDER BY SUM(units_sold) DESC
LIMIT 1;
o Q.64
List all employees who earn above the 90th percentile in their company.
Explanation:
To solve this:
92
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
-- Sample data for employees with their respective companies and salaries
INSERT INTO employees (employee_id, name, company, salary)
VALUES
(1, 'Alice', 'Apple', 200000.00),
(2, 'Bob', 'Apple', 180000.00),
(3, 'Charlie', 'Apple', 150000.00),
(4, 'Dave', 'Apple', 250000.00),
(5, 'Eve', 'Google', 220000.00),
(6, 'Frank', 'Google', 190000.00),
(7, 'Grace', 'Google', 170000.00),
(8, 'Hank', 'Google', 210000.00);
Learnings:
Solutions:
PostgreSQL Solution:
PostgreSQL has built-in support for percentile calculation using
PERCENTILE_CONT():
-- PostgreSQL Solution
WITH Percentiles AS (
SELECT
company,
name,
salary,
PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY salary) OVER (PARTITIO
N BY company) AS percentile_90
FROM employees
)
SELECT name, company, salary
FROM Percentiles
WHERE salary > percentile_90;
MySQL Solution:
MySQL doesn’t have PERCENTILE_CONT(), but you can simulate this with
window functions or by using NTILE():
-- MySQL Solution
SELECT e.name, e.company, e.salary
FROM (
SELECT
e.name,
e.company,
e.salary,
NTILE(100) OVER (PARTITION BY e.company ORDER BY e.salary DESC) AS
percentile_rank
FROM employees e
) AS RankedEmployees
WHERE percentile_rank > 90;
93
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
Rank the top 3 industries by total profit and list the companies contributing to
those profits.
Explanation:
To solve this:
2. Sum profits: Calculate the total profit for each industry by summing
the profits of all companies in the industry.
Learnings:
94
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
WITH IndustryProfit AS (
SELECT industry, SUM(profit) AS total_profit
FROM companies
GROUP BY industry
ORDER BY total_profit DESC
LIMIT 3
)
SELECT c.industry, c.name, c.profit
FROM companies c
JOIN IndustryProfit ip ON c.industry = ip.industry
ORDER BY ip.total_profit DESC, c.profit DESC;
MySQL Solution:
-- MySQL Solution
WITH IndustryProfit AS (
SELECT industry, SUM(profit) AS total_profit
FROM companies
GROUP BY industry
ORDER BY total_profit DESC
LIMIT 3
)
SELECT c.industry, c.name, c.profit
FROM companies c
JOIN IndustryProfit ip ON c.industry = ip.industry
ORDER BY ip.total_profit DESC, c.profit DESC;
o Q.66
Find the company that has the highest revenue per employee in the Retail
sector.
Explanation:
To solve this:
3. Identify the company with the highest revenue per employee: Use
ORDER BY to sort the companies based on the revenue per employee in
descending order and select the top company.
95
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT name, (revenue / employees) AS revenue_per_employee
FROM companies
WHERE sector = 'Retail'
ORDER BY revenue_per_employee DESC
LIMIT 1;
MySQL Solution:
-- MySQL Solution
SELECT name, (revenue / employees) AS revenue_per_employee
FROM companies
WHERE sector = 'Retail'
ORDER BY revenue_per_employee DESC
LIMIT 1;
o Q.67
Identify the quarter in which Apple generated its highest revenue for 2024.
Explanation:
To solve this:
1. Filter the data for Apple: We are only interested in Apple's revenue.
96
1000+ SQL Interview Questions & Answers | By Zero Analyst
2. Find the maximum revenue: Identify the highest revenue value for
Apple across the quarters.
Learnings:
Solutions:
PostgreSQL Solution:
-- PostgreSQL Solution
SELECT quarter, revenue
FROM quarterly_revenue
WHERE company = 'Apple'
ORDER BY revenue DESC
LIMIT 1;
MySQL Solution:
-- MySQL Solution
SELECT quarter, revenue
FROM quarterly_revenue
WHERE company = 'Apple'
ORDER BY revenue DESC
LIMIT 1;
o Q.68
Identify products from Amazon that had declining sales over the last 3
quarters.
Explanation:
97
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
Learnings:
• Self Join: You can join the table to itself to compare current quarter's
sales with previous quarters.
• Window Function (PostgreSQL): LAG() can help compare each
product's revenue with the previous quarter.
• Sales Decline: To detect a decline, check if the revenue in each quarter
is less than the previous quarter's revenue.
Solutions:
PostgreSQL Solution:
PostgreSQL has the LAG() function, which allows you to access the value of a
previous row within a window.
-- PostgreSQL Solution
WITH RevenueComparison AS (
98
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
company,
product_name,
quarter,
revenue,
LAG(revenue, 1) OVER (PARTITION BY company, product_name ORDER BY
quarter) AS prev_quarter_revenue,
LAG(revenue, 2) OVER (PARTITION BY company, product_name ORDER BY
quarter) AS prev_quarter2_revenue
FROM quarterly_revenue
WHERE company = 'Amazon'
)
SELECT
company,
product_name
FROM RevenueComparison
WHERE revenue < prev_quarter_revenue AND prev_quarter_revenue < prev_quart
er2_revenue
GROUP BY company, product_name;
MySQL Solution:
In MySQL, you can use LAG() or JOIN to compare revenue for each product
across quarters.
-- MySQL Solution (Using LAG function in MySQL 8.0+)
WITH RevenueComparison AS (
SELECT
company,
product_name,
quarter,
revenue,
LAG(revenue, 1) OVER (PARTITION BY company, product_name ORDER BY
quarter) AS prev_quarter_revenue,
LAG(revenue, 2) OVER (PARTITION BY company, product_name ORDER BY
quarter) AS prev_quarter2_revenue
FROM quarterly_revenue
WHERE company = 'Amazon'
)
SELECT
company,
product_name
FROM RevenueComparison
WHERE revenue < prev_quarter_revenue AND prev_quarter_revenue < prev_quart
er2_revenue
GROUP BY company, product_name;
Explanation:
o Q.69
Find the total revenue and profit for each company for the last 5 years, sorted
by profit in descending order.
99
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
This query involves calculating the total revenue and profit for each company
over a period of time (the last 5 years). The goal is to sort the results by profit
in descending order. The data provided is assumed to be for quarterly revenue,
and we will also include profit calculation (assumed to be 20% of revenue).
SQL Statement:
SELECT
company,
SUM(revenue) AS total_revenue,
SUM(profit) AS total_profit
FROM
quarterly_revenue
GROUP BY
company
ORDER BY
total_profit DESC;
Learnings:
Solutions:
100
1000+ SQL Interview Questions & Answers | By Zero Analyst
List all companies whose revenue grew by more than 10% year-over-year
consistently for the past 3 years.
Explanation:
To identify companies whose revenue grew by more than 10% year-over-year
for the past 3 years, we need to calculate the year-over-year growth for each
company. We then check for those that consistently showed a growth rate
greater than 10% for each of the past three years.
Assumptions:
1. The data is broken down into quarters. For simplicity, we assume there
is sufficient data for the past 3 years, ideally having 4 quarters per
year.
SQL Statement:
WITH year_over_year_growth AS (
SELECT
a.company,
a.quarter,
a.revenue AS current_revenue,
b.revenue AS previous_year_revenue,
((a.revenue - b.revenue) / b.revenue) * 100 AS growth_percentage
FROM
quarterly_revenue a
JOIN
quarterly_revenue b
ON
a.company = b.company
101
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
Learnings:
• Self Join: The query uses a self-join to compare each quarter with the
same quarter in the previous year.
• Growth Calculation: The growth is calculated as
(current_year_revenue - previous_year_revenue) /
previous_year_revenue * 100.
• Filtering Consistent Growth: The query ensures that all four quarters
of each year have a growth rate of more than 10%.
• Hard
o Q.71
Assumptions:
Data:
CREATE TABLE performance (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(100),
department VARCHAR(50),
performance_score INT
);
102
1000+ SQL Interview Questions & Answers | By Zero Analyst
SQL Statement:
SELECT
employee_id,
employee_name,
department,
performance_score,
RANK() OVER (PARTITION BY department ORDER BY performance_score DESC)
AS rank
FROM
performance;
Explanation:
Output (Example):
3 Emma Wilson IT 95 1
4 Liam Brown IT 89 2
103
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The Finance department has a ranking for each employee with the
highest performer (Alice Smith) ranked 1.
Notes:
• If two employees have the same score, they will receive the same rank,
and the next rank will be skipped (e.g., if two people tie for rank 1, the
next rank will be 3, not 2).
o Q.72
Write a SQL query to find the customer who made the most recent order.
Explanation:
To solve this problem, you need to find the customer who made the most
recent order. The solution requires joining the customers and orders tables,
then identifying the most recent order by selecting the maximum order_date.
The query will then return the customer who made that order.
Learnings:
104
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
SELECT c.customer_name
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date = (SELECT MAX(order_date) FROM orders);
MySQL Solution:
SELECT c.customer_name
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date = (SELECT MAX(order_date) FROM orders);
In both PostgreSQL and MySQL, this query retrieves the customer who placed
the most recent order by matching the order_date to the maximum
order_date found in the orders table.
o Q.73
List all customers who have made purchases in all product categories and the
total amount they spent.
Explanation:
To solve this, you need to:
You can use a GROUP BY clause to aggregate the total amount spent per
customer. Additionally, you need to ensure that the customer has purchased
products from all available categories, which can be verified by checking that
the count of distinct categories a customer has bought products from matches
the total number of product categories.
105
1000+ SQL Interview Questions & Answers | By Zero Analyst
customer_id INT,
product_id INT
);
Learnings:
Solutions:
PostgreSQL Solution:
SELECT c.customer_name, SUM(p.price) AS total_spent
FROM customers c
JOIN purchases pu ON c.customer_id = pu.customer_id
JOIN products p ON pu.product_id = p.product_id
GROUP BY c.customer_name
HAVING COUNT(DISTINCT p.category_id) = (SELECT COUNT(DISTINCT category_id)
FROM products);
MySQL Solution:
SELECT c.customer_name, SUM(p.price) AS total_spent
FROM customers c
JOIN purchases pu ON c.customer_id = pu.customer_id
JOIN products p ON pu.product_id = p.product_id
GROUP BY c.customer_name
HAVING COUNT(DISTINCT p.category_id) = (SELECT COUNT(DISTINCT category_id)
FROM products);
Explanation of Solutions:
106
1000+ SQL Interview Questions & Answers | By Zero Analyst
o Q.74
Identify the top 3 employees with the highest salaries within each department
at PwC.
Explanation:
To solve this, you need to identify the top 3 employees with the highest
salaries within each department. You can use a window function such as
ROW_NUMBER() to assign a rank to employees within each department, ordered
by salary in descending order. Then, filter out employees who have a rank
greater than 3.
Learnings:
Solutions:
PostgreSQL Solution:
WITH RankedEmployees AS (
SELECT employee_id, employee_name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC
) AS rank
107
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM employees
)
SELECT employee_id, employee_name, department, salary
FROM RankedEmployees
WHERE rank <= 3;
MySQL Solution:
In MySQL 8.0 and later, you can use the same approach with window
functions:
WITH RankedEmployees AS (
SELECT employee_id, employee_name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC
) AS rank
FROM employees
)
SELECT employee_id, employee_name, department, salary
FROM RankedEmployees
WHERE rank <= 3;
Explanation of Solutions:
This query works in both PostgreSQL and MySQL (8.0+), which support
window functions.
o Q.75
Determine the total number of unique suppliers used by both Barclays and
HSBC in the same year.
Explanation:
To solve this problem, you need to find the suppliers who have contracts with
both Barclays and HSBC in the same year. This involves:
108
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
SELECT COUNT(DISTINCT c.supplier_id) AS unique_suppliers
FROM contracts c
JOIN suppliers s ON c.supplier_id = s.supplier_id
WHERE c.company IN ('Barclays', 'HSBC')
GROUP BY c.supplier_id, EXTRACT(YEAR FROM c.contract_date)
HAVING COUNT(DISTINCT c.company) = 2;
MySQL Solution:
SELECT COUNT(DISTINCT c.supplier_id) AS unique_suppliers
FROM contracts c
JOIN suppliers s ON c.supplier_id = s.supplier_id
WHERE c.company IN ('Barclays', 'HSBC')
GROUP BY c.supplier_id, YEAR(c.contract_date)
HAVING COUNT(DISTINCT c.company) = 2;
Explanation of Solutions:
109
1000+ SQL Interview Questions & Answers | By Zero Analyst
This solution works for both PostgreSQL and MySQL, with slight variations
in date extraction functions.
o Q.76
Write a SQL query to find customers who have ordered more than once and
their total spending.
Explanation:
To solve this, you need to:
You can use GROUP BY to aggregate orders by customer, use HAVING to filter
customers who have ordered more than once, and SUM() to calculate the total
spending.
110
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
PostgreSQL Solution:
SELECT c.customer_name, SUM(o.amount) AS total_spending
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
HAVING COUNT(o.order_id) > 1;
MySQL Solution:
SELECT c.customer_name, SUM(o.amount) AS total_spending
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
HAVING COUNT(o.order_id) > 1;
Explanation of Solutions:
This query works identically in both PostgreSQL and MySQL. It returns the
customers who have placed multiple orders and the total amount they have
spent.
o Q.77
Find the top 5 products with the highest average rating in each category,
including their manufacturer and the number of reviews.
Explanation:
To solve this, you need to:
111
1000+ SQL Interview Questions & Answers | By Zero Analyst
4. Return the top 5 products with the highest average rating per
category.
You can use window functions like ROW_NUMBER() or RANK() to rank the
products in each category based on their average ratings and limit the result to
the top 5 in each category.
Learnings:
Solutions:
PostgreSQL Solution:
112
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH ProductRatings AS (
SELECT p.product_id, p.product_name, p.category_id, p.manufacturer,
AVG(r.rating) AS avg_rating, COUNT(r.review_id) AS num_reviews,
RANK() OVER (PARTITION BY p.category_id ORDER BY AVG(r.rating)
DESC) AS rank
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name, p.category_id, p.manufacturer
)
SELECT product_id, product_name, category_id, manufacturer, avg_rating, nu
m_reviews
FROM ProductRatings
WHERE rank <= 5
ORDER BY category_id, rank;
MySQL Solution:
WITH ProductRatings AS (
SELECT p.product_id, p.product_name, p.category_id, p.manufacturer,
AVG(r.rating) AS avg_rating, COUNT(r.review_id) AS num_reviews,
RANK() OVER (PARTITION BY p.category_id ORDER BY AVG(r.rating)
DESC) AS rank
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name, p.category_id, p.manufacturer
)
SELECT product_id, product_name, category_id, manufacturer, avg_rating, nu
m_reviews
FROM ProductRatings
WHERE rank <= 5
ORDER BY category_id, rank;
Explanation of Solutions:
This solution works for both PostgreSQL and MySQL (8.0+), which support
window functions like RANK(). It returns the top 5 products in each category
along with their average rating, manufacturer, and the number of reviews.
o Q.78
Find the top 5 products with the highest average rating in each category,
including their manufacturer and the number of reviews.
Explanation:
To solve this, you need to:
113
1000+ SQL Interview Questions & Answers | By Zero Analyst
You can use window functions like RANK() or ROW_NUMBER() to rank the
products based on their average ratings. The products should be ordered by
rating in descending order, and you can use PARTITION BY to rank them
separately within each category.
Learnings:
Solutions:
114
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL Solution:
WITH ProductRatings AS (
SELECT p.product_id, p.product_name, p.category_id, p.manufacturer,
AVG(r.rating) AS avg_rating, COUNT(r.review_id) AS num_reviews,
RANK() OVER (PARTITION BY p.category_id ORDER BY AVG(r.rating)
DESC) AS rank
FROM products p
LEFT JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name, p.category_id, p.manufacturer
)
SELECT product_id, product_name, category_id, manufacturer, avg_rating, nu
m_reviews
FROM ProductRatings
WHERE rank <= 5
ORDER BY category_id, rank;
MySQL Solution:
WITH ProductRatings AS (
SELECT p.product_id, p.product_name, p.category_id, p.manufacturer,
AVG(r.rating) AS avg_rating, COUNT(r.review_id) AS num_reviews,
RANK() OVER (PARTITION BY p.category_id ORDER BY AVG(r.rating)
DESC) AS rank
FROM products p
LEFT JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name, p.category_id, p.manufacturer
)
SELECT product_id, product_name, category_id, manufacturer, avg_rating, nu
m_reviews
FROM ProductRatings
WHERE rank <= 5
ORDER BY category_id, rank;
Explanation of Solutions:
This solution works for both PostgreSQL and MySQL 8.0+, which support
window functions like RANK(). The query returns the top 5 products with the
highest average rating in each category, including their manufacturer and the
number of reviews.
115
1000+ SQL Interview Questions & Answers | By Zero Analyst
o Q.79
Find all employees who have not taken any training sessions in the last year
and the number of projects they are currently assigned to.
Explanation:
To solve this, you need to:
1. Identify employees who have not taken any training sessions in the
last year. This involves filtering out employees who have a
training_date within the last 12 months.
3. Use LEFT JOIN to ensure all employees are included, even those
without any training sessions or project assignments.
4. Use WHERE and NOT EXISTS to filter out employees who have training
sessions within the last year.
Learnings:
116
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
PostgreSQL Solution:
SELECT e.employee_name, COUNT(p.project_id) AS num_projects
FROM employees e
LEFT JOIN training_sessions t ON e.employee_id = t.employee_id
AND t.training_date >= CURRENT_DATE - INTERVAL '1 year'
LEFT JOIN projects p ON e.employee_id = p.employee_id
WHERE t.session_id IS NULL -- No training session in the last year
GROUP BY e.employee_id, e.employee_name
ORDER BY e.employee_name;
MySQL Solution:
SELECT e.employee_name, COUNT(p.project_id) AS num_projects
FROM employees e
LEFT JOIN training_sessions t ON e.employee_id = t.employee_id
AND t.training_date >= CURDATE() - INTERVAL 1 YEAR
LEFT JOIN projects p ON e.employee_id = p.employee_id
WHERE t.session_id IS NULL -- No training session in the last year
GROUP BY e.employee_id, e.employee_name
ORDER BY e.employee_name;
Explanation of Solutions:
Key Points:
117
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query will give you the list of employees who haven't taken any training
in the past year along with the number of projects they are assigned to.
o Q.80
Write an SQL query to find the name of the product with the highest price in
each country.
Explanation:
To solve this problem, the task is to:
118
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Joins: You need to join two tables: Product and Supplier using
supplier_id to associate products with suppliers and their countries.
• Aggregation: Use MAX() or window functions to find the highest-
priced product for each country.
• Group By: Group the results by country to get one entry per country.
• Subqueries: To get the highest-priced product per country, you may
need a subquery or a window function to filter the product with the
highest price for each country.
Solutions:
119
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM products p
JOIN suppliers s ON p.supplier_id = s.supplier_id
)
SELECT product_name, country, price
FROM MaxPriceProducts
WHERE rank = 1
ORDER BY country;
Explanation of Solution:
3. Final SELECT:
• The query retrieves the product_name, country, and price for
the top product in each country.
4. Sorting:
• The results are ordered by country for a clearer output.
Expected Output:
The query will return the name of the product with the highest price in each
country along with the price:
o Q.81
Write an SQL query to calculate the difference between the highest salaries in
the marketing and engineering departments. Output the absolute difference in
salaries.
120
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
To solve this problem, we need to:
Learnings:
Solutions:
Explanation of Solution:
1. Subqueries:
121
1000+ SQL Interview Questions & Answers | By Zero Analyst
2. ABS() function:
• The ABS() function ensures that the result is always positive,
regardless of which department has the higher salary.
3. Final Output:
• The result will be the absolute difference between the highest
salaries in the Engineering and Marketing departments.
Expected Output:
The query will return the absolute difference between the highest salaries in
the two departments.
salary_difference
27000
This query works efficiently for both PostgreSQL and MySQL and provides
the correct absolute difference in salaries.
o Q.82
Write an SQL query to find the average order amount for male and female
customers separately. Return the results with 2 decimal points.
Explanation:
To solve this problem:
1. Join the customers table with the orders table on the customer_id
column.
2. Group the data by gender to calculate the average order amount for
male and female customers separately.
3. Use the AVG() function to compute the average order amount for each
gender.
122
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• JOIN operation: Using JOIN to combine data from two tables based
on a common key (customer_id).
• AVG() function: Computing the average of numerical data.
• GROUP BY: Grouping the result by gender to calculate averages for
each group.
• ROUND() function: Formatting the result to a specified number of
decimal places.
Solutions:
123
1000+ SQL Interview Questions & Answers | By Zero Analyst
c.gender;
Explanation of Solution:
1. JOIN: The JOIN clause is used to combine the customers table (c)
with the orders table (o) on the customer_id column.
4. GROUP BY: The query groups the data by the gender column from
the customers table, so we calculate the average for each gender
separately.
Expected Output:
gender avg_order_amount
Male 226.18
Female 191.40
This query works in both PostgreSQL and MySQL and provides the correct
average order amounts for male and female customers, rounded to two
decimal places.
o Q.83
Write an SQL query to obtain the third transaction of every user. Output the
user id, spend, and transaction date.
Explanation:
To retrieve the third transaction for each user, the approach involves:
2. Filtering the results to only get the third transaction for each user.
124
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
Explanation of Solution:
3. Filtering the Third Transaction: The final query selects only the
rows where row_num = 3, which gives us the third transaction for each
user.
125
1000+ SQL Interview Questions & Answers | By Zero Analyst
Expected Output:
• The result shows the third transaction for user 111, which is the
transaction on 2022-02-05 with a spend of 89.60.
Notes:
This solution works well for both PostgreSQL and MySQL with window
function support.
o Q.84
Find the top 5 products whose revenue has decreased in comparison to the
previous year (both 2022 and 2023). Return the product name, revenue for the
previous year, revenue for the current year, revenue decreased, and the
decreased ratio (percentage).
Explanation:
We need to:
4. Sort the results by the largest decrease and return the top 5 products.
126
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
Explanation of Solution:
127
1000+ SQL Interview Questions & Answers | By Zero Analyst
4. Filtering: The WHERE clause ensures that only products with a revenue
decrease in 2023 compared to 2022 are included.
Expected Output:
Notes:
o Q.85
Write a query that calculates the total viewership for laptops and mobile
devices, where mobile is defined as the sum of tablet and phone viewership.
Output the total viewership for laptops as laptop_views and the total
viewership for mobile devices as mobile_views.
Explanation:
We need to:
128
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
Solutions:
Explanation:
129
1000+ SQL Interview Questions & Answers | By Zero Analyst
Expected Output:
laptop_views mobile_views
16500 34500
Notes:
o Q.86
Write a query to identify the top two highest-grossing products within each
category in the year 2022. The output should include the category, product,
and total spend.
Explanation
To solve this problem, you need to calculate the total spend per product within
each category for the year 2022. Then, you need to rank the products by their
total spend within each category and select the top two highest-grossing
products per category. You can achieve this by using the RANK() or
ROW_NUMBER() window function, along with filtering by year and summing
the spend for each product.
130
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL solution:
WITH total_spend_per_product AS (
SELECT
category,
product,
SUM(spend) AS total_spend
FROM
product_spend
WHERE
EXTRACT(YEAR FROM transaction_date) = 2022
GROUP BY
category, product
),
ranked_products AS (
SELECT
category,
product,
total_spend,
RANK() OVER (PARTITION BY category ORDER BY total_spend DESC) AS r
ank
FROM
total_spend_per_product
)
SELECT
category,
product,
total_spend
FROM
ranked_products
WHERE
rank <= 2
ORDER BY
category, rank;
MySQL solution:
WITH total_spend_per_product AS (
SELECT
category,
product,
SUM(spend) AS total_spend
FROM
product_spend
WHERE
YEAR(transaction_date) = 2022
GROUP BY
category, product
),
ranked_products AS (
SELECT
category,
product,
total_spend,
131
1000+ SQL Interview Questions & Answers | By Zero Analyst
3. Filtering: The WHERE rank <= 2 filters to return only the top two
products per category.
o Q.87
Write a query to obtain a histogram of tweets posted per user in 2022. The
output should include the tweet count per user as the bucket and the number of
Twitter users who fall into that bucket.
Explanation
To solve this problem, we need to calculate the total number of tweets posted
by each user in the year 2022, then create a histogram showing how many
users fall into each tweet count "bucket". This can be done by grouping the
tweets by user and counting the number of tweets per user, then counting how
many users fall into each bucket of tweet counts.
132
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL solution:
WITH tweet_counts AS (
SELECT
user_id,
COUNT(tweet_id) AS tweet_count
FROM
tweets
WHERE
EXTRACT(YEAR FROM tweet_date) = 2022
GROUP BY
user_id
)
SELECT
tweet_count AS tweet_bucket,
COUNT(*) AS user_count
FROM
tweet_counts
GROUP BY
tweet_count
ORDER BY
tweet_bucket;
MySQL solution:
WITH tweet_counts AS (
SELECT
user_id,
COUNT(tweet_id) AS tweet_count
FROM
tweets
WHERE
YEAR(tweet_date) = 2022
GROUP BY
user_id
)
SELECT
tweet_count AS tweet_bucket,
COUNT(*) AS user_count
FROM
tweet_counts
GROUP BY
133
1000+ SQL Interview Questions & Answers | By Zero Analyst
tweet_count
ORDER BY
tweet_bucket;
2. Main query: The main query then groups these tweet counts
(tweet_count) and counts how many users have a particular number
of tweets (i.e., how many users fall into each "bucket").
4. Grouping and ordering: The final result groups by the tweet count
and orders the output by the tweet count to generate the histogram.
o Q.88
Write a query to find the employees who are high earners in each of the
company's departments. A high earner in a department is an employee who
has a salary in the top three unique salaries for that department.
Explanation
To solve this problem, you need to find the employees who have one of the
top three unique salaries within each department. This can be achieved by first
ranking employees within each department based on their salary in descending
order, then filtering to keep only those with the top three distinct salary values.
The solution requires handling of ties and ensuring that the "top three" salaries
are unique.
134
1000+ SQL Interview Questions & Answers | By Zero Analyst
salary INT,
departmentId INT,
FOREIGN KEY (departmentId) REFERENCES Department(id)
);
Learnings
Solutions
PostgreSQL solution:
WITH ranked_salaries AS (
SELECT
e.id AS employee_id,
e.name AS employee_name,
e.salary,
e.departmentId,
DENSE_RANK() OVER (PARTITION BY e.departmentId ORDER BY e.salary D
ESC) AS salary_rank
FROM
Employee e
)
SELECT
d.name AS department_name,
rs.employee_name,
rs.salary
FROM
ranked_salaries rs
JOIN
Department d ON rs.departmentId = d.id
WHERE
rs.salary_rank <= 3
ORDER BY
d.name, rs.salary DESC;
135
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL solution:
WITH ranked_salaries AS (
SELECT
e.id AS employee_id,
e.name AS employee_name,
e.salary,
e.departmentId,
DENSE_RANK() OVER (PARTITION BY e.departmentId ORDER BY e.salary D
ESC) AS salary_rank
FROM
Employee e
)
SELECT
d.name AS department_name,
rs.employee_name,
rs.salary
FROM
ranked_salaries rs
JOIN
Department d ON rs.departmentId = d.id
WHERE
rs.salary_rank <= 3
ORDER BY
d.name, rs.salary DESC;
o Q.89
Write an SQL query to find, for each month and country, the number of
transactions and their total amount, as well as the number of approved
transactions and their total amount.
The result should be ordered by country and month.
Explanation
You need to calculate the number of transactions and their total amount for
each month and country, as well as the number of approved transactions and
their total amount. This requires grouping the transactions by month and
136
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date manipulation: You need to extract the month and year from the
transaction date to group by month.
• Conditional aggregation: Using CASE WHEN to conditionally count
and sum the approved transactions.
• Grouping: Group by both country and the month/year extracted from
the transaction date.
• Sorting: Sorting the result by country and month.
Solutions
PostgreSQL solution:
SELECT
country,
TO_CHAR(trans_date, 'YYYY-MM') AS month, -- Format the date to year-m
onth
COUNT(*) AS total_transactions, -- Total number of transactions
SUM(amount) AS total_amount, -- Total amount of all transactions
COUNT(CASE WHEN state = 'approved' THEN 1 END) AS approved_transaction
s, -- Approved transactions count
SUM(CASE WHEN state = 'approved' THEN amount END) AS approved_amount
-- Approved transactions total amount
FROM
Transactions
GROUP BY
country, TO_CHAR(trans_date, 'YYYY-MM')
ORDER BY
country, month;
MySQL solution:
SELECT
country,
DATE_FORMAT(trans_date, '%Y-%m') AS month, -- Format the date to year
-month
COUNT(*) AS total_transactions, -- Total number of transactions
SUM(amount) AS total_amount, -- Total amount of all transactions
COUNT(CASE WHEN state = 'approved' THEN 1 END) AS approved_transaction
s, -- Approved transactions count
SUM(CASE WHEN state = 'approved' THEN amount END) AS approved_amount
-- Approved transactions total amount
137
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM
Transactions
GROUP BY
country, DATE_FORMAT(trans_date, '%Y-%m')
ORDER BY
country, month;
2. Conditional aggregation:
• COUNT(CASE WHEN state = 'approved' THEN 1 END)
counts the number of approved transactions.
• SUM(CASE WHEN state = 'approved' THEN amount END)
sums the amounts for approved transactions.
o Q.90
Given the reviews table, write a query to retrieve the average star rating for
each product, grouped by month. The output should display:
The result should be sorted first by month and then by product ID.
Explanation
To solve this problem:
1. We need to extract the month and year from the submit_date column
to group the reviews by month.
2. We calculate the average star rating for each product within each
month.
4. Finally, we sort the results first by month and then by product ID.
138
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL solution:
SELECT
EXTRACT(MONTH FROM submit_date) AS month, -- Extract the month from t
he date
product_id,
ROUND(AVG(stars)::numeric, 2) AS avg_star_rating -- Calculate and rou
nd the average star rating
FROM
reviews
GROUP BY
EXTRACT(MONTH FROM submit_date), product_id
ORDER BY
139
1000+ SQL Interview Questions & Answers | By Zero Analyst
month, product_id;
MySQL solution:
SELECT
MONTH(submit_date) AS month, -- Extract the month from the date
product_id,
ROUND(AVG(stars), 2) AS avg_star_rating -- Calculate and round the av
erage star rating
FROM
reviews
GROUP BY
MONTH(submit_date), product_id
ORDER BY
month, product_id;
1. Month extraction:
• In PostgreSQL, EXTRACT(MONTH FROM submit_date) extracts
the month part of the submit_date.
• In MySQL, MONTH(submit_date) does the same.
4. Sorting: The ORDER BY clause ensures the results are ordered first by
month and then by product ID.
o Q.91
Identify users who have made purchases totaling more than $10,000 in the last
month from the purchases table. The table contains information about
purchases, including the user ID, date of purchase, product ID, and the amount
spent.
Explanation
To solve this:
1. We need to filter the records to include only purchases made in the last
month.
2. Sum the total amount spent by each user during this period.
3. Select only those users whose total amount spent exceeds $10,000.
140
1000+ SQL Interview Questions & Answers | By Zero Analyst
4. Ensure that the query dynamically calculates "last month" based on the
current date.
Learnings
Solutions
PostgreSQL solution:
SELECT
user_id,
SUM(amount_spent) AS total_spent
FROM
purchases
WHERE
date_of_purchase >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '1 mo
nth' -- Start of last month
AND date_of_purchase < DATE_TRUNC('month', CURRENT_DATE) -- End of la
st month
GROUP BY
user_id
HAVING
SUM(amount_spent) > 10000 -- Only users who spent more than $10,000
ORDER BY
user_id;
MySQL solution:
141
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
user_id,
SUM(amount_spent) AS total_spent
FROM
purchases
WHERE
date_of_purchase >= CURDATE() - INTERVAL 1 MONTH -- Start of last mon
th
AND date_of_purchase < CURDATE() -- End of last month
GROUP BY
user_id
HAVING
SUM(amount_spent) > 10000 -- Only users who spent more than $10,000
ORDER BY
user_id;
1. Date filtering:
• In PostgreSQL, DATE_TRUNC('month', CURRENT_DATE) -
INTERVAL '1 month' gives the start of the previous month.
• In MySQL, CURDATE() - INTERVAL 1 MONTH gives the start
of the previous month.
• The end of the last month is calculated by using
DATE_TRUNC('month', CURRENT_DATE) in PostgreSQL and
CURDATE() in MySQL.
3. HAVING: The HAVING clause is used to filter out users whose total
spend is less than or equal to $10,000.
o Q.92
Given the data on IBM employees, write a query to find the average duration
of service for employees across different departments. The duration of service
is calculated as end_date - start_date. If end_date is NULL, consider it
as the current date.
Explanation
To solve this:
142
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL solution:
SELECT
department,
AVG(
CASE
WHEN end_date IS NULL THEN CURRENT_DATE - start_date -- Use c
urrent date if end_date is NULL
143
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL solution:
SELECT
department,
AVG(
CASE
WHEN end_date IS NULL THEN DATEDIFF(CURDATE(), start_date) --
Use current date if end_date is NULL
ELSE DATEDIFF(end_date, start_date)
END
) AS avg_duration_of_service
FROM
employee_service
GROUP BY
department
ORDER BY
department;
1. Date calculation:
• PostgreSQL: CURRENT_DATE - start_date calculates the
duration from start_date to the current date when end_date
is NULL.
• MySQL: DATEDIFF(CURDATE(), start_date) calculates the
difference in days between the start_date and the current
date if end_date is NULL.
• When end_date is not NULL, the difference between
end_date and start_date is calculated directly.
This solution computes the average service duration for employees in each
department, accounting for those whose service is ongoing. Let me know if
you need further explanations or adjustments!
144
1000+ SQL Interview Questions & Answers | By Zero Analyst
o Q.93
Write a query to identify the top 3 posts with the highest engagement (likes +
comments) for each user on a Facebook page. Display the user ID, post ID,
engagement count, and rank for each post.
Explanation
To solve this:
1. Calculate the total engagement for each post by summing likes and
comments.
2. Rank the posts for each user based on the total engagement.
3. For each user, select the top 3 posts with the highest engagement.
4. The result should include the user ID, post ID, engagement count (likes
+ comments), and rank for each post.
Learnings
145
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL solution:
WITH ranked_posts AS (
SELECT
user_id,
post_id,
(likes + comments) AS engagement,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY (likes + comments
) DESC) AS rank
FROM
fb_posts
)
SELECT
user_id,
post_id,
engagement,
rank
FROM
ranked_posts
WHERE
rank <= 3
ORDER BY
user_id, rank;
MySQL solution:
WITH ranked_posts AS (
SELECT
user_id,
post_id,
(likes + comments) AS engagement,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY (likes + comments
) DESC) AS rank
FROM
fb_posts
)
SELECT
user_id,
post_id,
engagement,
rank
FROM
ranked_posts
WHERE
rank <= 3
ORDER BY
user_id, rank;
2. Window Function:
146
1000+ SQL Interview Questions & Answers | By Zero Analyst
3. Filter for Top 3 Posts: The WHERE rank <= 3 clause filters the results
to only include the top 3 posts for each user.
This query retrieves the top 3 posts for each user based on engagement, with
the user ID, post ID, engagement count, and rank for each post. Let me know
if you need further clarifications!
o Q.94
Write a query to retrieve the count of companies that have posted duplicate job
listings.
Definition:
Duplicate job listings are defined as two job listings within the same company
that share identical titles and descriptions.
Explanation
To solve this:
1. We need to identify job listings within the same company that have
identical titles and descriptions.
147
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL Solution:
SELECT COUNT(DISTINCT company_id) AS duplicate_companies
FROM job_listings
GROUP BY company_id, title, description
HAVING COUNT(*) > 1;
MySQL Solution:
SELECT COUNT(DISTINCT company_id) AS duplicate_companies
FROM job_listings
GROUP BY company_id, title, description
HAVING COUNT(*) > 1;
1. Grouping:
• We group by company_id, title, and description to check
if there are any duplicates within the same company.
2. HAVING:
• HAVING COUNT(*) > 1: This filters the groups to only include
those that have more than one job posting with the same title
and description.
3. Counting Companies:
148
1000+ SQL Interview Questions & Answers | By Zero Analyst
Expected Output:
The result will be a single value representing the number of companies that
have posted duplicate job listings based on identical titles and descriptions.
o Q.95
Identify the region with the lowest sales amount for the previous month.
Return the region name and total sales amount.
Explanation
To solve this, we need to:
1. Filter the data to consider only sales from the previous month
(February 2024).
2. Group the data by region to calculate the total sales amount for each
region.
4. Return the region name and the total sales amount for that region.
149
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL Solution:
SELECT Region, SUM(Amount) AS total_sales
FROM Sales
WHERE SaleDate BETWEEN '2024-02-01' AND '2024-02-29'
GROUP BY Region
ORDER BY total_sales ASC
LIMIT 1;
MySQL Solution:
SELECT Region, SUM(Amount) AS total_sales
FROM Sales
WHERE SaleDate BETWEEN '2024-02-01' AND '2024-02-29'
GROUP BY Region
ORDER BY total_sales ASC
LIMIT 1;
1. Date Filtering:
• WHERE SaleDate BETWEEN '2024-02-01' AND '2024-02-
29': Filters sales records for February 2024.
2. Grouping:
• GROUP BY Region: Groups the data by region to calculate total
sales for each region.
3. Aggregation:
• SUM(Amount) AS total_sales: Sums up the sales amount for
each region.
4. Sorting:
150
1000+ SQL Interview Questions & Answers | By Zero Analyst
5. Limit:
• LIMIT 1: Returns only the region with the lowest total sales.
Expected Output:
The result will display the region with the lowest sales for the month of
February 2024, along with the total sales amount for that region.
Example output:
Region total_sales
East 13800.00
o Q.96
Explanation
To find the median in SQL:
2. We need to:
• Sort the views column.
• Find the middle value (or the average of the two middle values
if the number of rows is even).
151
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL Solution:
In PostgreSQL, we can use window functions to rank the rows and find the
middle value(s):
WITH RankedViews AS (
SELECT views,
ROW_NUMBER() OVER (ORDER BY views) AS row_num,
COUNT(*) OVER () AS total_rows
FROM tiktok
)
SELECT
CASE
WHEN total_rows % 2 = 1 THEN -- Odd number of rows
(SELECT views FROM RankedViews WHERE row_num = (total_rows + 1
) / 2)
ELSE -- Even number of rows
(SELECT AVG(views) FROM RankedViews WHERE row_num IN (total_ro
ws / 2, total_rows / 2 + 1))
END AS median;
Explanation:
1. Window Functions:
• ROW_NUMBER() OVER (ORDER BY views): This generates a
unique row number for each row based on the sorted views
column.
• COUNT(*) OVER (): This counts the total number of rows in
the table.
2. Median Logic:
• If the number of rows (total_rows) is odd, the median is the
row in the middle ((total_rows + 1) / 2).
• If the number of rows is even, the median is the average of the
two middle values (total_rows / 2 and total_rows / 2 +
1).
MySQL Solution:
MySQL doesn’t support ROW_NUMBER() and COUNT(*) OVER () in the same
way as PostgreSQL, but we can use a similar approach:
SELECT
CASE
WHEN COUNT(*) % 2 = 1 THEN
(SELECT views FROM tiktok ORDER BY views LIMIT 1 OFFSET (COUNT
(*) - 1) / 2)
152
1000+ SQL Interview Questions & Answers | By Zero Analyst
ELSE
(SELECT AVG(views) FROM (SELECT views FROM tiktok ORDER BY vie
ws LIMIT 2 OFFSET (COUNT(*) / 2 - 1)) AS subquery)
END AS median
FROM tiktok;
Explanation:
Expected Output
For the given sample data:
views
100
800
350
150
600
700
700
950
Sorted Views:
100, 150, 350, 600, 700, 700, 800, 950
• Since there are 8 values (even), the median will be the average of the
4th and 5th values (600 and 700):
(600 + 700) / 2 = 650
153
1000+ SQL Interview Questions & Answers | By Zero Analyst
650.00
Key Takeaways
o Q.97
Identify the region with the lowest sales amount for the previous month.
Return the region name and the total sale amount.
Explanation
You need to identify which region had the lowest total sales amount in the
previous month. This involves filtering the sales records to include only those
from the last month, summing the sales amounts by region, and then selecting
the region with the smallest total sales.
154
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering by Date: You’ll need to filter the data for the previous
month, which can be done using date functions such as CURRENT_DATE,
DATE_TRUNC, and INTERVAL.
• Aggregation: Use the SUM() function to aggregate sales for each
region.
• Grouping: Use GROUP BY to group the data by region.
• Ordering and Limiting: Use ORDER BY and LIMIT to get the region
with the lowest total sales.
Solutions
PostgreSQL Solution
SELECT Region, SUM(Amount) AS total_sale
FROM Sales
WHERE SaleDate >= date_trunc('month', CURRENT_DATE) - INTERVAL '1 month'
AND SaleDate < date_trunc('month', CURRENT_DATE)
GROUP BY Region
ORDER BY total_sale ASC
LIMIT 1;
MySQL Solution
SELECT Region, SUM(Amount) AS total_sale
FROM Sales
WHERE SaleDate >= DATE_SUB(CURRENT_DATE, INTERVAL 1 MONTH)
AND SaleDate < CURDATE()
GROUP BY Region
ORDER BY total_sale ASC
LIMIT 1;
o Q.98
Which metro city had the highest number of restaurant orders in September
2021?
Write the SQL query to retrieve the city name and the total count of orders,
ordered by the total count of orders in descending order.
155
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to find out which of the listed metro cities had the highest number of
restaurant orders in September 2021. This requires filtering the orders based
on the month and year, counting the number of orders for each city, and then
sorting the results by the count in descending order.
Learnings
Solutions
PostgreSQL Solution
156
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT city, COUNT(order_id) AS total_orders
FROM restaurant_orders
WHERE city IN ('Delhi', 'Mumbai', 'Bangalore', 'Hyderabad')
AND order_date >= '2021-09-01'
AND order_date < '2021-10-01'
GROUP BY city
ORDER BY total_orders DESC
LIMIT 1;
o Q.99
Identify the drivers with the highest average rating in the last 6 months.
For each driver, calculate their average rating, the number of completed
rides, and rank them based on their average rating in descending order.
Display the driver ID, average rating, number of completed rides, and
rank.
Explanation
You need to calculate the average rating for each driver in the last 6 months,
count the number of completed rides for each driver, and rank them by their
average rating. The challenge here involves:
157
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
PostgreSQL Solution
WITH driver_ratings AS (
SELECT driver_id,
COUNT(ride_id) AS total_rides,
AVG(rating) AS avg_rating
FROM rapido_rides
WHERE ride_status = 'Completed'
AND ride_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY driver_id
)
SELECT driver_id,
avg_rating,
total_rides,
RANK() OVER (ORDER BY avg_rating DESC) AS rating_rank
FROM driver_ratings
ORDER BY rating_rank;
MySQL Solution
WITH driver_ratings AS (
SELECT driver_id,
COUNT(ride_id) AS total_rides,
AVG(rating) AS avg_rating
FROM rapido_rides
WHERE ride_status = 'Completed'
AND ride_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY driver_id
)
SELECT driver_id,
avg_rating,
total_rides,
158
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation of Solutions
1. Subquery (driver_ratings):
• This part of the query calculates the total number of rides and
the average rating for each driver in the last 6 months.
• We filter the rides to include only the ones with the status
"Completed".
• The COUNT() function is used to get the number of completed
rides, and AVG() is used to calculate the average rating.
3. Date Filtering:
• We use CURRENT_DATE - INTERVAL '6 months'
(PostgreSQL) or CURDATE() - INTERVAL 6 MONTH (MySQL)
to filter the records for the last 6 months.
4. Final Output:
• The query returns the driver_id, avg_rating, total_rides,
and the rank for each driver, ordered by their rank.
o Q.100
Explanation
You need to calculate:
159
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
160
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
WITH customer_spending AS (
SELECT customer_id,
COUNT(*) AS total_items,
SUM(quantity * price) AS total_spent
FROM lenskart_purchases
WHERE product_category IN ('Eyewear', 'Sunglasses')
AND purchase_date >= CURRENT_DATE - INTERVAL '3 months'
GROUP BY customer_id
)
SELECT customer_id,
total_items,
total_spent,
RANK() OVER (ORDER BY total_spent DESC) AS rank
FROM customer_spending
ORDER BY rank;
MySQL Solution
WITH customer_spending AS (
SELECT customer_id,
COUNT(*) AS total_items,
SUM(quantity * price) AS total_spent
FROM lenskart_purchases
WHERE product_category IN ('Eyewear', 'Sunglasses')
AND purchase_date >= CURDATE() - INTERVAL 3 MONTH
GROUP BY customer_id
)
SELECT customer_id,
total_items,
total_spent,
RANK() OVER (ORDER BY total_spent DESC) AS rank
FROM customer_spending
ORDER BY rank;
Explanation of Solutions
1. Subquery (customer_spending):
• This part of the query calculates the total number of items
purchased (COUNT(*)) and the total amount spent by each
customer (SUM(quantity * price)) on products in the
Eyewear or Sunglasses categories, considering only the last 3
months.
• We filter the purchases to include only those with the relevant
categories (Eyewear and Sunglasses) and within the specified
time range (purchase_date >= CURRENT_DATE - INTERVAL
'3 months' for PostgreSQL or CURDATE() - INTERVAL 3
MONTH for MySQL).
161
1000+ SQL Interview Questions & Answers | By Zero Analyst
3. Final Output:
• The query returns the customer_id, total_items,
total_spent, and rank for each customer, ordered by rank.
Key Points
Questions By Topic
SELECT Statement
• Q.101
Question
Retrieve all distinct job titles from the employee table.
Explanation
You need to query the Employee table and retrieve distinct job titles, meaning no
duplicates.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Employee (
EmployeeID INT,
Name VARCHAR(50),
JobTitle VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Employee (EmployeeID, Name, JobTitle, Department, Salary) VALUES
(1, 'Alice', 'Software Engineer', 'Engineering', 120000.00),
(2, 'Bob', 'Data Scientist', 'Data Analytics', 115000.00),
(3, 'Charlie', 'Software Engineer', 'Engineering', 120000.00),
(4, 'Daisy', 'HR Manager', 'Human Resources', 95000.00),
(5, 'Eve', 'Product Manager', 'Product', 130000.00);
Learnings
162
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT DISTINCT JobTitle
FROM Employee;
• - MySQL solution
SELECT DISTINCT JobTitle
FROM Employee;
• Q.102
Question
List all distinct product categories available on the platform.
Explanation
You need to query the Product table and retrieve the distinct categories, ensuring no
duplicates in the result.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Product (
ProductID INT,
ProductName VARCHAR(50),
Category VARCHAR(50),
Price DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Product (ProductID, ProductName, Category, Price) VALUES
(1, 'Echo Dot', 'Electronics', 49.99),
(2, 'Fire Stick', 'Electronics', 39.99),
(3, 'Running Shoes', 'Sports', 79.99),
(4, 'Yoga Mat', 'Sports', 19.99),
(5, 'Smart Bulb', 'Electronics', 24.99);
Learnings
Solutions
• - PostgreSQL solution
SELECT DISTINCT Category
FROM Product;
• - MySQL solution
SELECT DISTINCT Category
FROM Product;
163
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.103
Question
Select the product name and price of the most expensive product from the Products
table.
Explanation
You need to find the most expensive product and return only its name and price. Use
MAX() to get the highest price.
-- Datasets
INSERT INTO Products (ProductID, ProductName, Category, Price) VALUES
(1, 'Laptop', 'Electronics', 1200.00),
(2, 'Smartphone', 'Electronics', 800.00),
(3, 'Tablet', 'Electronics', 600.00),
(4, 'Coffee Maker', 'Appliances', 100.00),
(5, 'Toaster', 'Appliances', 50.00);
Solutions
• - PostgreSQL solution
SELECT ProductName, MAX(Price) AS MaxPrice
FROM Products;
• - MySQL solution
SELECT ProductName, MAX(Price) AS MaxPrice
FROM Products;
• Q.104
Question
List all distinct industries running ad campaigns.
Explanation
You need to query the AdCampaigns table and retrieve distinct industries involved in
ad campaigns. The DISTINCT keyword ensures no duplicates in the result.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE AdCampaigns (
CampaignID INT,
CompanyName VARCHAR(50),
Industry VARCHAR(50),
Budget DECIMAL(10, 2)
);
-- Datasets
INSERT INTO AdCampaigns (CampaignID, CompanyName, Industry, Budget) VALUES
164
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT DISTINCT Industry
FROM AdCampaigns;
• - MySQL solution
SELECT DISTINCT Industry
FROM AdCampaigns;
• Q.105
Question
Find all genres of Netflix titles.
Explanation
You need to query the NetflixTitles table and retrieve distinct genres of Netflix
titles using the DISTINCT keyword to ensure no duplicates in the result.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE NetflixTitles (
TitleID INT,
TitleName VARCHAR(50),
Genre VARCHAR(50),
ReleaseYear INT
);
-- Datasets
INSERT INTO NetflixTitles (TitleID, TitleName, Genre, ReleaseYear) VALUES
(1, 'Stranger Things', 'Sci-Fi', 2016),
(2, 'The Crown', 'Drama', 2016),
(3, 'Money Heist', 'Thriller', 2017),
(4, 'Bridgerton', 'Romance', 2020),
(5, 'Breaking Bad', 'Crime', 2008);
Learnings
Solutions
• - PostgreSQL solution
165
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT DISTINCT Genre
FROM NetflixTitles;
• Q.106
Question
Display distinct license types available for Microsoft software.
Explanation
You need to query the Licenses table and retrieve distinct license types for Microsoft
software. You can filter by the SoftwareName column containing Microsoft products
and ensure there are no duplicates in the license types.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Licenses (
LicenseID INT,
SoftwareName VARCHAR(50),
LicenseType VARCHAR(50),
Price DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Licenses (LicenseID, SoftwareName, LicenseType, Price) VALUES
(1, 'Microsoft Office', 'Personal', 69.99),
(2, 'Microsoft Office', 'Business', 149.99),
(3, 'Windows 11', 'Home', 119.99),
(4, 'Windows 11', 'Pro', 199.99),
(5, 'Azure', 'Enterprise', 499.99);
Learnings
Solutions
• - PostgreSQL solution
SELECT DISTINCT LicenseType
FROM Licenses
WHERE SoftwareName LIKE 'Microsoft%';
• - MySQL solution
SELECT DISTINCT LicenseType
FROM Licenses
WHERE SoftwareName LIKE 'Microsoft%';
• Q.107
Question
166
1000+ SQL Interview Questions & Answers | By Zero Analyst
Select the first order received time on 31st December 2024 from the Orders table.
Explanation
You need to find the earliest order time (OrderTime) for all orders made on 31st
December 2024. This assumes all data corresponds to that date, and you simply want
the first order's time.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Orders (
OrderID INT,
CustomerID INT,
OrderDate DATE,
OrderTime TIME,
Amount DECIMAL(10, 2)
);
Solutions
• - PostgreSQL solution
SELECT MIN(OrderTime) AS FirstOrderTime
FROM Orders;
• - MySQL solution
SELECT MIN(OrderTime) AS FirstOrderTime
FROM Orders;
• Q.108
Question
Select the total price of all products from the Products table.
Explanation
You need to find the total price of all products in the Products table using the SUM()
function.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Products (
ProductID INT,
ProductName VARCHAR(50),
Category VARCHAR(50),
Price DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Products (ProductID, ProductName, Category, Price) VALUES
(1, 'Laptop', 'Electronics', 1200.00),
(2, 'Smartphone', 'Electronics', 800.00),
(3, 'Tablet', 'Electronics', 600.00),
(4, 'Coffee Maker', 'Appliances', 100.00),
167
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT SUM(Price) AS TotalPrice
FROM Products;
• - MySQL solution
SELECT SUM(Price) AS TotalPrice
FROM Products;
• Q.109
Question
Select the average salary of employees in the Employees table.
Explanation
You need to calculate the average salary of all employees in the Employees table
using the AVG() function.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Employees (
EmployeeID INT,
EmployeeName VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Employees (EmployeeID, EmployeeName, Department, Salary) VALUES
(1, 'Alice', 'Engineering', 100000.00),
(2, 'Bob', 'Engineering', 95000.00),
(3, 'Charlie', 'HR', 70000.00),
(4, 'David', 'HR', 75000.00),
(5, 'Eve', 'Marketing', 60000.00);
Solutions
• - PostgreSQL solution
SELECT AVG(Salary) AS AvgSalary
FROM Employees;
• - MySQL solution
SELECT AVG(Salary) AS AvgSalary
FROM Employees;
• Q.110
Question
Select the maximum salary from the Employees table.
Explanation
168
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to find the highest salary in the Employees table using the MAX() function.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Employees (
EmployeeID INT,
EmployeeName VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Employees (EmployeeID, EmployeeName, Department, Salary) VALUES
(1, 'Alice', 'Engineering', 100000.00),
(2, 'Bob', 'Engineering', 95000.00),
(3, 'Charlie', 'HR', 70000.00),
(4, 'David', 'HR', 75000.00),
(5, 'Eve', 'Marketing', 60000.00);
Solutions
• - PostgreSQL solution
SELECT MAX(Salary) AS MaxSalary
FROM Employees;
• - MySQL solution
SELECT MAX(Salary) AS MaxSalary
FROM Employees;
• COUNT
o Q.111
Question
Count the total number of orders placed in the Orders table.
Explanation
You need to count the total number of orders in the Orders table using
COUNT().
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Orders (
OrderID INT,
CustomerID INT,
OrderDate DATE,
Amount DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Orders (OrderID, CustomerID, OrderDate, Amount) VALUES
(1, 101, '2024-12-01', 50.00),
(2, 102, '2024-12-02', 30.00),
(3, 103, '2024-12-03', 70.00),
(4, 104, '2024-12-04', 100.00),
(5, 105, '2024-12-05', 25.00);
Solutions
• - PostgreSQL solution
169
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT COUNT(OrderID) AS TotalOrders
FROM Orders;
o Q.112
Question
Count the number of unique products in the Products table.
Explanation
You need to count the distinct number of products in the Products table using
COUNT(DISTINCT).
-- Datasets
INSERT INTO Products (ProductID, ProductName, Category, Price) VALUES
(1, 'Laptop', 'Electronics', 800.00),
(2, 'Smartphone', 'Electronics', 500.00),
(3, 'Tablet', 'Electronics', 300.00),
(4, 'Smartwatch', 'Accessories', 150.00),
(5, 'Laptop', 'Electronics', 800.00);
Solutions
• - PostgreSQL solution
SELECT COUNT(DISTINCT ProductName) AS UniqueProducts
FROM Products;
• - MySQL solution
SELECT COUNT(DISTINCT ProductName) AS UniqueProducts
FROM Products;
o Q.113
Question
Count the number of customers who made an order greater than $50 from the
Orders table.
Explanation
You need to count the number of orders where the amount is greater than 50
using COUNT() with a simple condition in the SELECT clause.
Datasets and SQL Schemas
170
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Table creation
CREATE TABLE Orders (
OrderID INT,
CustomerID INT,
OrderDate DATE,
Amount DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Orders (OrderID, CustomerID, OrderDate, Amount) VALUES
(1, 101, '2024-12-01', 50.00),
(2, 102, '2024-12-02', 30.00),
(3, 103, '2024-12-03', 70.00),
(4, 104, '2024-12-04', 100.00),
(5, 105, '2024-12-05', 25.00);
Solutions
• - PostgreSQL solution
SELECT COUNT(OrderID) AS OrdersAbove50
FROM Orders
WHERE Amount > 50;
• - MySQL solution
SELECT COUNT(OrderID) AS OrdersAbove50
FROM Orders
WHERE Amount > 50;
o Q.114
Question
Count the total number of products available in each category from the
Products table.
Explanation
You need to count the total number of products in the Products table, and
then apply COUNT() to the records, making sure the category is taken into
account.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Products (
ProductID INT,
ProductName VARCHAR(50),
Category VARCHAR(50),
Price DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Products (ProductID, ProductName, Category, Price) VALUES
(1, 'Laptop', 'Electronics', 800.00),
(2, 'Smartphone', 'Electronics', 500.00),
(3, 'Washing Machine', 'Home Appliances', 300.00),
(4, 'Refrigerator', 'Home Appliances', 600.00),
(5, 'Air Conditioner', 'Home Appliances', 700.00);
Solutions
• - PostgreSQL solution
171
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT COUNT(ProductID) AS TotalProducts
FROM Products;
o Q.115
Question
Count the number of employees assigned to each project from the
EmployeeProjects table.
Explanation
You need to count how many employees are assigned to each project. Use
COUNT() to aggregate the number of employees for each project.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE EmployeeProjects (
EmployeeID INT,
ProjectName VARCHAR(50),
Department VARCHAR(50)
);
-- Datasets
INSERT INTO EmployeeProjects (EmployeeID, ProjectName, Department) VALUES
(1, 'AI Development', 'Engineering'),
(2, 'Cloud Migration', 'Engineering'),
(3, 'AI Development', 'Engineering'),
(4, 'Data Analytics', 'Analytics'),
(5, 'Cloud Migration', 'Engineering');
Learnings
Solutions
• - PostgreSQL solution
SELECT ProjectName, COUNT(EmployeeID) AS EmployeeCount
FROM EmployeeProjects
GROUP BY ProjectName;
• - MySQL solution
SELECT ProjectName, COUNT(EmployeeID) AS EmployeeCount
FROM EmployeeProjects
GROUP BY ProjectName;
o Q.116
Question
172
1000+ SQL Interview Questions & Answers | By Zero Analyst
Count the number of vehicles in each type available in the Vehicles table.
Explanation
You need to count the total number of vehicles available, based on the type, in
the Vehicles table. COUNT() is used to determine the total number of entries
for each vehicle type.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Vehicles (
VehicleID INT,
VehicleType VARCHAR(50),
Model VARCHAR(50),
Year INT
);
-- Datasets
INSERT INTO Vehicles (VehicleID, VehicleType, Model, Year) VALUES
(1, 'Car', 'Sedan', 2022),
(2, 'Car', 'SUV', 2021),
(3, 'Bike', 'Mountain', 2020),
(4, 'Car', 'Hatchback', 2023),
(5, 'Bike', 'Road', 2021);
Solutions
• - PostgreSQL solution
SELECT COUNT(VehicleID) AS TotalVehicles
FROM Vehicles;
• - MySQL solution
SELECT COUNT(VehicleID) AS TotalVehicles
FROM Vehicles;
o Q.117
Question
Count the total number of student records in the Students table.
Explanation
You need to count the total number of student records in the Students table
using COUNT().
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Students (
StudentID INT,
Name VARCHAR(50),
Grade VARCHAR(2),
EnrollmentDate DATE
);
-- Datasets
INSERT INTO Students (StudentID, Name, Grade, EnrollmentDate) VALUES
(1, 'John Doe', 'A', '2023-01-10'),
(2, 'Jane Smith', 'B', '2022-09-20'),
(3, 'Sam Brown', 'A', '2021-05-15'),
(4, 'Anna White', 'C', '2024-02-22'),
(5, 'Peter Black', 'B', '2022-11-03');
173
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT COUNT(StudentID) AS TotalStudents
FROM Students;
• - MySQL solution
SELECT COUNT(StudentID) AS TotalStudents
FROM Students;
o Q.118
Question
Count how many rows in the Users table have a non-null date of birth (DOB).
Explanation
You need to count how many users have a non-null DOB in the Users table.
Use COUNT() to determine the number of users with a valid date of birth.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Users (
UserID INT,
UserName VARCHAR(50),
DOB DATE
);
-- Datasets
INSERT INTO Users (UserID, UserName, DOB) VALUES
(1, 'Alice', '1990-06-15'),
(2, 'Bob', NULL),
(3, 'Charlie', '1985-09-22'),
(4, 'David', NULL),
(5, 'Eve', '1992-01-11');
Solutions
• - PostgreSQL solution
SELECT COUNT(DOB) AS UsersWithDOB
FROM Users;
• - MySQL solution
SELECT COUNT(DOB) AS UsersWithDOB
FROM Users;
o Q.119
Question
Count how many records in the Books table have a publication year after
2010.
Explanation
174
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to count how many books in the Books table were published after
2010. Since we are not allowed to use WHERE, think about how you might
calculate the result with just COUNT().
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Books (
BookID INT,
BookTitle VARCHAR(100),
Author VARCHAR(50),
PublicationYear INT
);
-- Datasets
INSERT INTO Books (BookID, BookTitle, Author, PublicationYear) VALUES
(1, 'The Great Gatsby', 'F. Scott Fitzgerald', 1925),
(2, 'The Catcher in the Rye', 'J.D. Salinger', 1951),
(3, 'Sapiens', 'Yuval Noah Harari', 2014),
(4, 'Becoming', 'Michelle Obama', 2018),
(5, 'Educated', 'Tara Westover', 2018);
Solutions
• - PostgreSQL solution
SELECT COUNT(BookID) AS BooksAfter2010
FROM Books
WHERE PublicationYear > 2010;
• - MySQL solution
SELECT COUNT(BookID) AS BooksAfter2010
FROM Books
WHERE PublicationYear > 2010;
o Q.120
Question
Count the number of distinct cities in the Locations table.
Explanation
You need to count how many distinct cities are available in the Locations
table using COUNT(DISTINCT).
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Locations (
LocationID INT,
City VARCHAR(50),
Country VARCHAR(50),
Population INT
);
-- Datasets
INSERT INTO Locations (LocationID, City, Country, Population) VALUES
(1, 'Mumbai', 'India', 20000000),
(2, 'Delhi', 'India', 19000000),
(3, 'New York', 'USA', 8500000),
(4, 'Los Angeles', 'USA', 4000000),
(5, 'Mumbai', 'India', 20000000);
Solutions
175
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
SELECT COUNT(DISTINCT City) AS DistinctCities
FROM Locations;
• - MySQL solution
SELECT COUNT(DISTINCT City) AS DistinctCities
FROM Locations;
WHERE Clause
• Q.121
Question
Fetch the distinct customer IDs who made purchases in the last month.
Explanation
You need to retrieve the unique customer IDs from the Transactions table where the
TransactionDate falls within the last month (i.e., from 15th December 2024 to 15th
January 2025). Use date functions to filter the data accordingly.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Transactions (
TransactionID INT,
CustomerID INT,
ProductID INT,
TransactionDate DATE,
Amount DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Transactions (TransactionID, CustomerID, ProductID, TransactionDate,
Amount) VALUES
(1, 101, 1, '2024-12-01', 49.99),
(2, 102, 2, '2024-12-02', 39.99),
(3, 101, 3, '2024-12-03', 79.99),
(4, 103, 4, '2024-12-04', 19.99),
(5, 104, 5, '2024-12-05', 24.99);
Learnings
Solutions
• - PostgreSQL solution
SELECT DISTINCT CustomerID
FROM Transactions
WHERE TransactionDate >= '2024-12-15' AND TransactionDate <= '2025-01-15';
176
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT DISTINCT CustomerID
FROM Transactions
WHERE TransactionDate >= '2024-12-15' AND TransactionDate <= '2025-01-15';
This solution ensures you're considering purchases made in the last month,
specifically from 15th December 2024 to 15th January 2025.
• Q.122
Question
Retrieve all aircraft orders where the quantity is greater than 50 from the
AircraftOrders table.
Explanation
You need to select the rows from the AircraftOrders table where the Quantity
column is greater than 50. The WHERE clause is used to apply this condition.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE AircraftOrders (
OrderID INT,
CustomerName VARCHAR(50),
AircraftModel VARCHAR(50),
Quantity INT
);
-- Datasets
INSERT INTO AircraftOrders (OrderID, CustomerName, AircraftModel, Quantity) VALUE
S
(1, 'Lufthansa', 'A320', 60),
(2, 'Air France', 'A380', 30),
(3, 'British Airways', 'A350', 70),
(4, 'Ryanair', 'A320', 40),
(5, 'Iberia', 'A321', 55);
Learnings
Solutions
• - PostgreSQL solution
SELECT OrderID, CustomerName, AircraftModel, Quantity
FROM AircraftOrders
WHERE Quantity > 50;
• - MySQL solution
SELECT OrderID, CustomerName, AircraftModel, Quantity
FROM AircraftOrders
WHERE Quantity > 50;
177
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.123
Question
Retrieve all cars sold in the year 2024 from the CarSales table.
Explanation
You need to select all records from the CarSales table where the SaleYear is 2024.
The WHERE clause will be used to filter the data based on the year.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE CarSales (
SaleID INT,
ModelName VARCHAR(50),
SaleYear INT,
SaleAmount DECIMAL(10, 2)
);
-- Datasets
INSERT INTO CarSales (SaleID, ModelName, SaleYear, SaleAmount) VALUES
(1, 'X5', 2024, 80000.00),
(2, 'i4', 2023, 60000.00),
(3, '3 Series', 2024, 45000.00),
(4, 'i7', 2022, 120000.00),
(5, '5 Series', 2024, 65000.00);
Learnings
• Filtering data based on specific conditions (e.g., a certain year) using the
WHERE clause.
• Working with date-based or year-based data in SQL queries.
Solutions
• - PostgreSQL solution
SELECT SaleID, ModelName, SaleYear, SaleAmount
FROM CarSales
WHERE SaleYear = 2024;
• - MySQL solution
SELECT SaleID, ModelName, SaleYear, SaleAmount
FROM CarSales
WHERE SaleYear = 2024;
• Q.124
Question
Get all distinct Tesla vehicle models ordered.
Explanation
You need to query the VehicleOrders table and retrieve distinct vehicle models that
are Tesla vehicles. No duplicates should appear in the result.
Datasets and SQL Schemas
-- Table creation
178
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Datasets
INSERT INTO VehicleOrders (OrderID, CustomerID, Model, OrderDate, Price) VALUES
(1, 201, 'Model S', '2024-11-01', 79999.99),
(2, 202, 'Model 3', '2024-11-02', 49999.99),
(3, 201, 'Model X', '2024-11-03', 89999.99),
(4, 203, 'Model 3', '2024-11-04', 49999.99),
(5, 204, 'Model Y', '2024-11-05', 69999.99);
Learnings
Solutions
• - PostgreSQL solution
SELECT DISTINCT Model
FROM VehicleOrders
WHERE Model LIKE 'Model%';
• - MySQL solution
SELECT DISTINCT Model
FROM VehicleOrders
WHERE Model LIKE 'Model%';
• Q.125
Question
Select the total amount spent by a specific customer (CustomerID = 101) from the
Purchases table.
Explanation
You need to calculate the total amount spent by a specific customer by selecting only
the CustomerID and the sum of the Amount column. Use SUM() to calculate the total.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Purchases (
PurchaseID INT,
CustomerID INT,
Amount DECIMAL(10, 2),
PurchaseDate DATE
);
-- Datasets
INSERT INTO Purchases (PurchaseID, CustomerID, Amount, PurchaseDate) VALUES
(1, 101, 250.00, '2024-01-10'),
(2, 102, 150.00, '2024-02-20'),
(3, 101, 100.00, '2024-03-15'),
(4, 103, 200.00, '2024-04-01'),
(5, 101, 50.00, '2024-05-22');
179
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT CustomerID, SUM(Amount) AS TotalSpent
FROM Purchases
WHERE CustomerID = 101;
• - MySQL solution
SELECT CustomerID, SUM(Amount) AS TotalSpent
FROM Purchases
WHERE CustomerID = 101;
• Q.126
Question
Retrieve all sales of Diesel where the quantity is between 1000 and 5000 liters from
the FuelSales table.
Explanation
You need to select all records from the FuelSales table where the FuelType is
'Diesel' and the QuantityLiters is between 1000 and 5000 liters. The WHERE clause
will be used to filter both conditions.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE FuelSales (
SaleID INT,
FuelType VARCHAR(50),
QuantityLiters INT,
SaleAmount DECIMAL(10, 2)
);
-- Datasets
INSERT INTO FuelSales (SaleID, FuelType, QuantityLiters, SaleAmount) VALUES
(1, 'Diesel', 1200, 2500.00),
(2, 'Petrol', 800, 2000.00),
(3, 'Diesel', 4000, 5000.00),
(4, 'Diesel', 6000, 8000.00),
(5, 'Petrol', 1500, 3000.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT SaleID, FuelType, QuantityLiters, SaleAmount
FROM FuelSales
WHERE FuelType = 'Diesel' AND QuantityLiters BETWEEN 1000 AND 5000;
180
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT SaleID, FuelType, QuantityLiters, SaleAmount
FROM FuelSales
WHERE FuelType = 'Diesel' AND QuantityLiters BETWEEN 1000 AND 5000;
• Q.127
Question
Retrieve all products with "Table" in their name and a price less than 200 EUR from
the ProductInventory table.
Explanation
You need to select all products where the ProductName contains the word "Table"
and the Price is less than 200 EUR. The WHERE clause will be used to filter both
conditions: a substring match for the product name and a condition for the price.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE ProductInventory (
ProductID INT,
ProductName VARCHAR(50),
Price DECIMAL(10, 2),
Stock INT
);
-- Datasets
INSERT INTO ProductInventory (ProductID, ProductName, Price, Stock) VALUES
(1, 'Dining Table', 150.00, 100),
(2, 'Coffee Table', 180.00, 200),
(3, 'Office Chair', 100.00, 300),
(4, 'Bed Frame', 250.00, 150),
(5, 'Side Table', 90.00, 500);
Learnings
• Using LIKE for pattern matching in the WHERE clause to find a substring within
a column (e.g., products containing "Table").
• Combining multiple conditions in the WHERE clause (e.g., price less than 200
EUR).
Solutions
• - PostgreSQL solution
SELECT ProductID, ProductName, Price, Stock
FROM ProductInventory
WHERE ProductName LIKE '%Table%' AND Price < 200;
• - MySQL solution
SELECT ProductID, ProductName, Price, Stock
FROM ProductInventory
WHERE ProductName LIKE '%Table%' AND Price < 200;
• Q.127
181
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.128
Question
Retrieve all users with a Premium subscription who joined after 2022 from the
UserSubscriptions table.
Explanation
You need to select all users where the SubscriptionType is 'Premium' and the
JoinYear is greater than 2022. The WHERE clause will be used to filter these two
conditions.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE UserSubscriptions (
UserID INT,
UserName VARCHAR(50),
SubscriptionType VARCHAR(50),
JoinYear INT
);
-- Datasets
INSERT INTO UserSubscriptions (UserID, UserName, SubscriptionType, JoinYear) VALU
ES
(1, 'Alice', 'Premium', 2023),
(2, 'Bob', 'Free', 2022),
(3, 'Charlie', 'Premium', 2024),
(4, 'Diana', 'Free', 2021),
(5, 'Eve', 'Premium', 2021);
Learnings
• Filtering data using the WHERE clause with multiple conditions (e.g.,
SubscriptionType = 'Premium' and JoinYear > 2022).
• Working with text-based columns (e.g., SubscriptionType) and numeric
columns (e.g., JoinYear).
Solutions
• - PostgreSQL solution
SELECT UserID, UserName, SubscriptionType, JoinYear
FROM UserSubscriptions
WHERE SubscriptionType = 'Premium' AND JoinYear > 2022;
• - MySQL solution
SELECT UserID, UserName, SubscriptionType, JoinYear
FROM UserSubscriptions
WHERE SubscriptionType = 'Premium' AND JoinYear > 2022;
• Q.129
Question
Retrieve all chocolate products sold in either Germany or France with sales above 500
units from the ProductSales table.
182
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to select all records where the ProductCategory is 'Chocolate', the
Country is either 'Germany' or 'France', and the UnitsSold is greater than 500. Use
the WHERE clause to filter the conditions.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE ProductSales (
SaleID INT,
ProductCategory VARCHAR(50),
Country VARCHAR(50),
UnitsSold INT
);
-- Datasets
INSERT INTO ProductSales (SaleID, ProductCategory, Country, UnitsSold) VALUES
(1, 'Chocolate', 'Germany', 600),
(2, 'Chocolate', 'France', 700),
(3, 'Beverage', 'Germany', 400),
(4, 'Snacks', 'France', 300),
(5, 'Chocolate', 'Spain', 200);
Learnings
Solutions
• - PostgreSQL solution
SELECT SaleID, ProductCategory, Country, UnitsSold
FROM ProductSales
WHERE ProductCategory = 'Chocolate' AND (Country = 'Germany' OR Country =
'France') AND UnitsSold > 500;
• - MySQL solution
SELECT SaleID, ProductCategory, Country, UnitsSold
FROM ProductSales
WHERE ProductCategory = 'Chocolate' AND (Country = 'Germany' OR Country =
'France') AND UnitsSold > 500;
• Q.130
Question
Retrieve all users whose average daily data usage exceeds 2GB but is below 5GB
from the NetworkUsage table.
Explanation
You need to select all records where the AverageDailyUsageGB is greater than 2 and
less than 5. Use the WHERE clause to filter this numeric condition.
183
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Datasets
INSERT INTO NetworkUsage (UsageID, UserID, AverageDailyUsageGB) VALUES
(1, 101, 2.5),
(2, 102, 1.8),
(3, 103, 4.7),
(4, 104, 5.5),
(5, 105, 3.2);
Learnings
• Filtering numeric data using comparison operators (>, <) to find values within
a specific range.
• Working with DECIMAL data type and performing range-based filtering.
Solutions
• - PostgreSQL solution
SELECT UsageID, UserID, AverageDailyUsageGB
FROM NetworkUsage
WHERE AverageDailyUsageGB > 2 AND AverageDailyUsageGB < 5;
• - MySQL solution
SELECT UsageID, UserID, AverageDailyUsageGB
FROM NetworkUsage
WHERE AverageDailyUsageGB > 2 AND AverageDailyUsageGB < 5;
GROUP BY
• Q.131
Question
Retrieve the distinct product names along with their corresponding average price from
each manufacturer.
Explanation
You need to select the product names and calculate the average price for each product
from different manufacturers. Use GROUP BY for the manufacturer and AVG() for the
average price.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Products (
ProductID INT,
ProductName VARCHAR(50),
Manufacturer VARCHAR(50),
Price DECIMAL(10, 2)
);
184
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Datasets
INSERT INTO Products (ProductID, ProductName, Manufacturer, Price) VALUES
(1, 'Laptop', 'Company A', 1200.00),
(2, 'Smartphone', 'Company B', 800.00),
(3, 'Tablet', 'Company A', 600.00),
(4, 'Smartwatch', 'Company B', 200.00),
(5, 'Laptop', 'Company A', 1100.00);
Solutions
• - PostgreSQL solution
SELECT ProductName, Manufacturer, AVG(Price) AS AveragePrice
FROM Products
GROUP BY ProductName, Manufacturer;
• - MySQL solution
SELECT ProductName, Manufacturer, AVG(Price) AS AveragePrice
FROM Products
GROUP BY ProductName, Manufacturer;
• Q.132
Question
Select the customer names along with their corresponding order IDs and the total
amount spent for each order.
Explanation
You need to select the customer name, order ID, and the total amount for each order
by grouping data based on CustomerID and OrderID. Use aggregation to calculate the
total amount spent for each order.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Orders (
OrderID INT,
CustomerID INT,
CustomerName VARCHAR(50),
Amount DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Orders (OrderID, CustomerID, CustomerName, Amount) VALUES
(1, 101, 'Alice', 250.00),
(2, 102, 'Bob', 150.00),
(3, 101, 'Alice', 100.00),
(4, 103, 'Charlie', 200.00),
(5, 102, 'Bob', 50.00);
Solutions
• - PostgreSQL solution
SELECT CustomerName, OrderID, SUM(Amount) AS TotalAmount
FROM Orders
GROUP BY CustomerName, OrderID;
• - MySQL solution
SELECT CustomerName, OrderID, SUM(Amount) AS TotalAmount
185
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM Orders
GROUP BY CustomerName, OrderID;
• Q.133
Question
Fetch all distinct ride types offered by Uber, along with the average price for each ride
type.
Explanation
You need to retrieve distinct ride types from the Rides table. Additionally, for each
ride type, calculate the average price. Use the GROUP BY clause to group the data by
RideType and AVG() to calculate the average price.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Rides (
RideID INT,
CustomerID INT,
RideType VARCHAR(50),
Distance DECIMAL(5, 2),
Price DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Rides (RideID, CustomerID, RideType, Distance, Price) VALUES
(1, 401, 'UberX', 5.2, 12.50),
(2, 402, 'UberXL', 10.5, 25.00),
(3, 403, 'UberX', 7.8, 15.00),
(4, 404, 'Uber Black', 3.2, 20.00),
(5, 405, 'UberXL', 8.1, 30.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT RideType, AVG(Price) AS AveragePrice
FROM Rides
GROUP BY RideType;
• - MySQL solution
SELECT RideType, AVG(Price) AS AveragePrice
FROM Rides
GROUP BY RideType;
• Q.134
Question
186
1000+ SQL Interview Questions & Answers | By Zero Analyst
Select the average salary per department and the department with the highest average
salary.
Explanation
You need to calculate the average salary for each department. Then, select the
department that has the highest average salary. You can use AVG() to calculate the
average and MAX() to find the highest value.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Employees (
EmployeeID INT,
EmployeeName VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Employees (EmployeeID, EmployeeName, Department, Salary) VALUES
(1, 'Alice', 'Engineering', 100000.00),
(2, 'Bob', 'Engineering', 95000.00),
(3, 'Charlie', 'HR', 70000.00),
(4, 'David', 'HR', 75000.00),
(5, 'Eve', 'Marketing', 60000.00);
Solutions
• - PostgreSQL solution
SELECT Department, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY Department
ORDER BY AvgSalary DESC
LIMIT 1;
• - MySQL solution
SELECT Department, AVG(Salary) AS AvgSalary
FROM Employees
GROUP BY Department
ORDER BY AvgSalary DESC
LIMIT 1;
Question 2
Find the highest and lowest priced products from each category.
Explanation
You need to find both the maximum (MAX()) and minimum (MIN()) price for products
within each category. This requires grouping the products by their category.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Products (
ProductID INT,
ProductName VARCHAR(50),
Category VARCHAR(50),
Price DECIMAL(10, 2)
);
-- Datasets
187
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT Category, MAX(Price) AS MaxPrice, MIN(Price) AS MinPrice
FROM Products
GROUP BY Category;
• - MySQL solution
SELECT Category, MAX(Price) AS MaxPrice, MIN(Price) AS MinPrice
FROM Products
GROUP BY Category;
Question 3
Select the total amount spent by each customer, along with the date of their first and
last purchase.
Explanation
For each customer, calculate the total amount spent (use SUM()), and also find the date
of their first and last purchase using MIN() and MAX() respectively. Group the result
by customer.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Purchases (
PurchaseID INT,
CustomerID INT,
CustomerName VARCHAR(50),
Amount DECIMAL(10, 2),
PurchaseDate DATE
);
-- Datasets
INSERT INTO Purchases (PurchaseID, CustomerID, CustomerName, Amount, PurchaseDate
) VALUES
(1, 101, 'Alice', 250.00, '2024-01-10'),
(2, 102, 'Bob', 150.00, '2024-02-20'),
(3, 101, 'Alice', 100.00, '2024-03-15'),
(4, 103, 'Charlie', 200.00, '2024-04-01'),
(5, 102, 'Bob', 50.00, '2024-05-22');
Solutions
• - PostgreSQL solution
SELECT CustomerName, SUM(Amount) AS TotalSpent, MIN(PurchaseDate) AS FirstPurchas
e, MAX(PurchaseDate) AS LastPurchase
FROM Purchases
GROUP BY CustomerName;
• - MySQL solution
188
1000+ SQL Interview Questions & Answers | By Zero Analyst
These questions focus purely on SELECT statements that utilize various aggregation
functions, such as AVG(), SUM(), MAX(), MIN(), and handle dates, while also
involving multiple columns. Let me know if you'd like more variations or
explanations!
• Q.135
Question
Find the number of missions launched by each country and filter those with more than
2 missions from the SpaceMissions table.
Explanation
You need to calculate the number of missions launched by each country. After
counting the missions per country, filter the results to show only countries with more
than 2 missions. This can be achieved by using aggregation and a HAVING clause to
filter based on the count.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE SpaceMissions (
MissionID INT,
MissionName VARCHAR(50),
LaunchCountry VARCHAR(50),
LaunchDate DATE
);
-- Datasets
INSERT INTO SpaceMissions (MissionID, MissionName, LaunchCountry, LaunchDate) VAL
UES
(1, 'Apollo 11', 'USA', '1969-07-16'),
(2, 'Luna 2', 'USSR', '1959-09-12'),
(3, 'Voyager 1', 'USA', '1977-09-05'),
(4, 'Mars Rover', 'USA', '2003-06-10'),
(5, 'Venera 7', 'USSR', '1970-08-17');
Learnings
• Using the COUNT() function to aggregate the number of records (missions) per
group (country).
• Filtering the aggregated results using the HAVING clause to display only groups
that meet the condition (missions > 2).
Solutions
• - PostgreSQL solution
SELECT LaunchCountry, COUNT(MissionID) AS MissionCount
FROM SpaceMissions
GROUP BY LaunchCountry
HAVING COUNT(MissionID) > 2;
• - MySQL solution
189
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.136
Question
List the number of aircraft delivered for each model and filter for models with total
deliveries exceeding 50 from the AircraftDeliveries table.
Explanation
You need to calculate the total number of deliveries for each aircraft model. After
calculating the sum, filter the results to show only models with total deliveries greater
than 50. This can be done using aggregation and the HAVING clause to filter based on
the summed units.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE AircraftDeliveries (
DeliveryID INT,
Model VARCHAR(50),
UnitsDelivered INT
);
-- Datasets
INSERT INTO AircraftDeliveries (DeliveryID, Model, UnitsDelivered) VALUES
(1, 'A320', 30),
(2, 'A380', 20),
(3, 'A350', 60),
(4, 'A320', 40),
(5, 'A321', 70);
Learnings
• Using the SUM() function to aggregate the total units delivered for each model.
• Filtering aggregated results using the HAVING clause to only display models
with total deliveries exceeding 50.
Solutions
• - PostgreSQL solution
SELECT Model, SUM(UnitsDelivered) AS TotalDeliveries
FROM AircraftDeliveries
GROUP BY Model
HAVING SUM(UnitsDelivered) > 50;
• - MySQL solution
SELECT Model, SUM(UnitsDelivered) AS TotalDeliveries
FROM AircraftDeliveries
GROUP BY Model
HAVING SUM(UnitsDelivered) > 50;
• Q.137
Question
190
1000+ SQL Interview Questions & Answers | By Zero Analyst
Find the total quantity of fuel sold for each type and filter fuel types with total sales
exceeding 5000 liters from the FuelTypes table.
Explanation
You need to calculate the total quantity of fuel sold for each fuel type. After
calculating the total, filter the results to show only fuel types where the total quantity
sold exceeds 5000 liters. This can be achieved by using aggregation and the HAVING
clause to filter based on the summed quantity.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE FuelTypes (
SaleID INT,
FuelType VARCHAR(50),
QuantityLiters INT
);
Learnings
• Using the SUM() function to aggregate the total quantity of fuel sold for each
fuel type.
• Filtering aggregated results using the HAVING clause to only display fuel types
with sales exceeding 5000 liters.
Solutions
• - PostgreSQL solution
SELECT FuelType, SUM(QuantityLiters) AS TotalQuantitySold
FROM FuelTypes
GROUP BY FuelType
HAVING SUM(QuantityLiters) > 5000;
• - MySQL solution
SELECT FuelType, SUM(QuantityLiters) AS TotalQuantitySold
FROM FuelTypes
GROUP BY FuelType
HAVING SUM(QuantityLiters) > 5000;
• Q.138
Question
191
1000+ SQL Interview Questions & Answers | By Zero Analyst
Find the total number of streams for each artist and filter for artists with more than
1000 streams from the MusicStreams table.
Explanation
You need to calculate the total number of streams for each artist. After calculating the
total, filter the results to show only those artists who have more than 1000 streams.
This can be achieved by using aggregation and the HAVING clause to filter based on
the summed streams.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE MusicStreams (
StreamID INT,
ArtistName VARCHAR(50),
Streams INT
);
Learnings
• Using the SUM() function to aggregate the total number of streams for each
artist.
• Filtering aggregated results using the HAVING clause to only display artists
with more than 1000 streams.
Solutions
• - PostgreSQL solution
SELECT ArtistName, SUM(Streams) AS TotalStreams
FROM MusicStreams
GROUP BY ArtistName
HAVING SUM(Streams) > 1000;
• - MySQL solution
SELECT ArtistName, SUM(Streams) AS TotalStreams
FROM MusicStreams
GROUP BY ArtistName
HAVING SUM(Streams) > 1000;
• Q.139
Question
192
1000+ SQL Interview Questions & Answers | By Zero Analyst
Count the distinct products sold for each category and filter categories with more than
2 distinct products from the ProductCategories table.
Explanation
You need to count the number of distinct products sold within each category. After
counting, filter the categories to include only those with more than 2 distinct products.
This can be done using the COUNT(DISTINCT ProductName) to count distinct
products and the HAVING clause to filter the categories.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE ProductCategories (
ProductID INT,
CategoryName VARCHAR(50),
ProductName VARCHAR(50)
);
Learnings
Solutions
• - PostgreSQL solution
SELECT CategoryName, COUNT(DISTINCT ProductName) AS DistinctProductCount
FROM ProductCategories
GROUP BY CategoryName
HAVING COUNT(DISTINCT ProductName) > 2;
• - MySQL solution
SELECT CategoryName, COUNT(DISTINCT ProductName) AS DistinctProductCount
FROM ProductCategories
GROUP BY CategoryName
HAVING COUNT(DISTINCT ProductName) > 2;
• Q.140
Question
Calculate the total data usage for each plan type and filter plans with data usage
exceeding 50 GB from the NetworkUsage table.
Explanation
193
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to calculate the total data usage for each PlanType. After calculating the
total, filter the results to show only the plan types with total data usage exceeding 50
GB. This can be done using the SUM() function to aggregate the data usage and the
HAVING clause to filter the results.
Learnings
Solutions
• - PostgreSQL solution
SELECT PlanType, SUM(DataUsageGB) AS TotalDataUsage
FROM NetworkUsage
GROUP BY PlanType
HAVING SUM(DataUsageGB) > 50;
• - MySQL solution
SELECT PlanType, SUM(DataUsageGB) AS TotalDataUsage
FROM NetworkUsage
GROUP BY PlanType
HAVING SUM(DataUsageGB) > 50;
GROUP BY + HAVING
• Q.141
194
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Find the total order amount for each customer and filter customers who spent more
than 500 EUR.
Explanation
Calculate the total order amount for each customer and then filter the customers
whose total order amount exceeds 500 EUR.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE CustomerOrders (
OrderID INT,
CustomerName VARCHAR(50),
OrderAmount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO CustomerOrders (OrderID, CustomerName, OrderAmount) VALUES
(1, 'Alice', 150.00),
(2, 'Bob', 400.00),
(3, 'Charlie', 550.00),
(4, 'Diana', 200.00),
(5, 'Eve', 300.00),
(6, 'Frank', 600.00),
(7, 'Grace', 700.00),
(8, 'Hannah', 800.00),
(9, 'Ivan', 450.00),
(10, 'Jack', 250.00),
(11, 'Kathy', 650.00),
(12, 'Leo', 100.00),
(13, 'Mia', 1200.00),
(14, 'Nina', 350.00),
(15, 'Oscar', 800.00);
Learnings
• Aggregation (SUM)
• Grouping data (GROUP BY)
• Filtering using HAVING clause
Solutions
• - PostgreSQL solution
SELECT CustomerName, SUM(OrderAmount) AS TotalSpent
FROM CustomerOrders
GROUP BY CustomerName
HAVING SUM(OrderAmount) > 500;
• - MySQL solution
SELECT CustomerName, SUM(OrderAmount) AS TotalSpent
FROM CustomerOrders
GROUP BY CustomerName
HAVING SUM(OrderAmount) > 500;
• Q.142
195
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Count the number of returns for each category and filter categories with more than 3
returns.
Explanation
Count the number of returns for each product category and then filter out categories
that have 3 or fewer returns.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE ProductReturns (
ReturnID INT,
CategoryName VARCHAR(50),
ReturnReason VARCHAR(100)
);
• - Datasets
INSERT INTO ProductReturns (ReturnID, CategoryName, ReturnReason) VALUES
(1, 'Shoes', 'Size issue'),
(2, 'Apparel', 'Defect'),
(3, 'Shoes', 'Damaged'),
(4, 'Apparel', 'Color mismatch'),
(5, 'Shoes', 'Wrong size'),
(6, 'Shoes', 'Quality issue'),
(7, 'Apparel', 'Defect'),
(8, 'Shoes', 'Size issue'),
(9, 'Accessories', 'Color mismatch'),
(10, 'Shoes', 'Wrong size'),
(11, 'Shoes', 'Damaged'),
(12, 'Apparel', 'Size issue'),
(13, 'Accessories', 'Defect'),
(14, 'Shoes', 'Quality issue'),
(15, 'Apparel', 'Fit issue'),
(16, 'Shoes', 'Wrong size'),
(17, 'Accessories', 'Color mismatch'),
(18, 'Shoes', 'Defect'),
(19, 'Apparel', 'Defect'),
(20, 'Shoes', 'Size issue');
Learnings
Solutions
• - PostgreSQL solution
SELECT CategoryName, COUNT(ReturnID) AS ReturnCount
FROM ProductReturns
GROUP BY CategoryName
HAVING COUNT(ReturnID) > 3;
• - MySQL solution
SELECT CategoryName, COUNT(ReturnID) AS ReturnCount
196
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM ProductReturns
GROUP BY CategoryName
HAVING COUNT(ReturnID) > 3;
• Q.143
Question
Find the total absences for each department and filter departments with total absences
greater than 20.
Explanation
Sum the absences for each department and filter the results to include only
departments with a total absence count greater than 20.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE EmployeeAbsences (
AbsenceID INT,
Department VARCHAR(50),
Absences INT
);
• - Datasets
INSERT INTO EmployeeAbsences (AbsenceID, Department, Absences) VALUES
(1, 'Engineering', 12),
(2, 'HR', 5),
(3, 'Marketing', 10),
(4, 'Engineering', 15),
(5, 'HR', 8),
(6, 'Engineering', 10),
(7, 'Marketing', 7),
(8, 'HR', 4),
(9, 'Engineering', 20),
(10, 'Sales', 12),
(11, 'Engineering', 5),
(12, 'Marketing', 15),
(13, 'Sales', 5),
(14, 'HR', 6),
(15, 'Sales', 8),
(16, 'Marketing', 9),
(17, 'Engineering', 13),
(18, 'Sales', 10),
(19, 'HR', 3),
(20, 'Marketing', 10);
Learnings
Solutions
• - PostgreSQL solution
SELECT Department, SUM(Absences) AS TotalAbsences
FROM EmployeeAbsences
197
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY Department
HAVING SUM(Absences) > 20;
• - MySQL solution
SELECT Department, SUM(Absences) AS TotalAbsences
FROM EmployeeAbsences
GROUP BY Department
HAVING SUM(Absences) > 20;
• Q.144
Question
Find the number of books borrowed by each member and filter members who
borrowed more than 10 books in total.
Explanation
Sum the number of books borrowed by each member, then filter to only include
members who borrowed more than 10 books.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE BookBorrowing (
BorrowID INT,
MemberID INT,
BookTitle VARCHAR(100),
BorrowCount INT
);
• - Datasets
INSERT INTO BookBorrowing (BorrowID, MemberID, BookTitle, BorrowCount) VALUES
(1, 101, 'Introduction to SQL', 3),
(2, 102, 'Advanced SQL', 5),
(3, 103, 'Database Design', 4),
(4, 101, 'Data Structures', 2),
(5, 102, 'Algorithms', 6),
(6, 104, 'Operating Systems', 1),
(7, 105, 'Computer Networks', 7),
(8, 101, 'Machine Learning', 3),
(9, 103, 'Web Development', 6),
(10, 105, 'Cloud Computing', 5);
Learnings
Solutions
• - PostgreSQL solution
SELECT MemberID, SUM(BorrowCount) AS TotalBooksBorrowed
FROM BookBorrowing
GROUP BY MemberID
HAVING SUM(BorrowCount) > 10;
198
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT MemberID, SUM(BorrowCount) AS TotalBooksBorrowed
FROM BookBorrowing
GROUP BY MemberID
HAVING SUM(BorrowCount) > 10;
• Q.145
Question
Find the total number of patient visits by department and filter departments with more
than 30 visits.
Explanation
Sum the total number of visits for each department and filter out departments with 30
or fewer visits.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE PatientVisits (
VisitID INT,
Department VARCHAR(50),
PatientID INT,
VisitCount INT
);
• - Datasets
INSERT INTO PatientVisits (VisitID, Department, PatientID, VisitCount) VALUES
(1, 'Cardiology', 201, 3),
(2, 'Neurology', 202, 4),
(3, 'Orthopedics', 203, 5),
(4, 'Cardiology', 204, 6),
(5, 'Neurology', 205, 7),
(6, 'Orthopedics', 206, 2),
(7, 'Pediatrics', 207, 8),
(8, 'Cardiology', 208, 4),
(9, 'Orthopedics', 209, 5),
(10, 'Pediatrics', 210, 5),
(11, 'Neurology', 211, 6),
(12, 'Orthopedics', 212, 4),
(13, 'Cardiology', 213, 3),
(14, 'Pediatrics', 214, 9),
(15, 'Orthopedics', 215, 6);
Learnings
Solutions
• - PostgreSQL solution
SELECT Department, SUM(VisitCount) AS TotalVisits
FROM PatientVisits
199
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY Department
HAVING SUM(VisitCount) > 30;
• - MySQL solution
SELECT Department, SUM(VisitCount) AS TotalVisits
FROM PatientVisits
GROUP BY Department
HAVING SUM(VisitCount) > 30;
• Q.146
Question
Find the number of students enrolled in each course and filter out courses with fewer
than 5 students.
Explanation
Count the number of students in each course and then filter courses that have fewer
than 5 students enrolled.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE CourseEnrollments (
EnrollmentID INT,
CourseName VARCHAR(100),
StudentID INT
);
• - Datasets
INSERT INTO CourseEnrollments (EnrollmentID, CourseName, StudentID) VALUES
(1, 'Data Science', 301),
(2, 'Machine Learning', 302),
(3, 'Data Science', 303),
(4, 'Artificial Intelligence', 304),
(5, 'Data Science', 305),
(6, 'Web Development', 306),
(7, 'Machine Learning', 307),
(8, 'Web Development', 308),
(9, 'Artificial Intelligence', 309),
(10, 'Data Science', 310),
(11, 'Machine Learning', 311),
(12, 'Web Development', 312),
(13, 'Artificial Intelligence', 313),
(14, 'Data Science', 314),
(15, 'Machine Learning', 315);
Learnings
Solutions
• - PostgreSQL solution
200
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT CourseName, COUNT(StudentID) AS TotalStudents
FROM CourseEnrollments
GROUP BY CourseName
HAVING COUNT(StudentID) >= 5;
• Q.147
Question
Find the top 5 customers who spent the most on products in each product category.
Include only customers who have spent more than 1000 EUR in a single category and
sort by the total amount spent in descending order.
Explanation
For each product category, sum the spending for each customer. Filter to only include
customers with spending greater than 1000 EUR, and return the top 5 customers in
each category based on their spending.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE ProductSales (
SaleID INT,
ProductID INT,
ProductName VARCHAR(100),
Category VARCHAR(50),
CustomerID INT,
SaleAmount DECIMAL(10, 2),
SaleDate DATE
);
• - Datasets
INSERT INTO ProductSales (SaleID, ProductID, ProductName, Category, CustomerID, S
aleAmount, SaleDate) VALUES
(1, 101, 'Laptop', 'Electronics', 201, 1200.00, '2023-01-01'),
(2, 102, 'Phone', 'Electronics', 202, 800.00, '2023-01-10'),
(3, 103, 'Shoes', 'Apparel', 201, 150.00, '2023-02-15'),
(4, 104, 'Jacket', 'Apparel', 203, 350.00, '2023-02-20'),
(5, 105, 'Headphones', 'Electronics', 204, 300.00, '2023-03-05'),
(6, 106, 'Smartwatch', 'Electronics', 201, 900.00, '2023-03-10'),
(7, 107, 'Shirt', 'Apparel', 202, 100.00, '2023-04-01'),
(8, 108, 'Laptop', 'Electronics', 203, 1300.00, '2023-04-05'),
(9, 109, 'Phone', 'Electronics', 205, 500.00, '2023-04-10'),
(10, 110, 'Shoes', 'Apparel', 206, 200.00, '2023-05-15'),
(11, 111, 'Tablet', 'Electronics', 207, 1100.00, '2023-05-25'),
(12, 112, 'Shirt', 'Apparel', 204, 250.00, '2023-06-05'),
(13, 113, 'Smartwatch', 'Electronics', 201, 1100.00, '2023-06-20');
Learnings
201
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
WITH RankedCustomers AS (
SELECT Category, CustomerID, SUM(SaleAmount) AS TotalSpent,
ROW_NUMBER() OVER (PARTITION BY Category ORDER BY SUM(SaleAmount) DESC
) AS Rank
FROM ProductSales
GROUP BY Category, CustomerID
HAVING SUM(SaleAmount) > 1000
)
SELECT Category, CustomerID, TotalSpent
FROM RankedCustomers
WHERE Rank <= 5;
• - MySQL solution
WITH RankedCustomers AS (
SELECT Category, CustomerID, SUM(SaleAmount) AS TotalSpent,
ROW_NUMBER() OVER (PARTITION BY Category ORDER BY SUM(SaleAmount) DESC
) AS Rank
FROM ProductSales
GROUP BY Category, CustomerID
HAVING SUM(SaleAmount) > 1000
)
SELECT Category, CustomerID, TotalSpent
FROM RankedCustomers
WHERE Rank <= 5;
• Q.148
Question
Identify the products that have more than 10 returns and have an average return
amount greater than 100 EUR. Only include products where the total sales amount is
greater than 5000 EUR.
Explanation
For each product, calculate the total sales amount and the total number of returns.
Filter products with more than 10 returns, an average return amount greater than 100
EUR, and total sales greater than 5000 EUR.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE SalesReturns (
ReturnID INT,
ProductID INT,
ProductName VARCHAR(100),
SaleAmount DECIMAL(10, 2),
ReturnAmount DECIMAL(10, 2),
ReturnDate DATE
);
• - Datasets
202
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT ProductID, ProductName, COUNT(ReturnID) AS TotalReturns, AVG(ReturnAmount)
AS AvgReturnAmount, SUM(SaleAmount) AS TotalSales
FROM SalesReturns
GROUP BY ProductID, ProductName
HAVING COUNT(ReturnID) > 10
AND AVG(ReturnAmount) > 100
AND SUM(SaleAmount) > 5000;
• - MySQL solution
SELECT ProductID, ProductName, COUNT(ReturnID) AS TotalReturns, AVG(ReturnAmount)
AS AvgReturnAmount, SUM(SaleAmount) AS TotalSales
FROM SalesReturns
GROUP BY ProductID, ProductName
HAVING COUNT(ReturnID) > 10
AND AVG(ReturnAmount) > 100
AND SUM(SaleAmount) > 5000;
• Q.149
Question
Find the top 5 customers who spent the most across all their orders in a specific year
(e.g., 2023). Include only customers who placed at least 3 orders in that year, and
filter for customers whose total spending exceeds 5000 EUR.
Explanation
For each customer, sum the spending across all their orders in 2023. Filter out
customers with fewer than 3 orders and total spending less than 5000 EUR. Sort by
total spending in descending order and return the top 5.
Datasets and SQL Schemas
• - Table creation
203
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Orders (OrderID, CustomerID, OrderDate, OrderAmount) VALUES
(1, 301, '2023-01-15', 1500.00),
(2, 302, '2023-02-10', 1200.00),
(3, 303, '2023-03-05', 200.00),
(4, 301, '2023-04-01', 700.00),
(5, 304, '2023-05-15', 250.00),
(6, 305, '2023-06-01',
350.00),
(7, 301, '2023-07-10', 1200.00),
(8, 303, '2023-08-25', 700.00),
(9, 302, '2023-09-05', 1200.00),
(10, 305, '2023-09-20', 500.00),
(11, 301, '2023-10-15', 3000.00),
(12, 304, '2023-11-10', 450.00),
(13, 305, '2023-12-01', 600.00);
**Learnings**
- Grouping by customer and filtering by multiple conditions
- Aggregation using `SUM()`
- Filtering using `HAVING` for both count and value conditions
**Solutions**
-- PostgreSQL solution
```sql
WITH CustomerSpending AS (
SELECT CustomerID, COUNT(OrderID) AS OrderCount, SUM(OrderAmount) AS TotalSpe
nding
FROM Orders
WHERE EXTRACT(YEAR FROM OrderDate) = 2023
GROUP BY CustomerID
HAVING COUNT(OrderID) >= 3 AND SUM(OrderAmount) > 5000
)
SELECT CustomerID, TotalSpending
FROM CustomerSpending
ORDER BY TotalSpending DESC
LIMIT 5;
• - MySQL solution
WITH CustomerSpending AS (
SELECT CustomerID, COUNT(OrderID) AS OrderCount, SUM(OrderAmount) AS TotalSpe
nding
FROM Orders
WHERE YEAR(OrderDate) = 2023
GROUP BY CustomerID
HAVING COUNT(OrderID) >= 3 AND SUM(OrderAmount) > 5000
)
SELECT CustomerID, TotalSpending
FROM CustomerSpending
ORDER BY TotalSpending DESC
204
1000+ SQL Interview Questions & Answers | By Zero Analyst
LIMIT 5;
These questions test advanced SQL skills involving GROUP BY, HAVING, WHERE,
window functions, and aggregate functions. Let me know if you need further
modifications!
• Q.150
Question
Identify the top 5 products with the highest return rate, where the return rate is defined
as the percentage of returns relative to the total number of sales. Include only products
where the total number of returns is greater than 20, and filter for products where the
return rate is greater than 15%.
Explanation
For each product, calculate the return rate (number of returns / total sales). Filter out
products with fewer than 20 returns and where the return rate is greater than 15%.
Sort by return rate in descending order and return the top 5 products.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE ProductSales (
SaleID INT,
ProductID INT,
ProductName VARCHAR(100),
SaleAmount DECIMAL(10, 2),
SaleDate DATE
);
• - Datasets
INSERT INTO ProductSales (SaleID, ProductID, ProductName, SaleAmount, SaleDate) V
ALUES
(1, 101, 'Laptop', 1500.00, '2023-01-01'),
(2, 101, 'Laptop', 1200.00, '2023-01-15'),
(3, 102, 'Phone', 800.00, '2023-02-10'),
(4, 101, 'Laptop', 1300.00, '2023-02-20'),
(5, 103, 'Headphones', 150.00, '2023-03-01'),
(6, 102, 'Phone', 900.00, '2023-03-05'),
(7, 104, 'Smartwatch', 300.00, '2023-04-01'),
(8, 101, 'Laptop', 1600.00, '2023-04-10'),
(9, 102, 'Phone', 700.00, '2023-05-15'),
(10, 103, 'Headphones', 250.00, '2023-06-01'),
(11, 101, 'Laptop', 1200.00, '2023-06-20');
205
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
WITH ProductReturnRates AS (
SELECT ps.ProductID, ps.ProductName,
COUNT(pr.ReturnID) AS TotalReturns,
COUNT(ps.SaleID) AS TotalSales,
(COUNT(pr.ReturnID) * 1.0 / COUNT(ps.SaleID)) * 100 AS ReturnRate
FROM ProductSales ps
LEFT JOIN ProductReturns pr ON ps.ProductID = pr.ProductID
GROUP BY ps.ProductID, ps.ProductName
HAVING COUNT(pr.ReturnID) > 20 AND (COUNT(pr.ReturnID) * 1.0 / COUNT(ps.SaleI
D)) * 100 > 15
)
SELECT ProductID, ProductName, TotalReturns, ReturnRate
FROM ProductReturnRates
ORDER BY ReturnRate DESC
LIMIT 5;
• - MySQL solution
WITH ProductReturnRates AS (
SELECT ps.ProductID, ps.ProductName,
COUNT(pr.ReturnID) AS TotalReturns,
COUNT(ps.SaleID) AS TotalSales,
(COUNT(pr.ReturnID) * 1.0 / COUNT(ps.SaleID)) * 100 AS ReturnRate
FROM ProductSales ps
LEFT JOIN ProductReturns pr ON ps.ProductID = pr.ProductID
GROUP BY ps.ProductID, ps.ProductName
HAVING COUNT(pr.ReturnID) > 20 AND (COUNT(pr.ReturnID) * 1.0 / COUNT(ps.SaleI
D)) * 100 > 15
)
SELECT ProductID, ProductName, TotalReturns, ReturnRate
FROM ProductReturnRates
ORDER BY ReturnRate DESC
LIMIT 5;
ORDER BY
• Q.151
Question
Get all distinct sizes of beverages offered, but ensure that the sizes are ordered by
their price in descending order.
Explanation
206
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to retrieve distinct sizes of beverages from the Beverages table.
Additionally, order the sizes by their corresponding price in descending order to
display the most expensive sizes first.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Beverages (
BeverageID INT,
BeverageName VARCHAR(50),
Size VARCHAR(50),
Price DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Beverages (BeverageID, BeverageName, Size, Price) VALUES
(1, 'Latte', 'Tall', 3.50),
(2, 'Latte', 'Grande', 4.00),
(3, 'Latte', 'Venti', 4.50),
(4, 'Espresso', 'Solo', 2.50),
(5, 'Espresso', 'Doppio', 3.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT DISTINCT Size
FROM Beverages
ORDER BY Price DESC;
• - MySQL solution
SELECT DISTINCT Size
FROM Beverages
ORDER BY Price DESC;
• Q.152
Question
Sort Properties by Price in Ascending Order
Explanation
The task is to retrieve all property listings from the PropertyListings table and sort
them by the Price column in ascending order. This can be achieved using the ORDER
BY clause with the ASC keyword, which ensures that the rows are sorted from the
lowest to the highest price.
Datasets and SQL Schemas
• - Table creation
207
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO PropertyListings (ListingID, PropertyName, City, Price) VALUES
(1, 'Luxury Villa', 'Miami', 2500000.00),
(2, 'Cozy Condo', 'San Diego', 800000.00),
(3, 'Suburban House', 'Austin', 350000.00),
(4, 'Downtown Loft', 'Seattle', 1200000.00),
(5, 'Lakefront Cabin', 'Denver', 900000.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT *
FROM PropertyListings
ORDER BY Price ASC;
• - MySQL solution
SELECT *
FROM PropertyListings
ORDER BY Price ASC;
• Q.153
Question
Sort Rental Properties by Monthly Rent in Descending Order
Explanation
The task is to retrieve all rental property listings from the RentalProperties table
and sort them by the MonthlyRent column in descending order. This can be done
using the ORDER BY clause with the DESC keyword, which ensures that the rows are
sorted from the highest to the lowest monthly rent.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE RentalProperties (
RentalID INT,
PropertyName VARCHAR(100),
City VARCHAR(50),
MonthlyRent DECIMAL(10, 2)
);
208
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO RentalProperties (RentalID, PropertyName, City, MonthlyRent) VALUES
(1, 'Urban Apartment', 'Boston', 2500.00),
(2, 'Lakeside Duplex', 'Denver', 1800.00),
(3, 'Riverside Studio', 'Portland', 2200.00),
(4, 'City Heights Condo', 'San Francisco', 3500.00),
(5, 'Suburban Flat', 'Austin', 1500.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT *
FROM RentalProperties
ORDER BY MonthlyRent DESC;
• - MySQL solution
SELECT *
FROM RentalProperties
ORDER BY MonthlyRent DESC;
• Q.154
Question
Sort Properties by Year Built with Unknown Years Last
Explanation
The task is to sort the properties by the YearBuilt column in ascending order,
ensuring that properties with a NULL value in the YearBuilt column appear at the end
of the list. This can be done using the ORDER BY clause with the ASC keyword, and
specifying NULLS LAST to ensure that NULL values are treated as the largest possible
values when sorting.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE LuxuryProperties (
PropertyID INT,
PropertyName VARCHAR(100),
City VARCHAR(50),
Price DECIMAL(10, 2),
YearBuilt INT
);
• - Datasets
209
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT *
FROM LuxuryProperties
ORDER BY YearBuilt ASC NULLS LAST;
• - MySQL solution
SELECT *
FROM LuxuryProperties
ORDER BY YearBuilt ASC;
• Q.155
Question
Retrieve Properties Sorted by Price-per-Square-Foot Value
Explanation
The task is to calculate the price-per-square-foot value for each property and sort the
results in ascending order based on this value. The price-per-square-foot is calculated
by dividing the Price by the SquareFeet. The query then orders the properties by
this calculated value.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE RealEstateProperties (
PropertyID INT,
PropertyName VARCHAR(100),
City VARCHAR(50),
210
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO RealEstateProperties (PropertyID, PropertyName, City, Price, SquareFe
et) VALUES
(1, 'Seaside Cottage', 'Miami', 800000.00, 2000),
(2, 'Downtown Loft', 'Seattle', 1200000.00, 1500),
(3, 'Suburban House', 'Austin', 350000.00, 2500),
(4, 'Lakefront Cabin', 'Denver', 900000.00, 1800);
Learnings
Solutions
• - PostgreSQL solution
SELECT PropertyName, Price / SquareFeet AS PricePerSquareFoot
FROM RealEstateProperties
ORDER BY PricePerSquareFoot ASC;
• - MySQL solution
SELECT PropertyName, Price / SquareFeet AS PricePerSquareFoot
FROM RealEstateProperties
ORDER BY PricePerSquareFoot ASC;
• Q.156
Question
Sorting Dates Stored as Text
Explanation
The task is to sort real estate transactions by their sale date, but the dates are stored as
text in the SaleDate column, formatted as 'DD-MM-YYYY'. Since text sorting is
lexicographical and doesn't handle dates as expected, we need to first convert the text
into a valid date format for correct sorting.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE RealEstateTransactions (
TransactionID INT,
PropertyName VARCHAR(100),
City VARCHAR(50),
SaleDate VARCHAR(20) -- Dates stored as text
);
211
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO RealEstateTransactions (TransactionID, PropertyName, City, SaleDate)
VALUES
(1, 'Luxury Villa', 'London', '12-05-2023'),
(2, 'Suburban House', 'Berlin', '01-03-2022'),
(3, 'Downtown Loft', 'Paris', '25-12-2021'),
(4, 'Cozy Apartment', 'Madrid', '08-07-2024'),
(5, 'Lakeside Cabin', 'Geneva', '15-09-2023');
Learnings
• Handling dates stored as text and converting them to a valid date format for
sorting.
• Using date functions (TO_DATE in PostgreSQL and STR_TO_DATE in MySQL)
to convert text into date format.
• Sorting by the converted date values.
Solutions
• - PostgreSQL solution
SELECT *
FROM RealEstateTransactions
ORDER BY TO_DATE(SaleDate, 'DD-MM-YYYY') ASC;
• - MySQL solution
SELECT *
FROM RealEstateTransactions
ORDER BY STR_TO_DATE(SaleDate, '%d-%m-%Y') ASC;
Explanation:
• Q.157
Question
Sort Employees by Department and Calculate Average Salary
Explanation
The task is to retrieve all employees, sorted first by their Department, and then by
their Salary in descending order within each department. Additionally, calculate the
average salary for each department.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Employees (
EmployeeID INT,
EmployeeName VARCHAR(100),
212
1000+ SQL Interview Questions & Answers | By Zero Analyst
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees (EmployeeID, EmployeeName, Department, Salary) VALUES
(1, 'Alice', 'Engineering', 90000.00),
(2, 'Bob', 'HR', 50000.00),
(3, 'Charlie', 'Engineering', 95000.00),
(4, 'Diana', 'Marketing', 60000.00),
(5, 'Eve', 'HR', 55000.00),
(6, 'Frank', 'Marketing', 75000.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT EmployeeName, Department, Salary
FROM Employees
ORDER BY Department, Salary DESC;
• - MySQL solution
SELECT EmployeeName, Department, Salary
FROM Employees
ORDER BY Department, Salary DESC;
• Q.158
Question
Sort Projects by Deadline and Count Employees Involved
Explanation
The task is to sort projects by their deadline in ascending order and count the number
of employees involved in each project. The data is spread across two tables: Projects
and ProjectEmployees.
Datasets and SQL Schemas
213
1000+ SQL Interview Questions & Answers | By Zero Analyst
ProjectID INT,
EmployeeID INT
);
• - Datasets
INSERT INTO Projects (ProjectID, ProjectName, Deadline) VALUES
(1, 'Project Alpha', '2024-12-01'),
(2, 'Project Beta', '2023-08-15'),
(3, 'Project Gamma', '2023-11-30');
Learnings
Solutions
• - PostgreSQL solution
SELECT p.ProjectName, p.Deadline, COUNT(pe.EmployeeID) AS EmployeeCount
FROM Projects p
JOIN ProjectEmployees pe ON p.ProjectID = pe.ProjectID
GROUP BY p.ProjectID
ORDER BY p.Deadline ASC;
• - MySQL solution
SELECT p.ProjectName, p.Deadline, COUNT(pe.EmployeeID) AS EmployeeCount
FROM Projects p
JOIN ProjectEmployees pe ON p.ProjectID = pe.ProjectID
GROUP BY p.ProjectID
ORDER BY p.Deadline ASC;
• Q.159
Question
Sort Students by Total Marks and Filter Those Above a Threshold
Explanation
The task is to retrieve all students, sort them by their total marks in descending order,
and filter only those students whose total marks exceed a specific threshold (e.g.,
300).
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Students (
StudentID INT,
StudentName VARCHAR(100)
214
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO Students (StudentID, StudentName) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie'),
(4, 'Diana');
Learnings
Solutions
• - PostgreSQL solution
SELECT s.StudentName, SUM(m.Marks) AS TotalMarks
FROM Students s
JOIN Marks m ON s.StudentID = m.StudentID
GROUP BY s.StudentID
HAVING SUM(m.Marks) > 300
ORDER BY TotalMarks DESC;
• - MySQL solution
SELECT s.StudentName, SUM(m.Marks) AS TotalMarks
FROM Students s
JOIN Marks m ON s.StudentID = m.StudentID
GROUP BY s.StudentID
HAVING SUM(m.Marks) > 300
ORDER BY TotalMarks DESC;
• Q.160
215
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Select the maximum salary from the Employees table.
Explanation
You need to find the highest salary in the Employees table using the MAX() function.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE Employees (
EmployeeID INT,
EmployeeName VARCHAR(50),
Department VARCHAR(50),
Salary DECIMAL(10, 2)
);
-- Datasets
INSERT INTO Employees (EmployeeID, EmployeeName, Department, Salary) VALUES
(1, 'Alice', 'Engineering', 100000.00),
(2, 'Bob', 'Engineering', 95000.00),
(3, 'Charlie', 'HR', 70000.00),
(4, 'David', 'HR', 75000.00),
(5, 'Eve', 'Marketing', 60000.00);
Solutions
• - PostgreSQL solution
SELECT MAX(Salary) AS MaxSalary
FROM Employees;
• - MySQL solution
SELECT MAX(Salary) AS MaxSalary
FROM Employees;
JOINS
• Q.161
Question
Write an SQL query to retrieve all students along with their grades.
Explanation
This question requires you to perform a JOIN operation between the Students and
Grades tables to get a list of students along with their corresponding grades based on
the StudentID.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Students (
StudentID INT,
Name VARCHAR(50)
);
216
1000+ SQL Interview Questions & Answers | By Zero Analyst
Grade VARCHAR(2)
);
217
1000+ SQL Interview Questions & Answers | By Zero Analyst
(23, 'B'),
(24, 'C'),
(25, 'A'),
(26, 'B'),
(27, 'C'),
(28, 'A'),
(29, 'A'),
(30, 'B'),
(31, 'B'),
(32, 'C'),
(33, 'A'),
(34, 'B'),
(35, 'C');
Learnings
Solutions
• - PostgreSQL solution
SELECT s.StudentID, s.Name, g.Grade
FROM Students s
JOIN Grades g ON s.StudentID = g.StudentID;
• - MySQL solution
SELECT s.StudentID, s.Name, g.Grade
FROM Students s
JOIN Grades g ON s.StudentID = g.StudentID;
• Q.162
Question
Write an SQL query to retrieve all employees along with their department names.
Explanation
This question requires performing an INNER JOIN between the Employees and
Departments tables using the common DepartmentID column. The goal is to retrieve
the department name for each employee.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(50),
DepartmentID INT
);
218
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT e.EmployeeID, e.Name, d.DepartmentName
FROM Employees e
JOIN Departments d ON e.DepartmentID = d.DepartmentID;
• - MySQL solution
SELECT e.EmployeeID, e.Name, d.DepartmentName
FROM Employees e
JOIN Departments d ON e.DepartmentID = d.DepartmentID;
• Q.163
Question
Write an SQL query to select all orders along with their customer names.
Explanation
This question requires you to perform a JOIN operation between the Customers and
Orders tables using the common CustomerID column. The goal is to retrieve both the
order details and the corresponding customer names.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Customers (
219
1000+ SQL Interview Questions & Answers | By Zero Analyst
CustomerID INT,
CustomerName VARCHAR(50)
);
220
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT o.OrderID, o.TotalAmount, c.CustomerName
FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID;
• - MySQL solution
SELECT o.OrderID, o.TotalAmount, c.CustomerName
FROM Orders o
JOIN Customers c ON o.CustomerID = c.CustomerID;
• Q.164
Question
Write an SQL query to select all products along with their category names.
Explanation
This question requires you to perform a JOIN between the Products and Categories
tables using the CategoryID column, in order to retrieve product details along with
their respective category names.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Categories (
CategoryID INT,
CategoryName VARCHAR(50)
);
221
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT p.ProductID, p.ProductName, c.CategoryName
FROM Products p
JOIN Categories c ON p.CategoryID = c.CategoryID;
• - MySQL solution
SELECT p.ProductID, p.ProductName, c.CategoryName
FROM Products p
JOIN Categories c ON p.CategoryID = c.CategoryID;
• Q.165
Question
Write an SQL query to select all employees and their salaries. Include both employees
without a salary and salary records without a corresponding employee.
Explanation
This question requires using a FULL OUTER JOIN between the Employees and
Salaries tables. The FULL OUTER JOIN ensures that all records from both tables are
included, even if there is no corresponding match in the other table. If an employee
has no salary record, the salary will be NULL, and if a salary record does not have a
corresponding employee, the employee's name will be NULL.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(50)
);
222
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using a FULL OUTER JOIN to return all records from both tables, even if there
is no matching record in the other table.
• Handling NULL values for unmatched records in either table.
Solutions
• - PostgreSQL solution
SELECT e.Name, s.Salary
FROM Employees e
FULL OUTER JOIN Salaries s ON e.EmployeeID = s.EmployeeID;
• - MySQL solution (MySQL does not support FULL OUTER JOIN directly)
SELECT e.Name, s.Salary
FROM Employees e
LEFT JOIN Salaries s ON e.EmployeeID = s.EmployeeID
UNION
SELECT e.Name, s.Salary
FROM Employees e
RIGHT JOIN Salaries s ON e.EmployeeID = s.EmployeeID;
• Q.166
Question
Write an SQL query to select all teachers along with the subjects they teach.
Explanation
This question requires you to perform a JOIN between the Teachers and Subjects
tables using the SubjectID column, in order to retrieve the teacher names along with
the subjects they are assigned to.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Subjects (
SubjectID INT,
223
1000+ SQL Interview Questions & Answers | By Zero Analyst
SubjectName VARCHAR(50)
);
Learnings
Solutions
• - PostgreSQL solution
SELECT t.Name, s.SubjectName
FROM Teachers t
JOIN Subjects s ON t.SubjectID = s.SubjectID;
• - MySQL solution
SELECT t.Name, s.SubjectName
FROM Teachers t
JOIN Subjects s ON t.SubjectID = s.SubjectID;
• Q.167
Question
Write an SQL query to retrieve the list of customers who have made orders for
products in the "Electronics" or "Books" categories, including their names, order total
224
1000+ SQL Interview Questions & Answers | By Zero Analyst
amounts, and the product names. The query should include data from the following
tables: Customers, Orders, Products, and Categories.
Explanation
This question requires performing a complex JOIN operation across four tables:
You will need to filter products by the categories "Electronics" and "Books" and join
them with the respective customer order data.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Customers (
CustomerID INT,
CustomerName VARCHAR(50)
);
225
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(1001, 'Laptop', 1),
(1002, 'Smartphone', 1),
(1003, 'Novel', 3),
(1004, 'Cookbook', 3),
(1005, 'Headphones', 1),
(1006, 'Action Figure', 5);
Learnings
Solutions
• - PostgreSQL solution
SELECT c.CustomerName, o.TotalAmount, p.ProductName
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
JOIN Products p ON o.OrderID = p.ProductID
JOIN Categories cat ON p.CategoryID = cat.CategoryID
WHERE cat.CategoryName IN ('Electronics', 'Books');
• - MySQL solution
SELECT c.CustomerName, o.TotalAmount, p.ProductName
FROM Customers c
JOIN Orders o ON c.CustomerID = o.CustomerID
JOIN Products p ON o.OrderID = p.ProductID
JOIN Categories cat ON p.CategoryID = cat.CategoryID
WHERE cat.CategoryName IN ('Electronics', 'Books');
• Q.167
Question
Write an SQL query to retrieve the list of employees who have the same manager and
their corresponding manager's name. You need to use a self-join on the Employees
table. The result should show the employee's name and the name of their manager.
Explanation
This question requires performing a self-join on the Employees table. The idea is to
join the table with itself, where one instance of the table represents employees and the
other instance represents their managers. You will need to link the ManagerID (a
foreign key to EmployeeID) to fetch the manager's name for each employee.
226
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(50),
ManagerID INT
);
Learnings
Solutions
• - PostgreSQL solution
SELECT e1.Name AS EmployeeName, e2.Name AS ManagerName
FROM Employees e1
JOIN Employees e2 ON e1.ManagerID = e2.EmployeeID
ORDER BY e1.Name;
• - MySQL solution
SELECT e1.Name AS EmployeeName, e2.Name AS ManagerName
FROM Employees e1
JOIN Employees e2 ON e1.ManagerID = e2.EmployeeID
ORDER BY e1.Name;
• Q.168
Question
Write an SQL query to select all subjects and the corresponding teacher names.
Include subjects without a corresponding teacher.
Explanation
This question requires using a LEFT JOIN between the Subjects and Teachers
tables. The LEFT JOIN ensures that all records from the Subjects table are returned,
even if there is no corresponding teacher in the Teachers table. If there is no
matching teacher, the teacher's name will be NULL.
227
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE Subjects (
SubjectID INT,
SubjectName VARCHAR(50)
);
Learnings
• Using a LEFT JOIN to return all records from the left table (Subjects), even if
there is no matching record in the right table (Teachers).
• Handling NULL values when a subject does not have a corresponding teacher.
Solutions
• - PostgreSQL solution
SELECT s.SubjectName, t.Name AS TeacherName
FROM Subjects s
LEFT JOIN Teachers t ON s.SubjectID = t.SubjectID;
• - MySQL solution
SELECT s.SubjectName, t.Name AS TeacherName
FROM Subjects s
LEFT JOIN Teachers t ON s.SubjectID = t.SubjectID;
• Q.169
Question
228
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to retrieve a list of all products and their respective sales,
including products that have not been sold yet. The result should show the product
name, sales amount, and quantity sold. For products that have not been sold, the sales
amount and quantity sold should be NULL. Use a RIGHT JOIN between the Products
and Sales tables.
Explanation
This question requires performing a RIGHT JOIN between the Products and Sales
tables. The goal is to ensure that all products are listed, even if they have no
corresponding sales records. For those products with no sales, the sales-related fields
(SalesAmount, QuantitySold) should show NULL.
• - Table creation
CREATE TABLE Products (
ProductID INT,
ProductName VARCHAR(50)
);
229
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using a RIGHT JOIN to include all records from the right table (Sales) and
matching records from the left table (Products).
• Handling NULL values for products with no sales records.
• Retrieving data where some rows may not have corresponding records in one
of the tables.
Solutions
• - PostgreSQL solution
SELECT p.ProductName, s.SalesAmount, s.QuantitySold
FROM Products p
RIGHT JOIN Sales s ON p.ProductID = s.ProductID
ORDER BY p.ProductName;
• - MySQL solution
SELECT p.ProductName, s.SalesAmount, s.QuantitySold
FROM Products p
RIGHT JOIN Sales s ON p.ProductID = s.ProductID
ORDER BY p.ProductName;
• Q.170
Question
Write an SQL query to select all grades along with the student names. Include grades
without a corresponding student.
Explanation
This question requires using a LEFT JOIN between the Grades and Students tables.
The LEFT JOIN ensures that all records from the Grades table are returned, even if
there is no corresponding student in the Students table. If there is no matching
student, the student name will be NULL.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Students (
StudentID INT,
Name VARCHAR(50)
);
230
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using a LEFT JOIN to ensure all records from one table (Grades) are included,
even when there is no matching record in the other table (Students).
• Handling NULL values when there is no corresponding match in the Students
table.
Solutions
• - PostgreSQL solution
SELECT g.Grade, s.Name
FROM Grades g
LEFT JOIN Students s ON g.StudentID = s.StudentID;
• - MySQL solution
SELECT g.Grade, s.Name
FROM Grades g
LEFT JOIN Students s ON g.StudentID = s.StudentID;
Subqueries
• Q.171
Question
Select the employee name and their date of joining (DOJ) who joined the earliest from
the Employees table.
Explanation
You need to find the employee with the earliest DOJ (Date of Joining) and select their
name and the DOJ field. Use MIN() to find the earliest date.
Explanation:
231
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Datasets
INSERT INTO Employees (EmployeeID, EmployeeName, DOJ, Department, Salary) VALUES
(1, 'Alice', '2015-01-10', 'Engineering', 100000.00),
(2, 'Bob', '2017-06-25', 'HR', 95000.00),
(3, 'Charlie', '2014-08-15', 'Marketing', 70000.00),
(4, 'David', '2019-04-01', 'Engineering', 75000.00),
(5, 'Eve', '2016-03-22', 'HR', 80000.00);
Solutions
• - PostgreSQL solution
-- PostgreSQL solution
SELECT EmployeeName, DOJ AS EarliestDOJ
FROM Employees
WHERE DOJ = (SELECT MIN(DOJ) FROM Employees);
• - MySQL solution
-- MySQL solution
SELECT EmployeeName, DOJ AS EarliestDOJ
FROM Employees
WHERE DOJ = (SELECT MIN(DOJ) FROM Employees);
• Q.172
Question
Write an SQL query to select all employees who belong to the department with ID 5.
Explanation
This query requires a simple SELECT statement with a WHERE clause that filters the
employees based on their DepartmentID. The condition should be DepartmentID =
5.
• - Table creation
CREATE TABLE Employees (
ID INT,
Name VARCHAR(50),
DepartmentID INT
);
232
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data using a WHERE clause to find specific records based on column
values.
Solutions
• - PostgreSQL solution
SELECT *
FROM Employees
WHERE DepartmentID = 5;
• - MySQL solution
SELECT *
FROM Employees
WHERE DepartmentID = 5;
• Q.173
Question
Write a SQL query to select the order with the highest total amount.
Explanation
To find the order with the highest total amount, we can use a subquery. The subquery
will first find the maximum TotalAmount from the Orders table, and then the outer
query will select the order that has this maximum value.
233
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE Orders (
OrderID INT,
CustomerID INT,
TotalAmount DECIMAL(10, 2)
);
Learnings
• Using a subquery to first find the maximum value and then using it in the
outer query to filter the result.
• Subqueries in SELECT or WHERE clauses to perform aggregate operations.
Solutions
• - PostgreSQL solution
SELECT *
FROM Orders
WHERE TotalAmount = (SELECT MAX(TotalAmount) FROM Orders);
234
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT *
FROM Orders
WHERE TotalAmount = (SELECT MAX(TotalAmount) FROM Orders);
• Q.174
Question
Write an SQL query to retrieve employee details from each department who have a
salary greater than the average salary in their department.
Explanation
This task requires a correlated subquery. For each employee, compare their salary
with the average salary in their department. The subquery should calculate the
average salary for the department, and the main query should return employees whose
salary exceeds that average.
• - Table creation
CREATE TABLE Employees (
Emp_No DECIMAL(4,0) NOT NULL,
Emp_Name VARCHAR(10),
Job_Name VARCHAR(9),
Manager_Id DECIMAL(4,0),
HireDate DATE,
Salary DECIMAL(7,2),
Commission DECIMAL(7,2),
Department VARCHAR(20)
);
• - Insert data
INSERT INTO Employees (Emp_No, Emp_Name, Job_Name, Manager_Id, HireDate, Salary,
Commission, Department) VALUES
(7839, 'KING', 'PRESIDENT', NULL, '1981-11-17', 5000, NULL, 'IT'),
(7698, 'BLAKE', 'MANAGER', 7839, '1981-05-01', 2850, NULL, 'HR'),
(7782, 'CLARK', 'MANAGER', 7839, '1981-06-09', 2450, NULL, 'Marketing'),
(7566, 'JONES', 'MANAGER', 7839, '1981-04-02', 2975, NULL, 'Operations'),
(7788, 'SCOTT', 'ANALYST', 7566, '1987-07-29', 3000, NULL, 'Operations'),
(7902, 'FORD', 'ANALYST', 7566, '1981-12-03', 3000, NULL, 'Operations'),
(7369, 'SMITH', 'CLERK', 7902, '1980-12-17', 800, NULL, 'Operations'),
(7499, 'ALLEN', 'SALESMAN', 7698, '1981-02-20', 1600, 300, 'HR'),
(7521, 'WARD', 'SALESMAN', 7698, '1981-02-22', 1250, 500, 'HR'),
(7654, 'MARTIN', 'SALESMAN', 7698, '1981-09-28', 1250, 1400, 'HR'),
(7844, 'TURNER', 'SALESMAN', 7698, '1981-09-08', 1500, 0, 'HR'),
(7876, 'ADAMS', 'CLERK', 7788, '1987-06-02', 1100, NULL, 'Operations'),
(7900, 'JAMES', 'CLERK', 7698, '1981-12-03', 950, NULL, 'HR'),
(7934, 'MILLER', 'CLERK', 7782, '1982-01-23', 1300, NULL, 'Marketing'),
(7905, 'BROWN', 'SALESMAN', 7698, '1981-11-12', 1250, 1400, 'HR'),
(7906, 'DAVIS', 'ANALYST', 7566, '1987-07-13', 3000, NULL, 'Operations'),
(7907, 'GARCIA', 'MANAGER', 7839, '1981-08-12', 2975, NULL, 'IT'),
(7908, 'HARRIS', 'SALESMAN', 7698, '1981-06-21', 1600, 300, 'HR'),
(7909, 'JACKSON', 'CLERK', 7902, '1981-11-17', 800, NULL, 'Operations'),
(7910, 'JOHNSON', 'MANAGER', 7839, '1981-04-02', 2850, NULL, 'Marketing'),
(7911, 'LEE', 'ANALYST', 7566, '1981-09-28', 1250, 1400, 'Operations'),
(7912, 'MARTINEZ', 'CLERK', 7902, '1981-12-03', 1250, NULL, 'Operations'),
(7913, 'MILLER', 'MANAGER', 7839, '1981-01-23', 2450, NULL, 'HR'),
235
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT Emp_No, Emp_Name, Job_Name, Salary, Department
FROM Employees e1
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees e2
WHERE e1.Department = e2.Department
);
• - MySQL solution
SELECT Emp_No, Emp_Name, Job_Name, Salary, Department
FROM Employees e1
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees e2
WHERE e1.Department = e2.Department
);
• Q.175
Question
Find the details of employees whose salary is greater than the average salary across
the entire company.
Explanation
This problem involves calculating the overall average salary for all employees in the
company and retrieving the details of employees whose salary exceeds this average. A
subquery can be used to compute the average salary, which is then compared with
each employee's salary in the main query.
• - Table creation
236
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Insert data
INSERT INTO employees (employee_name, department, salary)
VALUES
('John Doe', 'HR', 50000.00),
('Jane Smith', 'HR', 55000.00),
('Michael Johnson', 'HR', 60000.00),
('Emily Davis', 'IT', 60000.00),
('David Brown', 'IT', 65000.00),
('Sarah Wilson', 'Finance', 70000.00),
('Robert Taylor', 'Finance', 75000.00),
('Jennifer Martinez', 'Finance', 80000.00);
Learnings
Solutions
• - PostgreSQL solution
SELECT employee_id, employee_name, department, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
• - MySQL solution
SELECT employee_id, employee_name, department, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
• Q.176
Question
Write a SQL query to find the names of managers who have at least five direct
reports. Ensure that no employee is their own manager. Return the result table in any
order.
Explanation
To solve this, we need to identify employees who have at least five direct reports. A
direct report is an employee whose managerId matches the id of another employee.
237
1000+ SQL Interview Questions & Answers | By Zero Analyst
A subquery can be used to count the number of direct reports for each manager and
filter managers who have five or more direct reports.
• - Table creation
CREATE TABLE Employees (
id INT PRIMARY KEY,
name VARCHAR(255),
department VARCHAR(255),
managerId INT
);
• - Insert data
INSERT INTO Employees (id, name, department, managerId) VALUES
(101, 'John', 'A', NULL),
(102, 'Dan', 'A', 101),
(103, 'James', 'A', 101),
(104, 'Amy', 'A', 101),
(105, 'Anne', 'A', 101),
(106, 'Ron', 'B', 101),
(107, 'Michael', 'C', NULL),
(108, 'Sarah', 'C', 107),
(109, 'Emily', 'C', 107),
(110, 'Brian', 'C', 107);
Learnings
• Self-joins or subqueries can be used to count related rows (i.e., direct reports).
• Filtering based on the count of related rows (employees managed by each
manager).
• Ensuring there are no circular relationships (no employee is their own
manager).
Solutions
• - PostgreSQL solution
SELECT name
FROM Employees
WHERE id IN (
SELECT managerId
FROM Employees
WHERE managerId IS NOT NULL
GROUP BY managerId
HAVING COUNT(*) >= 5
);
• - MySQL solution
SELECT name
FROM Employees
WHERE id IN (
SELECT managerId
FROM Employees
WHERE managerId IS NOT NULL
GROUP BY managerId
HAVING COUNT(*) >= 5
238
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• Q.177
Question
Write an SQL query to find the average market price of houses in each state and city,
where the average market price exceeds 300,000, using a correlated subquery.
Explanation
The task is to calculate the average market price for each state and city using a
correlated subquery. For each row, the subquery will calculate the average market
price of houses in the same state and city, and the main query will filter out those
where the average market price is greater than 300,000.
• - Table creation
CREATE TABLE house_price (
id INT,
state VARCHAR(255),
city VARCHAR(255),
street_address VARCHAR(255),
mkt_price INT
);
• - Insert data
INSERT INTO house_price (id, state, city, street_address, mkt_price) VALUES
(1, 'NY', 'New York City', '66 Trout Drive', 449761),
(2, 'NY', 'New York City', 'Atwater', 277527),
(3, 'NY', 'New York City', '58 Gates Street', 268394),
(4, 'NY', 'New York City', 'Norcross', 279929),
(5, 'NY', 'New York City', '337 Shore Ave.', 151592),
(6, 'NY', 'New York City', 'Plainfield', 624531),
(7, 'NY', 'New York City', '84 Central Street', 267345),
(8, 'NY', 'New York City', 'Passaic', 88504),
(9, 'NY', 'New York City', '951 Fulton Road', 270476),
(10, 'NY', 'New York City', 'Oxon Hill', 118112),
(11, 'CA', 'Los Angeles', '692 Redwood Court', 150707),
(12, 'CA', 'Los Angeles', 'Lewiston', 463180),
(13, 'CA', 'Los Angeles', '8368 West Acacia Ave.', 538865),
(14, 'CA', 'Los Angeles', 'Pearl', 390896),
(15, 'CA', 'Los Angeles', '8206 Old Riverview Rd.', 117754),
(16, 'CA', 'Los Angeles', 'Seattle', 424588),
(17, 'CA', 'Los Angeles', '7227 Joy Ridge Rd.', 156850),
(18, 'CA', 'Los Angeles', 'Battle Ground', 643454),
(19, 'CA', 'Los Angeles', '233 Bedford Ave.', 713841),
(20, 'CA', 'Los Angeles', 'Saint Albans', 295852),
(21, 'IL', 'Chicago', '8830 Baker St.', 12944),
(22, 'IL', 'Chicago', 'Watertown', 410766),
(23, 'IL', 'Chicago', '632 Princeton St.', 160696),
(24, 'IL', 'Chicago', 'Waxhaw', 464144),
(25, 'IL', 'Chicago', '7773 Tailwater Drive', 129393),
(26, 'IL', 'Chicago', 'Bonita Springs', 174886),
(27, 'IL', 'Chicago', '31 Summerhouse Rd.', 296008),
(28, 'IL', 'Chicago', 'Middleburg', 279000),
(29, 'IL', 'Chicago', '273 Windfall Avenue', 424846),
(30, 'IL', 'Chicago', 'Graham', 592268),
(31, 'TX', 'Houston', '91 Canterbury Dr.', 632014),
(32, 'TX', 'Houston', 'Dallas', 68868),
(33, 'TX', 'Houston', '503 Elmwood St.', 454184),
239
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT state, city
FROM house_price hp1
WHERE (
SELECT AVG(mkt_price)
FROM house_price hp2
WHERE hp1.state = hp2.state AND hp1.city = hp2.city
) > 300000
GROUP BY state, city;
• - MySQL solution
SELECT state, city
FROM house_price hp1
WHERE (
SELECT AVG(mkt_price)
FROM house_price hp2
WHERE hp1.state = hp2.state AND hp1.city = hp2.city
) > 300000
GROUP BY state, city;
• Q.178
Question
Find the customer with the highest total purchase value based on their orders and
order items.
Explanation
The task is to calculate the total purchase value for each customer by multiplying the
quantity of each item ordered by its unit price, then summing the values for all items
240
1000+ SQL Interview Questions & Answers | By Zero Analyst
ordered by each customer. The customer with the highest total purchase value should
be returned.
• - Create tables
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100)
);
• - Insert data
INSERT INTO Customers VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Johnson');
Learnings
Solutions
• - PostgreSQL solution
241
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT CustomerName
FROM Customers
WHERE CustomerID = (
SELECT CustomerID
FROM Orders
JOIN OrderItems ON Orders.OrderID = OrderItems.OrderID
GROUP BY CustomerID
ORDER BY SUM(Quantity * UnitPrice) DESC
LIMIT 1
);
• - MySQL solution
SELECT CustomerName
FROM Customers
WHERE CustomerID = (
SELECT CustomerID
FROM Orders
JOIN OrderItems ON Orders.OrderID = OrderItems.OrderID
GROUP BY CustomerID
ORDER BY SUM(Quantity * UnitPrice) DESC
LIMIT 1
);
• Q.179
Question
Find employees who earn above the average salary of their department.
Explanation
The task is to find employees whose salary is greater than the average salary of
employees in the same department. A correlated subquery is used to calculate the
average salary for each department while comparing it against each employee’s salary
in the main query.
• - Create tables
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(100)
);
• - Insert data
INSERT INTO Departments VALUES
(1, 'Engineering'),
(2, 'Marketing'),
(3, 'Finance');
242
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT EmployeeName
FROM Employees E
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
WHERE DepartmentID = E.DepartmentID
);
• - MySQL solution
SELECT EmployeeName
FROM Employees E
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
WHERE DepartmentID = E.DepartmentID
);
• Q.180
Question
Find the names of employees who earn more than the average salary of employees in
the same department, but only for departments where the average salary is greater
than $50,000.
Explanation
This query involves two key parts:
2. Subquery Condition: The outer query filters only those departments where
the average salary is greater than $50,000.
• - Create tables
243
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Insert data
INSERT INTO Departments VALUES
(1, 'Engineering'),
(2, 'Marketing'),
(3, 'Finance');
Learnings
• Correlated Subqueries: The inner subquery calculates the average salary for
each department.
• Subquery Filtering: The outer query filters only departments where the
average salary exceeds $50,000.
• Aggregation: The AVG() function is used for calculating the average salary in
each department.
Solutions
• - PostgreSQL solution
SELECT EmployeeName
FROM Employees E
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
WHERE DepartmentID = E.DepartmentID
)
AND E.DepartmentID IN (
SELECT DepartmentID
FROM Employees
GROUP BY DepartmentID
HAVING AVG(Salary) > 50000
);
• - MySQL solution
SELECT EmployeeName
FROM Employees E
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
244
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The inner subquery inside the WHERE clause calculates the average salary for
each department.
• The outer query compares each employee’s salary with the average salary of
their department.
• The second subquery in the AND condition ensures that only those departments
where the average salary is greater than $50,000 are considered.
• Q.181
Question
Write a query to find employees whose salary is greater than the average salary of all
employees.
Explanation
This can be solved using a CTE to calculate the average salary first, then filter
employees based on this average.
• - Create tables
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Insert data
INSERT INTO Employees VALUES
(1, 'John Doe', 50000.00),
(2, 'Jane Smith', 60000.00),
(3, 'Alice Johnson', 70000.00),
(4, 'Bob Brown', 65000.00);
Learnings
245
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.182
Question
Write a query to find all departments that have more than 2 employees, along with the
department name and the number of employees in each department.
Explanation
This can be solved using a CTE to calculate the number of employees in each
department, then filtering the results to only show departments with more than 2
employees.
• - Create tables
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(100)
);
• - Insert data
INSERT INTO Departments VALUES
(1, 'Engineering'),
(2, 'Marketing'),
(3, 'Finance');
Learnings
246
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH DeptEmployeeCount AS (
SELECT DepartmentID, COUNT(*) AS EmployeeCount
FROM Employees
GROUP BY DepartmentID
)
SELECT D.DepartmentName, DEC.EmployeeCount
FROM Departments D
JOIN DeptEmployeeCount DEC ON D.DepartmentID = DEC.DepartmentID
WHERE DEC.EmployeeCount > 2;
• Q.183
Question
Write a query to find employees who report to the same manager. The result should
include the employee name and their manager's name.
Explanation
This can be solved using a CTE that joins the Employees table with itself to find
employees who have the same manager.
• - Create tables
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
ManagerID INT
);
• - Insert data
INSERT INTO Employees VALUES
(1, 'John Doe', NULL),
(2, 'Jane Smith', 1),
(3, 'Alice Johnson', 1),
(4, 'Bob Brown', 2),
(5, 'Charlie Davis', 2);
Learnings
• Q.184
247
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a query to find the names of products that have never been ordered.
Explanation
This can be done using a CTE to list all products and then using a LEFT JOIN to
check which products don't have any corresponding orders in the OrderItems table.
• - Create tables
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductName VARCHAR(100)
);
• - Insert data
INSERT INTO Products VALUES
(1, 'Laptop'),
(2, 'Mouse'),
(3, 'Keyboard'),
(4, 'Monitor');
Learnings
MySQL Solution
WITH ProductList AS (
SELECT ProductID, ProductName
FROM Products
)
SELECT ProductName
FROM ProductList PL
LEFT JOIN OrderItems OI ON PL.ProductID = OI.ProductID
WHERE OI.OrderID IS NULL;
248
1000+ SQL Interview Questions & Answers | By Zero Analyst
Postgres Solution
WITH ProductList AS (
SELECT ProductID, ProductName
FROM Products
)
SELECT ProductName
FROM ProductList PL
LEFT JOIN OrderItems OI ON PL.ProductID = OI.ProductID
WHERE OI.OrderID IS NULL;
• Q.185
Question
Write a query to find customers who have purchased more than 5 different products.
Explanation
Using a CTE, we first count the number of distinct products each customer has
purchased, then filter those who have purchased more than 5 products.
• - Create tables
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100)
);
• - Insert data
INSERT INTO Customers VALUES
(1, 'John Doe'),
(2, 'Jane Smith');
249
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
MySQL Solution
WITH ProductCounts AS (
SELECT O.CustomerID, COUNT(DISTINCT OI.ProductID) AS ProductCount
FROM Orders O
JOIN OrderItems OI ON O.OrderID = OI.OrderID
GROUP BY O.CustomerID
)
SELECT C.CustomerName
FROM Customers C
JOIN ProductCounts PC ON C.CustomerID = PC.CustomerID
WHERE PC.ProductCount > 5;
Postgres Solution
WITH ProductCounts AS (
SELECT O.CustomerID, COUNT(DISTINCT OI.ProductID) AS ProductCount
FROM Orders O
JOIN OrderItems OI ON O.OrderID = OI.OrderID
GROUP BY O.CustomerID
)
SELECT C.CustomerName
FROM Customers C
JOIN ProductCounts PC ON C.CustomerID = PC.CustomerID
WHERE PC.ProductCount > 5;
• Q.186
Question
Write a query to find the most expensive product in each category.
Explanation
We can use a CTE to calculate the maximum price for each category, then use this to
filter out the most expensive product in each category.
• - Create tables
CREATE TABLE Categories (
CategoryID INT PRIMARY KEY,
CategoryName VARCHAR(100)
);
• - Insert data
250
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
MySQL Solution
WITH MaxPrices AS (
SELECT CategoryID, MAX(Price) AS MaxPrice
FROM Products
GROUP BY CategoryID
)
SELECT P.ProductName, C.CategoryName, P.Price
FROM Products P
JOIN MaxPrices MP ON P.CategoryID = MP.CategoryID AND P.Price = MP.MaxPrice
JOIN Categories C ON P.CategoryID = C.CategoryID;
Postgres Solution
WITH MaxPrices AS (
SELECT CategoryID, MAX(Price) AS MaxPrice
FROM Products
GROUP BY CategoryID
)
SELECT P.ProductName, C.CategoryName, P.Price
FROM Products P
JOIN MaxPrices MP ON P.CategoryID = MP.CategoryID AND P.Price = MP.MaxPrice
JOIN Categories C ON P.CategoryID = C.CategoryID;
• Q.187
Question
Calculate the department-wise average salary using a Common Table Expression
(CTE).
Explanation
You need to calculate the average salary for each department. Use a CTE to join the
Employees, Salaries, and Departments tables, then group by department to
calculate the average salary.
• - Table creation
251
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Employees VALUES
(1, 'Arun Kumar', 1),
(2, 'Priya Sharma', 2),
(3, 'Ravi Patel', 1),
(4, 'Sita Mehta', 3);
Learnings
Solutions
• - PostgreSQL solution
WITH DepartmentSalaries AS (
SELECT E.DepartmentID, D.DepartmentName, AVG(S.Salary) AS AvgSalary
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
JOIN Departments D ON E.DepartmentID = D.DepartmentID
GROUP BY E.DepartmentID, D.DepartmentName
)
SELECT * FROM DepartmentSalaries;
• - MySQL solution
WITH DepartmentSalaries AS (
SELECT E.DepartmentID, D.DepartmentName, AVG(S.Salary) AS AvgSalary
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
JOIN Departments D ON E.DepartmentID = D.DepartmentID
GROUP BY E.DepartmentID, D.DepartmentName
252
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
SELECT * FROM DepartmentSalaries;
• Q.188
Question
Find the Employees Who Have the Highest Salary in Each Department.
Explanation
The task is to find the employees with the highest salary in each department using a
Common Table Expression (CTE). The solution involves:
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
DepartmentID INT
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Arun Kumar', 1),
(2, 'Priya Sharma', 2),
(3, 'Ravi Patel', 1),
(4, 'Sita Mehta', 3);
Learnings
253
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using CTEs to calculate aggregate values (like MAX salary) for each group
(e.g., department).
• Joins across multiple tables (Employees, Salaries, Departments) to fetch
related information.
• Filtering the results to only show employees with the maximum salary in each
department.
Solutions
• - PostgreSQL solution
WITH DepartmentMaxSalary AS (
SELECT E.DepartmentID, MAX(S.Salary) AS MaxSalary
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
GROUP BY E.DepartmentID
)
SELECT E.EmployeeName, S.Salary, D.DepartmentName
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
JOIN Departments D ON E.DepartmentID = D.DepartmentID
JOIN DepartmentMaxSalary DMS ON E.DepartmentID = DMS.DepartmentID
WHERE S.Salary = DMS.MaxSalary;
• - MySQL solution
WITH DepartmentMaxSalary AS (
SELECT E.DepartmentID, MAX(S.Salary) AS MaxSalary
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
GROUP BY E.DepartmentID
)
SELECT E.EmployeeName, S.Salary, D.DepartmentName
FROM Employees E
JOIN Salaries S ON E.EmployeeID = S.EmployeeID
JOIN Departments D ON E.DepartmentID = D.DepartmentID
JOIN DepartmentMaxSalary DMS ON E.DepartmentID = DMS.DepartmentID
WHERE S.Salary = DMS.MaxSalary;
• Q.189
Question
Find Patients Who Have Visited More Than 3 Times.
Explanation
The goal is to identify patients who have had more than 3 appointments. The solution
involves:
• Counting the number of visits for each patient using a Common Table
Expression (CTE).
• Filtering patients whose visit count is greater than 3.
• - Table creation
254
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Patients VALUES
(1, 'Anita Roy', 35, 'Female'),
(2, 'Sandeep Kumar', 40, 'Male'),
(3, 'Ravi Gupta', 28, 'Male'),
(4, 'Priya Sharma', 55, 'Female');
Learnings
Solutions
• - PostgreSQL solution
WITH PatientVisitCount AS (
SELECT PatientID, COUNT(*) AS VisitCount
FROM Appointments
GROUP BY PatientID
)
SELECT P.PatientName
FROM Patients P
JOIN PatientVisitCount PVC ON P.PatientID = PVC.PatientID
WHERE PVC.VisitCount > 3;
• - MySQL solution
WITH PatientVisitCount AS (
SELECT PatientID, COUNT(*) AS VisitCount
FROM Appointments
GROUP BY PatientID
)
SELECT P.PatientName
FROM Patients P
JOIN PatientVisitCount PVC ON P.PatientID = PVC.PatientID
WHERE PVC.VisitCount > 3;
255
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.190
Question
List the Most Frequent Doctor for Each Patient.
Explanation
The task is to identify the most frequent doctor for each patient. This can be achieved
by:
• - Table creation
CREATE TABLE Patients (
PatientID INT PRIMARY KEY,
PatientName VARCHAR(100)
);
• - Datasets
INSERT INTO Patients VALUES
(1, 'Anita Roy'),
(2, 'Sandeep Kumar'),
(3, 'Ravi Gupta'),
(4, 'Priya Sharma');
256
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
WITH PatientDoctorCount AS (
SELECT PatientID, DoctorID, COUNT(*) AS VisitCount
FROM Appointments
GROUP BY PatientID, DoctorID
),
MaxDoctorVisit AS (
SELECT PatientID, MAX(VisitCount) AS MaxVisits
FROM PatientDoctorCount
GROUP BY PatientID
)
SELECT P.PatientName, D.DoctorName
FROM Patients P
JOIN PatientDoctorCount PDC ON P.PatientID = PDC.PatientID
JOIN Doctors D ON PDC.DoctorID = D.DoctorID
JOIN MaxDoctorVisit MDV ON PDC.PatientID = MDV.PatientID
WHERE PDC.VisitCount = MDV.MaxVisits;
• - MySQL solution
WITH PatientDoctorCount AS (
SELECT PatientID, DoctorID, COUNT(*) AS VisitCount
FROM Appointments
GROUP BY PatientID, DoctorID
),
MaxDoctorVisit AS (
SELECT PatientID, MAX(VisitCount) AS MaxVisits
FROM PatientDoctorCount
GROUP BY PatientID
)
SELECT P.PatientName, D.DoctorName
FROM Patients P
JOIN PatientDoctorCount PDC ON P.PatientID = PDC.PatientID
JOIN Doctors D ON PDC.DoctorID = D.DoctorID
JOIN MaxDoctorVisit MDV ON PDC.PatientID = MDV.PatientID
WHERE PDC.VisitCount = MDV.MaxVisits;
Window Functions
• Q.191
Question
Rank Employees by Salary (Descending).
Explanation
257
1000+ SQL Interview Questions & Answers | By Zero Analyst
The task is to rank employees based on their salary in descending order using the
RANK() window function. This function assigns a rank to each row within a result set,
with ties receiving the same rank, but skipping subsequent ranks.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
Salary DECIMAL(10, 2),
DepartmentID INT
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'John Smith', 55000, 1),
(2, 'Sarah Brown', 60000, 2),
(3, 'James White', 50000, 1),
(4, 'Emma Green', 65000, 3),
(5, 'Michael Clark', 48000, 2);
Learnings
Solutions
• - PostgreSQL solution
SELECT EmployeeName, Salary,
RANK() OVER (ORDER BY Salary DESC) AS SalaryRank
FROM Employees;
• - MySQL solution
SELECT EmployeeName, Salary,
RANK() OVER (ORDER BY Salary DESC) AS SalaryRank
FROM Employees;
• Q.192
Question
Assign a Unique Row Number to Each Employee in a Department.
258
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
The task is to assign a unique row number to each employee within their respective
department using the ROW_NUMBER() window function. This function provides a
sequential number for each row within a partition, ordered by salary in descending
order.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2),
DepartmentID INT
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Oliver Harris', 45000, 1),
(2, 'Emily Walker', 52000, 2),
(3, 'Charlotte King', 48000, 2),
(4, 'James Thompson', 56000, 1),
(5, 'Liam White', 54000, 1);
Learnings
Solutions
• - PostgreSQL solution
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS RowN
um
FROM Employees;
• - MySQL solution
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS RowN
um
FROM Employees;
259
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.193
Question
List Employees with Their Previous Salary (Using LAG).
Explanation
The task is to retrieve the salary of each employee along with their previous salary
using the LAG() window function. The LAG() function returns the value of the
specified column (in this case, Salary) from the previous row in the result set, based
on the given order.
• - Table creation
CREATE TABLE StaffMembers (
StaffID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO StaffMembers VALUES
(1, 'William Brown', 49000),
(2, 'Sophia Harris', 54000),
(3, 'Isabella Clark', 46000),
(4, 'Mia Lewis', 55000),
(5, 'Jacob Davis', 50000);
Learnings
• LAG(): The LAG() window function retrieves the value of a column from the
previous row within the specified ordering.
• ORDER BY: The order of salary is used to determine the previous salary. It
assigns a previous salary to each row based on ascending salary order.
• Handling NULLs: For the first row, where there is no previous row, LAG()
returns NULL.
Solutions
• - PostgreSQL solution
SELECT Name, Salary,
LAG(Salary, 1) OVER (ORDER BY Salary) AS PreviousSalary
FROM StaffMembers;
• - MySQL solution
SELECT Name, Salary,
LAG(Salary, 1) OVER (ORDER BY Salary) AS PreviousSalary
FROM StaffMembers;
260
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.194
Question
Find the Top 3 Highest Paid Employees Using ROW_NUMBER.
Explanation
The task is to find the top 3 highest-paid employees by assigning a unique rank using
the ROW_NUMBER() window function. The function assigns a number to each row
based on salary in descending order, and then the result is filtered to retrieve only the
top 3 employees.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Harry Williams', 55000),
(2, 'Olivia Jackson', 60000),
(3, 'George Smith', 58000),
(4, 'Charlotte Brown', 67000),
(5, 'Amelia Harris', 65000);
Learnings
Solutions
• - PostgreSQL solution
WITH TopSalaries AS (
SELECT Name, Salary,
ROW_NUMBER() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees
)
SELECT Name, Salary
FROM TopSalaries
WHERE Rank <= 3;
• - MySQL solution
261
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH TopSalaries AS (
SELECT Name, Salary,
ROW_NUMBER() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees
)
SELECT Name, Salary
FROM TopSalaries
WHERE Rank <= 3;
• Q.195
Question
Rank Employees with Ties Using DENSE_RANK.
Explanation
The task is to rank employees by salary in descending order using the DENSE_RANK()
window function. Unlike RANK(), which skips rank numbers in case of ties,
DENSE_RANK() assigns the same rank to tied values but does not skip subsequent
ranks.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Lucas Evans', 55000),
(2, 'Ava Johnson', 65000),
(3, 'Ethan Harris', 65000),
(4, 'Ella Walker', 47000),
(5, 'Mason Davis', 48000);
Learnings
Solutions
• - PostgreSQL solution
SELECT Name, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS SalaryRank
FROM Employees;
262
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT Name, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS SalaryRank
FROM Employees;
• Q.196
Question
Find Employees Who Are 2nd in Their Department by Salary Using
ROW_NUMBER.
Explanation
The task is to find employees who are ranked 2nd in their department by salary using
the ROW_NUMBER() window function. The ROW_NUMBER() function assigns a unique
rank to each employee within their department, ordered by salary in descending order.
We then filter for employees who have a rank of 2.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
DepartmentID INT,
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Daniel Miller', 55000, 1),
(2, 'Sophia Allen', 60000, 2),
(3, 'Ethan Jackson', 57000, 1),
(4, 'Olivia White', 62000, 2),
(5, 'Mason Harris', 45000, 1);
Learnings
Solutions
• - PostgreSQL solution
263
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH RankedEmployees AS (
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS
Rank
FROM Employees
)
SELECT Name, DepartmentID, Salary
FROM RankedEmployees
WHERE Rank = 2;
• - MySQL solution
WITH RankedEmployees AS (
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS
Rank
FROM Employees
)
SELECT Name, DepartmentID, Salary
FROM RankedEmployees
WHERE Rank = 2;
• Q.197
Question
Use NTILE to Divide Employees into 4 Salary Groups.
Explanation
The task is to divide employees into 4 groups based on their salary using the NTILE()
window function. The NTILE(n) function assigns rows into n approximately equal
groups, ordered by the specified column. Here, employees are divided into 4 salary
groups based on descending salary.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Lily Martin', 55000),
(2, 'James White', 70000),
(3, 'Benjamin Lewis', 80000),
(4, 'Lucas Walker', 95000),
(5, 'Mia Scott', 40000);
Learnings
• NTILE(): The NTILE(n) function divides the data into n groups based on an
ordered column. If there is a remainder, the extra rows are distributed across
the groups.
264
1000+ SQL Interview Questions & Answers | By Zero Analyst
• ORDER BY: The ORDER BY clause ensures that employees are ranked by
salary in descending order before dividing them into groups.
• Grouping: Employees are distributed into 4 salary groups based on their
relative ranking in terms of salary.
Solutions
• - PostgreSQL solution
SELECT Name, Salary,
NTILE(4) OVER (ORDER BY Salary DESC) AS SalaryGroup
FROM Employees;
• - MySQL solution
SELECT Name, Salary,
NTILE(4) OVER (ORDER BY Salary DESC) AS SalaryGroup
FROM Employees;
• Q.198
Question
Find the Salary Difference Between Employees and Their Preceding Employee Using
LAG.
Explanation
The task is to calculate the salary difference between each employee and the
preceding employee in terms of salary using the LAG() window function. The LAG()
function allows you to access the salary of the previous employee in the ordered list,
and then compute the difference.
• - Table creation
CREATE TABLE EmployeeSalaries (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO EmployeeSalaries VALUES
(1, 'Henry Clark', 51000),
(2, 'Ava White', 55000),
(3, 'Isabella Davis', 56000),
(4, 'George Lewis', 58000),
(5, 'Mason Harris', 60000);
Learnings
• LAG(): The LAG() function allows you to reference the previous row's value,
here used to get the previous employee's salary.
265
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Salary Difference: Subtracting the current salary from the previous one
calculates the difference.
• ORDER BY: The employees are ordered by salary to ensure that the
"preceding employee" refers to the one with the lower salary.
Solutions
• - PostgreSQL solution
SELECT Name, Salary,
LAG(Salary, 1) OVER (ORDER BY Salary) AS PreviousSalary,
Salary - LAG(Salary, 1) OVER (ORDER BY Salary) AS SalaryDifference
FROM EmployeeSalaries;
• - MySQL solution
SELECT Name, Salary,
LAG(Salary, 1) OVER (ORDER BY Salary) AS PreviousSalary,
Salary - LAG(Salary, 1) OVER (ORDER BY Salary) AS SalaryDifference
FROM EmployeeSalaries;
• Q.199
Question
Find the Employee with the Highest Salary in Each Department Using
ROW_NUMBER.
Explanation
The task is to identify the employee with the highest salary in each department using
the ROW_NUMBER() window function. The ROW_NUMBER() function assigns a unique
rank to employees within each department, ordered by salary in descending order. The
employee with the highest salary in each department will have a rank of 1.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
DepartmentID INT,
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'Oliver Phillips', 64000, 1),
(2, 'Sophia Wood', 67000, 2),
(3, 'Liam White', 60000, 1),
(4, 'Charlotte Scott', 72000, 2),
(5, 'Amelia Johnson', 50000, 1);
Learnings
266
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
WITH DepartmentSalaries AS (
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS
Rank
FROM Employees
)
SELECT Name, DepartmentID, Salary
FROM DepartmentSalaries
WHERE Rank = 1;
• - MySQL solution
WITH DepartmentSalaries AS (
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS
Rank
FROM Employees
)
SELECT Name, DepartmentID, Salary
FROM DepartmentSalaries
WHERE Rank = 1;
• Q.200
Question
Get Employees Who Are 3rd in Terms of Salary in Each Department Using
ROW_NUMBER.
Explanation
The task is to find employees who rank 3rd in terms of salary within their respective
departments using the ROW_NUMBER() window function. The ROW_NUMBER() function
assigns a unique rank to employees within each department, ordered by salary in
descending order. The employees who rank 3rd in their departments will have a rank
of 3.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
DepartmentID INT,
Salary DECIMAL(10, 2)
267
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'John Black', 48000, 1),
(2, 'Oliver Smith', 55000, 2),
(3, 'Emily Harris', 60000, 1),
(4, 'Daniel Brown', 65000, 2),
(5, 'Sophia King', 48000, 1);
Learnings
Solutions
• - PostgreSQL solution
WITH DepartmentRank AS (
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS
Rank
FROM Employees
)
SELECT Name, DepartmentID, Salary
FROM DepartmentRank
WHERE Rank = 3;
• - MySQL solution
WITH DepartmentRank AS (
SELECT Name, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary DESC) AS
Rank
FROM Employees
)
SELECT Name, DepartmentID, Salary
FROM DepartmentRank
WHERE Rank = 3;
String Functions
• Q.201
Question
Concatenate First and Last Names.
Explanation
268
1000+ SQL Interview Questions & Answers | By Zero Analyst
The task is to concatenate the first and last names of employees into a single full name
using the CONCAT() function. This function combines the values of the first and last
name with a space in between.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'John', 'Black'),
(2, 'Oliver', 'Smith'),
(3, 'Emily', 'Harris'),
(4, 'Daniel', 'Brown');
Learnings
Solutions
• - PostgreSQL solution
SELECT EmployeeID, CONCAT(FirstName, ' ', LastName) AS FullName
FROM Employees;
• - MySQL solution
SELECT EmployeeID, CONCAT(FirstName, ' ', LastName) AS FullName
FROM Employees;
• Q.202
Question
Extract Year from Date of Birth (TEXT).
Explanation
The task is to extract the year from the DateOfBirth column using the EXTRACT()
function, which is commonly used to retrieve specific parts of a date (e.g., year,
month, day).
269
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE People (
PersonID INT PRIMARY KEY,
Name VARCHAR(100),
DateOfBirth DATE
);
• - Datasets
INSERT INTO People VALUES
(1, 'Lucas Green', '1990-04-15'),
(2, 'Charlotte Brown', '1985-09-25'),
(3, 'Ethan White', '1992-03-10'),
(4, 'Mason Harris', '1988-11-05');
Learnings
Solutions
• - PostgreSQL solution
SELECT Name, DateOfBirth,
EXTRACT(YEAR FROM DateOfBirth) AS BirthYear
FROM People;
• - MySQL solution
SELECT Name, DateOfBirth,
YEAR(DateOfBirth) AS BirthYear
FROM People;
Note: In PostgreSQL, the EXTRACT() function is used, whereas MySQL provides the
YEAR() function to directly extract the year from a DATE field.
• Q.203
Question
Remove Extra Spaces in a String (TRIM).
Explanation
The task is to remove any leading or trailing spaces from the ProductName column
using the TRIM() function, which eliminates whitespace characters from both ends of
a string.
• - Table creation
CREATE TABLE Products (
270
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Products VALUES
(1, ' Apple iPhone 12 '),
(2, ' Samsung Galaxy S21 '),
(3, ' Sony Xperia 1 II '),
(4, ' OnePlus 8 Pro ');
Learnings
• TRIM(): This function is used to remove leading and trailing spaces from a
string.
• String Cleaning: Removing extra spaces is essential for clean and consistent
data, especially when processing text for search or reporting.
Solutions
• - PostgreSQL solution
SELECT ProductID, TRIM(ProductName) AS TrimmedProductName
FROM Products;
• - MySQL solution
SELECT ProductID, TRIM(ProductName) AS TrimmedProductName
FROM Products;
Note: The TRIM() function works similarly in both PostgreSQL and MySQL for
removing leading and trailing whitespace.
• Q.204
Question
Replace Specific Characters in a Text (REPLACE).
Explanation
The task is to replace all commas with semicolons in the Address field using the
REPLACE() function. This function finds all occurrences of a specified substring (in
this case, commas) and replaces them with a new substring (semicolons).
• - Table creation
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
Name VARCHAR(100),
Address VARCHAR(200)
);
271
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Customers VALUES
(1, 'John Doe', '1234 Elm St., New York, NY 10001'),
(2, 'Alice Johnson', '5678 Oak Ave, Los Angeles, CA 90001'),
(3, 'Bob Brown', '4321 Pine Rd, Chicago, IL 60001');
Learnings
Solutions
• - PostgreSQL solution
SELECT CustomerID, Name, REPLACE(Address, ',', ';') AS UpdatedAddress
FROM Customers;
• - MySQL solution
SELECT CustomerID, Name, REPLACE(Address, ',', ';') AS UpdatedAddress
FROM Customers;
Note: The REPLACE() function works similarly in both PostgreSQL and MySQL for
replacing specific characters or substrings within a string.
• Q.205
Question
Substring Extraction for Specific Position.
Explanation
The task is to extract the first 4 characters from the CompanyName column using the
SUBSTRING() function. The function allows you to extract a portion of a string
starting from a specific position.
• - Table creation
CREATE TABLE Companies (
CompanyID INT PRIMARY KEY,
CompanyName VARCHAR(100),
Industry VARCHAR(100)
);
• - Datasets
INSERT INTO Companies VALUES
(1, 'Tech Corp', 'Technology'),
(2, 'Green Earth Ltd.', 'Environmental'),
(3, 'Global Ventures', 'Investment');
272
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT CompanyID, CompanyName, SUBSTRING(CompanyName FROM 1 FOR 4) AS NamePrefix
FROM Companies;
• - MySQL solution
SELECT CompanyID, CompanyName, SUBSTRING(CompanyName, 1, 4) AS NamePrefix
FROM Companies;
Note: In MySQL, the SUBSTRING() function uses a starting position and length, while
in PostgreSQL, it uses a more flexible syntax with the FROM and FOR keywords.
• Q.206
Question
Search for a Pattern in a String (LIKE Operator).
Explanation
The task is to find countries with names starting with the word "United" using the
LIKE operator. The LIKE operator is used for pattern matching in strings, where %
represents any sequence of characters.
• - Table creation
CREATE TABLE Countries (
CountryID INT PRIMARY KEY,
CountryName VARCHAR(100),
Continent VARCHAR(50)
);
• - Datasets
INSERT INTO Countries VALUES
(1, 'United States', 'North America'),
(2, 'India', 'Asia'),
(3, 'United Kingdom', 'Europe'),
(4, 'South Korea', 'Asia');
Learnings
273
1000+ SQL Interview Questions & Answers | By Zero Analyst
• LIKE Operator: This is used for pattern matching in SQL. The % symbol
matches any sequence of characters.
• Pattern Matching: The LIKE operator can be used for more complex
searches, such as finding words that start with, end with, or contain certain
substrings.
Solutions
• - PostgreSQL solution
SELECT CountryID, CountryName
FROM Countries
WHERE CountryName LIKE 'United%';
• - MySQL solution
SELECT CountryID, CountryName
FROM Countries
WHERE CountryName LIKE 'United%';
Note: The LIKE operator works the same way in both PostgreSQL and MySQL for
pattern matching in strings.
• Q.207
Question
Check for Null or Empty Strings.
Explanation
The task is to identify clients with either NULL or empty strings ('') in the Email
field. This can be done using the IS NULL condition to check for NULL values and a
comparison (= '') to check for empty strings.
• - Table creation
CREATE TABLE Clients (
ClientID INT PRIMARY KEY,
ClientName VARCHAR(100),
Email VARCHAR(150)
);
• - Datasets
INSERT INTO Clients VALUES
(1, 'Jane Doe', '[email protected]'),
(2, 'Peter Parker', ''),
(3, 'Clark Kent', '[email protected]'),
(4, 'Bruce Wayne', NULL);
Learnings
274
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Empty string check: Use = '' to identify columns that contain empty strings.
• Combined conditions: You can use OR to combine multiple conditions for
filtering.
Solutions
• - PostgreSQL solution
SELECT ClientID, ClientName, Email
FROM Clients
WHERE Email IS NULL OR Email = '';
• - MySQL solution
SELECT ClientID, ClientName, Email
FROM Clients
WHERE Email IS NULL OR Email = '';
Note: Both PostgreSQL and MySQL use the same syntax to check for NULL values
and empty strings.
• Q.208
Question
Find All Companies with a Specific Suffix (REGEX).
Explanation
The task is to find all startups that have the suffix "Inc." or "LLC" in their name using
regular expressions (REGEXP). The regular expression pattern (Inc\.|LLC)$ will
match any startup name ending with either "Inc." or "LLC". The $ symbol asserts that
the match is at the end of the string.
• - Table creation
CREATE TABLE Startups (
StartupID INT PRIMARY KEY,
StartupName VARCHAR(100),
FoundedYear INT
);
• - Datasets
INSERT INTO Startups VALUES
(1, 'TechFusion Inc.', 2015),
(2, 'GreenPlanet LLC', 2018),
(3, 'EduTech Solutions', 2020),
(4, 'MedicaCorp Ltd.', 2017);
Learnings
275
1000+ SQL Interview Questions & Answers | By Zero Analyst
• $ symbol: The $ asserts that the match must occur at the end of the string,
ensuring that only companies with the specified suffix are selected.
• Pattern Matching: The pipe | is used to match either "Inc." or "LLC".
Solutions
• - PostgreSQL solution
SELECT StartupID, StartupName
FROM Startups
WHERE StartupName ~ '(Inc\.|LLC)$';
• - MySQL solution
SELECT StartupID, StartupName
FROM Startups
WHERE StartupName REGEXP '(Inc\.|LLC)$';
Note: In MySQL, REGEXP is used for pattern matching, while in PostgreSQL, the tilde
~ operator is used to apply regular expressions.
• Q.209
Question
Find the Top 3 Most Frequent Words in a Text Column.
Explanation
This query extracts words from the Content column of the BlogPosts table, then
counts their frequency while excluding common stop words (e.g., "the", "is", "and",
etc.). It uses STRING_TO_ARRAY to split the content into words and UNNEST to flatten
the array into rows. The result is grouped by the word and ordered by frequency,
limiting the result to the top 3 most frequent words.
• - Table creation
CREATE TABLE BlogPosts (
PostID INT PRIMARY KEY,
Title VARCHAR(255),
Content TEXT
);
• - Datasets
INSERT INTO BlogPosts VALUES
(1, 'SQL Best Practices', 'SQL best practices are important for performance. Lear
n SQL!'),
(2, 'Learn SQL for Data Analysis', 'Learn SQL for data analysis. SQL is essential
for data analysts.'),
(3, 'Advanced SQL Techniques', 'Master advanced SQL techniques and improve your s
kills.'),
(4, 'Introduction to Databases', 'Databases are the backbone of most modern appli
cations. Learn how to manage databases efficiently.'),
(5, 'SQL vs NoSQL', 'SQL and NoSQL databases serve different purposes. SQL is gre
at for structured data, while NoSQL is good for unstructured data.'),
276
1000+ SQL Interview Questions & Answers | By Zero Analyst
(6, 'Optimizing SQL Queries', 'Optimizing SQL queries is key to improving perform
ance. Learn how to write efficient queries and index your tables.'),
(7, 'Database Design Best Practices', 'Good database design can save time and eff
ort. Learn how to structure your database for performance and scalability.'),
(8, 'What is Data Science?', 'Data science combines statistical analysis and mach
ine learning to extract insights from data. Learn the fundamentals of data scienc
e!'),
(9, 'Introduction to Machine Learning', 'Machine learning algorithms can make pre
dictions based on data. Start learning machine learning concepts today!'),
(10, 'Big Data and Cloud Computing', 'Big data and cloud computing are
transforming industries. Learn how to leverage big data and cloud technologies
for growth.');
Learnings
Solutions
• - PostgreSQL solution
SELECT Word, COUNT(*) AS Frequency
FROM (
SELECT UNNEST(STRING_TO_ARRAY(LOWER(Content), ' ')) AS Word
FROM BlogPosts
) AS Words
WHERE Word NOT IN ('the', 'and', 'is', 'for', 'a', 'an', 'in', 'of', 'to')
GROUP BY Word
ORDER BY Frequency DESC
LIMIT 3;
• - MySQL solution
Note: The MySQL solution is more complex and would require additional handling to
break the text into individual words since MySQL lacks the direct functionality for
array splitting like PostgreSQL.
277
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.210
Question
Extract Country Code from Email Addresses (REGEX).
Explanation
The task is to extract the country code (or domain name) from the email domain using
a regular expression. The REGEXP_SUBSTR() function is used to capture the part of the
email after the "@" symbol, which typically represents the domain of the email
address.
• - Table creation
CREATE TABLE Users (
UserID INT PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(150)
);
• - Datasets
INSERT INTO Users VALUES
(1, 'John Doe', '[email protected]'),
(2, 'Alice Smith', '[email protected]'),
(3, 'Bob Johnson', '[email protected]'),
(4, 'Sophia Green', '[email protected]');
Learnings
Solutions
• - PostgreSQL solution
SELECT Name, Email,
REGEXP_SUBSTR(Email, '@([a-zA-Z]+)') AS CountryCode
FROM Users;
• - MySQL solution
SELECT Name, Email,
REGEXP_SUBSTR(Email, '@([a-zA-Z]+)') AS CountryCode
FROM Users;
278
1000+ SQL Interview Questions & Answers | By Zero Analyst
Note: The REGEXP_SUBSTR() function works in both PostgreSQL and MySQL for
extracting the domain name. However, the behavior may slightly vary across versions
or configurations.
Date Functions
• Q.211
Question:
Convert Date to Text (Date Format Change)
Tables: Orders
-- Create table
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderDate DATE,
CustomerID INT
);
-- Insert data
INSERT INTO Orders VALUES
(1, '2024-01-15', 101),
(2, '2024-02-20', 102),
(3, '2024-03-18', 103),
(4, '2024-04-25', 104),
(5, '2024-05-05', 105),
(6, '2024-06-10', 106),
(7, '2024-07-22', 107),
(8, '2024-08-13', 108),
(9, '2024-09-09', 109),
(10, '2024-10-30', 110);
Explanation:
Convert the OrderDate from a DATE type to a text string with the format "DD-Mon-
YYYY", where DD is the day, Mon is the abbreviated month name, and YYYY is the year.
Learnings:
Solutions:
• PostgreSQL Solution:
• SELECT OrderID, TO_CHAR(OrderDate, 'DD-Mon-YYYY') AS OrderDateText
FROM Orders;
• MySQL Solution:
• SELECT OrderID, DATE_FORMAT(OrderDate, '%d-%b-%Y') AS OrderDateText
FROM Orders;
• Q.212
Question
279
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Datasets
INSERT INTO EmployeeRecords (EmployeeID, JoiningDate, Department)
VALUES
(1, '01-12-2020', 'HR'),
(2, '15-02-2021', 'Finance'),
(3, '28-06-2019', 'Engineering'),
(4, '07-08-2018', 'Sales'),
(5, '20-11-2017', 'Marketing'),
(6, '11-05-2022', 'Product'),
(7, '02-09-2020', 'Operations'),
(8, '18-03-2019', 'Legal');
Learnings
Solutions
• - PostgreSQL solution
SELECT EmployeeID, TO_DATE(JoiningDate, 'DD-MM-YYYY') AS JoiningDate
FROM EmployeeRecords;
• - MySQL solution
SELECT EmployeeID, STR_TO_DATE(JoiningDate, '%d-%m-%Y') AS JoiningDate
FROM EmployeeRecords;
• Q.213
Question
Calculate the end date by adding the LeaveDuration to the LeaveStartDate.
Explanation
You need to calculate the end date of a leave request by adding the LeaveDuration
(in days) to the LeaveStartDate.
280
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Datasets
INSERT INTO LeaveRequests (RequestID, EmployeeID, LeaveStartDate, LeaveDuration)
VALUES
(1, 101, '2024-02-01', 5),
(2, 102, '2024-03-05', 3),
(3, 103, '2024-04-10', 7);
Learnings
Solutions
• - PostgreSQL solution
SELECT RequestID, EmployeeID, LeaveStartDate,
LeaveStartDate + INTERVAL '1 day' * LeaveDuration AS LeaveEndDate
FROM LeaveRequests;
• - MySQL solution
SELECT RequestID, EmployeeID, LeaveStartDate,
DATE_ADD(LeaveStartDate, INTERVAL LeaveDuration DAY) AS LeaveEndDate
FROM LeaveRequests;
• Q.214
Question
Extract the year from the TransactionDate.
Explanation
You need to extract the year part from the TransactionDate column.
-- Datasets
INSERT INTO Transactions (TransactionID, TransactionDate, Amount)
VALUES
(1, '2023-11-20', 250.50),
(2, '2022-05-15', 180.75),
(3, '2024-01-10', 540.60);
Learnings
281
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT TransactionID, TransactionDate, EXTRACT(YEAR FROM TransactionDate) AS Tran
sactionYear
FROM Transactions;
• - MySQL solution
SELECT TransactionID, TransactionDate, YEAR(TransactionDate) AS TransactionYear
FROM Transactions;
• Q.215
Question
Find the longest streak of consecutive days each user has logged in, based on
their LoginDate.
Explanation
You need to find the longest consecutive streak of days that each user has logged in.
A consecutive streak means there are no gaps (i.e., the difference between consecutive
login dates is 1 day). This requires identifying and grouping consecutive days, then
calculating the longest streak for each user.
-- Datasets
INSERT INTO UserLogins (UserID, LoginDate)
VALUES
(1, '2024-01-01'),
(1, '2024-01-02'),
(1, '2024-01-04'),
(1, '2024-01-05'),
(1, '2024-01-06'),
(2, '2024-01-01'),
(2, '2024-01-02'),
(2, '2024-01-04'),
(2, '2024-01-05'),
(3, '2024-01-03'),
(3, '2024-01-04'),
(3, '2024-01-05'),
(3, '2024-01-06'),
(4, '2024-01-01'),
(4, '2024-01-03'),
(4, '2024-01-04');
Learnings
282
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
WITH RankedLogins AS (
SELECT UserID, LoginDate,
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY LoginDate) -
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY LoginDate) AS StreakGr
oup
FROM UserLogins
),
ConsecutiveStreaks AS (
SELECT UserID, MIN(LoginDate) AS StreakStart, MAX(LoginDate) AS StreakEnd,
COUNT(*) AS StreakLength
FROM RankedLogins
GROUP BY UserID, StreakGroup
)
SELECT UserID, MAX(StreakLength) AS LongestStreak
FROM ConsecutiveStreaks
GROUP BY UserID;
MySQL Solution
WITH RankedLogins AS (
SELECT UserID, LoginDate,
DATEDIFF(LoginDate, @prev_date := IF(@prev_user = UserID, @prev_date,
NULL)) AS StreakGroup,
@prev_user := UserID
FROM UserLogins
ORDER BY UserID, LoginDate
),
ConsecutiveStreaks AS (
SELECT UserID, MIN(LoginDate) AS StreakStart, MAX(LoginDate) AS StreakEnd,
COUNT(*) AS StreakLength
FROM RankedLogins
GROUP BY UserID, StreakGroup
)
SELECT UserID, MAX(StreakLength) AS LongestStreak
FROM ConsecutiveStreaks
GROUP BY UserID;
3. Step 3: Finally, we select the longest streak for each user by finding the
maximum streak length (MAX(StreakLength)).
283
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.216
Question
Find the total sales for each day of the week.
Explanation
You need to extract the day of the week from the TransactionDate and calculate the
total sales for each day using the Amount column.
-- Datasets
INSERT INTO Transactions (TransactionID, TransactionDate, Amount)
VALUES
(1, '2023-11-20', 250.50),
(2, '2022-05-15', 180.75),
(3, '2024-01-10', 540.60),
(4, '2023-11-21', 320.40),
(5, '2023-11-22', 150.20),
(6, '2023-11-20', 430.30),
(7, '2023-11-23', 210.10),
(8, '2023-11-24', 300.00),
(9, '2023-11-25', 150.00),
(10, '2023-11-26', 500.00),
(11, '2023-11-27', 410.25),
(12, '2023-11-28', 100.40),
(13, '2023-11-29', 750.90),
(14, '2023-11-30', 600.75),
(15, '2023-12-01', 230.15),
(16, '2023-12-02', 185.20),
(17, '2023-12-03', 420.60),
(18, '2023-12-04', 520.45),
(19, '2023-12-05', 310.10),
(20, '2023-12-06', 450.25),
(21, '2023-12-07', 650.80),
(22, '2023-12-08', 370.50),
(23, '2023-12-09', 330.30),
(24, '2023-12-10', 490.40),
(25, '2023-12-11', 210.75),
(26, '2023-12-12', 320.10);
Learnings
• Using date functions to extract the day of the week (e.g., DAYOFWEEK(),
EXTRACT())
• Aggregating data using SUM() to calculate total sales
• Grouping results by day of the week
Solutions
• - PostgreSQL solution
SELECT EXTRACT(DOW FROM TransactionDate) AS DayOfWeek,
SUM(Amount) AS TotalSales
284
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM Transactions
GROUP BY EXTRACT(DOW FROM TransactionDate)
ORDER BY DayOfWeek;
• - MySQL solution
SELECT DAYOFWEEK(TransactionDate) AS DayOfWeek,
SUM(Amount) AS TotalSales
FROM Transactions
GROUP BY DAYOFWEEK(TransactionDate)
ORDER BY DayOfWeek;
• Q.217
Question
Calculate the total ride duration for each ride, considering rides that span across
midnight (e.g., from 11:58 PM to 12:15 AM the next day) as well as rides that end on
the same day.
Explanation
You need to calculate the duration between the RideStartTime and RideEndTime,
handling both rides that span across midnight and those that end within the same day.
Ensure that the time difference is calculated correctly.
-- Datasets
INSERT INTO UberRides (RideID, RideStartTime, RideEndTime)
VALUES
(1, '2024-01-01 23:58:00', '2024-01-02 00:15:00'),
(2, '2024-01-02 14:30:00', '2024-01-02 15:10:00'),
(3, '2024-01-03 07:45:00', '2024-01-03 08:25:00'),
(4, '2024-01-03 22:10:00', '2024-01-04 00:05:00'),
(5, '2024-01-04 12:00:00', '2024-01-04 12:45:00'),
(6, '2024-01-02 08:00:00', '2024-01-02 08:45:00'),
(7, '2024-01-02 09:15:00', '2024-01-02 10:00:00'),
(8, '2024-01-02 11:30:00', '2024-01-02 12:15:00'),
(9, '2024-01-02 13:00:00', '2024-01-02 13:50:00'),
(10, '2024-01-03 06:30:00', '2024-01-03 07:00:00'),
(11, '2024-01-03 09:00:00', '2024-01-03 09:30:00'),
(12, '2024-01-03 11:45:00', '2024-01-03 12:30:00'),
(13, '2024-01-03 15:00:00', '2024-01-03 15:40:00'),
(14, '2024-01-03 16:10:00', '2024-01-03 16:55:00'),
(15, '2024-01-03 17:00:00', '2024-01-03 17:30:00'),
(16, '2024-01-03 18:15:00', '2024-01-03 19:00:00'),
(17, '2024-01-03 20:00:00', '2024-01-03 20:45:00'),
(18, '2024-01-03 21:30:00', '2024-01-03 22:15:00'),
(19, '2024-01-03 23:00:00', '2024-01-03 23:40:00'),
(20, '2024-01-04 00:10:00', '2024-01-04 00:40:00'),
(21, '2024-01-04 02:00:00', '2024-01-04 02:45:00'),
(22, '2024-01-04 03:00:00', '2024-01-04 03:35:00'),
(23, '2024-01-04 05:10:00', '2024-01-04 05:50:00'),
(24, '2024-01-04 06:30:00', '2024-01-04 07:15:00'),
(25, '2024-01-04 08:00:00', '2024-01-04 08:50:00');
Learnings
285
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT RideID,
RideStartTime,
RideEndTime,
EXTRACT(EPOCH FROM (RideEndTime - RideStartTime)) / 60 AS RideDurationInMi
nutes
FROM UberRides;
• - MySQL solution
SELECT RideID,
RideStartTime,
RideEndTime,
TIMESTAMPDIFF(MINUTE, RideStartTime, RideEndTime) AS RideDurationInMinutes
FROM UberRides;
• Q.218
Question
Calculate the difference in months between the first and last purchase dates for
each customer.
Explanation
You need to calculate the number of months between the earliest purchase
(FirstPurchaseDate) and the latest purchase (LastPurchaseDate) for each
customer. You should handle edge cases where the purchases might be in different
years or months.
-- Datasets
INSERT INTO CustomerPurchases (CustomerID, PurchaseID, PurchaseDate)
VALUES
(1, 101, '2023-03-15'),
(1, 102, '2023-08-20'),
(1, 103, '2024-01-10'),
(2, 104, '2022-02-05'),
(2, 105, '2023-07-15'),
(2, 106, '2024-04-22'),
(3, 107, '2023-06-11'),
(3, 108, '2023-09-25');
286
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT CustomerID,
DATE_PART('year', MAX(PurchaseDate)) - DATE_PART('year', MIN(PurchaseDate)
) AS YearDiff,
DATE_PART('month', MAX(PurchaseDate)) - DATE_PART('month', MIN(PurchaseDat
e)) AS MonthDiff,
EXTRACT(MONTH FROM MAX(PurchaseDate) - MIN(PurchaseDate)) AS MonthsBetween
FROM CustomerPurchases
GROUP BY CustomerID;
• - MySQL solution
SELECT CustomerID,
TIMESTAMPDIFF(MONTH, MIN(PurchaseDate), MAX(PurchaseDate)) AS MonthsBetwee
n
FROM CustomerPurchases
GROUP BY CustomerID;
• Q.219
Question
Find the date of the last Friday of each month for the last 12 months.
Explanation
For each of the last 12 months, find the date of the last Friday. This involves
calculating the last day of the month and then adjusting backward to the previous
Friday if necessary.
Learnings
Solutions
• - PostgreSQL solution
SELECT
287
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT
LAST_DAY(CURRENT_DATE - INTERVAL n MONTH) AS LastDayOfMonth,
DATE_SUB(LAST_DAY(CURRENT_DATE - INTERVAL n MONTH), INTERVAL (DAYOFWEEK(LAST_
DAY(CURRENT_DATE - INTERVAL n MONTH)) + 1) % 7 DAY) AS LastFriday
FROM (SELECT 0 AS n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION
SELECT 10 UNION SELECT 11) AS months;
• Q.220
Question
Find the number of days since each user last logged in.
Explanation
You need to find the number of days since each user last logged in based on the
LastLoginDate column. This should be calculated from the current date, taking into
account the system's current date.
-- Datasets
INSERT INTO UserLogins (UserID, LastLoginDate)
VALUES
(1, '2023-12-01'),
(2, '2024-01-05'),
(3, '2023-10-15'),
(4, '2023-11-23'),
(5, '2024-01-01');
Learnings
Solutions
• - PostgreSQL solution
SELECT UserID,
CURRENT_DATE - LastLoginDate AS DaysSinceLastLogin
FROM UserLogins;
• - MySQL solution
288
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT UserID,
DATEDIFF(CURRENT_DATE, LastLoginDate) AS DaysSinceLastLogin
FROM UserLogins;
Case Statements
• Q.221
Question
Use a CASE statement to categorize customers based on their TotalSpent.
Explanation
This question asks you to categorize customers into three spending categories: 'Low
Spender', 'Medium Spender', and 'High Spender', based on the TotalSpent value. Use
a CASE statement to assign these categories.
• - Table creation
CREATE TABLE CustomerPurchases (
CustomerID INT PRIMARY KEY,
TotalSpent DECIMAL(10, 2)
);
• - Datasets
INSERT INTO CustomerPurchases VALUES
(1, 350.00),
(2, 1500.00),
(3, 75.00),
(4, 230.00),
(5, 1200.00),
(6, 450.00),
(7, 25.00),
(8, 890.00),
(9, 1020.00),
(10, 300.00),
(11, 150.00),
(12, 600.00),
(13, 750.00),
(14, 50.00),
(15, 10.00);
Learnings
Solutions
• - PostgreSQL solution
289
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT CustomerID, TotalSpent,
CASE
WHEN TotalSpent < 100 THEN 'Low Spender'
WHEN TotalSpent BETWEEN 100 AND 500 THEN 'Medium Spender'
ELSE 'High Spender'
END AS SpendingCategory
FROM CustomerPurchases;
• Q.222
Question
Use a CASE statement to assign a salary grade based on Salary.
Explanation
This question asks you to categorize employees into salary grades: 'Grade A', 'Grade
B', and 'Grade C', based on their salary. Use a CASE statement to assign the grades
accordingly.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
EmployeeName VARCHAR(100),
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
(1, 'John Doe', 120000),
(2, 'Alice Smith', 90000),
(3, 'Bob Brown', 60000),
(4, 'Charlie Davis', 135000),
(5, 'Eve Harris', 75000),
(6, 'Frank Black', 110000),
(7, 'Grace White', 85000),
(8, 'Helen Green', 95000),
(9, 'Igor King', 65000),
(10, 'Jackie Lewis', 140000),
(11, 'Kevin Moore', 115000),
(12, 'Liam Wilson', 80000),
(13, 'Mona Clark', 90000),
(14, 'Nancy Adams', 70000),
(15, 'Oscar Scott', 105000);
Learnings
290
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT EmployeeID, EmployeeName, Salary,
CASE
WHEN Salary > 100000 THEN 'Grade A'
WHEN Salary BETWEEN 70000 AND 100000 THEN 'Grade B'
ELSE 'Grade C'
END AS SalaryGrade
FROM Employees;
• - MySQL solution
SELECT EmployeeID, EmployeeName, Salary,
CASE
WHEN Salary > 100000 THEN 'Grade A'
WHEN Salary BETWEEN 70000 AND 100000 THEN 'Grade B'
ELSE 'Grade C'
END AS SalaryGrade
FROM Employees;
• Q.223
Question
Use a CASE statement to determine the delivery status of each order.
Explanation
This question asks you to classify the delivery status of orders into three categories:
'Pending', 'Delivered On Time', and 'Late Delivery', based on whether the
DeliveryDate is NULL or falls before or after the OrderDate. Use a CASE statement to
categorize each order accordingly.
• - Table creation
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
DeliveryDate DATE
);
• - Datasets
INSERT INTO Orders VALUES
(1, 101, '2024-01-10', '2024-01-15'),
(2, 102, '2024-02-05', NULL),
(3, 103, '2024-01-15', '2024-01-20'),
(4, 104, '2024-02-01', '2024-02-03'),
291
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
Solutions
• - PostgreSQL solution
SELECT OrderID, CustomerID, OrderDate, DeliveryDate,
CASE
WHEN DeliveryDate IS NULL THEN 'Pending'
WHEN DeliveryDate > OrderDate THEN 'Delivered On Time'
ELSE 'Late Delivery'
END AS DeliveryStatus
FROM Orders;
• - MySQL solution
SELECT OrderID, CustomerID, OrderDate, DeliveryDate,
CASE
WHEN DeliveryDate IS NULL THEN 'Pending'
WHEN DeliveryDate > OrderDate THEN 'Delivered On Time'
ELSE 'Late Delivery'
END AS DeliveryStatus
FROM Orders;
• Q.224
Question
Use a CASE statement to categorize products based on their Rating.
Explanation
This question requires you to categorize products based on their Rating. The
categories are 'Excellent' for ratings 4.5 and above, 'Good' for ratings between 3.5 and
4.4, and 'Average' for ratings below 3.5. Use a CASE statement to assign the
appropriate category.
• - Table creation
292
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO ProductReviews VALUES
(101, 4.7),
(102, 3.5),
(103, 2.9),
(104, 4.0),
(105, 4.8),
(106, 3.2),
(107, 4.4),
(108, 3.8),
(109, 4.1),
(110, 2.5),
(111, 3.6),
(112, 4.9),
(113, 3.0),
(114, 2.0),
(115, 4.2);
Learnings
Solutions
• - PostgreSQL solution
SELECT ProductID, Rating,
CASE
WHEN Rating >= 4.5 THEN 'Excellent'
WHEN Rating BETWEEN 3.5 AND 4.4 THEN 'Good'
ELSE 'Average'
END AS RatingCategory
FROM ProductReviews;
• - MySQL solution
SELECT ProductID, Rating,
CASE
WHEN Rating >= 4.5 THEN 'Excellent'
WHEN Rating BETWEEN 3.5 AND 4.4 THEN 'Good'
ELSE 'Average'
END AS RatingCategory
FROM ProductReviews;
• Q.225
Question
Use a CASE statement to mark high-value transactions.
Explanation
293
1000+ SQL Interview Questions & Answers | By Zero Analyst
This question asks you to categorize transactions as either 'High Value' or 'Standard
Value' based on the TransactionAmount. If the transaction amount is greater than
1000, mark it as 'High Value'; otherwise, it should be 'Standard Value'. Use a CASE
statement for this categorization.
• - Table creation
CREATE TABLE CryptoTransactions (
TransactionID INT PRIMARY KEY,
TransactionDate DATE,
TransactionAmount DECIMAL(10, 2),
TransactionType VARCHAR(50)
);
• - Datasets
INSERT INTO CryptoTransactions VALUES
(1, '2024-01-10', 1000, 'Deposit'),
(2, '2024-02-05', 500, 'Withdrawal'),
(3, '2024-02-10', 1500, 'Deposit'),
(4, '2024-02-15', 300, 'Deposit'),
(5, '2024-02-20', 100, 'Withdrawal'),
(6, '2024-02-25', 800, 'Deposit'),
(7, '2024-03-01', 1200, 'Withdrawal'),
(8, '2024-03-05', 500, 'Deposit'),
(9, '2024-03-10', 200, 'Withdrawal'),
(10, '2024-03-15', 1500, 'Deposit'),
(11, '2024-03-20', 900, 'Withdrawal'),
(12, '2024-03-25', 1300, 'Deposit'),
(13, '2024-03-30', 700, 'Withdrawal'),
(14, '2024-04-01', 2500, 'Deposit'),
(15, '2024-04-05', 1000, 'Withdrawal');
Learnings
Solutions
• - PostgreSQL solution
SELECT TransactionID, TransactionDate, TransactionAmount, TransactionType,
CASE
WHEN TransactionAmount > 1000 THEN 'High Value'
ELSE 'Standard Value'
END AS TransactionStatus
FROM CryptoTransactions;
• - MySQL solution
SELECT TransactionID, TransactionDate, TransactionAmount, TransactionType,
CASE
294
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.226
Question
Use a CASE statement to mark products as Available or Out of Stock.
Explanation
This question requires you to determine the availability of products based on their
StockQuantity. If the stock quantity is greater than 0, mark the product as
'Available'; otherwise, mark it as 'Out of Stock'. A CASE statement is used for this
categorization.
• - Table creation
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductName VARCHAR(100),
StockQuantity INT
);
• - Datasets
INSERT INTO Products VALUES
(101, 'T-shirt', 50),
(102, 'Jeans', 0),
(103, 'Jacket', 5),
(104, 'Hat', 30),
(105, 'Scarf', 0),
(106, 'Socks', 15),
(107, 'Sweater', 10),
(108, 'Shoes', 20),
(109, 'Gloves', 0),
(110, 'Coat', 8),
(111, 'Dress', 25),
(112, 'Skirt', 12),
(113, 'Blouse', 40),
(114, 'Belt', 5),
(115, 'Trousers', 0);
Learnings
Solutions
• - PostgreSQL solution
295
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT ProductID, ProductName, StockQuantity,
CASE
WHEN StockQuantity > 0 THEN 'Available'
ELSE 'Out of Stock'
END AS Availability
FROM Products;
• Q.227
Question
Use a CASE statement to categorize customers by their age.
Explanation
This question asks you to categorize customers based on their Age into three groups:
'Young' for ages less than 35, 'Middle-aged' for ages between 35 and 55, and 'Senior'
for ages 56 and above. A CASE statement is used to assign the appropriate age
category.
• - Table creation
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(100),
Age INT
);
• - Datasets
INSERT INTO Customers VALUES
(1, 'John Doe', 30),
(2, 'Alice Smith', 65),
(3, 'Bob Brown', 45),
(4, 'Charlie Davis', 25),
(5, 'Eve Harris', 55),
(6, 'Frank Black', 40),
(7, 'Grace White', 20),
(8, 'Helen Green', 33),
(9, 'Igor King', 60),
(10, 'Jackie Lewis', 28),
(11, 'Kevin Moore', 50),
(12, 'Liam Wilson', 60),
(13, 'Mona Clark', 35),
(14, 'Nancy Adams', 70),
(15, 'Oscar Scott', 41);
Learnings
296
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT CustomerID, CustomerName, Age,
CASE
WHEN Age < 35 THEN 'Young'
WHEN Age BETWEEN 35 AND 55 THEN 'Middle-aged'
ELSE 'Senior'
END AS AgeCategory
FROM Customers;
• - MySQL solution
SELECT CustomerID, CustomerName, Age,
CASE
WHEN Age < 35 THEN 'Young'
WHEN Age BETWEEN 35 AND 55 THEN 'Middle-aged'
ELSE 'Senior'
END AS AgeCategory
FROM Customers;
• Q.228
Question
Sales Performance Categorization
Given a table of sales transactions, categorize sales performance based on the total
amount of sales for each employee. If an employee’s total sales exceed $100,000,
classify them as 'Top Performer'. If the total sales are between $50,000 and $100,000,
classify them as 'Average Performer'. If the total sales are below $50,000, classify
them as 'Low Performer'.
Explanation
You need to group the sales transactions by employee, calculate the total sales per
employee, and then categorize them based on their sales performance.
• - Table creation
CREATE TABLE SalesTransactions (
EmployeeID INT,
TransactionAmount DECIMAL(10, 2)
);
• - Datasets
297
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Q.229
Question
Product Pricing Strategy
Given a table of product prices, identify the pricing strategy for each product. If the
price is greater than $100, categorize it as 'Premium'. If the price is between $50 and
$100, categorize it as 'Standard'. If the price is below $50, categorize it as 'Discount'.
Explanation
You need to categorize products based on their prices.
• - Table creation
CREATE TABLE Products (
ProductID INT,
ProductName VARCHAR(100),
Price DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Products VALUES
(1, 'Smartphone', 150),
298
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Q.230
Question
Employee Tenure and Salary Adjustment
For each employee, determine if they are eligible for a salary increase based on their
years of service. If an employee has been with the company for 5 years or more, they
are eligible for a '10% Increase'. If they have been with the company for less than 5
years but more than 2 years, they are eligible for a '5% Increase'. Employees with less
than 2 years are not eligible for any increase.
Explanation
You need to calculate the years of service for each employee and apply different
salary increases based on their tenure. The date of hire is provided, and you need to
compute the years of service by comparing the current date with the hire date.
• - Table creation
CREATE TABLE Employees (
EmployeeID INT,
EmployeeName VARCHAR(100),
HireDate DATE,
Salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO Employees VALUES
299
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Q.231
• - Table creation
CREATE TABLE WebsiteFeedback (
CustomerID INT,
Feedback TEXT
);
• - Datasets
300
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE StoreFeedback (
CustomerID INT,
Feedback TEXT
);
• - Datasets
INSERT INTO StoreFeedback VALUES
(1, 'Great website!'),
(5, 'Friendly staff'),
(6, 'Store was clean and organized'),
(4, 'Excellent service');
Learnings
• Q.232
Question
You are given two product catalogs: CatalogA and CatalogB. Each table contains
columns ProductID and ProductName. Write a query to find the products that appear
in both catalogs (common products), and also find the products that are in either one
but not the other (unique products). Use appropriate set operations to achieve this.
Explanation
This question requires you to:
• - Table creation
301
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO CatalogA VALUES
(101, 'Laptop'),
(102, 'Smartphone'),
(103, 'Tablet'),
(104, 'Monitor');
• - Table creation
CREATE TABLE CatalogB (
ProductID INT,
ProductName VARCHAR(100)
);
• - Datasets
INSERT INTO CatalogB VALUES
(102, 'Smartphone'),
(103, 'Tablet'),
(105, 'Keyboard'),
(106, 'Mouse');
Learnings
-- Find unique products (items in either CatalogA or CatalogB but not both)
SELECT ProductID, ProductName
FROM CatalogA
EXCEPT
SELECT ProductID, ProductName
FROM CatalogB
UNION
SELECT ProductID, ProductName
FROM CatalogB
EXCEPT
SELECT ProductID, ProductName
FROM CatalogA;
• Q.233
Question
302
1000+ SQL Interview Questions & Answers | By Zero Analyst
1. Find students who are enrolled in either the Math or Science course (or both).
2. Find students who are only enrolled in one course (i.e., students enrolled only
in Math or only in Science, but not both).
Explanation
This involves:
• - Table creation
CREATE TABLE MathEnrollment (
StudentID INT,
CourseName VARCHAR(100)
);
• - Datasets
INSERT INTO MathEnrollment VALUES
(1, 'Math'),
(2, 'Math'),
(3, 'Math'),
(4, 'Math');
• - Table creation
CREATE TABLE ScienceEnrollment (
StudentID INT,
CourseName VARCHAR(100)
);
• - Datasets
INSERT INTO ScienceEnrollment VALUES
(2, 'Science'),
(3, 'Science'),
(5, 'Science');
Learnings
303
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Find students enrolled only in one course (Math or Science but not both)
SELECT StudentID, CourseName
FROM MathEnrollment
EXCEPT
SELECT StudentID, CourseName
FROM ScienceEnrollment
UNION
SELECT StudentID, CourseName
FROM ScienceEnrollment
EXCEPT
SELECT StudentID, CourseName
FROM MathEnrollment;
• Q.234
Question
You are given two tables, CarModels2023 and CarModels2024, which contain the list
of Tesla car models for the years 2023 and 2024. Both tables have columns ModelID
and ModelName. Write a query to get a unified list of all Tesla car models from both
years, ensuring that duplicates are removed.
Explanation
Use UNION to merge the car models from both years and remove duplicates.
• - Table creation
CREATE TABLE CarModels2023 (
ModelID INT,
ModelName VARCHAR(100)
);
• - Datasets
INSERT INTO CarModels2023 VALUES
(1, 'Model S'),
(2, 'Model 3'),
(3, 'Model X'),
(4, 'Model Y');
• - Table creation
CREATE TABLE CarModels2024 (
ModelID INT,
ModelName VARCHAR(100)
);
304
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO CarModels2024 VALUES
(1, 'Model S'),
(2, 'Model 3'),
(5, 'Cybertruck'),
(6, 'Roadster');
Learnings
• Q.235
Question
You are given two tables, ElectricTeslaModels and StandardTeslaModels, that
contain lists of Tesla car models. The ElectricTeslaModels table lists models that
are electric, and the StandardTeslaModels table lists standard (non-electric) Tesla
models. Write a query to get a list of all Tesla models, including electric and non-
electric, ensuring that duplicate entries are included when a model appears in both
tables.
Explanation
Use UNION ALL to include all models from both tables, even if some models appear in
both tables.
• - Table creation
CREATE TABLE ElectricTeslaModels (
ModelID INT,
ModelName VARCHAR(100)
);
• - Datasets
INSERT INTO ElectricTeslaModels VALUES
(1, 'Model S'),
(2, 'Model 3'),
(3, 'Model X'),
(4, 'Model Y');
• - Table creation
305
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO StandardTeslaModels VALUES
(2, 'Model 3'),
(5, 'Cybertruck'),
(6, 'Roadster');
Learnings
• Using UNION ALL to combine all records from different tables, including
duplicates.
• Merging different data sets while preserving duplicate entries.
306
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using INTERSECT to find common records.
• Using EXCEPT to find unique records between tables.
307
1000+ SQL Interview Questions & Answers | By Zero Analyst
(2, 201),
(3, 202),
(5, 203),
(6, 204);
Learnings
• Using EXCEPT to find records in one table but not the other.
• Filtering unique customers based on their purchase categories.
308
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using INTERSECT to find common items purchased by the same customer across two
different categories.
• Identifying customers who purchase across multiple product categories.
Learnings
• Using UNION ALL to include all records, even duplicates.
• Combining data from different product categories.
309
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Combine customers who bought from either Fashion or Beauty, including duplicates
SELECT CustomerID
FROM FashionPurchases
UNION ALL
SELECT CustomerID
FROM BeautyPurchases;
• Q.240
Question
You are given three tables:
• ElectronicsPurchases: Contains CustomerID, ProductID, and PurchaseDate for
electronics purchases.
• ClothingPurchases: Contains CustomerID, ProductID, and PurchaseDate for clothing
purchases.
• GroceryPurchases: Contains CustomerID, ProductID, and PurchaseDate for grocery
purchases.
Write a query to identify customer purchase behavior, categorizing them into three categories
based on their activity:
• "Heavy Shopper": Customers who have purchased items from all three categories
(Electronics, Clothing, and Grocery).
• "Seasonal Shopper": Customers who have purchased items from two of the categories.
• "Category-Specific Shopper": Customers who have only purchased from one category.
Additionally:
• Use CTEs to calculate the number of unique categories each customer has purchased from.
• Use CASE to categorize the customer into the three categories.
• Use Set Operations to eliminate customers who are not in the desired categories.
Explanation
• CTEs will help track the number of unique categories each customer has purchased from.
• CASE will categorize customers into one of the three categories based on the count of
unique categories.
• Set Operations (UNION ALL, EXCEPT, etc.) will be used to combine and filter customers.
310
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• CTEs: Used to calculate and track unique values and aggregations across multiple tables.
• Set Operations: Help combine data from different tables.
• CASE Statements: Categorize customers based on their purchase behaviors.
• Combining Concepts: The question demonstrates the use of multiple concepts (CTEs,
CASE, Set Operations) in a complex scenario.
311
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation of Solution
• CTE (CategoryPurchases):
• This CTE aggregates data from the three purchase tables (ElectronicsPurchases,
ClothingPurchases, and GroceryPurchases) and calculates the number of unique
categories each customer has purchased from. The COUNT(DISTINCT ...) ensures that only
unique categories are counted for each customer.
• The FULL OUTER JOIN is used to ensure we capture customers who may have only
purchased from one or two categories (even if they are missing from some of the tables).
• CASE Statement:
• The CASE is used to categorize customers into "Heavy Shopper", "Seasonal Shopper", or
"Category-Specific Shopper" based on how many categories they have purchased from.
• The condition CategoryCount >= 1 is used in the final WHERE clause to exclude customers
who haven’t purchased from any category.
• Set Operations:
• The FULL OUTER JOIN is conceptually a set operation that ensures we merge customers
from all three tables, even if they don't appear in all three.
Recursive CTE
• Q.241
Question
You are given a table Employees that contains employee information and their direct
manager. Write a query to generate a report showing each employee's name and their
manager's name, starting from the top-level manager and recursively listing employees under
them.
Explanation
312
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query uses a Recursive CTE to first identify the top-level manager (who has no
manager), and then recursively find employees under each manager.
MySQL Solution
WITH RECURSIVE EmployeeHierarchy AS (
-- Base case: Start with the top-level manager (EmployeeID 1, John Doe)
SELECT EmployeeID, EmployeeName, ManagerID
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
-- Recursive case: Join to get employees under each manager
SELECT e.EmployeeID, e.EmployeeName, e.ManagerID
FROM Employees e
INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
-- Select the employees and their managers
SELECT e.EmployeeName AS Employee, m.EmployeeName AS Manager
FROM EmployeeHierarchy e
LEFT JOIN Employees m ON e.ManagerID = m.EmployeeID
ORDER BY e.EmployeeName;
Postgres Solution
WITH RECURSIVE EmployeeHierarchy AS (
-- Base case: Start with the top-level manager (EmployeeID 1, John Doe)
SELECT EmployeeID, EmployeeName, ManagerID
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
-- Recursive case: Join to get employees under each manager
SELECT e.EmployeeID, e.EmployeeName, e.ManagerID
FROM Employees e
INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
-- Select the employees and their managers
SELECT e.EmployeeName AS Employee, m.EmployeeName AS Manager
FROM EmployeeHierarchy e
LEFT JOIN Employees m ON e.ManagerID = m.EmployeeID
ORDER BY e.EmployeeName;
• Q.242
Question
Write a query that uses a Recursive CTE to generate the first 10 numbers of the Fibonacci
sequence (0, 1, 1, 2, 3, 5, 8, 13, 21, 34).
313
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
A Recursive CTE can be used here to generate the Fibonacci sequence by using the previous
two numbers to calculate the next one.
MySQL Solution
WITH RECURSIVE Fibonacci(n, fib_value, prev_value) AS (
-- Base case: The first two Fibonacci numbers (0, 1)
SELECT 1, 0, 0 -- n=1, fib_value=0, prev_value=0
UNION ALL
SELECT 2, 1, 0 -- n=2, fib_value=1, prev_value=0
UNION ALL
-- Recursive case: Calculate the next Fibonacci number
SELECT n + 1, fib_value + prev_value, fib_value
FROM Fibonacci
WHERE n < 10
)
SELECT fib_value
FROM Fibonacci
WHERE n <= 10;
Postgres Solution
WITH RECURSIVE Fibonacci(n, fib_value, prev_value) AS (
-- Base case: The first two Fibonacci numbers (0, 1)
SELECT 1, 0, 0 -- n=1, fib_value=0, prev_value=0
UNION ALL
SELECT 2, 1, 0 -- n=2, fib_value=1, prev_value=0
UNION ALL
-- Recursive case: Calculate the next Fibonacci number
SELECT n + 1, fib_value + prev_value, fib_value
FROM Fibonacci
WHERE n < 10
)
SELECT fib_value
FROM Fibonacci
WHERE n <= 10;
• Q.243
Question
Write a query using a Recursive CTE to generate a date range from '2024-01-01' to '2024-
01-10'. Return all the dates in this range.
Explanation
A Recursive CTE is used here to generate a sequence of dates starting from a specific date
and incrementing by one day for each iteration.
MySQL Solution
WITH RECURSIVE DateRange AS (
-- Base case: Start with '2024-01-01'
SELECT CAST('2024-01-01' AS DATE) AS current_date
UNION ALL
-- Recursive case: Add one day to the current date
SELECT DATE_ADD(current_date, INTERVAL 1 DAY)
FROM DateRange
WHERE current_date < '2024-01-10'
314
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
SELECT current_date
FROM DateRange;
Postgres Solution
WITH RECURSIVE DateRange AS (
-- Base case: Start with '2024-01-01'
SELECT CAST('2024-01-01' AS DATE) AS current_date
UNION ALL
-- Recursive case: Add one day to the current date
SELECT current_date + INTERVAL '1 day'
FROM DateRange
WHERE current_date < '2024-01-10'
)
SELECT current_date
FROM DateRange;
Explanation
The Recursive CTE can be used to multiply numbers starting from the given number and
decrementing down to 1. The factorial is the product of all positive integers up to a given
number.
MySQL Solution
WITH RECURSIVE Factorial(n, result) AS (
-- Base case: Start with 1
SELECT 1, 1
UNION ALL
-- Recursive case: Multiply n by the result of the previous iteration
SELECT n + 1, (n + 1) * result
FROM Factorial
WHERE n < 5 -- For factorial of 5, change this number as needed
)
SELECT result
FROM Factorial
WHERE n = 5; -- To get the factorial of 5
Postgres Solution
WITH RECURSIVE Factorial(n, result) AS (
-- Base case: Start with 1
SELECT 1, 1
UNION ALL
-- Recursive case: Multiply n by the result of the previous iteration
SELECT n + 1, (n + 1) * result
315
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM Factorial
WHERE n < 5 -- For factorial of 5, change this number as needed
)
SELECT result
FROM Factorial
WHERE n = 5; -- To get the factorial of 5
• Q.245
Question
Given a FamilyMembers table where each person has a ParentID (which references the
PersonID of their parent), write a query to list all descendants of a specific person, say
PersonID = 1, including indirect descendants.
Explanation
A Recursive CTE is ideal for traversing hierarchical structures like family trees. It allows us
to recursively find all descendants, starting with the direct children and then moving down
the tree.
MySQL Solution
WITH RECURSIVE FamilyTree AS (
-- Base case: Start with the root person (e.g., PersonID = 1)
SELECT PersonID, PersonName, ParentID
FROM FamilyMembers
WHERE PersonID = 1 -- Change to the person you want to start from
UNION ALL
-- Recursive case: Join to find descendants (children)
SELECT fm.PersonID, fm.PersonName, fm.ParentID
FROM FamilyMembers fm
INNER JOIN FamilyTree ft ON fm.ParentID = ft.PersonID
)
SELECT PersonID, PersonName
FROM FamilyTree;
Postgres Solution
WITH RECURSIVE FamilyTree AS (
-- Base case: Start with the root person (e.g., PersonID = 1)
SELECT PersonID, PersonName, ParentID
FROM FamilyMembers
WHERE PersonID = 1 -- Change to the person you want to start from
UNION ALL
316
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Generate an Arithmetic Progression
Question
Write a query to generate an arithmetic progression (e.g., 2, 5, 8, 11, 14, ...) starting from 2
with a common difference of 3 up to the 10th term using a Recursive CTE.
Explanation
A Recursive CTE is well-suited for generating sequences where each term depends on the
previous one. In this case, we can recursively add 3 to the starting number (2) until we reach
the 10th term.
MySQL Solution
WITH RECURSIVE ArithmeticProgression(n, value) AS (
-- Base case: Start with 2
SELECT 1, 2
UNION ALL
-- Recursive case: Add 3 to the previous value
SELECT n + 1, value + 3
FROM ArithmeticProgression
WHERE n < 10 -- Stop at the 10th term
)
-- Select the generated values
SELECT value
FROM ArithmeticProgression;
Postgres Solution
WITH RECURSIVE ArithmeticProgression(n, value) AS (
-- Base case: Start with 2
SELECT 1, 2
UNION ALL
-- Recursive case: Add 3 to the previous value
SELECT n + 1, value + 3
FROM ArithmeticProgression
WHERE n < 10 -- Stop at the 10th term
)
-- Select the generated values
SELECT value
FROM ArithmeticProgression;
• Q.247
Question
Find All Ancestors in a Family Tree
Question
317
1000+ SQL Interview Questions & Answers | By Zero Analyst
Given a FamilyMembers table, write a query to find all ancestors of a specific person (e.g.,
PersonID = 5), including direct and indirect ancestors.
Explanation
This query uses a Recursive CTE to find all ancestors of a person, starting from their direct
parent and then recursively climbing up the family tree.
MySQL Solution
WITH RECURSIVE Ancestors AS (
-- Base case: Start with the person whose ancestors we need to find
SELECT PersonID, PersonName, ParentID
FROM FamilyMembers
WHERE PersonID = 5 -- Change to the person ID of interest
UNION ALL
-- Recursive case: Find each person's parent recursively
SELECT fm.PersonID, fm.PersonName, fm.ParentID
FROM FamilyMembers fm
INNER JOIN Ancestors a ON fm.PersonID = a.ParentID
)
-- Final selection of the ancestors
SELECT PersonID, PersonName
FROM Ancestors;
Postgres Solution
WITH RECURSIVE Ancestors AS (
-- Base case: Start with the person whose ancestors we need to find
SELECT PersonID, PersonName, ParentID
FROM FamilyMembers
WHERE PersonID = 5 -- Change to the person ID of interest
UNION ALL
-- Recursive case: Find each person's parent recursively
SELECT fm.PersonID, fm.PersonName, fm.ParentID
FROM FamilyMembers fm
INNER JOIN Ancestors a ON fm.PersonID = a.ParentID
)
-- Final selection of the ancestors
SELECT PersonID, PersonName
FROM Ancestors;
• Q.248
Question
318
1000+ SQL Interview Questions & Answers | By Zero Analyst
Given an Employees table where each employee has a ManagerID, write a query to calculate
the maximum depth (or level) of employees under any manager. The depth is defined as the
number of levels below the top manager.
Explanation
A Recursive CTE is used to traverse the hierarchy, starting from the top-level manager
(those with no ManagerID) and calculating the depth for each level.
MySQL Solution
WITH RECURSIVE EmployeeHierarchy AS (
-- Base case: Start with the top-level employees (employees with no manager)
SELECT EmployeeID, EmployeeName, ManagerID, 1 AS Depth
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
-- Recursive case: Join employees with their managers to calculate depth
SELECT e.EmployeeID, e.EmployeeName, e.ManagerID, eh.Depth + 1
FROM Employees e
INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
-- Select the maximum depth in the employee hierarchy
SELECT MAX(Depth) AS MaxDepth
FROM EmployeeHierarchy;
Postgres Solution
WITH RECURSIVE EmployeeHierarchy AS (
-- Base case: Start with the top-level employees (employees with no manager)
SELECT EmployeeID, EmployeeName, ManagerID, 1 AS Depth
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
-- Recursive case: Join employees with their managers to calculate depth
SELECT e.EmployeeID, e.EmployeeName, e.ManagerID, eh.Depth + 1
FROM Employees e
INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
-- Select the maximum depth in the employee hierarchy
SELECT MAX(Depth) AS MaxDepth
FROM EmployeeHierarchy;
• Q.249
319
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Generate All Possible Parent-Child Pairs in a Tree
Question
Given a Categories table, which stores a hierarchical product category structure (where
CategoryID references the parent category via ParentCategoryID), write a query to
generate all possible parent-child pairs, showing the CategoryID of the parent and the
CategoryID of the child.
Explanation
This query uses a Recursive CTE to traverse the category tree and produce all parent-child
combinations in the hierarchy.
MySQL Solution
WITH RECURSIVE FamilyTree AS (
-- Base case: Start with the root person (e.g., PersonID = 1)
SELECT PersonID, PersonName, ParentID
FROM FamilyMembers
WHERE PersonID = 1 -- Change to the person you want to start from
UNION ALL
-- Recursive case: Join to find descendants (children)
SELECT fm.PersonID, fm.PersonName, fm.ParentID
FROM FamilyMembers fm
INNER JOIN FamilyTree ft ON fm.ParentID = ft.PersonID
)
-- Select all the descendants in the family tree
SELECT PersonID, PersonName
FROM FamilyTree;
Postgres Solution
WITH RECURSIVE FamilyTree AS (
-- Base case: Start with the root person (e.g., PersonID = 1)
SELECT PersonID, PersonName, ParentID
FROM FamilyMembers
WHERE PersonID = 1 -- Change to the person you want to start from
UNION ALL
-- Recursive case: Join to find descendants (children)
SELECT fm.PersonID, fm.PersonName, fm.ParentID
FROM FamilyMembers fm
INNER JOIN FamilyTree ft ON fm.ParentID = ft.PersonID
320
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
-- Select all the descendants in the family tree
SELECT PersonID, PersonName
FROM FamilyTree;
• Q.250
Question
Given a Tasks table where each task has a TaskID, TaskName, and a PredecessorID (which
references the TaskID of the task that must be completed before the current task), write a
query using Recursive CTEs to find the longest dependency chain (path) of tasks. The result
should include the TaskID and TaskName in the longest path.
Explanation
In this problem, the Recursive CTE will traverse the task dependencies, and for each task, it
will recursively follow the chain of tasks. We are interested in finding the longest
dependency chain of tasks, so we need to track the depth (number of tasks in the path) as we
go through the dependencies. The longest path will be the one with the maximum depth.
MySQL Solution
WITH RECURSIVE TaskDependencies AS (
-- Base case: Start with tasks that have no predecessors (i.e., TaskID = NULL)
SELECT TaskID, TaskName, PredecessorID, 1 AS PathLength, CAST(TaskName AS CHAR) AS P
ath
FROM Tasks
WHERE PredecessorID IS NULL -- Tasks without any predecessor
UNION ALL
-- Recursive case: Find the next task in the dependency chain
SELECT t.TaskID, t.TaskName, t.PredecessorID, td.PathLength + 1, CONCAT(td.Path, ' -
> ', t.TaskName)
FROM Tasks t
INNER JOIN TaskDependencies td ON t.PredecessorID = td.TaskID
)
SELECT TaskID, TaskName, PathLength, Path
FROM TaskDependencies
WHERE PathLength = (
SELECT MAX(PathLength) FROM TaskDependencies -- Find the longest path
)
ORDER BY PathLength DESC;
321
1000+ SQL Interview Questions & Answers | By Zero Analyst
Postgres Solution
WITH RECURSIVE TaskDependencies AS (
-- Base case: Start with tasks that have no predecessors (i.e., TaskID = NULL)
SELECT TaskID, TaskName, PredecessorID, 1 AS PathLength, CAST(TaskName AS VARCHAR(10
0)) AS Path
FROM Tasks
WHERE PredecessorID IS NULL -- Tasks without any predecessor
UNION ALL
-- Recursive case: Find the next task in the dependency chain
SELECT t.TaskID, t.TaskName, t.PredecessorID, td.PathLength + 1, td.Path || ' -> ' |
| t.TaskName
FROM Tasks t
INNER JOIN TaskDependencies td ON t.PredecessorID = td.TaskID
)
SELECT TaskID, TaskName, PathLength, Path
FROM TaskDependencies
WHERE PathLength = (
SELECT MAX(PathLength) FROM TaskDependencies -- Find the longest path
)
ORDER BY PathLength DESC;
DDL
• Q.251
Question
Write an SQL query to create a table named Employees with the following columns:
• EmployeeID (integer, primary key, auto-increment)
• FirstName (variable character, maximum 50 characters, not null)
• LastName (variable character, maximum 50 characters, not null)
• HireDate (date)
Explanation
Create a table named Employees where the EmployeeID auto-increments, and other columns
have appropriate data types and constraints.
Learnings
322
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
CREATE TABLE Employees (
EmployeeID SERIAL PRIMARY KEY,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
HireDate DATE
);
• - MySQL solution
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY AUTO_INCREMENT,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
HireDate DATE
);
• Q.252
Question
Write an SQL query to add a column named Email (variable character, maximum 100
characters) to the Employees table.
Explanation
Alter the existing Employees table by adding a new column Email with the specified data
type and size.
Learnings
• Using the ALTER TABLE statement to modify an existing table
• Adding a column with specific data type and size
• SQL syntax for altering tables
Solutions
• - PostgreSQL solution
ALTER TABLE Employees
ADD COLUMN Email VARCHAR(100);
• - MySQL solution
ALTER TABLE Employees
323
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
Use the ALTER TABLE statement to rename the Employees table to Staff.
Learnings
• Using the ALTER TABLE statement to rename a table
• SQL syntax for renaming tables
Solutions
• - PostgreSQL solution
ALTER TABLE Employees
RENAME TO Staff;
• - MySQL solution
ALTER TABLE Employees
RENAME TO Staff;
• Q.254
Question
Write an SQL query to change the data type of the HireDate column in the Employees table
to DATETIME.
Explanation
Use the ALTER TABLE statement to modify the data type of the HireDate column from DATE
to DATETIME.
324
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using the ALTER TABLE statement to modify column data types
• Syntax for changing a column’s data type in SQL
• Understanding the difference between DATE and DATETIME data types
Solutions
• - PostgreSQL solution
ALTER TABLE Employees
ALTER COLUMN HireDate TYPE DATETIME;
• - MySQL solution
ALTER TABLE Employees
MODIFY COLUMN HireDate DATETIME;
• Q.255
Question
Write an SQL query to add a primary key constraint to the EmployeeID column in the
Employees table.
Explanation
Use the ALTER TABLE statement to add a primary key constraint to the EmployeeID column
if it doesn't already have one.
Learnings
• Using the ALTER TABLE statement to add constraints
• Adding a PRIMARY KEY constraint to an existing column
• Understanding the importance of primary key constraints for uniqueness and indexing
Solutions
• - PostgreSQL solution
ALTER TABLE Employees
ADD CONSTRAINT pk_employeeid PRIMARY KEY (EmployeeID);
• - MySQL solution
ALTER TABLE Employees
ADD PRIMARY KEY (EmployeeID);
• Q.256
Question
325
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to create a view named EmployeeNames that includes EmployeeID,
FirstName, and LastName from the Employees table.
Explanation
Use the CREATE VIEW statement to create a view that selects the EmployeeID, FirstName,
and LastName columns from the Employees table.
Learnings
• Using the CREATE VIEW statement to create a view in SQL
• Selecting specific columns from a table in a view
• Views provide a way to simplify queries and encapsulate frequently used logic
Solutions
• - PostgreSQL solution
CREATE VIEW EmployeeNames AS
SELECT EmployeeID, FirstName, LastName
FROM Employees;
• - MySQL solution
CREATE VIEW EmployeeNames AS
SELECT EmployeeID, FirstName, LastName
FROM Employees;
• Q.257
Question
Assume there is a table named Departments with a primary key DepartmentID. Write an
SQL query to add a foreign key constraint to the Employees table, linking the
DepartmentID column to the DepartmentID column in the Departments table.
Explanation
Use the ALTER TABLE statement to add a foreign key constraint on the DepartmentID
column in the Employees table, referencing the DepartmentID column in the Departments
table.
326
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Adding a foreign key constraint to enforce referential integrity
• Understanding how foreign keys link columns between tables
• Using ALTER TABLE to add foreign key constraints
Solutions
• - PostgreSQL solution
ALTER TABLE Employees
ADD CONSTRAINT fk_department
FOREIGN KEY (DepartmentID)
REFERENCES Departments (DepartmentID);
• - MySQL solution
ALTER TABLE Employees
ADD CONSTRAINT fk_department
FOREIGN KEY (DepartmentID)
REFERENCES Departments (DepartmentID);
• Q.258
Question
Write an SQL query to create a non-unique index on the LastName column in the
Employees table to improve search performance.
Explanation
Use the CREATE INDEX statement to create a non-unique index on the LastName column in
the Employees table, which will improve the performance of queries that search by
LastName.
327
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using CREATE INDEX to improve query performance
• Understanding the difference between unique and non-unique indexes
• Indexes speed up SELECT queries but can slow down INSERT/UPDATE operations
Solutions
• - PostgreSQL solution
CREATE INDEX idx_lastname ON Employees (LastName);
• - MySQL solution
CREATE INDEX idx_lastname ON Employees (LastName);
• Q.259
Question
Write an SQL query to remove the Email column from the Employees table.
Explanation
Use the ALTER TABLE statement with the DROP COLUMN clause to remove the Email column
from the Employees table.
Learnings
• Using ALTER TABLE with DROP COLUMN to remove a column
• SQL syntax for modifying tables
• When removing a column, ensure that the column is not needed for any other constraints or
relationships
Solutions
• - PostgreSQL solution
ALTER TABLE Employees
DROP COLUMN Email;
• - MySQL solution
ALTER TABLE Employees
DROP COLUMN Email;
• Q.260
Question
Write an SQL query to create a table named Projects with the following specifications:
• ProjectID (integer, primary key, auto-increment)
• ProjectName (variable character, maximum 100 characters, not null, unique)
328
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
Create a table named Projects with the specified columns, adding constraints like PRIMARY
KEY, NOT NULL, UNIQUE, and ensuring that EndDate is after StartDate using a check
constraint.
Learnings
• Using constraints such as PRIMARY KEY, NOT NULL, UNIQUE, and CHECK
• How to enforce data integrity by ensuring EndDate is after StartDate
• Syntax for creating tables with multiple constraints
Solutions
• - PostgreSQL solution
CREATE TABLE Projects (
ProjectID SERIAL PRIMARY KEY,
ProjectName VARCHAR(100) NOT NULL UNIQUE,
StartDate DATE NOT NULL,
EndDate DATE NOT NULL,
CONSTRAINT check_enddate CHECK (EndDate > StartDate)
);
• - MySQL solution
CREATE TABLE Projects (
ProjectID INT PRIMARY KEY AUTO_INCREMENT,
ProjectName VARCHAR(100) NOT NULL UNIQUE,
StartDate DATE NOT NULL,
EndDate DATE NOT NULL,
CHECK (EndDate > StartDate)
);
DML
• Q.261
Question
Write an SQL query to update the JobTitle of the employee with EmployeeID = 3 to 'Senior
Software Engineer' in the Employees table.
Explanation
Use the UPDATE statement to modify the JobTitle for the employee with EmployeeID = 3.
329
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using the UPDATE statement to modify data in a table
• Specifying the row to update using the WHERE clause
• Avoiding accidental updates to all rows by ensuring the correct condition in the WHERE
clause
Solutions
• - PostgreSQL solution
UPDATE Employees
SET JobTitle = 'Senior Software Engineer'
WHERE EmployeeID = 3;
• - MySQL solution
UPDATE Employees
SET JobTitle = 'Senior Software Engineer'
WHERE EmployeeID = 3;
• Q.262
Question
Write an SQL query to insert a new employee into the Employees table with the following
details:
• FirstName: 'Michael'
• LastName: 'Taylor'
• JobTitle: 'UX Designer'
• HireDate: '2023-01-15'
Explanation
Use the INSERT INTO statement to add a new record into the Employees table with the
provided values.
330
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using the INSERT INTO statement to add new rows into a table
• Inserting multiple columns of data at once
• SQL syntax for specifying values for each column
Solutions
• - PostgreSQL solution
INSERT INTO Employees (FirstName, LastName, JobTitle, HireDate)
VALUES ('Michael', 'Taylor', 'UX Designer', '2023-01-15');
• - MySQL solution
INSERT INTO Employees (FirstName, LastName, JobTitle, HireDate)
VALUES ('Michael', 'Taylor', 'UX Designer', '2023-01-15');
• Q.263
Question
Write a query to delete all customers from the Customers table who have not placed an order
in the last two years. Assume there is an Orders table with a CustomerID and OrderDate.
Explanation
Use a subquery in the DELETE statement to identify CustomerID values that have placed an
order within the last two years. Then delete all customers whose CustomerID does not
appear in the results of that subquery.
331
1000+ SQL Interview Questions & Answers | By Zero Analyst
(5, 2, '2021-09-22'),
(6, 5, '2020-12-01'),
(7, 6, '2023-03-10'),
(8, 1, '2021-02-25');
Learnings
• Using a subquery to filter rows for deletion
• Deleting rows based on conditions from another table
• SQL syntax for deleting data with complex conditions
Solutions
• - PostgreSQL solution
DELETE FROM Customers
WHERE CustomerID NOT IN (
SELECT DISTINCT CustomerID
FROM Orders
WHERE OrderDate >= CURRENT_DATE - INTERVAL '2 years'
);
• - MySQL solution
DELETE FROM Customers
WHERE CustomerID NOT IN (
SELECT DISTINCT CustomerID
FROM Orders
WHERE OrderDate >= CURDATE() - INTERVAL 2 YEAR
);
• Q.264
Question
Write a query to insert a new employee into the Employees table with the following details:
• EmployeeID = 101
• FirstName = 'John'
• LastName = 'Doe'
• HireDate = '2025-01-14'
Explanation
Use the INSERT INTO statement to add a new employee record with the specified details into
the Employees table.
Learnings
• Using the INSERT INTO statement to add a specific row to a table
• Providing values for all columns when inserting new records
332
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
INSERT INTO Employees (EmployeeID, FirstName, LastName, HireDate)
VALUES (101, 'John', 'Doe', '2025-01-14');
• - MySQL solution
INSERT INTO Employees (EmployeeID, FirstName, LastName, HireDate)
VALUES (101, 'John', 'Doe', '2025-01-14');
• Q.265
Question
Insert New Car Model
Write an SQL query to insert a new car model into the Cars table with the following details:
• ModelID = 501
• ModelName = 'Model X'
• ReleaseYear = 2024
• Price = 89999.99
• Status = 'Available'
Explanation
Use the INSERT INTO statement to add a new record into the Cars table with the provided
values.
Learnings
• Understanding the INSERT INTO statement for adding records.
• Using proper data types for different fields (e.g., INT, VARCHAR, DECIMAL).
• How to handle string, numeric, and date-based data in SQL.
Solutions
• - PostgreSQL solution
INSERT INTO Cars (ModelID, ModelName, ReleaseYear, Price, Status)
VALUES (501, 'Model X', 2024, 89999.99, 'Available');
• - MySQL solution
INSERT INTO Cars (ModelID, ModelName, ReleaseYear, Price, Status)
VALUES (501, 'Model X', 2024, 89999.99, 'Available');
• Q.266
Question
333
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
Use the UPDATE statement to modify the Price column of the car model where ModelID is
502.
Learnings
• Using the UPDATE statement to modify existing records.
• Filtering records using WHERE clause to target specific rows.
• Understanding how to modify specific columns in a table.
Solutions
• - PostgreSQL solution
UPDATE Cars
SET Price = 74999.99
WHERE ModelID = 502;
• - MySQL solution
UPDATE Cars
SET Price = 74999.99
WHERE ModelID = 502;
• Q.267
Question
Delete Outdated Car Models
Write an SQL query to delete all cars from the Cars table that were released before 2019.
Explanation
Use the DELETE statement with a WHERE clause to remove records from the Cars table where
the ReleaseYear is earlier than 2019.
334
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO Cars (ModelID, ModelName, ReleaseYear, Price, Status)
VALUES
(601, 'Model A', 2017, 45999.99, 'Discontinued'),
(602, 'Model B', 2019, 50999.99, 'Available'),
(603, 'Model C', 2018, 47999.99, 'Discontinued');
Learnings
• Using the DELETE statement to remove records from a table.
• Applying conditions with WHERE to target specific rows for deletion.
• Managing data retention based on business rules (e.g., removing outdated records).
Solutions
• - PostgreSQL solution
DELETE FROM Cars
WHERE ReleaseYear < 2019;
• - MySQL solution
DELETE FROM Cars
WHERE ReleaseYear < 2019;
• Q.268
Question
Update Product Stock Based on Sales
Write an SQL query to update the Stock of all products in the Products table by reducing the
stock by the quantity sold in the Sales table. Assume the Sales table contains ProductID and
QuantitySold columns, and the Products table contains ProductID and Stock columns.
Explanation
Use an UPDATE statement combined with a JOIN to update the Stock in the Products table by
subtracting the QuantitySold from the Sales table.
335
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using UPDATE with JOIN to modify values based on related data in another table.
• Subtracting values from columns using arithmetic operators ().
• Handling updates across multiple tables in a relational database.
Solutions
• - PostgreSQL solution
UPDATE Products p
SET Stock = p.Stock - COALESCE(s.QuantitySold, 0)
FROM Sales s
WHERE p.ProductID = s.ProductID;
• - MySQL solution
UPDATE Products p
JOIN Sales s ON p.ProductID = s.ProductID
SET p.Stock = p.Stock - s.QuantitySold;
• Q.269
Question
Delete Out-of-Stock Products Older Than 5 Years
Write an SQL query to delete all products from the Products table that are out of stock (i.e.,
Stock = 0) and have not been sold in the last 5 years. Assume the Products table contains
ProductID, ProductName, Stock, and LastSoldDate columns, and the Sales table contains
ProductID and SaleDate.
Explanation
Use a DELETE statement with a combination of WHERE clauses, checking for products with
Stock = 0 and no sales for the last 5 years by comparing LastSoldDate with the current
date.
Learnings
• Combining multiple conditions in a WHERE clause.
336
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
DELETE FROM Products p
WHERE p.Stock = 0
AND (p.LastSoldDate IS NULL OR p.LastSoldDate < CURRENT_DATE - INTERVAL '5 years')
AND NOT EXISTS (
SELECT 1 FROM Sales s WHERE s.ProductID = p.ProductID AND s.SaleDate >= CURRENT_DATE
- INTERVAL '5 years'
);
• - MySQL solution
DELETE p
FROM Products p
LEFT JOIN Sales s ON p.ProductID = s.ProductID
WHERE p.Stock = 0
AND (p.LastSoldDate IS NULL OR p.LastSoldDate < CURDATE() - INTERVAL 5 YEAR)
AND s.SaleDate IS NULL OR s.SaleDate < CURDATE() - INTERVAL 5 YEAR;
• Q.270
Question
Delete Duplicate Employees Based on Email
Write an SQL query to delete all duplicate employee records from the Employees table,
keeping only one record for each unique email. Assume the Employees table contains
EmployeeID, EmployeeName, and Email columns.
Explanation
Use a DELETE statement combined with a CTID to identify and remove duplicate records
based on the Email column, keeping only the first occurrence.
337
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Removing duplicate records based on a unique column like Email.
• Using the CTID in PostgreSQL to uniquely identify rows in a table.
• Leveraging window functions and CTID for efficient deletion of duplicates.
Solutions
• - PostgreSQL solution
WITH DuplicateEmployees AS (
SELECT MIN(CTID) AS keep_ctid, Email
FROM Employees
GROUP BY Email
)
DELETE FROM Employees
WHERE CTID NOT IN (
SELECT keep_ctid FROM DuplicateEmployees
)
AND Email IN (
SELECT Email FROM DuplicateEmployees
);
MySQL solution
DELETE e FROM Employees e
JOIN (
SELECT MIN(EmployeeID) AS EmployeeID, Email
FROM Employees
GROUP BY Email
) AS first_occurrence
ON e.Email = first_occurrence.Email
WHERE e.EmployeeID > first_occurrence.EmployeeID;
Data Cleaning
• Q.271
Question
Handle Missing Values in Customer Data
Write an SQL query to update all NULL values in the Email column of the Customers table
with a default value of '[email protected]'.
Explanation
Use the UPDATE statement combined with the IS NULL condition to find and update rows
where the Email is NULL, and set a default value for those records.
338
1000+ SQL Interview Questions & Answers | By Zero Analyst
Email VARCHAR(255)
);
• - Datasets
INSERT INTO Customers (CustomerID, CustomerName, Email)
VALUES
(1, 'John Doe', '[email protected]'),
(2, 'Jane Smith', NULL),
(3, 'Sam Johnson', '[email protected]'),
(4, 'Mike Brown', NULL),
(5, 'Emily Davis', '[email protected]');
Learnings
• Handling missing or NULL values in SQL.
• Using the IS NULL condition to identify missing data.
• Applying the UPDATE statement to modify data in a specific column.
Solutions
• - PostgreSQL solution
UPDATE Customers
SET Email = '[email protected]'
WHERE Email IS NULL;
• - MySQL solution
UPDATE Customers
SET Email = '[email protected]'
WHERE Email IS NULL;
• Q.272
Question
Remove Duplicate Orders
Write an SQL query to delete all duplicate orders in the Orders table where both the
CustomerID and OrderDate are identical. Keep only the first instance of each duplicated
order.
Explanation
Use the DELETE statement combined with a JOIN to identify and remove rows with duplicate
CustomerID and OrderDate, keeping only the first instance of each duplicated order.
339
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Removing duplicate rows based on multiple columns (CustomerID, OrderDate).
• Using JOIN to identify duplicates across multiple records.
• Understanding how to keep the first occurrence of a duplicate and remove the rest.
Solutions
• - PostgreSQL solution
WITH DuplicateOrders AS (
SELECT MIN(OrderID) AS keep_orderid, CustomerID, OrderDate
FROM Orders
GROUP BY CustomerID, OrderDate
)
DELETE FROM Orders
WHERE OrderID NOT IN (
SELECT keep_orderid FROM DuplicateOrders
)
AND (CustomerID, OrderDate) IN (
SELECT CustomerID, OrderDate FROM DuplicateOrders
);
• - MySQL solution
DELETE o FROM Orders o
JOIN (
SELECT MIN(OrderID) AS keep_orderid, CustomerID, OrderDate
FROM Orders
GROUP BY CustomerID, OrderDate
) AS first_occurrence
ON o.CustomerID = first_occurrence.CustomerID
AND o.OrderDate = first_occurrence.OrderDate
WHERE o.OrderID > first_occurrence.keep_orderid;
• Q.273
Question
Write an SQL query to find and update all PhoneNumber values in the Customers table that
do not follow the standard UK phone format (i.e., must start with +44 and be 11 digits long)
to 'Invalid'.
Explanation
Use the UPDATE statement with a WHERE clause to identify rows where the PhoneNumber does
not match the standard UK phone format (+44 followed by 9 digits). This can be achieved
using pattern matching or regular expressions, depending on the SQL database being used.
Learnings
340
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL solution
PostgreSQL supports regular expressions through the SIMILAR TO or ~ operator, which can
be used to match the phone format.
UPDATE Customers
SET PhoneNumber = 'Invalid'
WHERE PhoneNumber !~ '^\+44\d{9}$';
• ^\+44\d{9}$ checks that the phone number starts with +44 and is followed by exactly 9
digits.
MySQL solution
MySQL supports REGEXP for regular expressions.
UPDATE Customers
SET PhoneNumber = 'Invalid'
WHERE PhoneNumber NOT REGEXP '^\\+44[0-9]{9}$';
• ^\\+44[0-9]{9}$ ensures that the phone number starts with +44 and is followed by
exactly 9 digits.
• In MySQL, we need to escape the + symbol with a double backslash (\\).
Notes
• This solution assumes the phone numbers are stored as strings (VARCHAR) in the table.
• Regular expressions can vary in syntax across different databases, so make sure to adjust
based on your SQL platform's capabilities.
• Q.274
Question
Remove Outdated Products
Write an SQL query to delete all products from the Products table that have not been sold in
the last 2 years. Assume the Products table has a LastSoldDate column.
Explanation
Use the DELETE statement to remove records where the LastSoldDate is older than 2 years
compared to the current date. This can be achieved using a date comparison in the WHERE
clause.
341
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using date functions (CURRENT_DATE or CURDATE()) to compare dates.
• Deleting records based on specific date conditions.
• Ensuring accurate date comparisons using SQL date operators.
Solutions
PostgreSQL solution
PostgreSQL uses CURRENT_DATE to get the current date.
DELETE FROM Products
WHERE LastSoldDate < CURRENT_DATE - INTERVAL '2 years';
• CURRENT_DATE - INTERVAL '2 years' calculates the date 2 years ago from the current
date and deletes any records where LastSoldDate is earlier than this.
MySQL solution
MySQL uses CURDATE() to get the current date.
DELETE FROM Products
WHERE LastSoldDate < CURDATE() - INTERVAL 2 YEAR;
• CURDATE() - INTERVAL 2 YEAR calculates the date 2 years ago from today and deletes
any records where LastSoldDate is earlier than this.
Notes
• This solution assumes LastSoldDate is stored in the DATE format.
• Date functions like CURRENT_DATE in PostgreSQL and CURDATE() in MySQL help in
comparing dates directly.
• Q.275
Question
Normalize Customer Names
Write an SQL query to convert all FirstName and LastName values in the Customers table
to proper case (e.g., 'john' → 'John').
Explanation
Use the UPDATE statement combined with string functions such as UPPER() and LOWER() to
capitalize the first letter of each name and convert the rest to lowercase, thereby converting
them to proper case.
342
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Customers (CustomerID, FirstName, LastName)
VALUES
(1, 'john', 'doe'),
(2, 'jane', 'smith'),
(3, 'rajesh', 'kumar'),
(4, 'neha', 'gupta'),
(5, 'ravi', 'mehra');
Learnings
• Using string functions such as UPPER(), LOWER(), and INITCAP() to manipulate text case.
• Updating data in a table based on text transformation.
Solutions
PostgreSQL solution
PostgreSQL provides the INITCAP() function to convert text to proper case.
UPDATE Customers
SET FirstName = INITCAP(LOWER(FirstName)),
LastName = INITCAP(LOWER(LastName));
• LOWER(FirstName) converts the first name to lowercase, and INITCAP() capitalizes the
first letter of each word.
MySQL solution
MySQL does not have a direct INITCAP() function, so we can use a combination of
CONCAT(), UPPER(), and LOWER() functions to achieve proper case.
UPDATE Customers
SET FirstName = CONCAT(UPPER(SUBSTRING(FirstName, 1, 1)), LOWER(SUBSTRING(FirstName, 2))
),
LastName = CONCAT(UPPER(SUBSTRING(LastName, 1, 1)), LOWER(SUBSTRING(LastName, 2)));
• UPPER(SUBSTRING(..., 1, 1)) capitalizes the first letter, and LOWER(SUBSTRING(...,
2)) ensures the rest are in lowercase.
Notes
• The PostgreSQL solution utilizes the built-in INITCAP() function, which makes it
straightforward.
• In MySQL, we use a combination of SUBSTRING(), UPPER(), and LOWER() to achieve the
same result as INITCAP().
• Q.276
Question
Remove Inconsistent Date Formats
Write an SQL query to standardize the OrderDate column in the Orders table, ensuring all
dates are in the 'YYYY-MM-DD' format. Assume the OrderDate column may contain
inconsistent date formats.
Explanation
Use the UPDATE statement combined with date functions such as STR_TO_DATE() in MySQL
or TO_DATE() in PostgreSQL to convert the OrderDate column values to a consistent date
format ('YYYY-MM-DD').
343
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Handling inconsistent date formats in SQL.
• Using STR_TO_DATE() (MySQL) and TO_DATE() (PostgreSQL) to convert string
representations of dates into standardized date formats.
• Ensuring consistency in date storage and representation.
Solutions
PostgreSQL solution
PostgreSQL uses the TO_DATE() function to convert string data into the DATE format.
UPDATE Orders
SET OrderDate = TO_DATE(OrderDate, 'YYYY-MM-DD')
WHERE OrderDate IS NOT NULL;
• TO_DATE(OrderDate, 'YYYY-MM-DD') converts the OrderDate to the 'YYYY-MM-DD'
format.
For more complex formats (like 'April 25, 2023' or '15-02-2023'), PostgreSQL can
handle those too by using the correct format pattern:
UPDATE Orders
SET OrderDate = TO_DATE(OrderDate, 'YYYY-MM-DD')
WHERE OrderDate IS NOT NULL;
UPDATE Orders
SET OrderDate = TO_DATE(OrderDate, 'DD-MM-YYYY')
WHERE OrderDate LIKE '%-%';
MySQL solution
In MySQL, use the STR_TO_DATE() function to convert string values into a standardized date
format.
UPDATE Orders
SET OrderDate = DATE_FORMAT(STR_TO_DATE(OrderDate, '%Y-%m-%d'), '%Y-%m-%d')
WHERE OrderDate IS NOT NULL;
• STR_TO_DATE(OrderDate, '%Y-%m-%d') converts the OrderDate string into a date
object, and DATE_FORMAT(..., '%Y-%m-%d') ensures it is stored in the 'YYYY-MM-DD'
format.
For different formats like '15-02-2023', you can use:
UPDATE Orders
SET OrderDate = STR_TO_DATE(OrderDate, '%d-%m-%Y')
WHERE OrderDate LIKE '%-%';
344
1000+ SQL Interview Questions & Answers | By Zero Analyst
Notes
• In PostgreSQL, TO_DATE() can handle different formats based on the date pattern you
specify, making it flexible to work with various date formats.
• In MySQL, STR_TO_DATE() converts strings into date types, and DATE_FORMAT() ensures
the output is in the desired format.
• Q.277
Question
Handle Missing Product Prices
Write an SQL query to replace NULL values in the Price column of the Products table with
the average price of all products.
Explanation
Use the UPDATE statement combined with a subquery to calculate the average price of all
products, and then replace NULL values in the Price column with this calculated average.
Learnings
• Handling NULL values in SQL.
• Using subqueries to calculate aggregate values like averages.
• Updating data in a table based on calculated values.
Solutions
PostgreSQL solution
PostgreSQL supports the use of subqueries in the UPDATE statement.
UPDATE Products
SET Price = (SELECT AVG(Price) FROM Products WHERE Price IS NOT NULL)
WHERE Price IS NULL;
• The subquery (SELECT AVG(Price) FROM Products WHERE Price IS NOT NULL)
calculates the average price of all products that have a non-NULL price.
• The UPDATE statement then replaces NULL values in the Price column with this average.
MySQL solution
In MySQL, the same approach can be applied using a subquery.
UPDATE Products
345
1000+ SQL Interview Questions & Answers | By Zero Analyst
SET Price = (SELECT AVG(Price) FROM Products WHERE Price IS NOT NULL)
WHERE Price IS NULL;
• Similar to PostgreSQL, this subquery calculates the average price for products that have a
non-NULL value in the Price column and then updates the NULL values with this average.
Notes
• This solution ensures that only NULL prices are updated, leaving non-NULL values
unchanged.
• Both PostgreSQL and MySQL support subqueries in UPDATE statements, making this
approach consistent across the two databases.
• Q.278
Question
Standardize Country Names
Write an SQL query to standardize the Country column in the Customers table, ensuring all
instances of 'United Kingdom', 'UK', and 'GB' are replaced with 'United Kingdom'.
Explanation
Use the UPDATE statement with a CASE or REPLACE function to standardize the values in the
Country column. The goal is to replace different representations of the same country ('UK',
'GB') with a single standardized name ('United Kingdom').
Learnings
• Using UPDATE to modify multiple values in a column.
• Standardizing inconsistent text values in a column.
• Using CASE or REPLACE to handle multiple conditions in SQL.
Solutions
PostgreSQL solution
PostgreSQL allows you to use the CASE expression for conditional updates.
UPDATE Customers
SET Country = CASE
WHEN Country IN ('UK', 'GB') THEN 'United Kingdom'
ELSE Country
END;
346
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The CASE expression checks if the Country is either 'UK' or 'GB' and replaces it with
'United Kingdom'.
• If the country is neither of these values, it keeps the original value.
Alternatively, you can use the REPLACE() function if you are certain that only the specific
values need to be replaced.
UPDATE Customers
SET Country = REPLACE(REPLACE(Country, 'UK', 'United Kingdom'), 'GB', 'United Kingdom');
• The nested REPLACE() functions first replace 'UK' with 'United Kingdom' and then
replace 'GB' with 'United Kingdom'.
MySQL solution
In MySQL, both the CASE expression and REPLACE() function are supported.
Using the CASE expression:
UPDATE Customers
SET Country = CASE
WHEN Country IN ('UK', 'GB') THEN 'United Kingdom'
ELSE Country
END;
Notes
• The CASE expression is more flexible as it allows you to handle multiple conditions and
ensures you can expand the logic easily in the future.
• The REPLACE() method is a more straightforward approach, though it could potentially
cause unexpected changes if there are partial matches (e.g., 'UK' within a longer string).
• Q.279
Question
Fix Number Format Issues in Text Column
Write an SQL query to fix the number format in the Amount column of the Transactions
table, which is stored as text. The column contains numeric values with commas as thousand
separators (e.g., '1,000', '1,500.50'). Remove the commas and standardize the format to store
the number as a plain text value (e.g., '1000', '1500.50').
Explanation
Use the UPDATE statement combined with the REPLACE() function to remove the commas
from the Amount column, while keeping the column as text. The goal is to standardize the
format without changing the column type.
347
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using string functions like REPLACE() to manipulate text data.
• Removing non-numeric characters while keeping the original data type as text.
• Handling number format inconsistencies stored in text columns.
Solutions
PostgreSQL solution
PostgreSQL allows the use of the REPLACE() function to remove unwanted characters.
UPDATE Transactions
SET Amount = REPLACE(Amount, ',', '');
• The REPLACE() function removes all commas from the Amount column, leaving the
number in a standardized format.
MySQL solution
MySQL also supports the REPLACE() function for text manipulation.
UPDATE Transactions
SET Amount = REPLACE(Amount, ',', '');
• Similar to PostgreSQL, this REPLACE() function removes commas from the Amount
column, ensuring the number is correctly formatted as text.
Notes
• The Amount column is kept as text, but the number format is corrected by removing
commas.
• This solution does not convert the text to an actual numeric type but ensures that the
formatting is consistent.
• Q.280
Question
Remove Invalid Email Domains
Write an SQL query to delete all customer records from the Customers table where the
Email column contains an invalid domain, such as 'example.com' or 'fake.com'.
Explanation
Use the DELETE statement combined with the WHERE clause and LIKE or regular expressions to
filter out records with specific invalid email domains. You can identify invalid domains by
using pattern matching to check for email addresses ending in the undesired domains
('example.com' or 'fake.com').
348
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE Customers (
CustomerID INT,
CustomerName VARCHAR(255),
Email VARCHAR(255)
);
• - Datasets
INSERT INTO Customers (CustomerID, CustomerName, Email)
VALUES
(1, 'John Doe', '[email protected]'),
(2, 'Jane Smith', '[email protected]'),
(3, 'Rajesh Kumar', '[email protected]'),
(4, 'Neha Gupta', '[email protected]'),
(5, 'Ravi Mehra', '[email protected]');
Learnings
• Deleting records based on string pattern matching in SQL.
• Using LIKE or regular expressions to filter records by specific conditions.
• Understanding how to manage email-related data by identifying invalid or unwanted
domains.
Solutions
PostgreSQL solution
PostgreSQL supports regular expressions with the ~ operator to match patterns.
DELETE FROM Customers
WHERE Email ~* '@(example\.com|fake\.com)$';
• The regular expression @(example\.com|fake\.com)$ matches email addresses that end
with either 'example.com' or 'fake.com'.
• The ~* operator performs a case-insensitive match.
Alternatively, using LIKE:
DELETE FROM Customers
WHERE Email LIKE '%@example.com' OR Email LIKE '%@fake.com';
• The LIKE operator checks if the email address ends with the specified invalid domains.
MySQL solution
In MySQL, you can use the REGEXP operator for regular expressions.
DELETE FROM Customers
WHERE Email REGEXP '@(example\\.com|fake\\.com)$';
• The regular expression @(example\.com|fake\.com)$ matches email addresses ending
with 'example.com' or 'fake.com'.
• The REGEXP operator allows pattern matching in MySQL.
Alternatively, using LIKE:
DELETE FROM Customers
WHERE Email LIKE '%@example.com' OR Email LIKE '%@fake.com';
• The LIKE operator in MySQL works similarly to the PostgreSQL version for matching
specific email domains.
Notes
• Regular expressions provide more flexibility and precision, especially when checking
complex patterns.
349
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Both REGEXP (MySQL) and ~ (PostgreSQL) are useful for pattern-based filtering, while
LIKE is simpler but may be less flexible for complex patterns.
• Be cautious when using LIKE with % as it may lead to partial matches in unintended places.
Regular expressions allow more control over the pattern matching.
Questions By Company
Amazon
• Q.281
Question
Write an SQL query to find all dates' id with a higher temperature compared to its previous
day's temperature.
Explanation
We need to compare the temperature of each day with the temperature of the previous day.
This can be done by using a self-join or a window function to get the temperature of the
previous day for each row, then filtering those rows where the temperature is higher than the
previous day's temperature.
Learnings
• Self-joins to compare a row with its previous row.
• Use of date comparison to match consecutive records.
• Understanding the use of LAG() or self-join techniques for handling consecutive records.
Solutions
• - PostgreSQL solution (using LAG() function)
SELECT w.id
FROM Weather w
JOIN LATERAL (
SELECT LAG(temperature) OVER (ORDER BY recordDate) AS prev_temp
) AS prev_day ON true
WHERE w.temperature > prev_day.prev_temp;
350
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
We need to generate a sequence of numbers from 1 to the maximum customer_id, and then
find which numbers do not exist in the Customers table. This can be done by comparing the
generated sequence with the customer_id values in the table. The MAX() function will help
identify the highest customer_id, and then a range of numbers can be generated to identify
the missing IDs.
Learnings
• Generating a range of numbers using JOIN or WITH clause.
• Using NOT IN or LEFT JOIN to filter missing records.
• Understanding how to dynamically calculate ranges and compare them to table values.
Solutions
• - PostgreSQL solution (using generate_series() to create a range of numbers)
SELECT series.ids
FROM generate_series(1, (SELECT MAX(customer_id) FROM Customers)) AS series(ids)
WHERE series.ids NOT IN (SELECT customer_id FROM Customers)
ORDER BY series.ids;
• - MySQL solution (using a JOIN with a sequence of numbers)
sql
WITH RECURSIVE NumberSequence AS (
SELECT 1 AS ids
UNION ALL
SELECT ids + 1
FROM NumberSequence
WHERE ids < (SELECT MAX(customer_id) FROM Customers)
)
SELECT ids
351
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM NumberSequence
WHERE ids NOT IN (SELECT customer_id FROM Customers)
ORDER BY ids;
• Q.283
Question
Write an SQL query to show the second most recent activity of each user. If the user only has
one activity, return that one.
Explanation
We need to identify the second most recent activity for each user based on the startDate of
each activity. If a user has only one activity, we should return that single activity. To achieve
this, we can use the ROW_NUMBER() window function to rank the activities per user, then filter
for the second most recent one. If there's only one activity, we return that record.
Learnings
• Use of window functions (ROW_NUMBER()) to rank records based on a specific order (in this
case, by startDate).
• Handling cases where a user has only one record.
• Using conditional logic to return the correct result when there are fewer than two activities.
Solutions
• - PostgreSQL solution (using ROW_NUMBER() to rank activities per user)
WITH RankedActivities AS (
SELECT username, activity, startDate, endDate,
ROW_NUMBER() OVER (PARTITION BY username ORDER BY startDate DESC) AS rn
FROM UserActivity
)
SELECT username, activity, startDate, endDate
FROM RankedActivities
WHERE rn = 2
UNION ALL
SELECT username, activity, startDate, endDate
FROM RankedActivities
WHERE rn = 1
AND username NOT IN (SELECT username FROM RankedActivities WHERE rn = 2);
• - MySQL solution (using ROW_NUMBER() and IFNULL() for similar functionality)
WITH RankedActivities AS (
352
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
We need to calculate the total distance travelled by each user, which requires summing the
distance from the Rides table grouped by user_id. Additionally, we need to include users
who have no rides, which can be handled by performing a LEFT JOIN between the Users and
Rides tables. The result should be ordered first by the total distance in descending order and
then by name in ascending order in case of ties.
353
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to include all users, even those without a corresponding record in the
Rides table.
• Grouping and aggregating data using SUM() to calculate the total distance travelled by each
user.
• Sorting results based on multiple columns (ORDER BY).
Solutions
• - PostgreSQL / MySQL solution (using LEFT JOIN and GROUP BY)
SELECT u.name, COALESCE(SUM(r.distance), 0) AS travelled_distance
FROM Users u
LEFT JOIN Rides r ON u.id = r.user_id
GROUP BY u.id
ORDER BY travelled_distance DESC, u.name ASC;
In this solution:
• The LEFT JOIN ensures that all users are included, even those without any rides.
• The SUM(r.distance) calculates the total distance each user has travelled.
• The COALESCE() function is used to return 0 for users who have no corresponding ride
data.
• The result is ordered first by travelled_distance in descending order, then by name in
ascending order for tie-breaking.
• Q.285
Question
Write an SQL query to report the current balance of each user after performing transactions,
along with a flag indicating whether the user has breached their credit limit (i.e., if their
balance is less than 0). The result should show user_id, user_name, credit, and
credit_limit_breached ("Yes" or "No").
Explanation
To calculate the current balance of each user, we need to account for both incoming and
outgoing transactions:
• For every transaction, we need to subtract the amount from the credit of the user who
paid (paid_by) and add the amount to the credit of the user who received (paid_to).
• After calculating the new balance, check if the balance is less than 0. If so, mark
credit_limit_breached as "Yes"; otherwise, it should be "No".
• The result should show each user's original credit (as credit) and their updated balance
after transactions (as current_balance).
354
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to include users who have not participated in any transactions.
• Aggregating the changes to user balances based on transactions.
• Using conditional logic (CASE) to check whether the credit limit is breached.
Solutions
• - PostgreSQL / MySQL solution (calculating the balance and checking the credit breach)
SELECT u.user_id,
u.user_name,
u.credit - COALESCE(SUM(CASE WHEN t.paid_by = u.user_id THEN t.amount ELSE 0 END)
, 0) +
COALESCE(SUM(CASE WHEN t.paid_to = u.user_id THEN t.amount ELSE 0 END), 0) AS cur
rent_balance,
CASE
WHEN u.credit - COALESCE(SUM(CASE WHEN t.paid_by = u.user_id THEN t.amount EL
SE 0 END), 0) +
COALESCE(SUM(CASE WHEN t.paid_to = u.user_id THEN t.amount ELSE 0 END),
0) < 0 THEN 'Yes'
ELSE 'No'
END AS credit_limit_breached
FROM Users u
LEFT JOIN Transactions t ON u.user_id = t.paid_by OR u.user_id = t.paid_to
GROUP BY u.user_id, u.user_name, u.credit;
In this solution:
• We calculate the current balance by adjusting the user's credit based on both outgoing
(paid_by) and incoming (paid_to) transactions using SUM() and CASE statements.
• COALESCE() ensures that users with no transactions are still included (i.e., their balance
remains unchanged).
• The CASE expression checks whether the calculated balance is less than 0 to determine if
the credit limit is breached.
• The result is ordered implicitly by user ID.
• Q.286
Question
355
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to find the most frequently ordered product(s) for each customer. Return
the product_id and product_name for each customer_id who ordered at least one product.
If there are multiple most frequently ordered products for a customer, return all of them. The
result table should be ordered by customer_id.
Explanation
To solve this problem:
• We need to join the Orders table with the Products table to get product details (such as
product_name) for each order.
• Then, we need to count how many times each customer ordered each product using GROUP
BY on customer_id and product_id.
• For each customer, we need to find the most frequently ordered product(s). If there are
multiple products with the same frequency, we need to include all of them.
• The result should be ordered by customer_id to meet the requirement.
356
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine related information from multiple tables (Orders and Products).
• Using GROUP BY to aggregate orders by customer and product.
• Using HAVING or window functions (ROW_NUMBER(), RANK()) to filter for the most frequent
items per customer.
• Handling cases where there are multiple products with the same frequency.
Solutions
• - PostgreSQL / MySQL solution (finding the most frequently ordered products for each
customer)
WITH ProductFrequency AS (
SELECT o.customer_id,
o.product_id,
p.product_name,
COUNT(*) AS frequency
FROM Orders o
JOIN Products p ON o.product_id = p.product_id
GROUP BY o.customer_id, o.product_id, p.product_name
),
MaxFrequency AS (
SELECT customer_id,
MAX(frequency) AS max_frequency
FROM ProductFrequency
GROUP BY customer_id
)
SELECT pf.customer_id,
pf.product_id,
pf.product_name
FROM ProductFrequency pf
JOIN MaxFrequency mf ON pf.customer_id = mf.customer_id
WHERE pf.frequency = mf.max_frequency
ORDER BY pf.customer_id;
This solution should efficiently solve the problem of identifying the most frequently ordered
products for each customer, considering cases where multiple products might be equally
frequent for a customer.
• Q.287
Question
357
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to report the number of bank accounts in each salary category. The
salary categories are:
• "Low Salary": All salaries strictly less than $20,000.
• "Average Salary": All salaries in the inclusive range [$20,000, $50,000].
• "High Salary": All salaries strictly greater than $50,000.
The result table must contain all three categories. If there are no accounts in a category, report
0. Return the result table in any order.
Explanation
To solve this problem, we can categorize the income values into three groups based on the
given salary ranges. We will use CASE expressions to count the number of accounts falling
into each category. If there are no accounts for a given category, the result should be 0 for
that category.
Steps:
• Use CASE expressions to categorize the incomes into the three groups: "Low Salary",
"Average Salary", and "High Salary".
• Use COUNT to count the number of accounts in each category.
• Make sure to include all categories, even if there are no accounts in a particular category
(using UNION ALL to ensure we always get a row for each category).
Learnings
• Using CASE to classify data into different categories.
• Using COUNT to count the number of records for each category.
• Handling cases with no records in a category using UNION ALL.
Solutions
• - PostgreSQL / MySQL solution (counting the number of accounts in each salary category)
SELECT 'Low Salary' AS category,
COUNT(*) AS accounts_count
FROM Accounts
WHERE income < 20000
UNION ALL
SELECT 'Average Salary',
COUNT(*)
FROM Accounts
358
1000+ SQL Interview Questions & Answers | By Zero Analyst
This solution guarantees that all salary categories will appear in the result, even if some
categories have no accounts.
• Q.288
Sure! Here's the formatted SQL interview question based on the information you've provided:
Question
Write an SQL query to find the name of the product with the highest price in each country.
Explanation
You need to find the product with the highest price for each country. This involves joining
the two tables (Product and Supplier) on Supplier_id, then grouping by the Country
column. After that, you should use an aggregation function or window function to select the
highest priced product for each country.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE suppliers (
supplier_id INT PRIMARY KEY,
supplier_name VARCHAR(25),
country VARCHAR(25)
);
359
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learnings
• How to join tables using a foreign key (supplier_id).
• Use of aggregation (MAX) and grouping by Country to find the highest priced product for
each country.
• Importance of dealing with subqueries or window functions to isolate the highest value.
Solutions
• - PostgreSQL and MySQL solution
WITH RankedProducts AS (
SELECT
s.country,
p.product_name,
p.price,
RANK() OVER (PARTITION BY s.country ORDER BY p.price DESC) AS rank
FROM products p
JOIN suppliers s ON p.supplier_id = s.supplier_id
)
SELECT country, product_name
FROM RankedProducts
360
1000+ SQL Interview Questions & Answers | By Zero Analyst
WHERE rank = 1
ORDER BY country;
This should help provide a comprehensive approach for solving this SQL interview question.
• Q.289
Question
Write an SQL query to calculate the total transaction amount for each customer for the
current year. The output should contain Customer_Name and the total amount.
Explanation
You need to calculate the total transaction amount for each customer for the current year. To
do this:
• Join the Customer and Transaction tables on Customer_id.
• Filter the transactions to only include those from the current year by using the
EXTRACT(YEAR FROM date) function.
• Group the results by Customer_Name and aggregate the total amount for each customer
using SUM().
Datasets and SQL Schemas
• - Table creation and sample data
-- Create Customer table
CREATE TABLE Customers (
Customer_id INT PRIMARY KEY,
Customer_Name VARCHAR(100),
Registration_Date DATE
);
Learnings
361
1000+ SQL Interview Questions & Answers | By Zero Analyst
• How to filter data based on the current year using EXTRACT(YEAR FROM date) or
equivalent functions.
• Using JOIN to combine information from two related tables (Customer and Transaction).
• Aggregating data using SUM() and grouping by customer.
Solutions
• - PostgreSQL solution (using EXTRACT to filter the current year)
SELECT
c.customer_name,
SUM(t.amount) AS total_amt
FROM customers AS c
JOIN transaction AS t ON c.customer_id = t.customer_id
WHERE EXTRACT(YEAR FROM t.transaction_date) = EXTRACT(YEAR FROM CURRENT_DATE)
GROUP BY c.customer_name;
• - MySQL solution (using YEAR() to extract the year)
SELECT
c.customer_name,
SUM(t.amount) AS total_amt
FROM customers AS c
JOIN transaction AS t ON c.customer_id = t.customer_id
WHERE YEAR(t.transaction_date) = YEAR(CURRENT_DATE)
GROUP BY c.customer_name;
Write a SQL query to get the average review ratings for every product every month. The
output should include the month in numerical value, product id, and average star rating
rounded to two decimal places. Sort the output based on the month followed by the product
id.
Explanation
The goal is to calculate the average star rating for each product on a monthly basis. This
involves:
• Extracting the month and year from the submit_date in the reviews table.
• Grouping the results by product and month.
• Calculating the average rating (stars) for each group.
• Sorting the output by month and product id.
• Rounding the average rating to two decimal places.
Datasets and SQL Schemas
• - Table creation and sample data
-- Create reviews table
CREATE TABLE reviews (
review_id INT PRIMARY KEY,
user_id INT,
submit_date TIMESTAMP,
362
1000+ SQL Interview Questions & Answers | By Zero Analyst
product_id INT,
stars INT
);
Learnings
• Extracting the month and year from a TIMESTAMP or DATE column.
• Using GROUP BY to aggregate data by month and product.
• Applying AVG() to calculate the average star rating.
• Rounding the result to two decimal places using ROUND().
• Sorting the output by month and product ID.
Solutions
• - PostgreSQL and MySQL solution
SELECT
EXTRACT(MONTH FROM submit_date) AS mth,
product_id AS product,
ROUND(AVG(stars), 2) AS avg_stars
FROM reviews
GROUP BY mth, product
ORDER BY mth, product;
Write a SQL query to find the highest-grossing items. Identify the top two highest-grossing
products within each category in 2022. Output the category, product, and total spend.
Explanation
The task is to find the top two highest-grossing products within each category for the year
2022. To achieve this:
• Filter the transactions that occurred in 2022.
• Aggregate the total spend for each product within each category using SUM().
• Use a window function like RANK() or ROW_NUMBER() to rank products within each
category by total spend.
363
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Select only the top two highest-grossing products for each category.
• Sort the results by category and the total spend in descending order.
Datasets and SQL Schemas
• - Table creation and sample data
-- Create product_spend table
CREATE TABLE product_spend (
category VARCHAR(50),
product VARCHAR(50),
user_id INT,
spend DECIMAL(10, 2),
transaction_date TIMESTAMP
);
Learnings
• How to filter data by a specific year using EXTRACT(YEAR FROM ...) or equivalent
functions.
• Aggregating data by category and product using SUM().
• Using RANK() or ROW_NUMBER() to rank the products based on total spend within each
category.
• Sorting the result by total spend and selecting the top two products per category.
Solutions
• - PostgreSQL and MySQL solution (using RANK() to rank products)
WITH RankedProducts AS (
SELECT
category,
product,
SUM(spend) AS total_spend,
RANK() OVER (PARTITION BY category ORDER BY SUM(spend) DESC) AS rank
FROM product_spend
WHERE EXTRACT(YEAR FROM transaction_date) = 2022
GROUP BY category, product
)
SELECT category, product, total_spend
FROM RankedProducts
WHERE rank <= 2
ORDER BY category, total_spend DESC;
364
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Sorting the result: ORDER BY category, total_spend DESC sorts the output first by
category and then by total spend in descending order.
This query will return the top two products per category with their total spend in 2022, sorted
as required.
• Q.292
Question
Write a SQL query to identify high-spending customers who have made purchases exceeding
$100. You have two tables called customers (containing customer_id and name) and
orders (containing order_id, customer_id, and order_amount). The task is to join these
tables and calculate the total purchase amount for each customer, selecting customers whose
total purchase amount exceeds $100.
Explanation
The task requires identifying customers who have made purchases exceeding $100 by:
• Joining the customers and orders tables using the customer_id.
• Aggregating the total purchase amount for each customer using SUM().
• Filtering the result to include only customers whose total purchase amount exceeds $100
using the HAVING clause.
• Grouping the result by customer_id and name.
Datasets and SQL Schemas
• - Table creation and sample data
-- Create customers table
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100)
);
Learnings
• How to join two tables using a common key (customer_id).
365
1000+ SQL Interview Questions & Answers | By Zero Analyst
• How to use SUM() to calculate the total purchase amount for each customer.
• How to filter results based on aggregated values using the HAVING clause.
• The importance of grouping by relevant columns (customer_id and name).
Solutions
• - PostgreSQL and MySQL solution
SELECT c.customer_id, c.name
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
HAVING SUM(o.order_amount) > 100;
This query correctly identifies high-spending customers based on the total amount spent in
the orders table.
• Q.293
Write a SQL query to generate a histogram showing the count of comments made by each
user. You are given two tables, users and comments. The users table contains information
about users, and the comments table contains comments made by users. The task is to
calculate the number of comments each user has made and sort the results in descending
order of comment count to identify the most active users.
Explanation
To calculate the number of comments made by each user and generate a histogram of their
activity:
• Join the users table and the comments table on the user_id.
• Count the number of comments for each user using the COUNT() function.
• Group the results by user_id and name to aggregate the comment counts.
• Sort the results in descending order by the comment count, showing the most active users
first.
366
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• How to perform a LEFT JOIN to include all users, even those who haven't made any
comments.
• Using the COUNT() function to count the number of comments for each user.
• Grouping by multiple columns (user_id, name) to aggregate data.
• Sorting the results using ORDER BY to display users with the highest comment count first.
Solution
• - PostgreSQL and MySQL solution
SELECT u.user_id, u.name, COUNT(c.comment_id) AS comment_count
FROM users u
LEFT JOIN comments c ON u.user_id = c.user_id
GROUP BY u.user_id, u.name
ORDER BY comment_count DESC;
367
1000+ SQL Interview Questions & Answers | By Zero Analyst
• GROUP BY u.user_id, u.name: The GROUP BY clause aggregates the results by user, so
that we can calculate the total comment count for each user.
• ORDER BY comment_count DESC: The ORDER BY clause sorts the result by the
comment_count in descending order to show the most active users first.
Expected Output:
user_id | name | comment_count
--------|---------|---------------
1 | Alice | 5
2 | Bob | 4
3 | Charlie | 2
4 | David | 1
• Q.294
Write a SQL query to determine the daily aggregate count of new users and the cumulative
count of users over time. You are given a users table with the columns user_id and
registration_date. The task is to generate a report showing:
Explanation
To generate the required report:
• Group by registration_date to count how many users registered on each specific day.
• Use COUNT(user_id) to count the number of new users each day.
• Use a window function (SUM() with OVER clause) to calculate the cumulative count of
users by summing up the number of new users up to the current date.
• Order by registration_date to ensure the result is sorted by the date of registration.
Learnings
• How to use COUNT() for counting new users for each day.
368
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solution
• - PostgreSQL and MySQL solution
SELECT
registration_date,
COUNT(user_id) AS new_users,
SUM(COUNT(user_id)) OVER (ORDER BY registration_date) AS cumulative_count
FROM
users
GROUP BY
registration_date
ORDER BY
registration_date;
Expected Output:
registration_date | new_users | cumulative_count
------------------|-----------|-----------------
2024-01-01 | 2 | 2
2024-01-02 | 2 | 4
2024-01-03 | 3 | 7
2024-01-04 | 3 | 10
Summary:
This query helps track user registration trends by counting the new users per day and
calculating the cumulative number of users over time. The use of window functions (SUM()
OVER()) makes it easy to compute the cumulative count dynamically as the query processes
each date in the dataset.
• Q.295
Write an SQL query to track daily user registrations and calculate the daily count of new
users and the cumulative count of users over time. The given users table contains user_id
and registration_date. Your goal is to generate a report showing:
• The count of new users per day (new_users).
• The cumulative count of users up to each registration date (cumulative_count).
The results should be ordered by registration_date.
Explanation
To generate the report:
369
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Group the data by registration_date to calculate the daily new user count.
• Use COUNT(user_id) to count the number of new users on each specific day.
• Use the window function SUM() OVER() to calculate the cumulative count of users up to
the current date.
• Order the results by registration_date to show the daily progression of user
registrations.
Learnings
• Using COUNT() to calculate the number of users on each day.
• Using SUM() OVER() to calculate the cumulative count over a range of dates.
• Grouping by registration_date to aggregate user registrations.
• Ordering the data by date to maintain chronological order.
Solution
• - PostgreSQL and MySQL solution
SELECT
registration_date,
COUNT(user_id) AS new_users,
SUM(COUNT(user_id)) OVER (ORDER BY registration_date) AS cumulative_count
FROM
users
GROUP BY
registration_date
ORDER BY
registration_date;
370
1000+ SQL Interview Questions & Answers | By Zero Analyst
Summary:
This SQL query effectively tracks user registrations by showing both daily new user counts
and cumulative totals, helping to analyze user growth over time. The use of window functions
(like SUM() OVER()) ensures a dynamic and efficient way to calculate cumulative values
based on chronological ordering.
• Q.296
Write an SQL query to find the second-highest salary of employees in the Engineering
department. The query should retrieve the second-highest salary for employees specifically in
the Engineering department.
Explanation
To solve this problem:
• Join the employees table with the departments table on department_id to get the
relevant salary and department information.
• Filter by the Engineering department.
• Use the RANK() window function to assign a rank to each employee’s salary in descending
order within the Engineering department.
• Select the salary with a rank of 2 to identify the second-highest salary.
• Group the result by department_name to ensure you output the result at the department
level.
371
1000+ SQL Interview Questions & Answers | By Zero Analyst
(6, 3, 45000.00);
Learnings
• Using RANK() to rank employees' salaries in descending order within a specific
department.
• Filtering by rank to retrieve the second-highest salary.
• Window functions allow partitioning and ordering within groups (i.e., departments in this
case).
• JOIN operations to combine data from multiple tables.
Solution
• - PostgreSQL and MySQL solution
SELECT
department_name,
MAX(salary) AS second_highest_salary
FROM
(
SELECT
d.department_name,
e.salary,
RANK() OVER (PARTITION BY d.department_name ORDER BY e.salary DESC) AS salary_ra
nk
FROM
employees e
JOIN departments d ON e.department_id = d.department_id
WHERE
d.department_name = 'Engineering'
) ranked_salaries
WHERE
salary_rank = 2
GROUP BY
department_name;
Summary:
This SQL query identifies the second-highest salary in the Engineering department by
ranking the employees' salaries in descending order and filtering for the rank 2. The use of
the RANK() window function allows handling ties in salaries, ensuring that the second-highest
salary is correctly identified. The MAX() function ensures that if multiple employees share the
same salary, only one result is returned.
• Q.297
372
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to determine which manager oversees the largest team. The employees
table contains employee_id, manager_id, and department_id. Your task is to find the
manager_id and the team size (the number of employees they manage).
Explanation
To solve this problem:
• Group the data by manager_id to calculate the number of employees they manage.
• Use COUNT(employee_id) to count the number of employees under each manager.
• Order the results by team size in descending order to identify the manager with the
largest team.
• Limit the result to the top manager using LIMIT 1.
Learnings
• COUNT(employee_id): Used to calculate the number of employees managed by each
manager.
• GROUP BY manager_id: Aggregates the employee data based on the manager.
• ORDER BY team_size DESC: Ensures that the manager with the largest team is sorted at
the top.
• LIMIT 1: Restricts the result to the top manager only.
Solution
• - PostgreSQL and MySQL solution
SELECT
manager_id,
COUNT(employee_id) AS team_size
FROM
employees
GROUP BY
manager_id
ORDER BY
team_size DESC
LIMIT 1;
373
1000+ SQL Interview Questions & Answers | By Zero Analyst
Summary:
This SQL query identifies the manager who oversees the largest team by grouping employees
by manager_id and counting the number of employees under each manager. The result is
sorted in descending order by team size, and the LIMIT 1 clause ensures only the manager
with the largest team is selected.
• Q.298
Write an SQL query to generate a report of product names, sale years, and prices for each
sale from the sales table. The output should include the sale ID, product name, the year the
sale was made, and the price.
Explanation
To generate this report:
• Extract the year from the sale_date using the EXTRACT(YEAR FROM sale_date)
function.
• Select the necessary columns: sale_id, product_name, year (extracted from
sale_date), and price.
• Return all rows from the sales table to get a complete list of sales data with these details.
Learnings
• EXTRACT(YEAR FROM date): Used to extract the year from a date field.
374
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Selecting multiple columns: Ensures the report includes all relevant data, including sale
ID, product name, sale year, and price.
• Simple SELECT statement: Pulls data directly from the table.
Solution
• - PostgreSQL and MySQL solution
SELECT
sale_id,
product_name,
EXTRACT(YEAR FROM sale_date) AS year,
price
FROM
sales;
Expected Output:
sale_id | product_name | year | price
--------|--------------|------|------
101 | Laptop | 2024 | 1200.00
102 | Smartphone | 2024 | 800.00
103 | Tablet | 2024 | 600.00
104 | Laptop | 2024 | 1100.00
105 | Smartwatch | 2024 | 300.00
Summary:
This SQL query retrieves detailed sales information from the sales table, including the
product name, sale year (extracted from sale_date), and price for each sale. The EXTRACT()
function is used to extract the year from the sale date, and the output is structured to display
the required fields.
• Q.299
Write an SQL query to find the second most recent order date for each customer from the
Orders table. The table contains the following columns:
• OrderID
• CustomerID
• OrderDate
The output should display the CustomerID and the second most recent order date for each
customer.
Explanation
To find the second most recent order date for each customer:
• Order the data by OrderDate for each customer in descending order.
375
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Rank the rows for each customer using the ROW_NUMBER() or RANK() window function.
• Filter the results to keep only the rows with the second most recent order (i.e., where the
rank is 2).
Learnings
• Window Functions (ROW_NUMBER() or RANK()): Used to assign a rank or number to each
row within a partition (grouped by CustomerID), which helps in identifying the most recent
and second most recent records.
• Using WHERE to filter by rank: Once rows are ranked, filter to get the second most recent
record.
Solution
• - PostgreSQL and MySQL solution
WITH RankedOrders AS (
SELECT
CustomerID,
OrderDate,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate DESC) AS OrderRank
FROM Orders
)
SELECT
CustomerID,
OrderDate AS SecondMostRecentOrderDate
FROM RankedOrders
WHERE OrderRank = 2;
376
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.300
Write an SQL query to calculate the running total of sales for each brand, ordered by
sale_date. The brand_sales table contains the following columns:
• sale_id
• brand
• sale_date
• sale_amount
The output should display the brand, sale_id, sale_date, and the cumulative sum of
sale_amount for each brand in a running total.
Explanation
To calculate the running total of sales by each brand:
• Use the SUM() function along with a window function to calculate the cumulative sum of
sale_amount for each brand.
• Partition the data by brand to calculate the running total separately for each brand.
• Order the data by sale_date to ensure the running total is calculated chronologically for
each brand.
Learnings
• Window Functions (SUM() with OVER() clause): The SUM() function is used with the
OVER() clause to calculate the running total of sales. The PARTITION BY clause groups the
data by brand, and the ORDER BY clause ensures the sales are summed in chronological order.
377
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Cumulative sum: The running total is computed progressively for each brand, which is
useful for tracking sales growth over time.
Solution
• - PostgreSQL and MySQL solution
SELECT
brand,
sale_id,
sale_date,
sale_amount,
SUM(sale_amount) OVER (PARTITION BY brand ORDER BY sale_date) AS running_total
FROM
brand_sales
ORDER BY
brand, sale_date;
Google
• Q.301
Question
Calculate the 3-month rolling average of total revenue from purchases, excluding returns
(negative amounts), grouped by year-month (YYYY-MM). Sort the result from the earliest to
the latest month.
Explanation
To solve this, you need to:
• Exclude negative purchase amounts (returns).
• Group data by year-month.
• Calculate the total revenue for each month.
• Compute the 3-month rolling average using a window function.
• Sort the results by year-month in ascending order.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE amazon_purchases (
created_at DATETIME,
purchase_amt BIGINT,
user_id BIGINT
);
378
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Grouping by month using TO_CHAR() or DATE_FORMAT().
• Filtering out negative purchase values with WHERE purchase_amt > 0.
• Using window functions (AVG()) for rolling averages.
• Sorting by date to maintain chronological order.
Solutions
• - PostgreSQL solution
WITH monthly_revenue AS (
SELECT
TO_CHAR(created_at, 'YYYY-MM') AS month,
SUM(purchase_amt) AS total_revenue
FROM amazon_purchases
WHERE purchase_amt > 0
GROUP BY TO_CHAR(created_at, 'YYYY-MM')
),
rolling_avg AS (
SELECT
month,
AVG(total_revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS three_month_avg
FROM monthly_revenue
)
SELECT month, three_month_avg
FROM rolling_avg
ORDER BY month;
• - MySQL solution
WITH monthly_revenue AS (
SELECT
DATE_FORMAT(created_at, '%Y-%m') AS month,
SUM(purchase_amt) AS total_revenue
FROM amazon_purchases
WHERE purchase_amt > 0
GROUP BY DATE_FORMAT(created_at, '%Y-%m')
),
rolling_avg AS (
SELECT
month,
AVG(total_revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS three_month_avg
FROM monthly_revenue
)
SELECT month, three_month_avg
FROM rolling_avg
ORDER BY month;
• Q.302
Find the fifth highest salary from the com_worker table without using TOP or LIMIT. Note:
Duplicate salaries should not be removed.
Explanation
379
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this, you can use a common table expression (CTE) and window functions such as
RANK() or DENSE_RANK() to rank the salaries in descending order. Then, you can filter for the
salary that has the rank of 5.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE com_worker (
worker_id BIGINT PRIMARY KEY,
department VARCHAR(25),
first_name VARCHAR(25),
last_name VARCHAR(25),
joining_date DATETIME,
salary BIGINT
);
Learnings
• Using window functions like RANK() or DENSE_RANK() to assign ranks based on salary.
• Handling duplicate values in ranking without filtering them out.
• Using CTEs for better query organization and readability.
• Filtering based on rank to find specific salary positions.
Solutions
• - PostgreSQL solution
WITH ranked_salaries AS (
SELECT
salary,
RANK() OVER (ORDER BY salary DESC) AS rank
FROM com_worker
)
SELECT salary
FROM ranked_salaries
WHERE rank = 5;
• - MySQL solution
WITH ranked_salaries AS (
SELECT
salary,
RANK() OVER (ORDER BY salary DESC) AS rank
FROM com_worker
)
SELECT salary
FROM ranked_salaries
WHERE rank = 5;
• Q.303
Find the top 3 most common letters across all words in both google_file_store and
google_word_lists tables (ignore the filename column in google_file_store). Output
the letter along with the number of occurrences, ordered in descending order by occurrences.
380
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this, you need to:
• Extract all words from both tables.
• Break each word into individual characters.
• Count the frequency of each letter, ignoring spaces and non-alphabetic characters.
• Sort the results based on the number of occurrences, and select the top 3 most common
letters.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE google_file_store (
contents VARCHAR(MAX),
filename VARCHAR(255)
);
Learnings
• String manipulation and breaking down words into characters using REGEXP_REPLACE() or
similar functions.
• Using UNION ALL to combine results from multiple tables.
• Filtering out non-alphabetic characters to focus on letters only.
• Aggregating data using GROUP BY and ordering the result by frequency.
Solutions
• - PostgreSQL solution
WITH letter_counts AS (
-- Extract letters from both tables and count occurrences
SELECT LOWER(regexp_replace(word, '[^a-zA-Z]', '', 'g')) AS letter
FROM (
SELECT unnest(string_to_array(contents, ' ')) AS word FROM google_file_store
UNION ALL
SELECT unnest(string_to_array(words1, ' ')) FROM google_word_lists
UNION ALL
SELECT unnest(string_to_array(words2, ' ')) FROM google_word_lists
) AS all_words
)
SELECT letter, COUNT(*) AS occurrences
FROM letter_counts
WHERE letter <> ''
GROUP BY letter
ORDER BY occurrences DESC
LIMIT 3;
• - MySQL solution
WITH letter_counts AS (
-- Extract letters from both tables and count occurrences
381
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using PERCENTILE_CONT() to calculate percentiles.
382
1000+ SQL Interview Questions & Answers | By Zero Analyst
383
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY to group data by multiple columns (department and salary).
• Counting occurrences with COUNT().
• Sorting results in descending order using ORDER BY.
Solutions
• - PostgreSQL solution
SELECT department, salary, COUNT(*) AS employee_count
FROM employees
GROUP BY department, salary
ORDER BY employee_count DESC;
• - MySQL solution
SELECT department, salary, COUNT(*) AS employee_count
FROM employees
GROUP BY department, salary
ORDER BY employee_count DESC;
• Q.306
Find the most common domain (excluding the www. prefix) used in email addresses from the
users table. Output the domain and the number of occurrences, ordered by the count in
descending order.
Explanation
To solve this:
• Extract the domain from the email addresses by splitting the string on the @ symbol.
• Remove the www. prefix (if it exists) from the domain part.
• Group the results by domain and count the occurrences.
• Order the result by the count in descending order to find the most common domains.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE users (
user_id INT PRIMARY KEY,
email VARCHAR(255)
);
Learnings
• Using SUBSTRING_INDEX() to extract parts of a string.
• Using REPLACE() to remove unwanted substrings (www. in this case).
• Grouping data by specific substrings (in this case, the domain of email addresses).
384
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using RANK() or ROW_NUMBER() to rank items by price within a category.
• Filtering for the second most expensive item by selecting the row with rank = 2.
• Grouping data by category to find the second most expensive item in each.
Solutions
• - PostgreSQL solution
385
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH ranked_products AS (
SELECT
category,
product_name,
price,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS rank
FROM products
)
SELECT category, product_name, price
FROM ranked_products
WHERE rank = 2;
• - MySQL solution
WITH ranked_products AS (
SELECT
category,
product_name,
price,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS rank
FROM products
)
SELECT category, product_name, price
FROM ranked_products
WHERE rank = 2;
• Q.308
Find the number of unique products purchased by each customer in the orders table. Output
the customer ID and the count of distinct product IDs they have purchased.
Explanation
To solve this:
• Group the data by customer_id.
• Count the distinct product_id for each customer.
• Output the customer ID and the number of unique products they have purchased.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
product_id INT,
order_date DATE
);
Learnings
• Using COUNT(DISTINCT ...) to count unique occurrences.
• Grouping data by a specific attribute (customer_id in this case).
• Understanding how to aggregate data by grouping and counting distinct values.
Solutions
386
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
SELECT customer_id, COUNT(DISTINCT product_id) AS unique_products
FROM orders
GROUP BY customer_id;
• - MySQL solution
SELECT customer_id, COUNT(DISTINCT product_id) AS unique_products
FROM orders
GROUP BY customer_id;
• Q.309
Find the longest streak of consecutive days with a purchase made by the same customer in the
orders table. Output the customer ID, the starting date of the streak, the ending date, and the
number of consecutive days in the streak.
Explanation
To solve this:
• You need to identify consecutive dates for each customer. A sequence of consecutive dates
is defined by the difference between each order date and the previous one being exactly 1
day.
• Group the data by customer_id and create a "group" for each consecutive streak of days.
• Count the length of each streak and select the longest one for each customer.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE
);
Learnings
• Using LEAD() or LAG() window functions to find consecutive rows.
• Using a combination of date comparison and grouping techniques to identify consecutive
streaks.
• Aggregating data using window functions and conditional logic to group consecutive days.
Solutions
• - PostgreSQL solution
WITH consecutive_dates AS (
SELECT
customer_id,
order_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) -
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS streak_group
FROM orders
),
387
1000+ SQL Interview Questions & Answers | By Zero Analyst
streak_lengths AS (
SELECT
customer_id,
MIN(order_date) AS streak_start,
MAX(order_date) AS streak_end,
COUNT(*) AS streak_length
FROM consecutive_dates
GROUP BY customer_id, streak_group
)
SELECT customer_id, streak_start, streak_end, streak_length
FROM streak_lengths
WHERE streak_length = (
SELECT MAX(streak_length)
FROM streak_lengths
WHERE customer_id = streak_lengths.customer_id
)
ORDER BY customer_id;
• - MySQL solution
WITH consecutive_dates AS (
SELECT
customer_id,
order_date,
DATEDIFF(order_date,
(SELECT MAX(order_date)
FROM orders o2
WHERE o2.customer_id = orders.customer_id AND o2.order_date < orders.order_
date)
) AS streak_group
FROM orders
),
streak_lengths AS (
SELECT
customer_id,
MIN(order_date) AS streak_start,
MAX(order_date) AS streak_end,
COUNT(*) AS streak_length
FROM consecutive_dates
GROUP BY customer_id, streak_group
)
SELECT customer_id, streak_start, streak_end, streak_length
FROM streak_lengths
WHERE streak_length = (
SELECT MAX(streak_length)
FROM streak_lengths
WHERE customer_id = streak_lengths.customer_id
)
ORDER BY customer_id;
Explanation
• We calculate the difference between each order date and the previous one to identify
consecutive days using ROW_NUMBER() and DATEDIFF().
• We then group consecutive dates by using the difference of row numbers or by calculating
the streak group.
• The final step aggregates the streaks and filters the longest consecutive streak per
customer.
This query requires using window functions, grouping, and date comparison logic, making it
a more advanced solution for identifying streaks.
• Q.310
Find all email addresses from the users table that are from Gmail but have non-standard
domains (i.e., they do not end with "gmail.com", but may have additional subdomains).
Output the user ID, email address, and the domain part of the email (after the '@' symbol).
Explanation
388
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
• Use regular expressions to match Gmail email addresses.
• Identify Gmail emails with non-standard domains (those that have additional subdomains).
• Extract the domain part of the email using string functions.
• Output the user ID, email, and the domain part for each matching email.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE users (
user_id INT PRIMARY KEY,
email VARCHAR(255)
);
Learnings
• Using regular expressions to match patterns in email addresses.
• Extracting domain parts from emails using string functions.
• Filtering emails based on specific patterns such as non-standard Gmail domains.
Solutions
• - PostgreSQL solution
SELECT
user_id,
email,
SUBSTRING(email FROM '@(.*)$') AS domain
FROM users
WHERE email ~* '^.*@gmail\..+'
AND email !~* '@gmail\.com$'
ORDER BY user_id;
• - MySQL solution
SELECT
user_id,
email,
SUBSTRING_INDEX(email, '@', -1) AS domain
FROM users
WHERE email REGEXP '^.*@gmail\\..+'
AND email NOT REGEXP '@gmail\\.com$'
ORDER BY user_id;
Explanation
• PostgreSQL:
• email ~* '^.*@gmail\..+': This regular expression matches emails that are from Gmail
but with additional subdomains (i.e., gmail followed by any subdomain).
• email !~* '@gmail\.com$': This ensures the email does not end with "gmail.com",
excluding standard Gmail addresses.
• SUBSTRING(email FROM '@(.*)$'): Extracts the domain part from the email.
389
1000+ SQL Interview Questions & Answers | By Zero Analyst
• MySQL:
• REGEXP '^.*@gmail\\..+': Matches Gmail emails with subdomains.
• NOT REGEXP '@gmail\\.com$': Ensures the email address does not end with
"gmail.com".
• SUBSTRING_INDEX(email, '@', -1): Extracts the domain part of the email.
• Q.311
Question
Write a SQL query to find employees who have the highest salary in each of the departments.
Explanation
To solve this, you need to join the Employee table with the Department table, group by the
DepartmentId, and select the employee(s) with the highest salary in each department. This
can be achieved using a subquery or JOIN with aggregation.
Learnings
• Use of JOIN to combine related data from different tables.
• GROUP BY and MAX() to find the highest salary in each department.
• Using subqueries to filter out employees with the highest salary.
Solutions
• - PostgreSQL solution
SELECT d.Name AS Department, e.Name AS Employee, e.Salary
FROM Employee e
JOIN Department d ON e.DepartmentId = d.Id
WHERE e.Salary = (
SELECT MAX(Salary)
FROM Employee
WHERE DepartmentId = e.DepartmentId
);
390
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT d.Name AS Department, e.Name AS Employee, e.Salary
FROM Employee e
JOIN Department d ON e.DepartmentId = d.Id
WHERE e.Salary = (
SELECT MAX(Salary)
FROM Employee
WHERE DepartmentId = e.DepartmentId
);
• Q.312
Question
Write a SQL query to print the node id and the type of the node (Root, Inner, Leaf). Sort the
result by the node id.
Explanation
To classify the nodes into Root, Inner, or Leaf:
• A Root node has p_id as NULL.
• An Inner node has a parent (p_id is not NULL) and at least one child.
• A Leaf node has a parent (p_id is not NULL) and no children.
Learnings
• Use of CASE statements to categorize the nodes.
• Identifying child nodes by checking for the absence or presence of a node with a matching
p_id.
• Sorting the output by node id.
Solutions
• - PostgreSQL solution
SELECT id,
CASE
WHEN p_id IS NULL THEN 'Root'
WHEN id IN (SELECT DISTINCT p_id FROM tree WHERE p_id IS NOT NULL) THEN 'Inne
r'
ELSE 'Leaf'
END AS Type
FROM tree
ORDER BY id;
• - MySQL solution
SELECT id,
CASE
391
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a SQL query to report the league statistics, including matches played, points, goals
scored, goals conceded, and goal difference for each team.
Explanation
To calculate the statistics:
• matches_played: Count the number of times the team appears as a home or away team.
• points: 3 points for a win, 0 points for a loss, and 1 point for a draw.
• goal_for: The total goals scored by the team in all matches.
• goal_against: The total goals conceded by the team in all matches.
• goal_diff: Calculated as goal_for - goal_against.
The results need to be sorted by:
• points in descending order.
• If points are tied, by goal_diff in descending order.
• If both points and goal_diff are tied, by team_name lexicographically.
Learnings
• Combining data from multiple tables using JOIN.
392
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using conditional aggregation to calculate points, goals scored, and goals conceded.
• Sorting with multiple criteria using ORDER BY.
Solutions
• - PostgreSQL solution
SELECT t.team_name,
COUNT(m.home_team_id) + COUNT(m.away_team_id) AS matches_played,
SUM(CASE
WHEN (m.home_team_id = t.team_id AND m.home_team_goals > m.away_team_goal
s)
OR (m.away_team_id = t.team_id AND m.away_team_goals > m.home_team_g
oals) THEN 3
WHEN m.home_team_goals = m.away_team_goals THEN 1
ELSE 0
END) AS points,
SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.home_team_goals
WHEN m.away_team_id = t.team_id THEN m.away_team_goals
END) AS goal_for,
SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.away_team_goals
WHEN m.away_team_id = t.team_id THEN m.home_team_goals
END) AS goal_against,
SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.home_team_goals
WHEN m.away_team_id = t.team_id THEN m.away_team_goals
END) - SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.away_team_goals
WHEN m.away_team_id = t.team_id THEN m.home_team_goals
END) AS goal_diff
FROM Teams t
LEFT JOIN Matches m ON t.team_id = m.home_team_id OR t.team_id = m.away_team_id
GROUP BY t.team_name
ORDER BY points DESC, goal_diff DESC, t.team_name;
• - MySQL solution
SELECT t.team_name,
COUNT(m.home_team_id) + COUNT(m.away_team_id) AS matches_played,
SUM(CASE
WHEN (m.home_team_id = t.team_id AND m.home_team_goals > m.away_team_goal
s)
OR (m.away_team_id = t.team_id AND m.away_team_goals > m.home_team_g
oals) THEN 3
WHEN m.home_team_goals = m.away_team_goals THEN 1
ELSE 0
END) AS points,
SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.home_team_goals
WHEN m.away_team_id = t.team_id THEN m.away_team_goals
END) AS goal_for,
SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.away_team_goals
WHEN m.away_team_id = t.team_id THEN m.home_team_goals
END) AS goal_against,
SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.home_team_goals
WHEN m.away_team_id = t.team_id THEN m.away_team_goals
END) - SUM(CASE
WHEN m.home_team_id = t.team_id THEN m.away_team_goals
WHEN m.away_team_id = t.team_id THEN m.home_team_goals
END) AS goal_diff
FROM Teams t
LEFT JOIN Matches m ON t.team_id = m.home_team_id OR t.team_id = m.away_team_id
GROUP BY t.team_name
ORDER BY points DESC, goal_diff DESC, t.team_name;
• Q.314
Question
393
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to generate a report of the continuous periods where tasks either failed or
succeeded between 2019-01-01 and 2019-12-31. The report should include start_date,
end_date, and period_state (either 'failed' or 'succeeded'). Order the result by start_date.
Explanation
To solve this:
• Combine both the Failed and Succeeded tables and select the dates within the period
2019-01-01 to 2019-12-31.
• Group consecutive days together using a method to identify continuous periods of success
or failure.
• For each period, determine if the task was failed or succeeded.
• Output the start_date, end_date, and period_state for each continuous period,
ordering by start_date.
Learnings
• Combining data from multiple tables using UNION.
• Identifying consecutive days using GROUP BY and ROW_NUMBER() for windowing functions.
• Using CASE and conditional logic to determine if the period is failed or succeeded.
Solutions
• - PostgreSQL solution
WITH Combined AS (
SELECT fail_date AS task_date, 'failed' AS period_state
FROM Failed
WHERE fail_date BETWEEN '2019-01-01' AND '2019-12-31'
UNION ALL
SELECT success_date AS task_date, 'succeeded' AS period_state
FROM Succeeded
WHERE success_date BETWEEN '2019-01-01' AND '2019-12-31'
),
Ranked AS (
394
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT task_date,
period_state,
ROW_NUMBER() OVER (ORDER BY task_date) -
ROW_NUMBER() OVER (PARTITION BY period_state ORDER BY task_date) AS grp
FROM Combined
)
SELECT MIN(task_date) AS start_date,
MAX(task_date) AS end_date,
period_state
FROM Ranked
GROUP BY period_state, grp
ORDER BY start_date;
• - MySQL solution
WITH Combined AS (
SELECT fail_date AS task_date, 'failed' AS period_state
FROM Failed
WHERE fail_date BETWEEN '2019-01-01' AND '2019-12-31'
UNION ALL
SELECT success_date AS task_date, 'succeeded' AS period_state
FROM Succeeded
WHERE success_date BETWEEN '2019-01-01' AND '2019-12-31'
),
Ranked AS (
SELECT task_date,
period_state,
ROW_NUMBER() OVER (ORDER BY task_date) -
ROW_NUMBER() OVER (PARTITION BY period_state ORDER BY task_date) AS grp
FROM Combined
)
SELECT MIN(task_date) AS start_date,
MAX(task_date) AS end_date,
period_state
FROM Ranked
GROUP BY period_state, grp
ORDER BY start_date;
• Q.315
Question
Write an SQL query to report the current credit balance for each user after processing all
transactions, and check if they have breached their credit limit (i.e., if their credit balance is
less than 0). The result should include user_id, user_name, credit (current balance), and
credit_limit_breached (with values 'Yes' or 'No').
Explanation
To solve this:
• Calculate the total transactions for each user by summing the amounts where the user is
either the paid_by or paid_to in the Transactions table.
• Update the user's credit balance by adjusting it for both money paid and received:
• If the user is the payer (paid_by), subtract the transaction amount.
• If the user is the receiver (paid_to), add the transaction amount.
• Check whether the updated credit balance is below 0 to determine if the user has breached
their credit limit.
• Return the user details along with their current balance and whether they have breached the
credit limit.
395
1000+ SQL Interview Questions & Answers | By Zero Analyst
user_name VARCHAR(100),
credit INT
);
Learnings
• SUM() and conditional aggregation to calculate net transaction changes for each user.
• JOIN to combine Users and Transactions tables.
• Conditional logic with CASE to check credit breaches.
Solutions
• - PostgreSQL solution
SELECT u.user_id,
u.user_name,
u.credit - COALESCE(SUM(CASE WHEN t.paid_by = u.user_id THEN t.amount ELSE 0 END)
, 0) + COALESCE(SUM(CASE WHEN t.paid_to = u.user_id THEN t.amount ELSE 0 END), 0) AS cre
dit,
CASE
WHEN u.credit - COALESCE(SUM(CASE WHEN t.paid_by = u.user_id THEN t.amount EL
SE 0 END), 0) + COALESCE(SUM(CASE WHEN t.paid_to = u.user_id THEN t.amount ELSE 0 END),
0) < 0
THEN 'Yes'
ELSE 'No'
END AS credit_limit_breached
FROM Users u
LEFT JOIN Transactions t ON u.user_id = t.paid_by OR u.user_id = t.paid_to
GROUP BY u.user_id
ORDER BY u.user_id;
• - MySQL solution
SELECT u.user_id,
u.user_name,
u.credit - COALESCE(SUM(CASE WHEN t.paid_by = u.user_id THEN t.amount ELSE 0 END)
, 0) + COALESCE(SUM(CASE WHEN t.paid_to = u.user_id THEN t.amount ELSE 0 END), 0) AS cre
dit,
CASE
WHEN u.credit - COALESCE(SUM(CASE WHEN t.paid_by = u.user_id THEN t.amount EL
SE 0 END), 0) + COALESCE(SUM(CASE WHEN t.paid_to = u.user_id THEN t.amount ELSE 0 END),
0) < 0
THEN 'Yes'
ELSE 'No'
END AS credit_limit_breached
FROM Users u
LEFT JOIN Transactions t ON u.user_id = t.paid_by OR u.user_id = t.paid_to
GROUP BY u.user_id
ORDER BY u.user_id;
• Q.316
396
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to find the countries where the telecommunications company can invest.
The company wants to invest in countries where the average call duration is strictly greater
than the global average call duration.
Explanation
To solve this:
• Calculate the global average call duration by averaging the duration of all calls.
• Calculate the average call duration for each country by joining the Calls, Person, and
Country tables. This requires:
• Mapping each caller's and callee's phone number to their country based on the country
code.
• Grouping the calls by country and calculating the average duration for each.
• Compare the average call duration of each country with the global average and return
the countries where the country's average duration is greater than the global average.
397
1000+ SQL Interview Questions & Answers | By Zero Analyst
(7, 9, 13),
(7, 1, 3),
(9, 7, 1),
(1, 7, 7);
Learnings
• Using JOIN to combine multiple tables based on relationships.
• Aggregating with AVG() to calculate averages.
• Filtering results based on comparison of averages (global vs. local).
• Dealing with LEFT JOIN for country code lookups.
Solutions
• - PostgreSQL solution
WITH GlobalAverage AS (
SELECT AVG(duration) AS global_avg
FROM Calls
),
CountryAverage AS (
SELECT c.name AS country,
AVG(call.duration) AS country_avg
FROM Calls call
JOIN Person p1 ON call.caller_id = p1.id
JOIN Person p2 ON call.callee_id = p2.id
JOIN Country c1 ON SUBSTRING(p1.phone_number FROM 1 FOR 3) = c1.country_code
JOIN Country c2 ON SUBSTRING(p2.phone_number FROM 1 FOR 3) = c2.country_code
GROUP BY c1.name
)
SELECT ca.country
FROM CountryAverage ca, GlobalAverage ga
WHERE ca.country_avg > ga.global_avg;
• - MySQL solution
WITH GlobalAverage AS (
SELECT AVG(duration) AS global_avg
FROM Calls
),
CountryAverage AS (
SELECT c.name AS country,
AVG(call.duration) AS country_avg
FROM Calls call
JOIN Person p1 ON call.caller_id = p1.id
JOIN Person p2 ON call.callee_id = p2.id
JOIN Country c1 ON SUBSTRING(p1.phone_number, 1, 3) = c1.country_code
JOIN Country c2 ON SUBSTRING(p2.phone_number, 1, 3) = c2.country_code
GROUP BY c1.name
)
SELECT ca.country
FROM CountryAverage ca, GlobalAverage ga
WHERE ca.country_avg > ga.global_avg;
• Q.317
• Q.318
Question
Write an SQL query to find the bank accounts where the total income from deposits exceeds
the max_income for two or more consecutive months. The total income of an account in a
month is the sum of all its 'Creditor' transactions during that month.
Explanation
To solve this:
• Extract the monthly total income for each account:
398
1000+ SQL Interview Questions & Answers | By Zero Analyst
• We will group the Transactions by account_id and month (using YEAR() and MONTH()
functions).
• We will sum the amount for transactions of type 'Creditor' for each account and month.
• Identify accounts with consecutive months where the total income exceeds
max_income:
• For each account, check if the total income for a month exceeds max_income and then
check if this occurs for two or more consecutive months.
• Return the account_id of suspicious accounts:
• An account is suspicious if its total income exceeds the max_income for two or more
consecutive months.
Learnings
• Using SUM() to calculate total income for each account in each month.
• Grouping by account_id, year, and month.
• Using JOIN to filter out accounts where the total income exceeds the max_income for
consecutive months.
• Window functions or subqueries to detect consecutive months with suspicious activity.
Solutions
• - PostgreSQL solution
WITH MonthlyIncome AS (
SELECT account_id,
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
SUM(amount) AS total_income
FROM Transactions
399
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a query to find the number of students majoring in each department. Include all
departments, even those with no students. Sort the result by the number of students in
descending order, and in case of ties, alphabetically by the department name.
Explanation
To solve this:
• Join the tables: We will join the student and department tables using the dept_id
column.
• Use a LEFT JOIN to ensure that all departments are included, even those without
students.
• Count the students per department: After the join, we will count the number of students
for each department.
• Handle sorting: We will order the result by the number of students in descending order,
and by department name alphabetically in case of ties.
400
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to include all rows from one table even when there are no matching rows
in the other table.
• Using COUNT() with GROUP BY to aggregate results.
• Sorting by multiple criteria using ORDER BY.
Solutions
• - PostgreSQL solution
SELECT d.dept_name,
COUNT(s.student_id) AS student_number
FROM department d
LEFT JOIN student s ON d.dept_id = s.dept_id
GROUP BY d.dept_name
ORDER BY student_number DESC, d.dept_name;
• - MySQL solution
SELECT d.dept_name,
COUNT(s.student_id) AS student_number
FROM department d
LEFT JOIN student s ON d.dept_id = s.dept_id
GROUP BY d.dept_name
ORDER BY student_number DESC, d.dept_name;
• Q.320
Question
Write an SQL query to calculate the quality of each query. The quality is defined as the
average of the ratio between the query's rating and its position. The result should be rounded
to two decimal places.
Explanation
• Calculating the ratio: For each query, the ratio is defined as the ratio between the rating
and position columns:
Ratio=ratingposition\text{Ratio} = \frac{\text{rating}}{\text{position}}
401
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Aggregating by query_name: For each query, we need to calculate the average of these
ratios.
• Filtering poor queries: Since the problem doesn't specifically ask to exclude poor queries
(rating less than 3), we will include all rows in the calculation.
• Rounding the result: The final average ratio (quality) should be rounded to 2 decimal
places.
Learnings
• Using AVG() to calculate the average of a calculated expression.
• Using ROUND() to round the result to a specified number of decimal places.
Solutions
• - PostgreSQL solution
SELECT query_name,
ROUND(AVG(rating::FLOAT / position), 2) AS quality
FROM Queries
GROUP BY query_name;
• - MySQL solution
SELECT query_name,
ROUND(AVG(rating / position), 2) AS quality
FROM Queries
GROUP BY query_name;
Walmart
• Q.321
Question
Find the longest sequence of consecutive days a Walmart customer has made purchases.
Output the customer ID, the start date, the end date, and the number of consecutive days they
made purchases.
Explanation
To solve this:
• Identify consecutive purchase dates for each customer. A sequence of consecutive dates is
defined by the difference between each order date and the previous one being exactly 1 day.
• Group the purchases by customer and identify streaks of consecutive dates.
• Calculate the length of each streak, and then output the longest streak per customer.
402
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using window functions like LEAD() or LAG() to find consecutive dates.
• Using date difference logic to identify consecutive streaks.
• Grouping and aggregating based on streaks and customer IDs.
• Handling edge cases where streaks might have gaps.
Solutions
• - PostgreSQL solution
WITH consecutive_dates AS (
SELECT
customer_id,
order_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) -
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS streak_group
FROM walmart_orders
),
streak_lengths AS (
SELECT
customer_id,
MIN(order_date) AS streak_start,
MAX(order_date) AS streak_end,
COUNT(*) AS streak_length
FROM consecutive_dates
GROUP BY customer_id, streak_group
)
SELECT customer_id, streak_start, streak_end, streak_length
FROM streak_lengths
WHERE streak_length = (
SELECT MAX(streak_length)
FROM streak_lengths
WHERE customer_id = streak_lengths.customer_id
)
ORDER BY customer_id;
• - MySQL solution
WITH consecutive_dates AS (
SELECT
customer_id,
order_date,
DATEDIFF(order_date,
(SELECT MAX(order_date)
403
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM walmart_orders o2
WHERE o2.customer_id = walmart_orders.customer_id AND o2.order_date < walma
rt_orders.order_date)
) AS streak_group
FROM walmart_orders
),
streak_lengths AS (
SELECT
customer_id,
MIN(order_date) AS streak_start,
MAX(order_date) AS streak_end,
COUNT(*) AS streak_length
FROM consecutive_dates
GROUP BY customer_id, streak_group
)
SELECT customer_id, streak_start, streak_end, streak_length
FROM streak_lengths
WHERE streak_length = (
SELECT MAX(streak_length)
FROM streak_lengths
WHERE customer_id = streak_lengths.customer_id
)
ORDER BY customer_id;
Explanation
• PostgreSQL:
• We calculate the difference between row numbers within each customer using
ROW_NUMBER() to identify streak groups.
• We then aggregate the streaks, calculating the start and end dates and the length of each
streak.
• The final query filters the longest streak for each customer.
• MySQL:
• We use DATEDIFF() to find the difference between the current order's date and the
previous order's date for each customer. This helps in identifying consecutive purchase days.
• The streaks are identified using streak_group, and we aggregate the results to find the
longest streak for each customer.
This solution involves window functions and complex date difference calculations to identify
and group consecutive days of purchases, making it a more advanced problem.
• Q.322
Question
Find the top 3 most sold products globally (across all stores) in the walmart_inventory
table based on the total quantity sold. Output the product name, total quantity sold, and the
total revenue generated from that product.
Explanation
To solve this:
• Aggregate the data by product_id to get the total quantity sold and total revenue for each
product globally.
• Sort the results by total quantity sold in descending order to identify the top-selling
products.
• Filter to show only the top 3 products.
• Calculate the total revenue by multiplying quantity_sold by price_per_unit.
Datasets and SQL Schemas
404
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using aggregation (SUM()) to calculate total quantities and revenues.
• Grouping data by product to summarize sales information.
• Sorting the results to find the top 3 products by quantity sold.
• Calculating total revenue using multiplication of quantity_sold and price_per_unit.
Solutions
• - PostgreSQL solution
SELECT
product_name,
SUM(quantity_sold) AS total_quantity_sold,
SUM(quantity_sold * price_per_unit) AS total_revenue
FROM walmart_inventory
GROUP BY product_name
ORDER BY total_quantity_sold DESC
LIMIT 3;
• - MySQL solution
SELECT
product_name,
SUM(quantity_sold) AS total_quantity_sold,
SUM(quantity_sold * price_per_unit) AS total_revenue
FROM walmart_inventory
GROUP BY product_name
ORDER BY total_quantity_sold DESC
LIMIT 3;
Explanation
• Both PostgreSQL and MySQL solutions aggregate sales data by product_name using
SUM(quantity_sold) to calculate the total quantity sold and SUM(quantity_sold *
price_per_unit) to calculate the total revenue.
• We order the products by total_quantity_sold in descending order to identify the most
sold products.
• The query limits the output to the top 3 products using LIMIT 3.
This problem requires basic aggregation, sorting, and filtering techniques, but it challenges
you to think globally about data across multiple stores and to summarize it effectively.
405
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.323
Question
Find the second most recent purchase made by each customer in the purchases table.
Output the customer ID, the second most recent purchase date, and the amount spent on that
purchase.
Explanation
To solve this:
• For each customer, you need to determine the second most recent purchase date.
• This requires sorting the purchases by purchase_date for each customer and then
selecting the second entry.
• Handle edge cases where a customer has fewer than two purchases.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
purchase_date DATE,
amount_spent DECIMAL(10, 2)
);
Learnings
• Using window functions to rank purchases by date for each customer.
• Using ROW_NUMBER() to determine the ranking of purchases for each customer.
• Handling cases where customers might have fewer than two purchases.
Solutions
• - PostgreSQL solution
WITH ranked_purchases AS (
SELECT
customer_id,
purchase_date,
amount_spent,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date DESC) AS purc
hase_rank
FROM purchases
)
SELECT customer_id, purchase_date, amount_spent
FROM ranked_purchases
WHERE purchase_rank = 2
ORDER BY customer_id;
• - MySQL solution
WITH ranked_purchases AS (
SELECT
customer_id,
purchase_date,
406
1000+ SQL Interview Questions & Answers | By Zero Analyst
amount_spent,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY purchase_date DESC) AS purc
hase_rank
FROM purchases
)
SELECT customer_id, purchase_date, amount_spent
FROM ranked_purchases
WHERE purchase_rank = 2
ORDER BY customer_id;
Explanation
• PostgreSQL and MySQL:
• We use ROW_NUMBER() window function to rank purchases by purchase_date for each
customer in descending order.
• The second most recent purchase for each customer is identified where purchase_rank =
2.
• The results are ordered by customer_id to list each customer and their second most recent
purchase.
This problem tests your ability to handle window functions, ranking data, and managing edge
cases like customers with fewer than two purchases.
• Q.324
Question
Find the most frequent day of the week on which each customer makes purchases. Output
the customer ID, the most frequent day of the week, and the number of times they made a
purchase on that day.
Explanation
To solve this:
• You need to extract the day of the week from the purchase_date (e.g., Monday, Tuesday,
etc.).
• Count the frequency of purchases for each day of the week for each customer.
• Identify the most frequent day for each customer by selecting the day with the highest
count.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
purchase_date DATE,
amount_spent DECIMAL(10, 2)
);
407
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using EXTRACT(DOW FROM date) to get the day of the week.
• Grouping and counting the frequency of each day for each customer.
• Using RANK() or similar methods to select the most frequent day of the week for each
customer.
Solutions
• - PostgreSQL solution
WITH day_of_week AS (
SELECT
customer_id,
EXTRACT(DOW FROM purchase_date) AS purchase_day,
COUNT(*) AS purchase_count
FROM purchases
GROUP BY customer_id, purchase_day
),
ranked_days AS (
SELECT
customer_id,
purchase_day,
purchase_count,
RANK() OVER (PARTITION BY customer_id ORDER BY purchase_count DESC) AS day_rank
FROM day_of_week
)
SELECT
customer_id,
CASE
WHEN purchase_day = 0 THEN 'Sunday'
WHEN purchase_day = 1 THEN 'Monday'
WHEN purchase_day = 2 THEN 'Tuesday'
WHEN purchase_day = 3 THEN 'Wednesday'
WHEN purchase_day = 4 THEN 'Thursday'
WHEN purchase_day = 5 THEN 'Friday'
WHEN purchase_day = 6 THEN 'Saturday'
END AS most_frequent_day,
purchase_count
FROM ranked_days
WHERE day_rank = 1
ORDER BY customer_id;
• - MySQL solution
WITH day_of_week AS (
SELECT
customer_id,
DAYOFWEEK(purchase_date) AS purchase_day,
COUNT(*) AS purchase_count
FROM purchases
GROUP BY customer_id, purchase_day
),
ranked_days AS (
SELECT
customer_id,
purchase_day,
purchase_count,
RANK() OVER (PARTITION BY customer_id ORDER BY purchase_count DESC) AS day_rank
FROM day_of_week
)
SELECT
customer_id,
CASE
WHEN purchase_day = 1 THEN 'Sunday'
WHEN purchase_day = 2 THEN 'Monday'
WHEN purchase_day = 3 THEN 'Tuesday'
WHEN purchase_day = 4 THEN 'Wednesday'
WHEN purchase_day = 5 THEN 'Thursday'
WHEN purchase_day = 6 THEN 'Friday'
WHEN purchase_day = 7 THEN 'Saturday'
END AS most_frequent_day,
408
1000+ SQL Interview Questions & Answers | By Zero Analyst
purchase_count
FROM ranked_days
WHERE day_rank = 1
ORDER BY customer_id;
Explanation
• PostgreSQL and MySQL:
• We use EXTRACT(DOW FROM purchase_date) in PostgreSQL and
DAYOFWEEK(purchase_date) in MySQL to get the day of the week (0 for Sunday, 6 for
Saturday).
• We then count how many purchases were made on each day of the week for each
customer.
• Using RANK(), we rank the days based on the count of purchases.
• The final query selects the day with the highest rank (day_rank = 1) and converts the
numeric value of the day into its corresponding name.
This problem helps you practice date manipulation, grouping, and ranking functions to
analyze customer behavior in a creative way.
• Q.325
Question
Find the longest sequence of products purchased in a single session by each customer. A
session is defined as a series of purchases where the time between consecutive purchases is
less than or equal to 2 hours. Output the customer ID, the session start time (earliest
purchase), session end time (latest purchase), and the number of products purchased in that
session.
Explanation
To solve this:
• We need to group purchases into sessions for each customer. A session is defined as
purchases made within 2 hours of each other.
• Once the purchases are grouped by session, we need to count how many products were
purchased in each session and determine the start and end times for each session.
• Select the longest session for each customer based on the number of products purchased.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE customer_purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
product_id INT,
purchase_time TIMESTAMP
);
409
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LAG() or LEAD() window functions to calculate time differences between
consecutive purchases.
• Grouping purchases into sessions based on time intervals.
• Handling time-based conditions using TIMESTAMP and INTERVAL.
• Aggregating results to identify the longest session.
Solutions
• - PostgreSQL solution
WITH session_groups AS (
SELECT
customer_id,
purchase_time,
product_id,
SUM(CASE
WHEN purchase_time - LAG(purchase_time) OVER (PARTITION BY customer_id ORDER
BY purchase_time) <= INTERVAL '2 hours'
THEN 0
ELSE 1
END)
OVER (PARTITION BY customer_id ORDER BY purchase_time) AS session_id
FROM customer_purchases
),
session_lengths AS (
SELECT
customer_id,
session_id,
MIN(purchase_time) AS session_start,
MAX(purchase_time) AS session_end,
COUNT(product_id) AS products_in_session
FROM session_groups
GROUP BY customer_id, session_id
)
SELECT
customer_id,
session_start,
session_end,
products_in_session
FROM session_lengths
WHERE products_in_session = (
SELECT MAX(products_in_session)
FROM session_lengths
WHERE customer_id = session_lengths.customer_id
)
ORDER BY customer_id;
• - MySQL solution
WITH session_groups AS (
SELECT
customer_id,
purchase_time,
product_id,
SUM(CASE
WHEN TIMESTAMPDIFF(HOUR, LAG(purchase_time) OVER (PARTITION BY customer_id O
RDER BY purchase_time), purchase_time) <= 2
THEN 0
ELSE 1
END)
OVER (PARTITION BY customer_id ORDER BY purchase_time) AS session_id
FROM customer_purchases
),
session_lengths AS (
SELECT
410
1000+ SQL Interview Questions & Answers | By Zero Analyst
customer_id,
session_id,
MIN(purchase_time) AS session_start,
MAX(purchase_time) AS session_end,
COUNT(product_id) AS products_in_session
FROM session_groups
GROUP BY customer_id, session_id
)
SELECT
customer_id,
session_start,
session_end,
products_in_session
FROM session_lengths
WHERE products_in_session = (
SELECT MAX(products_in_session)
FROM session_lengths
WHERE customer_id = session_lengths.customer_id
)
ORDER BY customer_id;
Explanation
• PostgreSQL and MySQL:
• In the session_groups CTE, we calculate the session for each customer using LAG() to
get the previous purchase time and check if the difference between consecutive purchases is
greater than 2 hours. If it's greater than 2 hours, it means a new session has started.
• In the session_lengths CTE, we aggregate by customer_id and session_id to
calculate the session start and end times, and count the number of products purchased in that
session.
• The final query selects the longest session for each customer by finding the session with
the highest number of products purchased.
This problem involves using window functions, date/time manipulation, and aggregating data
in a way that identifies meaningful purchase behavior. It tests your ability to handle time-
based sessions and complex grouping scenarios.
• Q.326
Question
Find the top 3 most profitable stores based on the total revenue from their sales in the
store_sales table. Output the store ID, store name, and the total revenue, sorted by revenue
in descending order.
Explanation
To solve this:
• Calculate the total revenue for each store by summing up the sale_amount for each store.
• Sort the stores based on the total revenue in descending order.
• Output the top 3 stores based on total revenue.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE store_sales (
sale_id INT PRIMARY KEY,
store_id INT,
store_name VARCHAR(100),
sale_amount DECIMAL(10, 2),
sale_date DATE
);
411
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using aggregation (SUM()) to calculate total sales for each store.
• Grouping by store_id and store_name to aggregate sales data at the store level.
• Sorting and limiting the results to get the top N records based on total revenue.
Solutions
• - PostgreSQL solution
SELECT
store_id,
store_name,
SUM(sale_amount) AS total_revenue
FROM store_sales
GROUP BY store_id, store_name
ORDER BY total_revenue DESC
LIMIT 3;
• - MySQL solution
SELECT
store_id,
store_name,
SUM(sale_amount) AS total_revenue
FROM store_sales
GROUP BY store_id, store_name
ORDER BY total_revenue DESC
LIMIT 3;
Explanation
• PostgreSQL and MySQL:
• The query uses SUM(sale_amount) to calculate the total revenue for each store.
• We group the data by store_id and store_name to aggregate the sales by store.
• The results are sorted in descending order based on the total revenue, and only the top 3
stores are selected using LIMIT 3.
This problem helps practice grouping and aggregating data at the store level, as well as
sorting and limiting the result set to focus on the most profitable stores. It tests basic SQL
skills, but also helps refine your ability to analyze sales data effectively.
• Q.327
Question
Find the most profitable product category for each brand based on the total revenue from
sales in the product_sales table. Revenue for a category is calculated by summing the total
412
1000+ SQL Interview Questions & Answers | By Zero Analyst
sales of all products within that category for each brand. Output the brand name, product
category, and the total revenue for that category, ordered by brand and revenue.
Explanation
To solve this:
• Calculate the total revenue for each product category within each brand by summing the
sale_amount.
• Group the sales data by brand_id, brand_name, and category_name.
• Identify the most profitable category for each brand, which can be done by finding the
category with the highest total revenue for each brand.
• Output the brand name, category, and revenue, sorted by brand name and revenue.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE product_sales (
sale_id INT PRIMARY KEY,
product_id INT,
product_name VARCHAR(100),
brand_id INT,
brand_name VARCHAR(100),
category_name VARCHAR(100),
sale_amount DECIMAL(10, 2),
sale_date DATE
);
Learnings
• Using aggregation (SUM()) to calculate the total revenue per category for each brand.
• Grouping by multiple columns (brand_id, category_name) to aggregate the data.
• Identifying the highest revenue category for each brand using window functions or
subqueries.
• Sorting and selecting top records within groups (brands).
Solutions
• - PostgreSQL solution
WITH category_revenue AS (
SELECT
brand_name,
category_name,
SUM(sale_amount) AS total_revenue
FROM product_sales
GROUP BY brand_name, category_name
413
1000+ SQL Interview Questions & Answers | By Zero Analyst
),
ranked_categories AS (
SELECT
brand_name,
category_name,
total_revenue,
RANK() OVER (PARTITION BY brand_name ORDER BY total_revenue DESC) AS category_ra
nk
FROM category_revenue
)
SELECT
brand_name,
category_name,
total_revenue
FROM ranked_categories
WHERE category_rank = 1
ORDER BY brand_name, total_revenue DESC;
• - MySQL solution
WITH category_revenue AS (
SELECT
brand_name,
category_name,
SUM(sale_amount) AS total_revenue
FROM product_sales
GROUP BY brand_name, category_name
),
ranked_categories AS (
SELECT
brand_name,
category_name,
total_revenue,
RANK() OVER (PARTITION BY brand_name ORDER BY total_revenue DESC) AS category_ra
nk
FROM category_revenue
)
SELECT
brand_name,
category_name,
total_revenue
FROM ranked_categories
WHERE category_rank = 1
ORDER BY brand_name, total_revenue DESC;
Explanation
• PostgreSQL and MySQL:
• In the category_revenue CTE, we aggregate the sales data by brand_name and
category_name to calculate the total revenue for each category within each brand.
• In the ranked_categories CTE, we use the RANK() window function to rank the
categories within each brand based on the total_revenue.
• The final query filters for the most profitable category (category_rank = 1) for each
brand and orders the results by brand and total revenue.
This problem requires using aggregation and ranking techniques to find the most profitable
category for each brand. It helps you practice window functions and data partitioning, as well
as applying ranking to identify top values within groups.
• Q.328
Question
Find the top 3 customers who have made the most number of purchases in a single month,
across all months. Output the customer ID, customer name, the month (YYYY-MM), and the
number of purchases made by that customer in that month, ordered by number of purchases
(highest to lowest).
414
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Group the purchases by customer_id, customer_name, and the month of purchase.
• Count the number of purchases made by each customer in each month.
• Identify the top 3 customers who made the most purchases in any single month.
• Output the results, sorted by the number of purchases in descending order.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE customer_purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
customer_name VARCHAR(100),
purchase_date DATE,
product_id INT,
purchase_amount DECIMAL(10, 2)
);
Learnings
• Using GROUP BY with date functions (DATE_TRUNC() or DATE_FORMAT()) to group data by
month.
• Counting the number of purchases made by each customer in each month using COUNT().
• Sorting and limiting results to get the top N customers based on purchase counts.
Solutions
• - PostgreSQL solution
WITH monthly_purchases AS (
SELECT
customer_id,
customer_name,
TO_CHAR(purchase_date, 'YYYY-MM') AS purchase_month,
COUNT(purchase_id) AS num_purchases
FROM customer_purchases
GROUP BY customer_id, customer_name, TO_CHAR(purchase_date, 'YYYY-MM')
)
SELECT
customer_id,
customer_name,
purchase_month,
num_purchases
FROM monthly_purchases
ORDER BY num_purchases DESC
LIMIT 3;
415
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
WITH monthly_purchases AS (
SELECT
customer_id,
customer_name,
DATE_FORMAT(purchase_date, '%Y-%m') AS purchase_month,
COUNT(purchase_id) AS num_purchases
FROM customer_purchases
GROUP BY customer_id, customer_name, DATE_FORMAT(purchase_date, '%Y-%m')
)
SELECT
customer_id,
customer_name,
purchase_month,
num_purchases
FROM monthly_purchases
ORDER BY num_purchases DESC
LIMIT 3;
Explanation
• PostgreSQL and MySQL:
• The monthly_purchases CTE groups the data by customer_id, customer_name, and the
formatted month (YYYY-MM). It then calculates the number of purchases made by each
customer within each month using COUNT(purchase_id).
• The final query sorts the results by num_purchases in descending order and limits the
output to the top 3 records using LIMIT 3.
This problem helps you practice using date formatting and aggregation to analyze customer
purchase behavior over time. It tests your ability to group data by time intervals (months) and
sort results based on counts or other aggregated metrics.
• Q.329
Question
Find all customer names whose email addresses contain a domain from a specific list of
domains (e.g., gmail.com, yahoo.com, outlook.com). Output the customer name and email
address.
Explanation
To solve this:
• Use regular expressions to match email addresses that contain specific domains.
• Filter the emails by the given domains using REGEXP or REGEXP_LIKE (depending on the
DBMS).
• Output the customer name and email address.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
email VARCHAR(100)
);
416
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using regular expressions (REGEXP or REGEXP_LIKE) to filter email addresses.
• Understanding how to match patterns, specifically domain names within email addresses.
• Filtering results based on pattern matching in a column.
Solutions
• - PostgreSQL solution
SELECT
customer_name,
email
FROM customers
WHERE email ~* '\\.(gmail\\.com|yahoo\\.com|outlook\\.com)$';
• - MySQL solution
SELECT
customer_name,
email
FROM customers
WHERE email REGEXP '\\.(gmail\\.com|yahoo\\.com|outlook\\.com)$';
Explanation
• PostgreSQL and MySQL:
• The regular expression checks if the email ends with any of the specified domains
(gmail.com, yahoo.com, outlook.com).
• The ~* in PostgreSQL and REGEXP in MySQL are used to perform case-insensitive
matching.
• The \\. in the regular expression escapes the dot (.) since it is a special character in regex
and needs to be treated literally.
This question helps you practice using regular expressions to filter data based on patterns,
and it's useful for identifying customers from specific email domains.
• Q.330
Question
Based on each user's most recent transaction date, write a query to retrieve the users along
with the number of products they bought. Output the user's most recent transaction date, user
ID, and the number of products bought, sorted in chronological order by the transaction date.
Explanation
To solve this:
• Use the MAX() function to find the most recent transaction date for each user.
• Count the number of products bought by each user in their most recent transaction.
• Sort the results by the transaction date in chronological order.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE walmart_transactions (
transaction_id INT PRIMARY KEY,
user_id INT,
transaction_date DATE,
417
1000+ SQL Interview Questions & Answers | By Zero Analyst
product_id INT,
product_name VARCHAR(100),
quantity INT
);
Learnings
• Using MAX() to find the most recent transaction date.
• Counting products bought in a specific transaction using SUM().
• Using GROUP BY to aggregate results by user.
• Sorting results based on date using ORDER BY.
Solutions
• - PostgreSQL solution
WITH recent_transactions AS (
SELECT
user_id,
MAX(transaction_date) AS recent_transaction_date
FROM walmart_transactions
GROUP BY user_id
)
SELECT
rt.recent_transaction_date,
wt.user_id,
SUM(wt.quantity) AS num_products
FROM recent_transactions rt
JOIN walmart_transactions wt ON rt.user_id = wt.user_id AND rt.recent_transaction_date =
wt.transaction_date
GROUP BY rt.recent_transaction_date, wt.user_id
ORDER BY rt.recent_transaction_date;
• - MySQL solution
WITH recent_transactions AS (
SELECT
user_id,
MAX(transaction_date) AS recent_transaction_date
FROM walmart_transactions
GROUP BY user_id
)
SELECT
rt.recent_transaction_date,
wt.user_id,
SUM(wt.quantity) AS num_products
FROM recent_transactions rt
JOIN walmart_transactions wt ON rt.user_id = wt.user_id AND rt.recent_transaction_date =
wt.transaction_date
GROUP BY rt.recent_transaction_date, wt.user_id
ORDER BY rt.recent_transaction_date;
Explanation
• PostgreSQL and MySQL:
418
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The recent_transactions CTE finds the most recent transaction date for each user by
grouping the data by user_id and using MAX() on transaction_date.
• The main query joins this CTE with the walmart_transactions table to get the user_id,
transaction_date, and the total number of products bought in their most recent transaction
(calculated using SUM(wt.quantity)).
• The results are grouped by user_id and recent_transaction_date to aggregate product
quantities.
• The final result is sorted by recent_transaction_date to output the users in
chronological order of their most recent transactions.
This problem helps you practice working with window functions, joins, and aggregation to
analyze transaction data by user. It also improves your ability to work with dates and
summarize user activity.
• Q.331
Question
Given a table of employee sales, write a query to select the Employee_id, Store_id, and a
rank based on their Sale_amount for the year 2023, with 1 being the highest performing
employee.
Explanation
To solve this:
• Filter the data for the year 2023.
• Use a window function to rank employees based on their Sale_amount in descending
order.
• Return the Employee_id, Store_id, and the rank for each employee based on their total
sales.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE employee_sales (
transaction_id INT PRIMARY KEY,
employee_id INT,
store_id INT,
sale_date DATE,
sale_amount DECIMAL(10, 2)
);
Learnings
• Using the RANK() window function to assign rankings based on a specific column.
• Filtering data for a specific year.
419
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• PostgreSQL and MySQL:
• The query filters the employee_sales table for transactions in 2023 using the
EXTRACT(YEAR FROM sale_date) in PostgreSQL and YEAR(sale_date) in MySQL.
• It then calculates the total sales for each employee by grouping the data by employee_id
and store_id, and uses the SUM(sale_amount) function.
• The RANK() window function is used to assign a ranking to each employee based on their
total sales in descending order, with the highest sales receiving a rank of 1. The PARTITION
BY store_id ensures that rankings are calculated separately for each store.
• The final result is ordered by store_id and sales_rank to ensure the rankings are in the
correct order.
This problem helps you practice working with window functions, filtering data by year, and
grouping results to analyze employee performance. It also gives you insight into ranking
methods like RANK() and DENSE_RANK().
• Q.332
Question
Write a SQL query to select the Supplier_id, Product_id, and start date of the period when
the stock quantity was below 50 units for more than two consecutive days.
Explanation
To solve this:
• Identify periods where the stock quantity is below 50 for consecutive days.
• Use LEAD() or LAG() window functions to compare stock quantities from one day to the
next.
• Detect streaks of days where the stock quantity remains below 50.
• Return the Supplier_id, Product_id, and the start date of these periods.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE inventory (
record_id INT PRIMARY KEY,
420
1000+ SQL Interview Questions & Answers | By Zero Analyst
supplier_id INT,
product_id INT,
stock_quantity INT,
record_date DATE
);
Learnings
• Using window functions (LAG(), LEAD()) to compare adjacent rows.
• Identifying patterns or streaks of data, such as consecutive days of low stock.
• Applying conditions to find specific periods where stock is below a threshold.
Solutions
• - PostgreSQL solution
WITH consecutive_low_stock AS (
SELECT
supplier_id,
product_id,
record_date,
stock_quantity,
LAG(stock_quantity, 1) OVER (PARTITION BY supplier_id, product_id ORDER BY recor
d_date) AS prev_day_stock,
LAG(stock_quantity, 2) OVER (PARTITION BY supplier_id, product_id ORDER BY recor
d_date) AS prev_2_day_stock
FROM inventory
)
SELECT
supplier_id,
product_id,
record_date AS start_date
FROM consecutive_low_stock
WHERE stock_quantity < 50
AND prev_day_stock < 50
AND prev_2_day_stock < 50
ORDER BY record_date;
• - MySQL solution
WITH consecutive_low_stock AS (
SELECT
supplier_id,
product_id,
record_date,
stock_quantity,
LAG(stock_quantity, 1) OVER (PARTITION BY supplier_id, product_id ORDER BY recor
d_date) AS prev_day_stock,
LAG(stock_quantity, 2) OVER (PARTITION BY supplier_id, product_id ORDER BY recor
d_date) AS prev_2_day_stock
FROM inventory
)
SELECT
supplier_id,
product_id,
421
1000+ SQL Interview Questions & Answers | By Zero Analyst
record_date AS start_date
FROM consecutive_low_stock
WHERE stock_quantity < 50
AND prev_day_stock < 50
AND prev_2_day_stock < 50
ORDER BY record_date;
Explanation
• PostgreSQL and MySQL:
• The LAG() window function is used to retrieve the stock_quantity of the previous day
and the day before that for each product and supplier.
• The query checks if the current day’s stock and the previous two days’ stock are all below
50.
• If this condition is true, it indicates that the stock was below 50 for more than two
consecutive days.
• The result includes the supplier_id, product_id, and the record_date of the first day in
this low-stock period (the "start date").
• The result is sorted by record_date to show the periods in chronological order.
This problem tests your ability to work with window functions (LAG()), and identify patterns
in sequential data, such as consecutive days of low stock. It also helps practice filtering and
grouping data based on conditions applied to multiple rows.
• Q.333
Question
Write a SQL query to select the Customer_id and Store_id for customers with more than
10 purchases from the same store in the past year.
Explanation
To solve this:
• Filter the transactions for the past year using CURRENT_DATE or NOW().
• Group the data by Customer_id and Store_id.
• Count the number of purchases for each customer at each store.
• Return customers who have made more than 10 purchases at the same store.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE customer_purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
store_id INT,
purchase_date DATE,
amount DECIMAL(10, 2)
);
422
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY to group data by multiple columns.
• Counting occurrences of events (purchases) using COUNT().
• Filtering data based on the result of an aggregation function like HAVING.
• Working with date ranges (e.g., "past year") and the CURRENT_DATE function.
Solutions
• - PostgreSQL solution
SELECT
customer_id,
store_id
FROM customer_purchases
WHERE purchase_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY customer_id, store_id
HAVING COUNT(purchase_id) > 10;
• - MySQL solution
SELECT
customer_id,
store_id
FROM customer_purchases
WHERE purchase_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY customer_id, store_id
HAVING COUNT(purchase_id) > 10;
Explanation
• PostgreSQL and MySQL:
• The query filters the data to include only purchases made in the past year using
CURRENT_DATE - INTERVAL '1 year' in PostgreSQL and CURDATE() - INTERVAL 1
YEAR in MySQL.
• The data is grouped by customer_id and store_id to identify each customer’s purchases
at each store.
• The HAVING COUNT(purchase_id) > 10 condition ensures that only customers with more
than 10 purchases from the same store in the past year are included.
• The result includes the customer_id and store_id of frequent shoppers.
This problem demonstrates how to filter and aggregate data based on time intervals and
customer activity, and it emphasizes the importance of using the HAVING clause for
aggregated conditions.
• Q.334
Question
Write a SQL query to find all orders with a total amount greater than twice the average order
amount.
Explanation
To solve this:
• Calculate the average order amount across all orders.
• Find orders where the total order amount exceeds twice this average.
423
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Return the relevant order details such as Order_id, Customer_id, and Total_amount.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10, 2)
);
Learnings
• Calculating averages using AVG().
• Using subqueries to compare an individual value against an aggregate (in this case,
comparing order totals against twice the average).
• Filtering data based on a condition applied to aggregated results.
Solutions
• - PostgreSQL solution
WITH avg_order_amount AS (
SELECT AVG(total_amount) AS avg_amount FROM orders
)
SELECT
order_id,
customer_id,
total_amount
FROM orders, avg_order_amount
WHERE total_amount > 2 * avg_amount;
• - MySQL solution
WITH avg_order_amount AS (
SELECT AVG(total_amount) AS avg_amount FROM orders
)
SELECT
order_id,
customer_id,
total_amount
FROM orders, avg_order_amount
WHERE total_amount > 2 * avg_amount;
Explanation
• PostgreSQL and MySQL:
• The subquery avg_order_amount calculates the average total_amount from the orders
table.
• The main query compares the total_amount of each order to twice the average order
amount.
• Only orders where the total_amount exceeds twice the average are selected.
• The result includes order_id, customer_id, and total_amount for all qualifying orders.
424
1000+ SQL Interview Questions & Answers | By Zero Analyst
This problem tests your ability to work with aggregates and subqueries, particularly in the
context of comparing individual rows to overall summary statistics (like averages). It helps
you practice filtering based on calculated metrics.
• Q.335
Question
Write a SQL query to find the top 5 products with the highest increase in sales compared to
the previous month.
Explanation
To solve this:
• Calculate the sales of each product for the current month and the previous month.
• Subtract the previous month's sales from the current month's sales to calculate the increase.
• Order the results by the highest increase in sales.
• Return the top 5 products with the highest sales increase.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE product_sales (
product_id INT,
product_name VARCHAR(100),
sale_date DATE,
sale_amount DECIMAL(10, 2)
);
Learnings
• Using GROUP BY to aggregate sales by month and product.
• Applying LAG() or self-joins to compare the current month’s sales to the previous month's
sales.
• Sorting data to get the top 5 products based on sales increase.
Solutions
• - PostgreSQL solution
WITH sales_by_month AS (
SELECT
product_id,
product_name,
EXTRACT(YEAR FROM sale_date) AS year,
EXTRACT(MONTH FROM sale_date) AS month,
SUM(sale_amount) AS total_sales
425
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM product_sales
GROUP BY product_id, product_name, EXTRACT(YEAR FROM sale_date), EXTRACT(MONTH FROM
sale_date)
),
sales_diff AS (
SELECT
a.product_id,
a.product_name,
a.year,
a.month,
a.total_sales - COALESCE(b.total_sales, 0) AS sales_increase
FROM sales_by_month a
LEFT JOIN sales_by_month b
ON a.product_id = b.product_id
AND a.year = b.year
AND a.month = b.month + 1
)
SELECT
product_id,
product_name,
sales_increase
FROM sales_diff
ORDER BY sales_increase DESC
LIMIT 5;
• - MySQL solution
WITH sales_by_month AS (
SELECT
product_id,
product_name,
YEAR(sale_date) AS year,
MONTH(sale_date) AS month,
SUM(sale_amount) AS total_sales
FROM product_sales
GROUP BY product_id, product_name, YEAR(sale_date), MONTH(sale_date)
),
sales_diff AS (
SELECT
a.product_id,
a.product_name,
a.year,
a.month,
a.total_sales - COALESCE(b.total_sales, 0) AS sales_increase
FROM sales_by_month a
LEFT JOIN sales_by_month b
ON a.product_id = b.product_id
AND a.year = b.year
AND a.month = b.month + 1
)
SELECT
product_id,
product_name,
sales_increase
FROM sales_diff
ORDER BY sales_increase DESC
LIMIT 5;
Explanation
• PostgreSQL and MySQL:
• First, the sales_by_month CTE calculates the total sales for each product in each month
by extracting the year and month from the sale_date.
• The sales_diff CTE calculates the difference in sales between the current month
(a.total_sales) and the previous month (b.total_sales). The COALESCE() function
ensures that if there is no previous month's data, the sales increase is treated as the total sales
for the current month.
• The final result selects the top 5 products with the highest sales increase, ordered by
sales_increase in descending order.
426
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query demonstrates how to compare aggregated data across different time periods
(months) and how to find products with the highest performance improvement over time.
• Q.336
Question
Write a SQL query to calculate the average time taken to fulfill orders, from order placement
to delivery, for each city.
Explanation
To solve this:
• Calculate the difference between the order placement date (order_date) and the delivery
date (delivery_date) for each order.
• Group the data by city and calculate the average time taken to fulfill the order for each
city.
• Output the city and the average fulfillment time (in days).
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
city VARCHAR(100),
order_date DATE,
delivery_date DATE,
total_amount DECIMAL(10, 2)
);
Learnings
• Calculating date differences in SQL using DATEDIFF() or direct subtraction.
• Using GROUP BY to group the data by city and calculate aggregates.
• Aggregating with AVG() to find the average fulfillment time per city.
Solutions
• - PostgreSQL solution
SELECT
city,
AVG(CAST(delivery_date - order_date AS INTEGER)) AS avg_fulfillment_time
FROM orders
GROUP BY city;
• - MySQL solution
SELECT
city,
AVG(DATEDIFF(delivery_date, order_date)) AS avg_fulfillment_time
FROM orders
427
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY city;
Explanation
• PostgreSQL:
• The difference between delivery_date and order_date is calculated directly. We cast
the result of this subtraction to an integer to get the number of days.
• AVG() is used to calculate the average fulfillment time for each city.
• MySQL:
• The DATEDIFF() function calculates the difference between delivery_date and
order_date in days.
• AVG() is used to calculate the average fulfillment time for each city.
Both solutions group the results by city to show the average time taken to fulfill orders for
each city.
This problem is a good exercise in working with date functions and aggregating time-based
data.
• Q.337
Question
Write a SQL query to identify customers who have not placed any orders in the last 6 months
but had placed more than 5 orders in the 6 months prior.
Explanation
To solve this:
• Identify customers who have not placed any orders in the last 6 months.
• Check the number of orders placed by these customers in the 6 months prior.
• Return customers who meet the condition of placing more than 5 orders in the previous 6
months and none in the last 6 months.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10, 2)
);
428
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data by date ranges (using CURRENT_DATE or NOW()).
• Using COUNT() to find the number of orders for a customer in a specific time period.
• Combining conditions using HAVING to filter based on aggregated results.
Solutions
• - PostgreSQL solution
WITH orders_in_last_6_months AS (
SELECT
customer_id
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY customer_id
),
orders_in_previous_6_months AS (
SELECT
customer_id
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '12 months'
AND order_date < CURRENT_DATE - INTERVAL '6 months'
GROUP BY customer_id
HAVING COUNT(order_id) > 5
)
SELECT
o.customer_id
FROM orders_in_previous_6_months o
LEFT JOIN orders_in_last_6_months l
ON o.customer_id = l.customer_id
WHERE l.customer_id IS NULL;
• - MySQL solution
WITH orders_in_last_6_months AS (
SELECT
customer_id
FROM orders
WHERE order_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY customer_id
),
orders_in_previous_6_months AS (
SELECT
customer_id
FROM orders
WHERE order_date >= CURDATE() - INTERVAL 12 MONTH
AND order_date < CURDATE() - INTERVAL 6 MONTH
GROUP BY customer_id
HAVING COUNT(order_id) > 5
)
SELECT
o.customer_id
FROM orders_in_previous_6_months o
LEFT JOIN orders_in_last_6_months l
ON o.customer_id = l.customer_id
WHERE l.customer_id IS NULL;
Explanation
• PostgreSQL and MySQL:
• The first CTE (orders_in_last_6_months) finds all customers who placed an order in
the last 6 months.
• The second CTE (orders_in_previous_6_months) finds customers who placed more
than 5 orders in the 6-month period prior to the last 6 months. This is done using HAVING
COUNT(order_id) > 5.
• A LEFT JOIN is performed between the two CTEs to ensure we identify customers who
have no orders in the last 6 months (l.customer_id IS NULL).
429
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The final result returns customers who placed more than 5 orders in the previous 6 months
but have not placed any orders in the last 6 months.
This query demonstrates the use of CTEs, filtering based on date ranges, and using COUNT()
with HAVING to aggregate and filter data.
• Q.338
Question
Write a SQL query to find out supplier_id, product_id, and starting date of
record_date` for which stock quantity is less than 50 for two or more consecutive days.
Explanation
To solve this:
• Identify days where stock quantity is less than 50.
• Check for consecutive days where stock remains below 50.
• Return the supplier_id, product_id, and the first date of consecutive days when stock
quantity was below 50.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE inventory (
supplier_id INT,
product_id INT,
record_date DATE,
stock_quantity INT
);
430
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Identifying consecutive rows with specific conditions (in this case, stock quantity less than
50).
• Using LEAD() or LAG() window functions for identifying consecutive data.
• Combining CASE statements or joins to filter consecutive periods with specific conditions.
Solutions
• - PostgreSQL solution
WITH consecutive_low_stock AS (
SELECT
supplier_id,
product_id,
record_date,
stock_quantity,
LEAD(record_date) OVER (PARTITION BY supplier_id, product_id ORDER BY record_dat
e) AS next_day,
LEAD(stock_quantity) OVER (PARTITION BY supplier_id, product_id ORDER BY record_
date) AS next_day_stock
FROM inventory
WHERE stock_quantity < 50
)
SELECT
supplier_id,
product_id,
record_date
FROM consecutive_low_stock
WHERE next_day IS NOT NULL
AND stock_quantity < 50
AND next_day_stock < 50
ORDER BY record_date;
• - MySQL solution
WITH consecutive_low_stock AS (
SELECT
supplier_id,
product_id,
record_date,
stock_quantity,
LEAD(record_date) OVER (PARTITION BY supplier_id, product_id ORDER BY record_dat
e) AS next_day,
LEAD(stock_quantity) OVER (PARTITION BY supplier_id, product_id ORDER BY record_
date) AS next_day_stock
FROM inventory
WHERE stock_quantity < 50
)
SELECT
supplier_id,
product_id,
record_date
FROM consecutive_low_stock
WHERE next_day IS NOT NULL
AND stock_quantity < 50
AND next_day_stock < 50
ORDER BY record_date;
Explanation
• PostgreSQL and MySQL:
431
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The LEAD() window function is used to get the next day's record_date and
stock_quantity for each row.
• The query filters out rows where the stock quantity is less than 50 and checks if both the
current day and the next day have a stock quantity less than 50, indicating consecutive days
with low stock.
• The result includes supplier_id, product_id, and record_date for the starting date of
each consecutive low-stock period.
This solution uses window functions (LEAD()) to examine consecutive days and finds periods
where stock is consistently low, helping to identify potential inventory issues for suppliers.
• Q.339
Question
Given a table containing product sales data, write a query to find the top-selling product by
revenue in each product category. Include the category, product name, and total sales for each
product.
Explanation
To solve this:
• Calculate the total sales (total_sales) for each product in each category by multiplying
the quantity sold by the price.
• Group the data by product category and product name, then sum up the total sales for each
product.
• Use a window function (RANK() or ROW_NUMBER()) to rank the products within each
category based on total sales in descending order.
• Retrieve the top-ranked product in each category, which will be the one with the highest
total sales.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE product_sales (
category VARCHAR(100),
product_name VARCHAR(100),
quantity_sold INT,
price DECIMAL(10, 2)
);
Learnings
• Aggregating data based on categories and products.
• Using SUM() to calculate total sales.
432
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using window functions like RANK() or ROW_NUMBER() to rank products by their total sales
within each category.
• Using PARTITION BY with window functions to apply the ranking within each category.
Solutions
• - PostgreSQL solution
WITH ranked_sales AS (
SELECT
category,
product_name,
SUM(quantity_sold * price) AS total_sales,
RANK() OVER (PARTITION BY category ORDER BY SUM(quantity_sold * price) DESC) AS
sales_rank
FROM product_sales
GROUP BY category, product_name
)
SELECT
category,
product_name,
total_sales
FROM ranked_sales
WHERE sales_rank = 1
ORDER BY category;
• - MySQL solution
WITH ranked_sales AS (
SELECT
category,
product_name,
SUM(quantity_sold * price) AS total_sales,
RANK() OVER (PARTITION BY category ORDER BY SUM(quantity_sold * price) DESC) AS
sales_rank
FROM product_sales
GROUP BY category, product_name
)
SELECT
category,
product_name,
total_sales
FROM ranked_sales
WHERE sales_rank = 1
ORDER BY category;
Explanation
• PostgreSQL and MySQL:
• The SUM(quantity_sold * price) calculates the total sales for each product.
• RANK() is used to assign a rank to each product within its category, ordering by total sales
in descending order.
• The PARTITION BY category ensures the ranking is calculated separately for each
category.
• Only the top-ranked product in each category is selected by filtering on sales_rank = 1.
• The result shows the category, product name, and total sales for the top-selling product in
each category.
This query demonstrates the use of window functions (RANK()) to rank products within each
category based on their total sales, allowing you to find the top-sellers efficiently.
• Q.340
Question
433
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to find all possible combinations of employees and departments (cross
join). Include the employee's employee_id, employee_name, department_id, and
department_name.
Explanation
To solve this:
• Use a CROSS JOIN to combine all rows from the employees table with all rows from the
departments table.
• Ensure the output includes the employee's ID and name, as well as the department's ID and
name.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE employees (
employee_id INT,
employee_name VARCHAR(100)
);
Learnings
• Using CROSS JOIN to generate all possible combinations of two tables.
• Understanding how Cartesian products work (each row of the first table is combined with
all rows from the second table).
Solutions
• - PostgreSQL and MySQL solution
SELECT
e.employee_id,
e.employee_name,
d.department_id,
d.department_name
FROM employees e
CROSS JOIN departments d;
Explanation
• The CROSS JOIN creates a Cartesian product of the employees and departments tables.
This means each employee will be paired with every department, resulting in all possible
combinations.
• The query outputs the employee_id, employee_name, department_id, and
department_name for each combination.
434
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query demonstrates how to use a CROSS JOIN to generate all combinations between two
tables without any filtering or condition.
Flipkart
• Q.341
Question
How would you concatenate two strings in SQL?
Explanation
To solve this:
• Use the string concatenation operator or function specific to the SQL dialect you are
working with.
• The goal is to combine two string values into a single string.
Datasets and SQL Schemas
No table creation is needed for this specific question as we are just focusing on string
concatenation.
Learnings
• Understanding how to concatenate strings in different SQL databases.
• Using the string concatenation operator (|| or +) or the built-in function (CONCAT()).
Solutions
• - PostgreSQL solution
SELECT 'Hello' || ' ' || 'World' AS concatenated_string;
• - MySQL solution
SELECT CONCAT('Hello', ' ', 'World') AS concatenated_string;
Explanation
• PostgreSQL: The || operator is used to concatenate strings.
• MySQL: The CONCAT() function is used to concatenate multiple strings.
In both examples, the query combines "Hello" and "World" with a space between them to
return the concatenated result "Hello World".
• Q.342
Question
Write a query to find all the big countries from the World table. A country is considered big if
it has:
• An area greater than 3 million square kilometers, or
• A population of more than 25 million.
Output the name, population, and area of these countries.
Explanation
To solve this:
• Use a WHERE clause to filter the countries that either:
435
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using logical conditions (OR) in the WHERE clause to filter rows.
• Filtering numeric data based on multiple criteria.
Solutions
• - PostgreSQL solution
SELECT name, population, area
FROM World
WHERE area > 3000000 OR population > 25000000;
• - MySQL solution
SELECT name, population, area
FROM World
WHERE area > 3000000 OR population > 25000000;
• Q.343
Question
Write a SQL query to find pairs of (actor_id, director_id) where the actor has co-
worked with the director at least 3 times.
Explanation
• Group by actor_id and director_id: We need to count how many times each (actor_id,
director_id) pair appears in the table.
• HAVING clause: We filter out the pairs that have less than 3 occurrences.
• Select distinct pairs: After applying the HAVING clause, select the pairs that meet the
condition.
436
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• GROUP BY: Useful for aggregating data (like counting occurrences) based on one or
more columns.
• HAVING: Used in combination with GROUP BY to filter groups based on an aggregate
condition.
• COUNT(): The aggregate function that counts the number of rows per group.
Solutions
• - PostgreSQL solution
SELECT actor_id, director_id
FROM ActorDirector
GROUP BY actor_id, director_id
HAVING COUNT(*) >= 3;
• - MySQL solution
SELECT actor_id, director_id
FROM ActorDirector
GROUP BY actor_id, director_id
HAVING COUNT(*) >= 3;
• Q.344
Question
Write a SQL query to retrieve the FirstName, LastName, City, and State for each person in
the Person table, regardless of whether there is an associated address for each person.
Explanation
• LEFT JOIN: Since we want to include all people from the Person table, even those
without an address, we should use a LEFT JOIN to join the Person table with the Address
table. This will include all rows from the Person table and matching rows from the Address
table. If no match is found, the columns from Address will contain NULL.
• Select required columns: After joining the tables, we select the FirstName, LastName,
City, and State from the result.
437
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• LEFT JOIN: Ensures that all records from the left table (Person) are included, even if
there is no matching record in the right table (Address).
• Handling NULLs: When there's no matching address, the City and State will be NULL
for that person.
Solutions
• - PostgreSQL solution
SELECT p.FirstName, p.LastName, a.City, a.State
FROM Person p
LEFT JOIN Address a ON p.PersonId = a.PersonId;
• - MySQL solution
SELECT p.FirstName, p.LastName, a.City, a.State
FROM Person p
LEFT JOIN Address a ON p.PersonId = a.PersonId;
• Q.345
Question
Write a SQL query to get the second-highest salary from the Employee table.
Explanation
• Subquery or Ranking Functions: We can use a subquery to find the maximum salary that
is less than the highest salary, which will effectively give us the second-highest salary.
• Handle edge case: If there's no second-highest salary (i.e., all employees have the same
salary), we need to return NULL.
Learnings
438
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Subquery: Using a subquery to select the maximum salary less than the highest salary
gives us the second-highest value.
• Edge case: If there is no second-highest salary (e.g., only one distinct salary value), the
query should return NULL.
Solutions
• - PostgreSQL and MySQL solution
SELECT MAX(Salary) AS salary
FROM Employee
WHERE Salary < (SELECT MAX(Salary) FROM Employee);
Explanation:
• The subquery (SELECT MAX(Salary) FROM Employee) finds the highest salary.
• The outer query finds the maximum salary that is less than the highest salary, which is the
second-highest salary.
• If all salaries are the same, the WHERE clause will filter out all rows, and the result will be
NULL.
• Q.346
Question
Write a SQL query to find the employee with the maximum salary for each gender from the
Salary table.
Explanation
• Group By: We need to group the records by gender (sex column).
• Max Salary: For each gender group, we need to select the employee with the maximum
salary.
• Tie-breaker: If there are multiple employees with the same maximum salary in the same
gender, the query should still return all of them.
Learnings
• Aggregate functions: Using MAX() allows us to fetch the highest salary.
• Grouping: We need to group by sex to separate male and female employees.
• Handling ties: If multiple employees have the same highest salary, the query should return
all such employees.
439
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL and MySQL solution
SELECT sex, name, salary
FROM Salary
WHERE (sex, salary) IN (
SELECT sex, MAX(salary)
FROM Salary
GROUP BY sex
);
Explanation:
• The subquery (SELECT sex, MAX(salary) FROM Salary GROUP BY sex) finds the
maximum salary for each gender.
• The outer query selects the employees whose sex and salary match those maximum
values, thus returning the employees with the highest salary in each gender.
• If multiple employees share the highest salary within the same gender, they are all
returned.
• Q.347
Question
Find all numbers that appear at least three times consecutively in the "Logs" table.
Explanation
You need to identify numbers that appear consecutively in three or more consecutive rows.
This can be done by comparing each row with its two subsequent rows.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Logs (
id INT PRIMARY KEY,
num VARCHAR(10)
);
Learnings
• Using JOIN to compare rows with their subsequent ones
• Identifying consecutive rows
• Applying aggregation with GROUP BY for uniqueness
Solutions
• - PostgreSQL solution
SELECT DISTINCT l1.num AS ConsecutiveNums
FROM Logs l1, Logs l2, Logs l3
WHERE l1.id = l2.id - 1 AND l2.id = l3.id - 1
AND l1.num = l2.num AND l2.num = l3.num;
• - MySQL solution
SELECT DISTINCT l1.num AS ConsecutiveNums
FROM Logs l1, Logs l2, Logs l3
WHERE l1.id = l2.id - 1 AND l2.id = l3.id - 1
AND l1.num = l2.num AND l2.num = l3.num;
440
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.348
Question
Find the number of followers each follower has if they themselves have at least one follower.
Explanation
You need to find each follower's follower count. This can be done by joining the "follow"
table on itself and counting how many distinct followers each user has.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE follow (
followee VARCHAR(10),
follower VARCHAR(10)
);
Learnings
• Self-join to count related records
• Using GROUP BY and COUNT() for aggregation
• Filtering with HAVING to include only those with at least one follower
Solutions
• - PostgreSQL solution
SELECT f1.follower, COUNT(DISTINCT f2.follower) AS num
FROM follow f1
JOIN follow f2 ON f1.follower = f2.followee
GROUP BY f1.follower
HAVING COUNT(DISTINCT f2.follower) > 0;
• - MySQL solution
SELECT f1.follower, COUNT(DISTINCT f2.follower) AS num
FROM follow f1
JOIN follow f2 ON f1.follower = f2.followee
GROUP BY f1.follower
HAVING COUNT(DISTINCT f2.follower) > 0;
• Q.349
Question
For each user, find the largest gap in days between consecutive visits or from the last visit to
today's date.
Explanation
You need to calculate the difference in days between each user's visit and the next visit (or
today’s date for the last visit). The largest gap should be selected for each user.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE UserVisits (
user_id INT,
visit_date DATE
);
441
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEAD() to get the next visit date
• Calculating the date difference with DATEDIFF()
• Using COALESCE() to handle the last visit date by comparing it to today
• Grouping by user_id and selecting the maximum gap
Solutions
• - PostgreSQL solution
WITH VisitGaps AS (
SELECT user_id,
visit_date,
LEAD(visit_date, 1, '2021-01-01'::date) OVER (PARTITION BY user_id ORDER BY v
isit_date) AS next_visit_date
FROM UserVisits
)
SELECT user_id, MAX(DATE(next_visit_date) - DATE(visit_date)) AS biggest_window
FROM VisitGaps
GROUP BY user_id
ORDER BY user_id;
• - MySQL solution
WITH VisitGaps AS (
SELECT user_id,
visit_date,
LEAD(visit_date, 1, '2021-01-01') OVER (PARTITION BY user_id ORDER BY visit_d
ate) AS next_visit_date
FROM UserVisits
)
SELECT user_id, MAX(DATEDIFF(next_visit_date, visit_date)) AS biggest_window
FROM VisitGaps
GROUP BY user_id
ORDER BY user_id;
• Q.350
Question
Find the transaction IDs with the maximum amount for each day. If there are multiple
transactions with the same maximum amount on a day, return all of them.
Explanation
You need to identify the transaction(s) with the highest amount for each day. To achieve this,
you can first find the maximum amount per day and then filter the transactions based on that.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Transactions (
transaction_id INT PRIMARY KEY,
day DATETIME,
amount INT
);
442
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using MAX() with GROUP BY to find the maximum value
• Filtering results based on the maximum value for each day
• Sorting the result by transaction ID
Solutions
• - PostgreSQL solution
WITH MaxAmounts AS (
SELECT DATE(day) AS day, MAX(amount) AS max_amount
FROM Transactions
GROUP BY DATE(day)
)
SELECT t.transaction_id
FROM Transactions t
JOIN MaxAmounts ma ON DATE(t.day) = ma.day
WHERE t.amount = ma.max_amount
ORDER BY t.transaction_id;
• - MySQL solution
WITH MaxAmounts AS (
SELECT DATE(day) AS day, MAX(amount) AS max_amount
FROM Transactions
GROUP BY DATE(day)
)
SELECT t.transaction_id
FROM Transactions t
JOIN MaxAmounts ma ON DATE(t.day) = ma.day
WHERE t.amount = ma.max_amount
ORDER BY t.transaction_id;
• Q.351
Question
Find the records with three or more consecutive rows where the number of people is greater
than or equal to 100 for each row.
Explanation
You need to identify groups of three or more consecutive rows where the number of people is
greater than or equal to 100 for all rows in the group. You can achieve this by checking for
consecutive id values and applying a condition on the people column.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Stadium (
id INT PRIMARY KEY,
visit_date DATE,
people INT
);
Learnings
443
1000+ SQL Interview Questions & Answers | By Zero Analyst
444
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine data from multiple tables
• Calculating volume by multiplying product dimensions with the units
• Grouping by warehouse name to get the total volume
Solutions
• - PostgreSQL solution
SELECT w.name AS warehouse_name,
SUM(w.units * p.Width * p.Length * p.Height) AS volume
FROM Warehouse w
JOIN Products p ON w.product_id = p.product_id
GROUP BY w.name;
• - MySQL solution
SELECT w.name AS warehouse_name,
SUM(w.units * p.Width * p.Length * p.Height) AS volume
FROM Warehouse w
JOIN Products p ON w.product_id = p.product_id
GROUP BY w.name;
• Q.353
Question
Find the patient_id, patient_name, and all conditions for patients who have Type I
Diabetes, where the conditions contain codes that start with the prefix DIAB1.
Explanation
You need to filter patients whose conditions column contains at least one code starting with
DIAB1. This can be done using the LIKE operator in SQL to match any condition that begins
with DIAB1.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Patients (
patient_id INT PRIMARY KEY,
patient_name VARCHAR(100),
conditions VARCHAR(255)
);
Learnings
• Using the LIKE operator to filter rows based on pattern matching
• Handling conditions stored as space-separated strings
• Querying based on a specific prefix (DIAB1)
Solutions
445
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
SELECT patient_id, patient_name, conditions
FROM Patients
WHERE conditions LIKE '%DIAB1%';
• - MySQL solution
SELECT patient_id, patient_name, conditions
FROM Patients
WHERE conditions LIKE '%DIAB1%';
• Q.354
Question
Find the second most recent activity for each user. If a user has only one activity, return that
activity.
Explanation
To find the second most recent activity, you need to order the activities for each user by
startDate in descending order and then fetch the second row (or the first row if only one
exists). You can achieve this using ROW_NUMBER() to rank the activities for each user and then
filter the result.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE UserActivity (
username VARCHAR(100),
activity VARCHAR(100),
startDate DATE,
endDate DATE
);
Learnings
• Using ROW_NUMBER() window function to rank rows for each user
• Filtering rows based on the rank for the second most recent activity
• Handling cases where there is only one activity for a user
Solutions
• - PostgreSQL solution
WITH RankedActivities AS (
SELECT username, activity, startDate, endDate,
ROW_NUMBER() OVER (PARTITION BY username ORDER BY startDate DESC) AS activity
_rank
FROM UserActivity
)
SELECT username, activity, startDate, endDate
FROM RankedActivities
WHERE activity_rank = 2
UNION
SELECT username, activity, startDate, endDate
FROM RankedActivities
WHERE activity_rank = 1
AND username NOT IN (SELECT username FROM RankedActivities WHERE activity_rank = 2)
ORDER BY username;
• - MySQL solution
WITH RankedActivities AS (
SELECT username, activity, startDate, endDate,
446
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY and HAVING to calculate maximum and minimum scores per exam
447
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using NOT IN to filter students who didn't have the highest or lowest score
• Ensuring the condition applies to all exams a student participated in
Solutions
• - PostgreSQL solution
WITH ExamStats AS (
SELECT exam_id, MAX(score) AS max_score, MIN(score) AS min_score
FROM Exam
GROUP BY exam_id
)
SELECT s.student_id, s.student_name
FROM Student s
JOIN Exam e ON s.student_id = e.student_id
JOIN ExamStats es ON e.exam_id = es.exam_id
GROUP BY s.student_id, s.student_name
HAVING SUM(CASE
WHEN e.score = es.max_score OR e.score = es.min_score THEN 1
ELSE 0
END) = 0
ORDER BY s.student_id;
• - MySQL solution
WITH ExamStats AS (
SELECT exam_id, MAX(score) AS max_score, MIN(score) AS min_score
FROM Exam
GROUP BY exam_id
)
SELECT s.student_id, s.student_name
FROM Student s
JOIN Exam e ON s.student_id = e.student_id
JOIN ExamStats es ON e.exam_id = es.exam_id
GROUP BY s.student_id, s.student_name
HAVING SUM(CASE
WHEN e.score = es.max_score OR e.score = es.min_score THEN 1
ELSE 0
END) = 0
ORDER BY s.student_id;
• Q.356
Question
Query all columns for all Marvel cities in the CITY table with populations larger than
100,000. The CountryCode for Marvel is 'Marv'.
Explanation
You need to filter the cities that belong to Marvel (i.e., where CountryCode = 'Marv') and
have a population greater than 100,000. Select all columns from the CITY table that match
these conditions.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE CITY (
ID INT,
Name VARCHAR(100),
CountryCode VARCHAR(3),
Population INT
);
448
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data based on conditions with WHERE
• Using specific country code filter for Marvel cities
• Selecting all columns with for the result
Solutions
• - PostgreSQL solution
SELECT *
FROM CITY
WHERE CountryCode = 'Marv' AND Population > 100000;
• - MySQL solution
SELECT *
FROM CITY
WHERE CountryCode = 'Marv' AND Population > 100000;
• Q.357
Question
Find the shortest distance between two points on the x-axis from the given list of points.
Explanation
To find the shortest distance between two points, first, you need to calculate the absolute
differences between all pairs of points. The minimum of these differences will be the shortest
distance. You can achieve this by joining the table with itself and comparing all pairs of
points.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE point (
x INT
);
Learnings
• Using self-joins to compare all pairs of points
• Using the ABS() function to calculate the absolute difference
• Filtering and finding the minimum distance
Solutions
• - PostgreSQL solution
SELECT MIN(ABS(p1.x - p2.x)) AS shortest
FROM point p1, point p2
WHERE p1.x < p2.x;
• - MySQL solution
SELECT MIN(ABS(p1.x - p2.x)) AS shortest
FROM point p1, point p2
WHERE p1.x < p2.x;
• Q.358
Question
Find the top 3 brands with the highest average product ratings in each category from a
products table and a reviews table.
449
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to join the "products" table with the "reviews" table, group the results by category
and brand, calculate the average product rating for each brand in each category, and then
filter to get the top 3 brands for each category.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
brand VARCHAR(50),
category VARCHAR(50)
);
Learnings
• Using JOINs to combine product and review data.
• GROUP BY to group data by category and brand.
• AVG() to calculate average ratings.
• Using ROW_NUMBER() or RANK() for filtering top results.
Solutions
• - PostgreSQL solution
WITH ranked_brands AS (
SELECT p.category,
p.brand,
AVG(r.rating) AS avg_rating,
ROW_NUMBER() OVER (PARTITION BY p.category ORDER BY AVG(r.rating) DESC) AS ra
nk
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.category, p.brand
)
SELECT category, brand, avg_rating
FROM ranked_brands
WHERE rank <= 3
ORDER BY category, rank;
• - MySQL solution
WITH ranked_brands AS (
SELECT p.category,
450
1000+ SQL Interview Questions & Answers | By Zero Analyst
p.brand,
AVG(r.rating) AS avg_rating,
RANK() OVER (PARTITION BY p.category ORDER BY AVG(r.rating) DESC) AS rank
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.category, p.brand
)
SELECT category, brand, avg_rating
FROM ranked_brands
WHERE rank <= 3
ORDER BY category, rank;
• Q.359
Question
Identify and remove products with customer feedback that contains inappropriate words (e.g.,
nudity or offensive language) from a product review system. Only include reviews that do not
contain flagged words.
Explanation
You need to filter out reviews that contain inappropriate or offensive words using a
predefined list of such words. You will join the "products" and "reviews" tables, then apply a
filter to exclude reviews with flagged words.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50)
);
Learnings
• Filtering text data for specific words using LIKE or REGEXP (regular expressions).
• JOINs to combine product and review data.
• Using WHERE clause to exclude reviews with inappropriate content.
Solutions
• - PostgreSQL solution
SELECT p.product_name, r.review_text, r.rating
451
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM products p
JOIN reviews r ON p.product_id = r.product_id
WHERE NOT (r.review_text ILIKE '%nudity%' OR r.review_text ILIKE '%offensive_word%')
ORDER BY r.review_date;
• - MySQL solution
SELECT p.product_name, r.review_text, r.rating
FROM products p
JOIN reviews r ON p.product_id = r.product_id
WHERE NOT (r.review_text LIKE '%nudity%' OR r.review_text LIKE '%offensive_word%')
ORDER BY r.review_date;
• Q.360
Question
Identify customers who have returned products in their last 3 consecutive orders and
categorize them as "suspect" for potential abuse.
Explanation
You need to join the "orders" and "returns" tables, filter for customers who have made 3
consecutive returns, and categorize them as "suspect." The data should be ordered by order
date to identify the sequence of returns.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
452
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOINs to combine multiple tables.
• Filtering with WHERE clause to identify returns.
• ROW_NUMBER() or LAG() to identify consecutive orders.
• Categorizing customers based on consecutive actions.
Solutions
• - PostgreSQL solution
WITH recent_returns AS (
SELECT r.customer_id, o.order_id, o.order_date, r.return_date,
ROW_NUMBER() OVER (PARTITION BY r.customer_id ORDER BY o.order_date DESC) AS
rn
FROM returns r
JOIN orders o ON r.order_id = o.order_id
WHERE o.order_status = 'Completed'
)
SELECT customer_id, COUNT(*) AS return_count
FROM recent_returns
WHERE rn <= 3
GROUP BY customer_id
HAVING COUNT(*) = 3
ORDER BY customer_id;
• - MySQL solution
WITH recent_returns AS (
SELECT r.customer_id, o.order_id, o.order_date, r.return_date,
ROW_NUMBER() OVER (PARTITION BY r.customer_id ORDER BY o.order_date DESC) AS
rn
FROM returns r
JOIN orders o ON r.order_id = o.order_id
WHERE o.order_status = 'Completed'
)
SELECT customer_id, COUNT(*) AS return_count
FROM recent_returns
WHERE rn <= 3
GROUP BY customer_id
HAVING COUNT(*) = 3
ORDER BY customer_id;
Spotify
• Datasets
Note
Please copy below datasets into PgAdmin or MySQL run these query to create
table and insert the data and solve questions in next sections!
1. Users Table
CREATE TABLE users (
user_id INT PRIMARY KEY,
user_name VARCHAR(100),
email VARCHAR(100),
country VARCHAR(50),
subscription_type VARCHAR(50), -- Free, Premium
sign_up_date DATE,
last_login TIMESTAMP
);
453
1000+ SQL Interview Questions & Answers | By Zero Analyst
2. Artists Table
CREATE TABLE artists (
artist_id INT PRIMARY KEY,
artist_name VARCHAR(100),
genre VARCHAR(50),
country VARCHAR(50),
date_joined TIMESTAMP
);
454
1000+ SQL Interview Questions & Answers | By Zero Analyst
3. Albums Table
CREATE TABLE albums (
album_id INT PRIMARY KEY,
album_name VARCHAR(100),
artist_id INT,
release_date DATE,
genre VARCHAR(50),
total_tracks INT,
FOREIGN KEY (artist_id) REFERENCES artists(artist_id)
);
'Hip-Hop', 14),
(20, 'Bangerz', 20, '2013-10-08', 'Pop', 13);
4. Tracks Table
CREATE TABLE tracks (
track_id INT PRIMARY KEY,
track_name VARCHAR(100),
album_id INT,
artist_id INT,
genre VARCHAR(50),
duration INT, -- in seconds
FOREIGN KEY (album_id) REFERENCES albums(album_id),
FOREIGN KEY (artist_id) REFERENCES artists(artist_id)
);
455
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.361
Find the total number of tracks played by users from the USA in the last 6 months.
💡
Learnings:
• Use JOIN to combine the user_activity, users, and tracks tables.
• Apply date filters and group by user country.
Explanation:
This query gives insights into user activity in the USA, specifically the number of tracks
played by users within the last 6 months.
Expected Outcome:
A single result showing the total number of tracks played by users from the USA in the past 6
months.
• Q.362
Identify the top 5 users who have listened to the most distinct tracks in the last 30 days.
💡
Learnings:
• Use COUNT(DISTINCT) for distinct tracks.
• Filter by the last 30 days using NOW() - INTERVAL 30 DAY.
Explanation:
This query helps identify users who have shown the most variety in their listening habits over
the past month.
Expected Outcome:
The top 5 users who have listened to the most distinct tracks in the last 30 days.
456
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.363
Calculate the average play duration per track in the last 60 days, grouped by genre.
💡
Learnings:
• Use AVG() to calculate the average.
• Group by genre and apply filters for the last 60 days.
Explanation:
This query calculates how long tracks in each genre are played on average, helping identify
user preferences.
Expected Outcome:
The average play duration for each genre in the last 60 days.
💡
Learnings:
• Use JOIN to connect user_activity, tracks, and artists.
• Use HAVING to filter results based on track count.
Explanation:
This query identifies users with a strong preference for an artist (e.g., Drake), showing active
engagement with their tracks.
Expected Outcome:
A list of users who have played more than 50 tracks by Drake.
457
1000+ SQL Interview Questions & Answers | By Zero Analyst
List the top 3 genres with the highest total play duration in the last 3 months.
💡
Learnings:
• Use SUM() to calculate total play duration.
• Filter results by the last 3 months.
Explanation:
This query identifies the top 3 genres based on the total play time over the past 3 months,
helping to understand genre popularity.
Expected Outcome:
The top 3 genres with the highest play duration.
• Q.366
Find users who have listened to the most tracks from the album 'Scorpion' by Drake.
💡
Learnings:
• Use JOIN to link user_activity, tracks, and albums.
• Apply filters for album name and track count.
Explanation:
This query helps identify users who have shown a high preference for Drake’s "Scorpion"
album.
Expected Outcome:
A list of users who have listened to the most tracks from "Scorpion."
458
1000+ SQL Interview Questions & Answers | By Zero Analyst
💡
Learnings:
• Aggregate by artist and filter by date.
• Use COUNT() to count the number of tracks played.
Explanation:
This query identifies the top 5 artists who have had the most engagement with listeners in the
last 60 days.
Expected Outcome:
The top 5 artists based on the number of tracks played.
💡
Learnings:
• Aggregate track plays using COUNT().
• Apply date filters and use HAVING to filter tracks with high play counts.
Explanation:
This query identifies popular tracks that have been played over 1000 times in the past 30
days.
Expected Outcome:
A list of tracks with over 1000 plays in the last month.
459
1000+ SQL Interview Questions & Answers | By Zero Analyst
💡
Learnings:
• Use aggregation functions like COUNT() to measure popularity.
• Apply date filters and ORDER BY to find the top albums.
Explanation:
This query helps to identify the albums with the most engagement in the last 90 days.
Expected Outcome:
The top 3 most played albums.
• Q.370
Identify users who have upgraded from Free to Premium subscription in the last 6
months.
💡
Learnings:
• Use JOIN to link the users table with itself.
• Apply WHERE and AND to filter by subscription status change.
Explanation:
This query finds users who have moved from Free to Premium subscriptions, providing
insight into user behavior and potential engagement strategies.
Expected Outcome:
A list of users who upgraded their subscription in the last 6 months.
460
1000+ SQL Interview Questions & Answers | By Zero Analyst
Find the top 3 Indian artists with the most number of tracks that have a rating greater than
4.5, grouped by genre.
Explanation
You need to join the "artists," "tracks," and "ratings" tables. Filter tracks that have ratings
greater than 4.5, then group the data by genre and artist. For each artist in each genre, count
the number of such high-rated tracks and retrieve the top 3 artists.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE artists (
artist_id INT PRIMARY KEY,
artist_name VARCHAR(100),
nationality VARCHAR(50),
genre VARCHAR(50)
);
Learnings
• Filtering data based on specific conditions using the WHERE clause.
• JOINs to combine related tables.
• COUNT() to count the number of high-rated tracks.
• GROUP BY to group the results by artist and genre.
Solutions
461
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
WITH high_rated_tracks AS (
SELECT a.artist_name, a.genre, COUNT(*) AS track_count
FROM artists a
JOIN tracks t ON a.artist_id = t.artist_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.nationality = 'Indian' AND r.rating > 4.5
GROUP BY a.artist_name, a.genre
)
SELECT artist_name, genre, track_count
FROM high_rated_tracks
ORDER BY track_count DESC
LIMIT 3;
• - MySQL solution
WITH high_rated_tracks AS (
SELECT a.artist_name, a.genre, COUNT(*) AS track_count
FROM artists a
JOIN tracks t ON a.artist_id = t.artist_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.nationality = 'Indian' AND r.rating > 4.5
GROUP BY a.artist_name, a.genre
)
SELECT artist_name, genre, track_count
FROM high_rated_tracks
ORDER BY track_count DESC
LIMIT 3;
• Q.372
Question
Identify the top 3 albums in the 'Pop' genre with the highest average track ratings, but only
consider albums released within the last 6 months.
Explanation
You need to join the "albums," "tracks," and "ratings" tables. Filter albums based on the
genre ('Pop') and release date (last 6 months). Then, calculate the average rating of tracks
within each album, rank the albums by average rating, and retrieve the top 3.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE albums (
album_id INT PRIMARY KEY,
album_name VARCHAR(100),
genre VARCHAR(50),
release_date DATE
);
462
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOINs to combine album, track, and rating data.
• GROUP BY to aggregate ratings by album.
• AVG() to calculate the average rating for each album.
• Filtering by release date and genre.
• Using ORDER BY to rank albums by average rating.
Solutions
• - PostgreSQL solution
WITH album_ratings AS (
SELECT a.album_id, a.album_name, AVG(r.rating) AS avg_rating
FROM albums a
JOIN tracks t ON a.album_id = t.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Pop'
AND a.release_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY a.album_id, a.album_name
)
SELECT album_name, avg_rating
FROM album_ratings
ORDER BY avg_rating DESC
LIMIT 3;
• - MySQL solution
WITH album_ratings AS (
SELECT a.album_id, a.album_name, AVG(r.rating) AS avg_rating
FROM albums a
JOIN tracks t ON a.album_id = t.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Pop'
AND a.release_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY a.album_id, a.album_name
)
SELECT album_name, avg_rating
FROM album_ratings
ORDER BY avg_rating DESC
LIMIT 3;
• Q.373
Question
463
1000+ SQL Interview Questions & Answers | By Zero Analyst
Identify artists who have at least 5 tracks with an average rating of 4.5 or higher, but only
count tracks released within the last 12 months. Also, filter out any artists who have more
than 3 tracks with a rating below 3.5.
Explanation
You need to join the "artists," "tracks," and "ratings" tables. Filter for tracks released within
the last 12 months and calculate the average rating for each artist. After that, you will need to
check that each artist has at least 5 tracks with an average rating of 4.5 or higher and filter out
any artist with more than 3 tracks with a rating below 3.5. The result should show artists
meeting both conditions.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE artists (
artist_id INT PRIMARY KEY,
artist_name VARCHAR(100),
nationality VARCHAR(50),
genre VARCHAR(50)
);
464
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOINs to combine multiple tables.
• WHERE clause to filter by release dates and ratings.
• HAVING clause to apply conditions after aggregation (e.g., the minimum number of
tracks).
• COUNT() and AVG() to aggregate and compute the total number of tracks and average
ratings.
• GROUP BY to group by artist.
Solutions
• - PostgreSQL solution
WITH track_ratings AS (
SELECT t.artist_id, COUNT(*) AS track_count, AVG(r.rating) AS avg_rating
FROM tracks t
JOIN ratings r ON t.track_id = r.track_id
WHERE t.release_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY t.artist_id, t.track_id
),
artist_ratings AS (
SELECT artist_id, COUNT(*) AS high_rating_tracks
FROM track_ratings
WHERE avg_rating >= 4.5
GROUP BY artist_id
HAVING COUNT(*) >= 5
),
artist_low_ratings AS (
SELECT t.artist_id, COUNT(*) AS low_rating_tracks
FROM tracks t
JOIN ratings r ON t.track_id = r.track_id
WHERE r.rating < 3.5
GROUP BY t.artist_id
HAVING COUNT(*) <= 3
)
SELECT a.artist_name
FROM artists a
JOIN artist_ratings ar ON a.artist_id = ar.artist_id
JOIN artist_low_ratings alr ON a.artist_id = alr.artist_id;
• - MySQL solution
WITH track_ratings AS (
SELECT t.artist_id, COUNT(*) AS track_count, AVG(r.rating) AS avg_rating
FROM tracks t
JOIN ratings r ON t.track_id = r.track_id
WHERE t.release_date >= CURDATE() - INTERVAL 12 MONTH
GROUP BY t.artist_id, t.track_id
),
artist_ratings AS (
SELECT artist_id, COUNT(*) AS high_rating_tracks
FROM track_ratings
WHERE avg_rating >= 4.5
GROUP BY artist_id
HAVING COUNT(*) >= 5
),
artist_low_ratings AS (
SELECT t.artist_id, COUNT(*) AS low_rating_tracks
FROM tracks t
JOIN ratings r ON t.track_id = r.track_id
WHERE r.rating < 3.5
GROUP BY t.artist_id
HAVING COUNT(*) <= 3
)
SELECT a.artist_name
FROM artists a
JOIN artist_ratings ar ON a.artist_id = ar.artist_id
JOIN artist_low_ratings alr ON a.artist_id = alr.artist_id;
• Q.374
465
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Identify the top 3 Indian users who have the highest average rating on tracks they have
listened to, but only include tracks from the 'Pop' genre that were released in the last 6
months.
Explanation
You need to join the "users," "tracks," "ratings," and "albums" tables. Filter the tracks by
genre ('Pop') and release date (last 6 months). Then, calculate the average rating of tracks for
each user. Finally, rank the users by their average rating and retrieve the top 3.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE users (
user_id INT PRIMARY KEY,
user_name VARCHAR(100),
nationality VARCHAR(50)
);
466
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOINs to combine data from multiple tables (users, tracks, ratings, albums).
• Filtering data based on genre and release date.
• AVG() to calculate the average rating for each user.
• GROUP BY to group the results by user.
• Using ORDER BY to sort the results and retrieve the top users.
Solutions
• - PostgreSQL solution
WITH user_ratings AS (
SELECT r.user_id, AVG(r.rating) AS avg_rating
FROM ratings r
JOIN tracks t ON r.track_id = t.track_id
JOIN albums a ON t.album_id = a.album_id
WHERE a.genre = 'Pop'
AND a.release_date >= CURRENT_DATE - INTERVAL '6 months'
AND r.user_id IN (SELECT user_id FROM users WHERE nationality = 'Indian')
GROUP BY r.user_id
)
SELECT u.user_name, ur.avg_rating
FROM user_ratings ur
JOIN users u ON ur.user_id = u.user_id
ORDER BY ur.avg_rating DESC
LIMIT 3;
• - MySQL solution
WITH user_ratings AS (
SELECT r.user_id, AVG(r.rating) AS avg_rating
FROM ratings r
JOIN tracks t ON r.track_id = t.track_id
JOIN albums a ON t.album_id = a.album_id
WHERE a.genre = 'Pop'
AND a.release_date >= CURDATE() - INTERVAL 6 MONTH
AND r.user_id IN (SELECT user_id FROM users WHERE nationality = 'Indian')
GROUP BY r.user_id
)
SELECT u.user_name, ur.avg_rating
FROM user_ratings ur
JOIN users u ON ur.user_id = u.user_id
ORDER BY ur.avg_rating DESC
LIMIT 3;
• Q.375
Question
Identify the top 3 Indian albums in the 'Classical' genre that have the highest average track
rating, considering only albums released in the last 12 months. Also, each album must have
at least 3 tracks with ratings of 4.0 or higher.
Explanation
You need to join the "albums," "tracks," and "ratings" tables. Filter albums by genre
('Classical') and release date (last 12 months). Then, calculate the average rating of tracks for
467
1000+ SQL Interview Questions & Answers | By Zero Analyst
each album, and filter albums that have at least 3 tracks with ratings of 4.0 or higher. Finally,
rank the albums by their average track rating and retrieve the top 3.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE albums (
album_id INT PRIMARY KEY,
album_name VARCHAR(100),
genre VARCHAR(50),
release_date DATE
);
Learnings
• Filtering data based on genre and release date.
• Using JOINs to combine albums, tracks, and ratings.
• HAVING clause to filter albums that meet the track rating threshold.
• AVG() to calculate the average rating for each album.
• GROUP BY to group by album and apply filtering after aggregation.
Solutions
468
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
WITH album_ratings AS (
SELECT a.album_id, a.album_name, AVG(r.rating) AS avg_rating, COUNT(CASE WHEN r.rati
ng >= 4.0 THEN 1 END) AS high_rating_count
FROM albums a
JOIN tracks t ON a.album_id = t.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Classical'
AND a.release_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY a.album_id, a.album_name
HAVING COUNT(CASE WHEN r.rating >= 4.0 THEN 1 END) >= 3
)
SELECT album_name, avg_rating
FROM album_ratings
ORDER BY avg_rating DESC
LIMIT 3;
• - MySQL solution
WITH album_ratings AS (
SELECT a.album_id, a.album_name, AVG(r.rating) AS avg_rating, COUNT(CASE WHEN r.rati
ng >= 4.0 THEN 1 END) AS high_rating_count
FROM albums a
JOIN tracks t ON a.album_id = t.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Classical'
AND a.release_date >= CURDATE() - INTERVAL 12 MONTH
GROUP BY a.album_id, a.album_name
HAVING COUNT(CASE WHEN r.rating >= 4.0 THEN 1 END) >= 3
)
SELECT album_name, avg_rating
FROM album_ratings
ORDER BY avg_rating DESC
LIMIT 3;
• Q.376
Question
Identify the peak hours when the most Indian users are active on Spotify, based on the
timestamps of when they rated tracks. Show the top 3 hours during which the highest
number of ratings are given by users.
Explanation
You need to analyze the ratings table by extracting the hour from the rating_date field.
Then, filter for Indian users and count the number of ratings given by users during each hour.
Finally, retrieve the top 3 hours during which the most ratings are given.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE users (
user_id INT PRIMARY KEY,
user_name VARCHAR(100),
nationality VARCHAR(50)
);
469
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Extracting the hour from a timestamp using EXTRACT() or HOUR().
• GROUP BY to group ratings by hour.
• Filtering users by nationality.
• COUNT() to count ratings in each hour.
• Using ORDER BY to rank hours by activity.
Solutions
• - PostgreSQL solution
SELECT EXTRACT(HOUR FROM r.rating_date) AS rating_hour, COUNT(*) AS ratings_count
FROM ratings r
JOIN users u ON r.user_id = u.user_id
WHERE u.nationality = 'Indian'
GROUP BY rating_hour
ORDER BY ratings_count DESC
LIMIT 3;
• - MySQL solution
SELECT HOUR(r.rating_date) AS rating_hour, COUNT(*) AS ratings_count
FROM ratings r
JOIN users u ON r.user_id = u.user_id
WHERE u.nationality = 'Indian'
GROUP BY rating_hour
ORDER BY ratings_count DESC
LIMIT 3;
• Q.377
Question
Identify the top 5 most listened tracks in the 'Pop' genre based on the number of ratings
given by users, considering only tracks released in the last 6 months.
Explanation
You need to join the "tracks," "albums," and "ratings" tables. Filter tracks based on the genre
('Pop') and release date (last 6 months). Then, count the number of ratings each track has
received. Finally, retrieve the top 5 tracks based on the highest number of ratings.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE albums (
album_id INT PRIMARY KEY,
album_name VARCHAR(100),
genre VARCHAR(50),
release_date DATE
);
470
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data by genre and release date.
• Using COUNT() to count ratings for each track.
• GROUP BY to aggregate data by track.
• Sorting results with ORDER BY to find the top tracks.
Solutions
• - PostgreSQL solution
SELECT t.track_name, COUNT(r.rating_id) AS ratings_count
FROM tracks t
JOIN albums a ON t.album_id = a.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Pop'
AND a.release_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY t.track_name
ORDER BY ratings_count DESC
LIMIT 5;
• - MySQL solution
SELECT t.track_name, COUNT(r.rating_id) AS ratings_count
FROM tracks t
JOIN albums a ON t.album_id = a.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Pop'
AND a.release_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY t.track_name
ORDER BY ratings_count DESC
471
1000+ SQL Interview Questions & Answers | By Zero Analyst
LIMIT 5;
• Q.378
Question
Write a query to calculate the average rating for each track on Spotify, but apply the
following logic using a CASE statement:
• If the average rating is greater than or equal to 4.5, label the track as 'Excellent'.
• If the average rating is between 3.5 and 4.4, label the track as 'Good'.
• If the average rating is below 3.5, label the track as 'Poor'.
Explanation
You need to calculate the average rating for each track, and based on that average rating, use
a CASE statement to categorize the track as 'Excellent', 'Good', or 'Poor'. This will involve
joining the "tracks" and "ratings" tables, then applying the AVG() function for each track,
followed by the CASE statement to assign the appropriate label.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE tracks (
track_id INT PRIMARY KEY,
track_name VARCHAR(100)
);
Learnings
• Using the AVG() function to calculate the average of a column.
• Applying CASE statements to create conditional logic.
• Using GROUP BY to aggregate results by track.
• Filtering and categorizing data dynamically based on the calculated average.
Solutions
• - PostgreSQL solution
SELECT t.track_name,
AVG(r.rating) AS avg_rating,
CASE
472
1000+ SQL Interview Questions & Answers | By Zero Analyst
473
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOINs to combine data from multiple tables.
• Calculating average ratings using AVG().
• RANK() to rank tracks based on average rating.
• Filtering results based on genre.
Solutions
• - PostgreSQL solution
WITH ranked_tracks AS (
SELECT t.track_name,
a.album_name,
AVG(r.rating) AS avg_rating,
RANK() OVER (ORDER BY AVG(r.rating) DESC) AS track_rank
FROM tracks t
JOIN albums a ON t.album_id = a.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Hollywood'
GROUP BY t.track_name, a.album_name
)
SELECT track_name, album_name, avg_rating, track_rank
FROM ranked_tracks
WHERE track_rank <= 5
ORDER BY track_rank;
• - MySQL solution
WITH ranked_tracks AS (
SELECT t.track_name,
a.album_name,
AVG(r.rating) AS avg_rating,
RANK() OVER (ORDER BY AVG(r.rating) DESC) AS track_rank
FROM tracks t
JOIN albums a ON t.album_id = a.album_id
JOIN ratings r ON t.track_id = r.track_id
WHERE a.genre = 'Hollywood'
GROUP BY t.track_name, a.album_name
)
SELECT track_name, album_name, avg_rating, track_rank
FROM ranked_tracks
WHERE track_rank <= 5
ORDER BY track_rank;
• Q.380
Question
Identify the users who canceled their subscriptions in the last month. For each of these
users, also calculate their listen time growth (difference in total listening time between the
last month before cancellation and the month prior to that).
Explanation
You need to:
• Identify users who canceled their subscriptions in the last month.
474
1000+ SQL Interview Questions & Answers | By Zero Analyst
• For these users, calculate their total listening time for the month immediately before the
cancellation and the month before that.
• Calculate the listen time growth as the difference in listening time between these two
months.
This will involve joining the "users," "subscriptions," and "listen_logs" tables, using date
manipulation to get the correct months, and then calculating the growth in listening time.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE users (
user_id INT PRIMARY KEY,
user_name VARCHAR(100),
subscription_status VARCHAR(50),
subscription_end_date DATE
);
Learnings
• Using date manipulation functions like EXTRACT() or DATE_TRUNC() to extract months
and years.
• JOINs to link users, subscriptions, and listen logs.
• Calculating total listen time using SUM() and filtering by months.
475
1000+ SQL Interview Questions & Answers | By Zero Analyst
AirBnB
• Q.381
Problem statement
Write an SQL query to find the total number of listings available in each city on Airbnb.
Explanation
• Group the data by city.
• Count the number of listings in each city.
476
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• COUNT(): To count the number of listings in each city.
• GROUP BY: To group by City to calculate the total per city.
Solutions
• - PostgreSQL and MySQL solution
SELECT City, COUNT(ListingID) AS TotalListings
FROM Listings
GROUP BY City
ORDER BY City;
• Q.382
Problem statement
Write an SQL query to find the average price of listings in each city where the listings are
available (i.e., Available = TRUE) on Airbnb.
Explanation
• Filter the listings where Available = TRUE.
• Group the data by City.
• Calculate the average price for each city.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
City VARCHAR(100),
Price DECIMAL(10, 2),
Available BOOLEAN
);
477
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• AVG(): To calculate the average price.
• WHERE: To filter the records based on availability.
• GROUP BY: To group data by City.
Solutions
• - PostgreSQL and MySQL solution
SELECT City, AVG(Price) AS AveragePrice
FROM Listings
WHERE Available = TRUE
GROUP BY City;
• Q.383
Problem statement
Write an SQL query to find the top 3 highest-priced listings for each city on Airbnb. The
result should return the ListingID, City, and Price, sorted in descending order by price
within each city.
Explanation
• Sort the listings within each city by Price in descending order.
• Use a window function to assign a rank (1 to N) for the price within each city.
• Filter the results to return only the top 3 listings for each city.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
City VARCHAR(100),
Price DECIMAL(10, 2),
Available BOOLEAN
);
Learnings
• ROW_NUMBER(): To rank listings within each city by price.
• PARTITION BY: To restart ranking for each city.
• ORDER BY: To order listings by price in descending order.
Solutions
• - PostgreSQL and MySQL solution
WITH RankedListings AS (
SELECT ListingID,
City,
Price,
ROW_NUMBER() OVER (PARTITION BY City ORDER BY Price DESC) AS rank
FROM Listings
478
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
SELECT ListingID, City, Price
FROM RankedListings
WHERE rank <= 3
ORDER BY City, Price DESC;
Explanation:
• ROW_NUMBER() OVER (PARTITION BY City ORDER BY Price DESC): This
window function ranks the listings by Price in descending order for each City.
• WITH RankedListings: This common table expression (CTE) ranks all listings within
their respective cities.
• WHERE rank <= 3: Filters the results to return only the top 3 listings for each city.
• ORDER BY City, Price DESC: Orders the final results by city name and price in
descending order.
• Q.384
Problem statement
Write an SQL query to find the number of available listings in each city on Airbnb.
Explanation
• Filter the data to include only listings where Available = TRUE.
• Group the results by city.
• Count the number of available listings for each city.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
City VARCHAR(100),
Price DECIMAL(10, 2),
Available BOOLEAN
);
Learnings
• COUNT(): To count the number of available listings.
• WHERE: To filter only the available listings.
• GROUP BY: To group the data by City.
Solutions
• - PostgreSQL and MySQL solution
SELECT City, COUNT(ListingID) AS AvailableListings
FROM Listings
WHERE Available = TRUE
GROUP BY City;
479
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.385
Problem statement
Write an SQL query to find the top 2 highest-priced listings in each city on Airbnb.
Explanation
• Sort listings by Price in descending order for each city.
• Use a window function to assign ranks to listings based on price within each city.
• Filter to only return the top 2 highest-priced listings.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
City VARCHAR(100),
Price DECIMAL(10, 2),
Available BOOLEAN
);
Learnings
• ROW_NUMBER(): To rank the listings within each city based on price.
• PARTITION BY: To partition the data by city.
• ORDER BY: To order by price in descending order.
Solutions
• - PostgreSQL and MySQL solution
WITH RankedListings AS (
SELECT ListingID,
City,
Price,
ROW_NUMBER() OVER (PARTITION BY City ORDER BY Price DESC) AS rank
FROM Listings
)
SELECT ListingID, City, Price
FROM RankedListings
WHERE rank <= 2
ORDER BY City, Price DESC;
• Q.386
Problem statement
Write an SQL query to find the average price of listings by city and availability status
(Available vs. Not Available) on Airbnb.
Explanation
• Group the data by City and Availability (Available or Not Available).
480
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• GROUP BY: To group the data by both City and Available status.
• AVG(): To calculate the average price for each group.
Solutions
• - PostgreSQL and MySQL solution
SELECT City,
CASE
WHEN Available = TRUE THEN 'Available'
ELSE 'Not Available'
END AS AvailabilityStatus,
AVG(Price) AS AveragePrice
FROM Listings
GROUP BY City, AvailabilityStatus
ORDER BY City, AvailabilityStatus;
• Q.387
Problem statement
Write an SQL query to find the total revenue generated by Airbnb listings on each day of
the week. Use the OrderDate from the Orders table.
Explanation
481
1000+ SQL Interview Questions & Answers | By Zero Analyst
• DAYOFWEEK() function to extract the day of the week from the OrderDate.
• SUM() to calculate the total revenue for each day of the week.
• Use a CASE statement to ensure that days are labeled with their corresponding names
(e.g., Monday, Tuesday).
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
ListingID INT,
TotalAmount DECIMAL(10, 2),
OrderDate DATE
);
Learnings
• DAYOFWEEK(): Extracts the day of the week from a date.
• SUM(): Aggregates the total revenue.
• CASE: Used to label the days of the week.
Solutions
• - PostgreSQL and MySQL solution
SELECT
CASE
WHEN DAYOFWEEK(OrderDate) = 1 THEN 'Sunday'
WHEN DAYOFWEEK(OrderDate) = 2 THEN 'Monday'
WHEN DAYOFWEEK(OrderDate) = 3 THEN 'Tuesday'
WHEN DAYOFWEEK(OrderDate) = 4 THEN 'Wednesday'
WHEN DAYOFWEEK(OrderDate) = 5 THEN 'Thursday'
WHEN DAYOFWEEK(OrderDate) = 6 THEN 'Friday'
WHEN DAYOFWEEK(OrderDate) = 7 THEN 'Saturday'
END AS DayOfWeek,
SUM(TotalAmount) AS TotalRevenue
FROM Orders
GROUP BY DayOfWeek
ORDER BY FIELD(DayOfWeek, 'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday',
'Friday', 'Saturday');
• Q.388
Problem statement
Write an SQL query to calculate the average price of listings that were booked more than 3
times in the past month. Return the listing ID and average price.
Explanation
• DATE_SUB() and CURDATE() (or CURRENT_DATE()) to filter orders in the past
month.
• COUNT() to filter listings that have more than 3 bookings.
• CASE statement to calculate average price only for listings that meet the booking
condition.
482
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DATE_SUB(): Subtracts a specified interval from a date.
• COUNT(): Filters based on the number of bookings.
• CASE: Conditionally calculates the average price only for listings with more than 3
bookings.
Solutions
• - PostgreSQL and MySQL solution
SELECT ListingID,
AVG(Price) AS AvgPrice
FROM Listings
WHERE ListingID IN (
SELECT ListingID
FROM Orders
WHERE OrderDate >= DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH)
GROUP BY ListingID
HAVING COUNT(OrderID) > 3
)
GROUP BY ListingID;
• Q.389
Problem statement
Write an SQL query to calculate the average length of stay for each customer. Use the
BookingDate and CheckoutDate from the Bookings table. The result should also include the
total spending by each customer, and categorize the total spending using a CASE statement
into "Low", "Medium", and "High" based on the following criteria:
• Low: Total spending <= 500
• Medium: Total spending between 500 and 1000
483
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DATEDIFF(): To calculate the difference between two dates (length of stay).
• CASE: To categorize total spending into "Low", "Medium", and "High".
• AVG(): To calculate the average length of stay.
• SUM(): To calculate total spending.
Solutions
• - PostgreSQL and MySQL solution
SELECT CustomerID,
AVG(DATEDIFF(CheckoutDate, BookingDate)) AS AvgLengthOfStay,
SUM(TotalAmount) AS TotalSpending,
CASE
WHEN SUM(TotalAmount) <= 500 THEN 'Low'
WHEN SUM(TotalAmount) BETWEEN 500 AND 1000 THEN 'Medium'
ELSE 'High'
END AS SpendingCategory
FROM Bookings
GROUP BY CustomerID;
484
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.390
Problem statement
Write an SQL query to find the unique domain names from the Email column in the Users
table. The domain name should be everything after the "@" symbol.
Explanation
• Use SUBSTRING_INDEX() or REGEXP_SUBSTR() (depending on your SQL dialect)
to extract the domain part of the email.
• Use DISTINCT to return unique domain names.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Users (
UserID INT PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(100)
);
Learnings
• SUBSTRING_INDEX(): To extract the part of a string before or after a specified
delimiter.
• DISTINCT: To get unique values from a result set.
Solutions
• - PostgreSQL and MySQL solution
SELECT DISTINCT
REGEXP_SUBSTR(Email, '@(.+)$') AS Domain
FROM Users;
• Q.391
Problem statement
Write an SQL query to extract the first 5 characters of the description column from the
Listings table and return only the rows where the description starts with "Lux".
Explanation
• Use SUBSTRING() or REGEXP_SUBSTR() to extract the first 5 characters.
• Use REGEXP or LIKE to filter descriptions that start with "Lux".
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
Description VARCHAR(255),
Price DECIMAL(10, 2)
);
485
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• SUBSTRING(): To extract part of a string.
• LIKE: To filter strings based on a pattern.
• REGEXP: To filter strings using a regular expression.
Solutions
• - PostgreSQL and MySQL solution
SELECT ListingID,
SUBSTRING(Description, 1, 5) AS ShortDescription
FROM Listings
WHERE Description LIKE 'Lux%';
• Q.392
Problem statement
Write an SQL query to find all listings whose Description column contains a valid phone
number in the format "(xxx) xxx-xxxx" (where "x" is a digit). Return the ListingID and
Description.
Explanation
• Use REGEXP to match the phone number pattern in the description.
• The phone number should be in the format (xxx) xxx-xxxx, where x is a digit.
• Use REGEXP_LIKE() (in PostgreSQL) or REGEXP (in MySQL) to filter descriptions
that match this pattern.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
Description VARCHAR(255),
Price DECIMAL(10, 2)
);
Learnings
• REGEXP: To match a pattern in a string.
• Pattern Matching: Using regex to find a phone number in a specific format.
Solutions
• - PostgreSQL solution
SELECT ListingID,
Description
FROM Listings
WHERE Description ~ '\\(\\d{3}\\) \\d{3}-\\d{4}';
486
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT ListingID,
Description
FROM Listings
WHERE Description REGEXP '\\([0-9]{3}\\) [0-9]{3}-[0-9]{4}';
• Q.393
Problem statement
Write an SQL query to calculate the rank of each listing based on its price within the
Listings table, sorted from highest to lowest price. The query should return the ListingID,
Price, and Rank.
Explanation
• Use the RANK() window function to assign ranks to the listings based on the Price in
descending order.
• The RANK() function assigns the same rank to listings with the same price.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
Description VARCHAR(255),
Price DECIMAL(10, 2)
);
Learnings
• RANK(): To assign ranks based on the sorting order of a column.
• PARTITION BY (optional): Not used here but can be added if we want to partition the
data by some column.
Solutions
• - PostgreSQL and MySQL solution
SELECT ListingID, Price,
RANK() OVER (ORDER BY Price DESC) AS Rank
FROM Listings;
487
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.394
Problem statement
Write an SQL query to calculate the average price of listings in each price range quartile.
The price ranges should be divided into 4 equal quartiles. Return the Quartile (1 to 4) and
the average price in each quartile.
Explanation
• Use NTILE(4) to divide the listings into 4 quartiles based on the Price.
• Use AVG() to calculate the average price within each quartile.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Listings (
ListingID INT PRIMARY KEY,
Description VARCHAR(255),
Price DECIMAL(10, 2)
);
Learnings
• NTILE(): To divide the data into equal groups (quartiles, deciles, etc.).
• AVG(): To calculate the average of a set of values within each group.
Solutions
• - PostgreSQL and MySQL solution
SELECT NTILE(4) OVER (ORDER BY Price) AS Quartile,
AVG(Price) AS AvgPrice
FROM Listings
GROUP BY Quartile
ORDER BY Quartile;
• Q.395
Advanced Question:
Problem statement
Write an SQL query to find the moving average of prices over a 3-listing window based on
the Price column in the Listings table. Return the ListingID, Price, and 3-listing
moving average. The result should be sorted by ListingID in ascending order.
Explanation
• Use the AVG() window function with a 3-row sliding window to calculate the moving
average of the prices.
• Use ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING to create a 3-listing
window for calculating the moving average.
Datasets and SQL Schemas
• - Table creation and sample data
488
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• AVG(): To calculate the average of a window of rows.
• ROWS BETWEEN: To define the sliding window for the moving average.
• Sliding Window: Using ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING to
include 3 rows for the moving average.
Solutions
• - PostgreSQL and MySQL solution
SELECT ListingID, Price,
AVG(Price) OVER (ORDER BY ListingID
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS MovingAvgPrice
FROM Listings;
Key Concepts:
• NTILE(): To divide data into equal groups based on a numeric column (used for quartiles,
deciles, etc.).
• RANK(): Used to assign a rank to rows based on a specified ordering, with ties receiving
the same rank.
• Sliding Window (ROWS BETWEEN): A method to calculate running averages or other
aggregate functions over a sliding window of rows.
• Q.396
Problem statement
Write an SQL query to list all orders along with the customer name who placed the order. If
an order does not have a corresponding customer, still include the order with NULL for the
customer name.
Explanation
489
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
• Use a LEFT JOIN to include all orders, even if there is no matching customer.
• Match Order.CustomerID with Customer.CustomerID to get the corresponding customer
name.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE
);
Learnings
• LEFT JOIN: Includes all rows from the left table and matching rows from the right table,
returning NULL if no match is found.
• Join condition: The condition for matching CustomerID in both tables.
Solutions
• - PostgreSQL and MySQL solution
SELECT Orders.OrderID, Customers.CustomerName
FROM Orders
LEFT JOIN Customers
ON Orders.CustomerID = Customers.CustomerID;
• Q.397
Problem statement
Write an SQL query to find all customers who have placed more than one order. List the
customer name and the total number of orders they have placed.
Explanation
To solve this:
490
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• INNER JOIN: Returns only rows with matching values from both tables.
• GROUP BY: Groups results based on a specific column, useful for aggregation.
• HAVING: Filters groups after aggregation (e.g., count > 1).
Solutions
• - PostgreSQL and MySQL solution
SELECT Customers.CustomerName, COUNT(Orders.OrderID) AS TotalOrders
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
GROUP BY Customers.CustomerName
HAVING COUNT(Orders.OrderID) > 1;
• Q.398
Problem statement
Write an SQL query to find the top 3 customers who have placed the most number of orders
in January 2024. Return the customer name, number of orders, and their rank based on the
total number of orders.
Explanation
491
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
• Use an INNER JOIN between Customers and Orders.
• Filter orders placed in January 2024 using the WHERE clause.
• Use COUNT() to count the number of orders per customer.
• Use RANK() to rank customers based on the number of orders placed.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE
);
Learnings
• INNER JOIN: To join data from both tables based on matching customer IDs.
• COUNT(): To count the number of orders per customer.
• RANK(): To assign ranks based on the number of orders placed.
• WHERE clause: To filter orders based on a specific date range.
Solutions
• - PostgreSQL and MySQL solution
WITH OrderCount AS (
SELECT Customers.CustomerName, COUNT(Orders.OrderID) AS TotalOrders
FROM Orders
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.OrderDate BETWEEN '2024-01-01' AND '2024-01-31'
GROUP BY Customers.CustomerName
492
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
SELECT CustomerName, TotalOrders,
RANK() OVER (ORDER BY TotalOrders DESC) AS Rank
FROM OrderCount
WHERE Rank <= 3;
Key Concepts:
• JOIN types (INNER, LEFT): Use to combine rows from multiple tables.
• GROUP BY: Group data based on specific column(s) for aggregation.
• COUNT(): Aggregate function to count rows.
• RANK(): Assigns a rank to each row based on an ordering condition.
• Q.399
Problem Statement
You are tasked with analyzing the bookings for an online platform across multiple countries.
The platform tracks bookings for properties, and you need to find the top 3 countries based
on the total booking amount for the year 2024, along with the average booking amount per
property in each country.
Additionally, you need to calculate the percentage contribution of each country to the total
booking amount for 2024.
Explanation
To solve this problem:
• Sum the total booking amount for each country for the year 2024.
• Calculate the average booking amount per property for each country.
• Calculate the percentage contribution of each country to the total booking amount for
2024.
• Rank the countries based on the total booking amount and return the top 3 countries.
You will use JOINs to link the booking data with country information, GROUP BY to
aggregate the results by country, and WINDOW FUNCTIONS to calculate percentage
contribution and rankings.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Bookings (
BookingID INT PRIMARY KEY,
PropertyID INT,
CountryID INT,
BookingDate DATE,
TotalAmount DECIMAL(10, 2)
);
--
INSERT INTO Bookings (BookingID, PropertyID, CountryID, BookingDate, TotalAmount)
VALUES
493
1000+ SQL Interview Questions & Answers | By Zero Analyst
--
INSERT INTO Properties (PropertyID, PropertyName, CountryID)
VALUES
(101, 'Beach House', 1),
(102, 'Mountain Cabin', 1),
(103, 'City Apartment', 2),
(104, 'Lakefront Villa', 2),
(105, 'Country Inn', 3),
(106, 'Luxury Condo', 3),
(107, 'Eco Retreat', 1),
(108, 'Grand Resort', 4),
(109, 'Beachfront Villa', 4),
(110, 'Downtown Loft', 2),
(111, 'Mountain Lodge', 1),
(112, 'Seaside Cottage', 3),
(113, 'Vineyard Estate', 4),
(114, 'Forest Bungalow', 3),
(115, 'Skyline Penthouse', 1);
--
INSERT INTO Countries (CountryID, CountryName)
VALUES
(1, 'USA'),
(2, 'Canada'),
(3, 'Mexico'),
(4, 'France');
Learnings
• JOIN: You need to join the Bookings, Properties, and Countries tables on CountryID
and PropertyID.
• GROUP BY: To aggregate the total booking amount and average booking amount per
country.
• SUM(): To calculate the total booking amount per country.
• AVG(): To calculate the average booking amount per property in each country.
• RANK(): To rank countries based on the total booking amount.
• WINDOW FUNCTION: Use SUM() OVER to calculate the total bookings for percentage
contribution.
Solutions
PostgreSQL and MySQL solution
WITH CountryBookings AS (
SELECT
c.CountryName,
SUM(b.TotalAmount) AS TotalBookingAmount,
COUNT(DISTINCT p.PropertyID) AS TotalProperties,
AVG(b.TotalAmount) AS AvgBookingAmountPerProperty
FROM Bookings b
JOIN Properties p ON b.PropertyID = p.PropertyID
494
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
• CountryBookings CTE (Common Table Expression):
• Joins Bookings, Properties, and Countries tables to get the total booking amount for
each country.
• It calculates the TotalBookingAmount (sum of TotalAmount) and the
AvgBookingAmountPerProperty (average of TotalAmount per property).
• Filters bookings for the year 2024 using EXTRACT(YEAR FROM b.BookingDate) = 2024.
• Groups the result by CountryName.
• TotalRevenue CTE:
• This calculates the TotalRevenueFor2024 (sum of all booking amounts for the year 2024)
to compute the percentage contribution of each country.
• Main Query:
• Joins the CountryBookings CTE with TotalRevenue to get the TotalBookingAmount
for each country and calculate the PercentageContribution of each country to the total
revenue for 2024.
• RANK() OVER: Ranks the countries based on their total booking amounts in descending
order.
• WHERE Rank <= 3: Filters to get the top 3 countries based on the booking amount.
• ROUND(): Rounds the percentage contribution to two decimal places.
• ORDER BY Rank: Orders the results by the rank in ascending order.
495
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Concepts:
• JOINs: Combining data from multiple tables (Bookings, Properties, Countries).
• SUM() and AVG(): Aggregate functions to calculate total revenue and average booking
amounts.
• RANK(): Ranking countries based on total booking amounts.
• EXTRACT(): Function to extract the year from a date for filtering.
• Window functions: Used for ranking and percentage calculations.
• Q.400
Problem Statement
You are tasked with analyzing the cancellations of bookings in an online property platform.
The platform tracks bookings and cancellations, and you need to find the top 3 countries
where cancellations are most frequent. Along with this, you need to calculate the
cancellation rate for each country in the year 2024, based on the ratio of cancelled bookings
to total bookings for each country.
Additionally, the result should include the total cancellations, total bookings, and
cancellation rate for each country, and rank the countries by the cancellation rate in
descending order.
Explanation
To solve this:
• Identify the cancellations: Track bookings where cancellations are marked (there will be
a flag or status indicating cancellation).
• Count the total bookings and cancellations for each country in 2024.
• Calculate the cancellation rate as the ratio of cancellations to total bookings.
• Rank the countries by their cancellation rate and return the top 3.
You will need to use JOINs to link the booking data with cancellation information and
GROUP BY to aggregate results by country. Window functions will be used for ranking
countries based on cancellation rates.
--
496
1000+ SQL Interview Questions & Answers | By Zero Analyst
--
INSERT INTO Properties (PropertyID, PropertyName, CountryID)
VALUES
(101, 'Beach House', 1),
(102, 'Mountain Cabin', 1),
(103, 'City Apartment', 2),
(104, 'Lakefront Villa', 2),
(105, 'Country Inn', 3),
(106, 'Luxury Condo', 3),
(107, 'Eco Retreat', 1),
(108, 'Grand Resort', 4),
(109, 'Beachfront Villa', 4),
(110, 'Downtown Loft', 2),
(111, 'Mountain Lodge', 1),
(112, 'Seaside Cottage', 3),
(113, 'Vineyard Estate', 4),
(114, 'Forest Bungalow', 3),
(115, 'Skyline Penthouse', 1);
--
INSERT INTO Countries (CountryID, CountryName)
VALUES
(1, 'USA'),
(2, 'Canada'),
(3, 'Mexico'),
(4, 'France');
Learnings
• JOIN: Joining multiple tables (Bookings, Properties, and Countries) based on CountryID
and PropertyID.
• COUNT(): Used to count the total number of bookings and cancellations.
• SUM(): To calculate the total cancellations in each country.
• GROUP BY: To aggregate the results by country.
• CASE WHEN: To count cancellations and non-cancellations.
• RANK(): Window function to rank countries by cancellation rate.
Solutions
PostgreSQL and MySQL Solution
WITH CountryCancellationStats AS (
SELECT
c.CountryName,
COUNT(b.BookingID) AS TotalBookings,
SUM(CASE WHEN b.IsCancelled = TRUE THEN 1 ELSE 0 END) AS TotalCancellations,
497
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
• CountryCancellationStats CTE:
• Joins Bookings, Properties, and Countries tables to get the cancellation statistics for
each country in the year 2024.
• COUNT(b.BookingID) counts the total number of bookings for each country.
• SUM(CASE WHEN b.IsCancelled = TRUE THEN 1 ELSE 0 END) calculates the total
cancellations for each country.
• ROUND(SUM(CASE WHEN b.IsCancelled = TRUE THEN 1 ELSE 0 END) * 100.0 /
COUNT(b.BookingID), 2) calculates the cancellation rate (percentage of cancellations).
• TotalBookings CTE:
• This is used to get the total number of bookings for 2024 across all countries (though not
used in the final output, it could be used for a more detailed analysis of total bookings).
• Main Query:
• The RANK() window function is used to rank countries based on their cancellation rate in
descending order.
• Filters countries with cancellation rates greater than 0 to focus only on countries with
actual cancellations.
• The LIMIT 3 clause is used to return the top 3 countries with the highest cancellation
rates.
• Output:
• The output will include the CountryName, TotalBookings, TotalCancellations,
CancellationRate, and the Rank based on the cancellation rate.
USA 10 4 40.00 1
France 5 2 40.00 1
498
1000+ SQL Interview Questions & Answers | By Zero Analyst
Mexico 6 1 16.67 3
This query will return the top 3 countries with the highest cancellation rates for the year
2024, along with the total bookings, total cancellations, and cancellation rates.
Key Concepts:
• JOIN: To combine data from multiple tables (Bookings, Properties, Countries).
• COUNT() and SUM(): Aggregate functions to calculate total bookings and cancellations.
• CASE WHEN: For conditional counting (for cancellations).
• RANK(): To rank countries by their cancellation rate.
• EXTRACT(): Used to filter bookings from the year 2024.
Microsoft
• Q.401
Question
Find the top 3 customers who have spent the most on products from the 'Electronics'
category, but only count purchases made within the last 30 days. Consider both the price of
the product and the quantity purchased.
Explanation
You need to join the "customers," "orders," and "products" tables, filter for the "Electronics"
category, and calculate the total spend for each customer within the last 30 days. Rank the
customers based on their total spend and retrieve the top 3.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
499
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data based on DATE conditions (last 30 days).
• Using JOINs to combine multiple tables.
• SUM() and GROUP BY to calculate the total spend for each customer.
• Using ROW_NUMBER() or RANK() to rank customers by their total spend.
Solutions
• - PostgreSQL solution
WITH recent_purchases AS (
SELECT o.customer_id, SUM(oi.quantity * p.price) AS total_spend
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE p.category = 'Electronics'
AND o.order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY o.customer_id
)
SELECT customer_id, total_spend
FROM recent_purchases
ORDER BY total_spend DESC
LIMIT 3;
• - MySQL solution
WITH recent_purchases AS (
SELECT o.customer_id, SUM(oi.quantity * p.price) AS total_spend
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
WHERE p.category = 'Electronics'
AND o.order_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY o.customer_id
)
SELECT customer_id, total_spend
FROM recent_purchases
ORDER BY total_spend DESC
LIMIT 3;
• Q.402
Question
Write an SQL query to find the team size of each employee from the Employee table.
500
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
The task is to count the number of employees in each team and return that count for every
employee. This can be achieved by using a COUNT() aggregate function with a JOIN or a
GROUP BY clause.
Learnings
• Use of COUNT() to calculate team size.
• Understanding of JOIN or GROUP BY to aggregate data.
• Familiarity with how to display aggregated results for each row.
Solutions
• - PostgreSQL solution
SELECT e.employee_id, COUNT(*) AS team_size
FROM Employee e
JOIN Employee e2 ON e.team_id = e2.team_id
GROUP BY e.employee_id;
• - MySQL solution
SELECT e.employee_id, COUNT(*) AS team_size
FROM Employee e
JOIN Employee e2 ON e.team_id = e2.team_id
GROUP BY e.employee_id;
• Q.403
Question
Write an SQL query to find the customer_id from the Customer table who bought all the
products in the Product table.
Explanation
The query needs to find customers who have purchased every product listed in the Product
table. This can be done by counting the products each customer has bought and comparing it
with the total number of products in the Product table.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Customer (
customer_id INT,
product_key INT,
FOREIGN KEY (product_key) REFERENCES Product(product_key)
);
501
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learnings
• Use of COUNT() to check the number of products each customer bought.
• Understanding of GROUP BY and HAVING to filter customers who meet a certain condition.
• Subquery to compare counts and ensure all products are bought by the customer.
Solutions
• - PostgreSQL solution
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product);
• - MySQL solution
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product);
• Q.404
Question
Write an SQL query to find the person_name of the last person who can board the bus
without exceeding the weight limit of 1000 kilograms.
Explanation
The goal is to compute the cumulative weight of people boarding the bus in their given order
(defined by the turn column). The first person to exceed the weight limit should not be
included. To solve this, we can use a window function or a running total to check the
cumulative weight.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Queue (
person_id INT PRIMARY KEY,
person_name VARCHAR(100),
weight INT,
turn INT
);
Learnings
• Use of window functions like SUM() to calculate running totals.
502
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of window functions like SUM() to calculate a running total across rows.
• Understanding how to order rows with ORDER BY within window functions.
503
1000+ SQL Interview Questions & Answers | By Zero Analyst
504
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of JOIN to match rows from two tables based on common columns (id and year).
• Understanding of primary keys and how they help in efficiently joining tables.
• Basic knowledge of how to filter and return results from multiple tables.
Solutions
• - PostgreSQL solution
SELECT q.id, q.year, COALESCE(n.npv, 0) AS npv
FROM Queries q
LEFT JOIN NPV n ON q.id = n.id AND q.year = n.year;
• - MySQL solution
SELECT q.id, q.year, COALESCE(n.npv, 0) AS npv
FROM Queries q
LEFT JOIN NPV n ON q.id = n.id AND q.year = n.year;
• Q.407
Question
Write an SQL query to find the names of all activities that have neither the maximum nor the
minimum number of participants.
Explanation
The task is to identify activities that do not have the highest or lowest number of participants.
This can be achieved by first counting the number of participants for each activity, and then
filtering out those with the maximum and minimum participant counts.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Friends (
id INT PRIMARY KEY,
name VARCHAR(100),
activity VARCHAR(100)
);
505
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT() to count participants per activity.
• Filtering results based on aggregate values (maximum and minimum).
• Understanding how to exclude specific conditions with NOT IN.
Solutions
• - PostgreSQL solution
WITH activity_counts AS (
SELECT activity, COUNT(*) AS participant_count
FROM Friends
GROUP BY activity
),
min_max_counts AS (
SELECT MIN(participant_count) AS min_participants,
MAX(participant_count) AS max_participants
FROM activity_counts
)
SELECT ac.activity
FROM activity_counts ac
JOIN min_max_counts mm ON ac.participant_count NOT IN (mm.min_participants,
mm.max_participants);
• - MySQL solution
WITH activity_counts AS (
SELECT activity, COUNT(*) AS participant_count
FROM Friends
GROUP BY activity
),
min_max_counts AS (
SELECT MIN(participant_count) AS min_participants,
MAX(participant_count) AS max_participants
FROM activity_counts
)
SELECT ac.activity
FROM activity_counts ac
JOIN min_max_counts mm ON ac.participant_count NOT IN (mm.min_participants,
mm.max_participants);
• Q.408
Question
Write an SQL query to find the countries where the average call duration is strictly greater
than the global average call duration.
Explanation
To solve this problem:
• Calculate the global average call duration from the Calls table.
• Calculate the average call duration for each country, using the country code derived from
the caller's phone number.
• Filter out the countries where the average call duration is greater than the global average.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Person (
id INT PRIMARY KEY,
name VARCHAR(100),
phone_number VARCHAR(15)
);
506
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• The use of JOIN to combine multiple tables for related information.
• Aggregating data with AVG() to compute the average call duration for each country.
• Filtering results using subqueries and comparison operators (>) to compare with the global
average.
Solutions
• - PostgreSQL solution
WITH country_avg AS (
SELECT c.name AS country_name,
AVG(call.duration) AS avg_duration
FROM Calls call
JOIN Person p1 ON call.caller_id = p1.id
JOIN Person p2 ON call.callee_id = p2.id
JOIN Country c ON SUBSTRING(p1.phone_number FROM 1 FOR 3) = c.country_code
GROUP BY c.name
),
global_avg AS (
SELECT AVG(duration) AS avg_duration
FROM Calls
)
507
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT ca.country_name
FROM country_avg ca, global_avg ga
WHERE ca.avg_duration > ga.avg_duration;
• - MySQL solution
WITH country_avg AS (
SELECT c.name AS country_name,
AVG(call.duration) AS avg_duration
FROM Calls call
JOIN Person p1 ON call.caller_id = p1.id
JOIN Person p2 ON call.callee_id = p2.id
JOIN Country c ON SUBSTRING(p1.phone_number, 1, 3) = c.country_code
GROUP BY c.name
),
global_avg AS (
SELECT AVG(duration) AS avg_duration
FROM Calls
)
SELECT ca.country_name
FROM country_avg ca, global_avg ga
WHERE ca.avg_duration > ga.avg_duration;
508
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using ROW_NUMBER() or RANK() window functions to assign ranks to rows.
• Calculating median by identifying the middle row(s) based on row numbers.
• Handling even vs. odd cases for median calculation.
Solutions
• - PostgreSQL solution
WITH ranked_countries AS (
SELECT population,
ROW_NUMBER() OVER (ORDER BY population) AS row_num,
COUNT(*) OVER () AS total_rows
FROM Country
)
SELECT AVG(population) AS median_population
FROM ranked_countries
WHERE row_num IN (
(total_rows + 1) / 2,
(total_rows + 2) / 2
);
• - MySQL solution
WITH ranked_countries AS (
SELECT population,
ROW_NUMBER() OVER (ORDER BY population) AS row_num,
COUNT(*) OVER () AS total_rows
FROM Country
)
SELECT AVG(population) AS median_population
FROM ranked_countries
WHERE row_num IN (
(total_rows + 1) / 2,
(total_rows + 2) / 2
);
Question
Write an SQL query to convert the number of users from thousands to millions, rounding the
result to two decimal places, and append "M" to the result.
Example Output:
India 2.54M
509
1000+ SQL Interview Questions & Answers | By Zero Analyst
China 13.90M
Indonesia 2.71M
Brazil 2.12M
Explanation
You need to convert the "Number of Users" from thousands to millions by dividing by 1,000,
then round the result to two decimal places, and finally append "M" to indicate millions.
Datasets
Table creation and sample data:
CREATE TABLE Users (
Country VARCHAR(100),
NumberOfUsers INT
);
Learnings
• Use of SQL functions for mathematical calculations (division, rounding).
• String concatenation to append units (e.g., "M").
• Rounding numbers to a specified decimal place in SQL.
510
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL solution:
SELECT Country,
CONCAT(ROUND(NumberOfUsers / 1000000, 2), 'M') AS UsersInMillions
FROM Users;
PostgreSQL solution:
SELECT Country,
ROUND(NumberOfUsers / 1000000.0, 2) || 'M' AS UsersInMillions
FROM Users;
• Q.411
Question
Find the top 3 highest-paid employees in each department from the Employee table.
Explanation
To solve this, you need to rank employees within each department based on their salary and
return the top 3 for each department. This can be achieved using window functions like
ROW_NUMBER() or RANK().
Datasets
Table creation and sample data:
CREATE TABLE Employee (
id INT,
name VARCHAR(50),
department VARCHAR(50),
salary INT
);
Learnings
• Use of window functions (ROW_NUMBER() or RANK()) to rank results.
• Partitioning results based on a group (e.g., department).
• Filtering top N results using window functions.
511
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL solution:
WITH RankedEmployees AS (
SELECT id, name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM Employee
)
SELECT id, name, department, salary
FROM RankedEmployees
WHERE rank <= 3;
PostgreSQL solution:
WITH RankedEmployees AS (
SELECT id, name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM Employee
)
SELECT id, name, department, salary
FROM RankedEmployees
WHERE rank <= 3;
• Q.412
Question
Write an SQL query to calculate the cumulative salary total for each department,
ordered by employee salary.
Explanation
You need to calculate the running total of salary for each department, ordered by the salary of
employees. This can be done using the window function SUM() with the OVER() clause.
Datasets
Table creation and sample data:
CREATE TABLE Employee (
id INT,
name VARCHAR(50),
department VARCHAR(50),
salary INT
);
Learnings
• Use of window functions to calculate cumulative totals (SUM()).
• Ordering and partitioning results to apply calculations across different groups.
512
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL solution:
SELECT id, name, department, salary,
SUM(salary) OVER (PARTITION BY department ORDER BY salary) AS cumulative_salary
FROM Employee;
PostgreSQL solution:
SELECT id, name, department, salary,
SUM(salary) OVER (PARTITION BY department ORDER BY salary) AS cumulative_salary
FROM Employee;
• Q.413
Question
Write an SQL query to find the number of employees in each department who have a
salary greater than the average salary of their department.
Explanation
You need to compare each employee's salary with the average salary of their department.
This can be done using a subquery to first calculate the average salary per department and
then comparing each employee’s salary to this average.
Datasets
Table creation and sample data:
CREATE TABLE Employee (
id INT,
name VARCHAR(50),
department VARCHAR(50),
salary INT
);
Learnings
• Use of subqueries to calculate aggregates (like average salary).
• Filtering based on comparison with aggregates.
• Grouping results by department.
Solutions
MySQL solution:
SELECT department, COUNT(*) AS num_employees_above_avg
FROM Employee e
513
1000+ SQL Interview Questions & Answers | By Zero Analyst
WHERE salary > (SELECT AVG(salary) FROM Employee WHERE department = e.department)
GROUP BY department;
PostgreSQL solution:
SELECT department, COUNT(*) AS num_employees_above_avg
FROM Employee e
WHERE salary > (SELECT AVG(salary) FROM Employee WHERE department = e.department)
GROUP BY department;
• Q.414
Question
Find the second most expensive product in each category from the Products table.
Explanation
To find the second most expensive product in each category, you can use a subquery or
window functions like RANK() to rank products within each category by price and then filter
out the highest price.
Datasets
Table creation and sample data:
CREATE TABLE Products (
product_id INT,
category VARCHAR(50),
product_name VARCHAR(100),
price DECIMAL(10, 2)
);
Learnings
• Use of window functions like RANK() or ROW_NUMBER() to rank products within categories.
• Subqueries to filter based on rank or price comparison.
Solutions
MySQL solution:
WITH RankedProducts AS (
SELECT product_id, category, product_name, price,
RANK() OVER (PARTITION BY category ORDER BY price DESC) AS rank
FROM Products
)
SELECT product_id, category, product_name, price
FROM RankedProducts
514
1000+ SQL Interview Questions & Answers | By Zero Analyst
WHERE rank = 2;
PostgreSQL solution:
WITH RankedProducts AS (
SELECT product_id, category, product_name, price,
RANK() OVER (PARTITION BY category ORDER BY price DESC) AS rank
FROM Products
)
SELECT product_id, category, product_name, price
FROM RankedProducts
WHERE rank = 2;
• Q.415
Question 2
Write an SQL query to count the number of orders placed by each customer, including
those who haven't placed any orders.
Explanation
To count the number of orders for each customer, including customers without any orders,
you will need to perform a LEFT JOIN between the Customers and Orders tables and then
count the orders per customer.
Datasets
Table creation and sample data:
CREATE TABLE Customers (
customer_id INT,
customer_name VARCHAR(100)
);
Learnings
• LEFT JOIN to include rows even when there’s no matching data.
• Using COUNT() with GROUP BY to aggregate data.
515
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL solution:
SELECT c.customer_id, c.customer_name, COUNT(o.order_id) AS order_count
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id;
PostgreSQL solution:
SELECT c.customer_id, c.customer_name, COUNT(o.order_id) AS order_count
FROM Customers c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id;
• Q.416
Question
Write an SQL query to find the employees who have the highest salary in their
department but are not the highest paid overall.
Explanation
To solve this, you need to find the highest salary in each department, then compare it to the
highest salary overall to filter out the top earner.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT,
department VARCHAR(50),
employee_name VARCHAR(100),
salary DECIMAL(10, 2)
);
Learnings
• Subqueries to find the maximum salary in a department.
• Using MAX() to find the top salary overall and then applying conditions.
Solutions
MySQL solution:
516
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
SELECT employee_id, department, employee_name, salary
FROM Employees e
WHERE salary = (SELECT MAX(salary) FROM Employees WHERE department = e.department)
AND salary < (SELECT MAX(salary) FROM Employees);
• Q.417
Question
Write an SQL query to report the latest login for all users in the year 2020. Do not include
users who did not log in in 2020.
The query result should look like this:
Explanation
You need to filter out only the users who logged in during the year 2020, then for each of
these users, find the latest login timestamp within that year. The result should be formatted in
the desired output format, with a Z appended to the timestamp for UTC timezone.
Datasets
Table creation and sample data:
CREATE TABLE Logins (
user_id INT,
time_stamp DATETIME,
PRIMARY KEY (user_id, time_stamp)
);
Learnings
• Using YEAR() function to filter rows by year.
• Using GROUP BY to group data by user_id.
• Using MAX() to select the latest timestamp.
• Formatting the datetime output.
Solutions
MySQL solution:
SELECT user_id,
517
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
SELECT user_id,
TO_CHAR(MAX(time_stamp), 'YYYY-MM-DD"T"HH24:MI:SS"Z"') AS last_stamp
FROM Logins
WHERE EXTRACT(YEAR FROM time_stamp) = 2020
GROUP BY user_id;
• Q.418
Question
Write an SQL query to find managers with at least five direct reports.
Return the result table in any order.
Explanation
To solve this, we need to count how many direct reports each manager has. This can be done
by grouping by the managerId and counting how many employees are associated with each
managerId. Then, we filter out managers who have fewer than five direct reports.
Datasets
Table creation and sample data:
CREATE TABLE Employee (
id INT,
name VARCHAR(100),
department VARCHAR(50),
managerId INT,
PRIMARY KEY (id)
);
Learnings
• Grouping data by manager using managerId.
• Counting direct reports using COUNT().
• Filtering based on conditions using HAVING.
Solutions
MySQL solution:
SELECT e.name
FROM Employee e
518
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
SELECT e.name
FROM Employee e
JOIN Employee sub ON e.id = sub.managerId
GROUP BY e.id
HAVING COUNT(sub.id) >= 5;
• Q.419
Question
Write an SQL query to find the confirmation rate of each user. The confirmation rate of a
user is calculated as the number of 'confirmed' actions divided by the total number of
confirmation requests (both 'confirmed' and 'timeout') for that user. If the user did not request
any confirmation messages, their confirmation rate should be 0. Round the confirmation rate
to two decimal places.
Explanation
To calculate the confirmation rate for each user, we need to:
• Count the total number of confirmation requests for each user (both 'confirmed' and
'timeout').
• Count the number of 'confirmed' requests for each user.
• Divide the number of 'confirmed' requests by the total number of confirmation requests.
• For users with no confirmation requests, the rate will be 0.
• Round the confirmation rate to two decimal places.
We'll join the Signups and Confirmations tables based on user_id, and use aggregate
functions to calculate the total and confirmed counts for each user.
Datasets
Table creation and sample data:
CREATE TABLE Signups (
user_id INT PRIMARY KEY,
time_stamp DATETIME
);
519
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(3, '2021-01-06 03:30:46', 'timeout'),
(3, '2021-07-14 14:00:00', 'timeout'),
(7, '2021-06-12 11:57:29', 'confirmed'),
(7, '2021-06-13 12:58:28', 'confirmed'),
(7, '2021-06-14 13:59:27', 'confirmed'),
(2, '2021-01-22 00:00:00', 'confirmed'),
(2, '2021-02-28 23:59:59', 'timeout');
Learnings
• Use of JOIN to combine data from multiple tables.
• Use of conditional aggregation with COUNT() to filter the confirmation actions.
• Handling division and conditional logic (e.g., dividing by zero when there are no requests).
• Rounding to a specified number of decimal places using SQL functions.
Solutions
MySQL solution:
SELECT s.user_id,
ROUND(
IFNULL(SUM(CASE WHEN c.action = 'confirmed' THEN 1 ELSE 0 END), 0)
/ IFNULL(COUNT(c.action), 1), 2) AS confirmation_rate
FROM Signups s
LEFT JOIN Confirmations c ON s.user_id = c.user_id
GROUP BY s.user_id;
Explanation:
• We join the Signups table with the Confirmations table using a LEFT JOIN to include
users who didn't request any confirmation messages.
• We count the number of 'confirmed' actions using a CASE statement inside the SUM()
function.
• We use COUNT(c.action) to calculate the total number of confirmation messages (both
'confirmed' and 'timeout').
• The IFNULL() function is used to handle cases where there are no confirmation requests by
returning 0 for the confirmed count and 1 for the total count.
• Finally, the ROUND() function is used to round the result to two decimal places.
PostgreSQL solution:
SELECT s.user_id,
ROUND(
COALESCE(SUM(CASE WHEN c.action = 'confirmed' THEN 1 ELSE 0 END), 0)
/ COALESCE(COUNT(c.action), 1), 2) AS confirmation_rate
FROM Signups s
LEFT JOIN Confirmations c ON s.user_id = c.user_id
GROUP BY s.user_id;
Explanation:
• Similar to the MySQL solution, but uses COALESCE() instead of IFNULL() to handle null
values.
• Q.420
Question
520
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to find all possible pairs of users from the Signups table where the
user_id of the first user is less than the user_id of the second user. For each pair, return the
user_id of both users along with the difference in their signup timestamps in seconds. Only
include pairs where the time difference between their signups is less than 1 year.
Explanation
This problem requires:
• Cross Join: To generate all possible combinations of users from the Signups table.
• Time Difference Calculation: To calculate the time difference between the two users'
signup timestamps in seconds.
• Filtering: To include only those pairs where the time difference is less than 1 year.
• Condition on user_id: Ensure that the first user_id is less than the second to avoid
duplicate pairs (e.g., (user_1, user_2) and (user_2, user_1)).
Datasets
Table creation and sample data:
CREATE TABLE Signups (
user_id INT PRIMARY KEY,
time_stamp DATETIME
);
Learnings
• Cross Join to generate all possible pairs of rows.
• Date Difference: Using TIMESTAMPDIFF() or equivalent functions to calculate the
difference between timestamps.
• Filtering pairs based on conditions.
• Order of user_id: Ensuring we only return pairs where user_id_1 < user_id_2.
Solutions
MySQL solution:
SELECT s1.user_id AS user_id_1,
s2.user_id AS user_id_2,
TIMESTAMPDIFF(SECOND, s1.time_stamp, s2.time_stamp) AS time_diff_seconds
FROM Signups s1
CROSS JOIN Signups s2
WHERE s1.user_id < s2.user_id
AND TIMESTAMPDIFF(SECOND, s1.time_stamp, s2.time_stamp) < 60 * 60 * 24 * 365
ORDER BY user_id_1, user_id_2;
Explanation:
521
1000+ SQL Interview Questions & Answers | By Zero Analyst
• We use a CROSS JOIN to generate all possible pairs of users from the Signups table.
• The TIMESTAMPDIFF() function calculates the difference between the two time_stamp
values in seconds.
• The WHERE clause ensures:
• Only pairs where user_id_1 < user_id_2 are considered.
• Only pairs where the time difference is less than 1 year (365 days, expressed in seconds).
• The result is ordered by user_id_1 and user_id_2 to ensure consistent pairing.
PostgreSQL solution:
SELECT s1.user_id AS user_id_1,
s2.user_id AS user_id_2,
EXTRACT(EPOCH FROM s2.time_stamp - s1.time_stamp) AS time_diff_seconds
FROM Signups s1
CROSS JOIN Signups s2
WHERE s1.user_id < s2.user_id
AND EXTRACT(EPOCH FROM s2.time_stamp - s1.time_stamp) < 60 * 60 * 24 * 365
ORDER BY user_id_1, user_id_2;
Explanation:
• In PostgreSQL, we use EXTRACT(EPOCH FROM ...) to calculate the time difference
between the two timestamps in seconds.
• The rest of the query structure is similar to MySQL, ensuring the correct ordering and
filtering.
Meta
• Q.421
Question
Write an SQL query to calculate the overall acceptance rate of friend requests. The
acceptance rate is the number of accepted requests divided by the total number of requests.
Return the result rounded to two decimal places.
• The FriendRequest table contains the ID of the user who sent the request, the ID of the
user who received the request, and the date of the request.
• The RequestAccepted table contains the ID of the user who sent the request, the ID of the
user who received the request, and the date when the request was accepted.
• If there are no requests at all, return an acceptance rate of 0.00.
Explanation
To solve this:
• We need to count the total number of unique requests in the FriendRequest table. This
involves counting distinct pairs of sender and receiver (i.e., sender_id and send_to_id).
• We need to count the total number of unique accepted requests from the RequestAccepted
table, which is again based on distinct sender and receiver pairs (i.e., requester_id and
accepter_id).
• The acceptance rate is the ratio of accepted requests to total requests, and we return this
value rounded to two decimal places.
• If no requests exist, we return 0.00.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE FriendRequest (
522
1000+ SQL Interview Questions & Answers | By Zero Analyst
sender_id INT,
send_to_id INT,
request_date DATE
);
Learnings
• COUNT(DISTINCT) to count unique combinations of sender and receiver, ensuring that
duplicates are not counted.
• JOIN between tables may be needed to match accepted requests and sent requests.
• ROUND() to round the final acceptance rate to two decimal places.
• Handling cases where there may be no requests at all by returning a default value of 0.00.
Solutions
• - PostgreSQL and MySQL solution
SELECT
ROUND(
IFNULL(
(SELECT COUNT(DISTINCT requester_id, accepter_id)
FROM RequestAccepted), 0)
/
IFNULL(
(SELECT COUNT(DISTINCT sender_id, send_to_id)
FROM FriendRequest), 0),
2) AS unique_accepted_request;
Explanation:
• COUNT(DISTINCT requester_id, accepter_id): This counts the number of unique
accepted requests, considering only distinct pairs of requesters and accepters.
• COUNT(DISTINCT sender_id, send_to_id): This counts the number of unique requests
in the FriendRequest table.
• IFNULL: This function ensures that in case there are no records in either table (to avoid
division by zero), we return 0. For MySQL, IFNULL() is used; for PostgreSQL, you can use
COALESCE().
• ROUND(): The result is rounded to two decimal places to match the required output
format.
This query returns the acceptance rate of requests, rounded to two decimal places, and
handles edge cases where no requests or acceptances exist.
523
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.422
Question
Write an SQL query to find the number of users who signed up each month in 2020. Return
the result with the month and the corresponding count of users.
Explanation
You need to group the users by their signup month and count the number of signups in each
month for the year 2020. Use the MONTH() function to extract the month from the
time_stamp field and filter only for the year 2020.
Datasets
Table creation and sample data:
CREATE TABLE Signups (
user_id INT PRIMARY KEY,
time_stamp DATETIME
);
Learnings
• Use of MONTH() and YEAR() functions to extract date parts.
• Grouping by extracted date parts to aggregate data.
• Filtering data by specific year.
Solutions
MySQL solution:
SELECT MONTH(time_stamp) AS month, COUNT(user_id) AS user_count
FROM Signups
WHERE YEAR(time_stamp) = 2020
GROUP BY MONTH(time_stamp)
ORDER BY month;
PostgreSQL solution:
SELECT EXTRACT(MONTH FROM time_stamp) AS month, COUNT(user_id) AS user_count
FROM Signups
WHERE EXTRACT(YEAR FROM time_stamp) = 2020
GROUP BY month
ORDER BY month;
• Q.423
524
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to find the top 3 departments with the highest average salary in the
Employees table. If two departments have the same average salary, return both in the same
rank.
Explanation
You need to calculate the average salary for each department and rank them accordingly. In
the case of ties (same average salary), you should return both departments with the same
rank.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
Learnings
• Using AVG() for aggregation and ranking.
• Use of RANK() or DENSE_RANK() for handling ties.
• Ordering the results based on the calculated average salary.
Solutions
MySQL solution:
SELECT department, AVG(salary) AS avg_salary
FROM Employees
GROUP BY department
ORDER BY avg_salary DESC
LIMIT 3;
PostgreSQL solution:
SELECT department, AVG(salary) AS avg_salary
FROM Employees
GROUP BY department
ORDER BY avg_salary DESC
LIMIT 3;
• Q.424
525
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to find the top 5 users who logged in the most times in the last 30 days,
excluding users who did not log in during this period. Return the user_id and the total
number of logins for each user.
Explanation
To solve this:
• You need to count the number of logins for each user in the last 30 days.
• Exclude users who haven't logged in during the past 30 days.
• Sort by the number of logins and return the top 5 users.
Datasets
Table creation and sample data:
CREATE TABLE Logins (
user_id INT,
time_stamp DATETIME
);
Learnings
• Filtering data based on a date range using CURDATE() and INTERVAL.
• Using COUNT() to aggregate login data.
• Sorting and limiting the result set to get the top N users.
Solutions
MySQL solution:
SELECT user_id, COUNT(*) AS login_count
FROM Logins
WHERE time_stamp >= CURDATE() - INTERVAL 30 DAY
GROUP BY user_id
ORDER BY login_count DESC
LIMIT 5;
PostgreSQL solution:
SELECT user_id, COUNT(*) AS login_count
FROM Logins
WHERE time_stamp >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY user_id
ORDER BY login_count DESC
526
1000+ SQL Interview Questions & Answers | By Zero Analyst
LIMIT 5;
• Q.425
Question
Write an SQL query to find the total number of employees in each department. Return the
department name and the corresponding count of employees.
Explanation
You need to group the employees by their department and count the number of employees in
each department. This can be achieved using the GROUP BY clause.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50)
);
Learnings
• Use of GROUP BY to aggregate data by department.
• Counting rows for each group using COUNT().
Solutions
MySQL solution:
SELECT department, COUNT(*) AS employee_count
FROM Employees
GROUP BY department;
PostgreSQL solution:
SELECT department, COUNT(*) AS employee_count
FROM Employees
GROUP BY department;
Medium Question
527
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to find all employees who have the same manager as at least one other
employee. Return their employee_id, name, and managerId.
Explanation
You need to identify employees who share the same managerId. This requires joining the
table with itself to compare employees’ managerId and filtering out those who don’t share it
with anyone.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
managerId INT
);
Learnings
• Self-joins to compare rows within the same table.
• Filtering results using conditions on joined tables.
• Handling NULL values for employees without managers.
Solutions
MySQL solution:
SELECT e1.employee_id, e1.name, e1.managerId
FROM Employees e1
JOIN Employees e2
ON e1.managerId = e2.managerId
WHERE e1.employee_id != e2.employee_id;
PostgreSQL solution:
SELECT e1.employee_id, e1.name, e1.managerId
FROM Employees e1
JOIN Employees e2
ON e1.managerId = e2.managerId
WHERE e1.employee_id != e2.employee_id;
528
1000+ SQL Interview Questions & Answers | By Zero Analyst
Hard Question
Question
Write an SQL query to find the employees who have been in the company the longest and the
shortest. Return their employee_id, name, department, and hire_date.
Explanation
• You need to calculate the employees with the maximum and minimum hire_date (earliest
and latest hires).
• Use MIN() and MAX() to find the corresponding dates and then join them with the
employees table to get their details.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
hire_date DATE
);
Learnings
• Use of MIN() and MAX() functions to find the earliest and latest dates.
• Joining the table with the result of MIN() and MAX() to get full details.
• Handling hire_date as a date type for comparison.
Solutions
MySQL solution:
SELECT employee_id, name, department, hire_date
FROM Employees
WHERE hire_date = (SELECT MIN(hire_date) FROM Employees)
OR hire_date = (SELECT MAX(hire_date) FROM Employees);
PostgreSQL solution:
SELECT employee_id, name, department, hire_date
FROM Employees
WHERE hire_date = (SELECT MIN(hire_date) FROM Employees)
OR hire_date = (SELECT MAX(hire_date) FROM Employees);
529
1000+ SQL Interview Questions & Answers | By Zero Analyst
These questions cover a range of difficulty levels while still staying relevant to the data
analysis domain in a real-world scenario. Let me know if you need further modifications or
additional questions!
• Q.426
Question
Write an SQL query to find the department with the highest average salary. If multiple
departments have the same highest average salary, return all of them.
Explanation
You need to calculate the average salary for each department, and then select the
department(s) with the highest average salary. To handle ties, you can use a subquery or a
HAVING clause.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
Learnings
• Use of AVG() for aggregating salary data by department.
• Sorting and filtering to find the department with the highest average salary.
• Handling ties in aggregate functions.
Solutions
MySQL solution:
SELECT department, AVG(salary) AS avg_salary
FROM Employees
GROUP BY department
HAVING AVG(salary) = (SELECT MAX(AVG(salary)) FROM Employees GROUP BY department);
PostgreSQL solution:
SELECT department, AVG(salary) AS avg_salary
FROM Employees
GROUP BY department
530
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to filter employees earning more than $80,000, group them by department, and
count the number of employees in each department. Afterward, you need to sort the result
and limit it to the top 3 departments based on the number of employees.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
Learnings
• Filtering data based on salary condition.
• Grouping and counting the number of employees within each department.
• Sorting results to find the top 3 departments.
Solutions
MySQL solution:
SELECT department, COUNT(*) AS department_count
FROM Employees
WHERE salary > 80000
GROUP BY department
ORDER BY department_count DESC
LIMIT 3;
PostgreSQL solution:
SELECT department, COUNT(*) AS department_count
FROM Employees
531
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this, you need to:
• Find the highest salary in each department.
• Exclude that salary and calculate the average salary for the remaining employees in the
same department.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
Learnings
• Use of MAX() to find the highest salary in each department.
• Subquery filtering to exclude top salaries.
• Calculating the average salary for the remaining employees in each department.
Solutions
MySQL solution:
SELECT department, AVG(salary) AS avg_salary
FROM Employees
WHERE salary < (SELECT MAX(salary) FROM Employees e2 WHERE e2.department = Employees.dep
artment)
GROUP BY department;
PostgreSQL solution:
532
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to classify employees into three categories based on their salary:
• "Low" for salary less than $60,000,
• "Medium" for salary between $60,000 and $100,000, and
• "High" for salary greater than $100,000.
Return the employee_id, name, and the salary classification.
Explanation
You need to use the CASE statement to create a salary classification for each employee. The
CASE statement helps to create conditional logic directly in the SQL query.
Datasets
Table creation and sample data:
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
salary DECIMAL(10, 2)
);
Learnings
• Use of CASE to implement conditional logic in SQL.
• Categorizing data into groups based on numeric conditions.
Solutions
MySQL solution:
SELECT employee_id, name,
CASE
WHEN salary < 60000 THEN 'Low'
WHEN salary BETWEEN 60000 AND 100000 THEN 'Medium'
ELSE 'High'
END AS salary_classification
FROM Employees;
PostgreSQL solution:
533
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.430
Question
Write an SQL query to calculate the total number of "likes" and classify them based on how
many likes each post has received. Classify as:
• "Low Engagement" for less than 50 likes,
• "Medium Engagement" for between 50 and 200 likes,
• "High Engagement" for more than 200 likes.
Return the post_id, likes_count, and the engagement classification.
Explanation
This query involves the use of the CASE statement to categorize the number of likes a post
received into "Low", "Medium", or "High" based on predefined thresholds.
Datasets
Table creation and sample data:
CREATE TABLE InstagramPosts (
post_id INT PRIMARY KEY,
likes_count INT
);
Learnings
• Applying CASE to categorize data into ranges.
• Aggregation using simple classification.
Solutions
MySQL solution:
SELECT post_id, likes_count,
CASE
534
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
SELECT post_id, likes_count,
CASE
WHEN likes_count < 50 THEN 'Low Engagement'
WHEN likes_count BETWEEN 50 AND 200 THEN 'Medium Engagement'
ELSE 'High Engagement'
END AS engagement_classification
FROM InstagramPosts;
• Q.431
Question
Write an SQL query to calculate the number of users who either:
• "Followed" a post, based on the action follow,
• "Unfollowed" a post, based on the action unfollow.
Use the action field in the InstagramActions table to classify each action. Additionally, if
a user follows a post but hasn't unfollowed it, mark it as a "Net Follower" in a new column.
Return the post_id, follow_count, unfollow_count, and net_follower_count.
Explanation
• You need to classify the actions based on the action field.
• Then, you calculate the count of follows and unfollows.
• Use the CASE statement to classify users who followed a post but haven't unfollowed it, and
calculate the net number of followers.
Datasets
Table creation and sample data:
CREATE TABLE InstagramActions (
action_id INT PRIMARY KEY,
user_id INT,
post_id INT,
action ENUM('follow', 'unfollow'),
time_stamp DATETIME
);
Learnings
• Using CASE to classify and filter based on action types.
535
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL solution:
SELECT post_id,
SUM(CASE WHEN action = 'follow' THEN 1 ELSE 0 END) AS follow_count,
SUM(CASE WHEN action = 'unfollow' THEN 1 ELSE 0 END) AS unfollow_count,
SUM(CASE WHEN action = 'follow' THEN 1 ELSE 0 END) -
SUM(CASE WHEN action = 'unfollow' THEN 1 ELSE 0 END) AS net_follower_count
FROM InstagramActions
GROUP BY post_id;
PostgreSQL solution:
SELECT post_id,
SUM(CASE WHEN action = 'follow' THEN 1 ELSE 0 END) AS follow_count,
SUM(CASE WHEN action = 'unfollow' THEN 1 ELSE 0 END) AS unfollow_count,
SUM(CASE WHEN action = 'follow' THEN 1 ELSE 0 END) -
SUM(CASE WHEN action = 'unfollow' THEN 1 ELSE 0 END) AS net_follower_count
FROM InstagramActions
GROUP BY post_id;
• Q.432
Question
Write an SQL query to identify the users who have commented about "violence" in their last
3 comments, based on the presence of specific keywords related to violence such as
'violence', 'attack', 'fight', 'war', or 'blood'. The comment table contains a column
comment_text where comments are stored.
Return the user_id and the comment_id of the relevant comments.
Explanation
You need to:
• Filter the comments that contain certain violence-related keywords using LIKE or REGEXP
in SQL.
• Identify the last 3 comments for each user.
• Check if any of the last 3 comments contain violence-related words.
• Return the user_id and comment_id of the comments that match the criteria.
Datasets
Table creation and sample data:
CREATE TABLE Comments (
comment_id INT PRIMARY KEY,
user_id INT,
comment_text TEXT,
time_stamp DATETIME
);
536
1000+ SQL Interview Questions & Answers | By Zero Analyst
(1, 101, 'The attack on the city was brutal and violent', '2021-08-01 10:00:00'),
(2, 101, 'I watched a bloody war on the news', '2021-08-02 12:00:00'),
(3, 101, 'This fight is going too far, people are getting hurt', '2021-08-03 14:00:0
0'),
(4, 102, 'The peaceful march was inspiring', '2021-08-01 15:00:00'),
(5, 102, 'Such a violent outburst in the city center', '2021-08-02 16:00:00'),
(6, 103, 'The new movie was amazing, no violence', '2021-08-02 17:00:00'),
(7, 103, 'Blood spilled in the streets during the riots', '2021-08-03 18:00:00'),
(8, 103, 'Fighting over the issue is senseless', '2021-08-04 19:00:00'),
(9, 104, 'Nothing violent about this situation', '2021-08-04 20:00:00');
Learnings
• Use of LIKE or REGEXP for pattern matching in comments.
• Handling multiple conditions to check for keywords related to violence.
• Ordering and filtering to select the last 3 comments per user.
• Combining text analysis with SQL filtering.
Solutions
MySQL solution:
SELECT user_id, comment_id
FROM (
SELECT user_id, comment_id, comment_text, time_stamp,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY time_stamp DESC) AS rn
FROM Comments
WHERE comment_text LIKE '%violence%' OR comment_text LIKE '%attack%' OR comment_text
LIKE '%fight%'
OR comment_text LIKE '%war%' OR comment_text LIKE '%blood%'
) AS subquery
WHERE rn <= 3;
PostgreSQL solution:
SELECT user_id, comment_id
FROM (
SELECT user_id, comment_id, comment_text, time_stamp,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY time_stamp DESC) AS rn
FROM Comments
WHERE comment_text ~* 'violence|attack|fight|war|blood'
) AS subquery
WHERE rn <= 3;
Notes
• The ROW_NUMBER() function is used to order the comments per user by timestamp and then
select the last 3 comments for each user.
• LIKE or REGEXP is used to match specific words related to violence in the comments. For
MySQL, LIKE is used, and for PostgreSQL, ~* (case-insensitive regular expression match) is
used.
• This query assumes that the comments table has enough data and that the comment_text
contains the necessary information to identify violence-related content.
• Q.433
Question
Write an SQL query to combine WhatsApp group chat messages and individual chat
messages into a single result set. The GroupChats table stores messages from group chats,
and the IndividualChats table stores messages from one-on-one conversations.
537
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to use UNION to combine results from two different tables, ensuring you select the
necessary columns from both tables and label them accordingly. This query will return all
messages with a label to identify if they are from a group chat or individual chat.
Datasets
Table creation and sample data:
CREATE TABLE GroupChats (
message_id INT PRIMARY KEY,
group_id INT,
user_id INT,
message TEXT,
time_stamp DATETIME
);
Learnings
• Using UNION to combine data from multiple tables.
• Labeling the type of chat by adding a static value in the SELECT statement.
• Handling both group and individual messages in a single query.
Solutions
MySQL solution:
SELECT user_id AS user_id, 'Group' AS chat_type, message
FROM GroupChats
UNION
SELECT sender_id AS user_id, 'Individual' AS chat_type, message
FROM IndividualChats
538
1000+ SQL Interview Questions & Answers | By Zero Analyst
ORDER BY time_stamp;
PostgreSQL solution:
SELECT user_id AS user_id, 'Group' AS chat_type, message
FROM GroupChats
UNION
SELECT sender_id AS user_id, 'Individual' AS chat_type, message
FROM IndividualChats
ORDER BY time_stamp;
• Q.434
Question
Write an SQL query to combine incoming and outgoing messages in WhatsApp chat logs into
one result set. The IncomingMessages table stores messages received by users, and the
OutgoingMessages table stores messages sent by users.
• Return the user_id, message_type (either "Incoming" or "Outgoing"), and the message.
• Combine both tables using UNION and display the result ordered by timestamp.
Explanation
You need to combine the incoming and outgoing messages using UNION. Additionally, label
the messages as either "Incoming" or "Outgoing" using a SELECT with static values.
Datasets
Table creation and sample data:
CREATE TABLE IncomingMessages (
message_id INT PRIMARY KEY,
user_id INT,
message TEXT,
time_stamp DATETIME
);
Learnings
• Using UNION to merge data from different sources (incoming vs. outgoing messages).
• Adding static labels to each query to differentiate between message types.
• Ensuring the correct order of the messages by timestamp.
539
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL solution:
SELECT user_id, 'Incoming' AS message_type, message
FROM IncomingMessages
UNION
SELECT user_id, 'Outgoing' AS message_type, message
FROM OutgoingMessages
ORDER BY time_stamp;
PostgreSQL solution:
SELECT user_id, 'Incoming' AS message_type, message
FROM IncomingMessages
UNION
SELECT user_id, 'Outgoing' AS message_type, message
FROM OutgoingMessages
ORDER BY time_stamp;
Notes
• Both queries use UNION to combine messages from two different sources.
• The message_type column is added to label whether the message is incoming or outgoing.
• The query results are ordered by the time_stamp to ensure the messages are returned in
chronological order.
• Q.435
Question
Write an SQL query to find the prices of all products on 2019-08-16. Assume the price of all
products before any price change is 10.
Return the product_id and the price of each product as of 2019-08-16.
Explanation
You need to:
• For each product, find the most recent price change that occurred on or before 2019-08-
16.
• If no price change is found for a product before or on this date, the price will be assumed to
be the default value of 10.
Datasets
Table creation and sample data:
CREATE TABLE Products (
product_id INT,
new_price INT,
change_date DATE,
PRIMARY KEY (product_id, change_date)
);
540
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to handle products with no price change before the specified date.
• Using MAX() and GROUP BY to find the most recent price change before or on the given
date.
• Handling default values when no data is available for certain conditions.
Solutions
MySQL solution:
SELECT p.product_id,
IFNULL(MAX(pr.new_price), 10) AS price
FROM (
SELECT DISTINCT product_id
FROM Products
) p
LEFT JOIN Products pr
ON p.product_id = pr.product_id
AND pr.change_date <= '2019-08-16'
GROUP BY p.product_id;
PostgreSQL solution:
SELECT p.product_id,
COALESCE(MAX(pr.new_price), 10) AS price
FROM (
SELECT DISTINCT product_id
FROM Products
) p
LEFT JOIN Products pr
ON p.product_id = pr.product_id
AND pr.change_date <= '2019-08-16'
GROUP BY p.product_id;
Notes
• The query selects all distinct product_id values first, ensuring that we handle all products.
• A LEFT JOIN is used to include products that have no price change by 2019-08-16,
ensuring those products have a default price of 10.
• The MAX() function is used to fetch the most recent price change that occurred before or on
2019-08-16 for each product.
• The IFNULL (MySQL) and COALESCE (PostgreSQL) functions handle the case where a
product does not have any price change by returning the default value 10
• Q.436
Question
Write an SQL query to find the name of the candidate who received the most votes. In case of
a tie, return the candidate with the lowest id among the tied candidates.
Explanation
541
1000+ SQL Interview Questions & Answers | By Zero Analyst
• You need to count the votes for each candidate from the Vote table, grouped by
CandidateId.
• You should then join this result with the Candidate table to get the name of the candidate.
• Finally, return the candidate name who received the most votes.
Datasets
Table creation and sample data:
CREATE TABLE Candidate (
id INT PRIMARY KEY AUTO_INCREMENT,
Name VARCHAR(100)
);
Learnings
• Counting rows in a table using COUNT().
• Grouping results using GROUP BY.
• Handling ties and returning the candidate with the lowest id using ORDER BY and LIMIT.
Solutions
MySQL solution:
SELECT c.Name
FROM Candidate c
JOIN (
SELECT CandidateId, COUNT(*) AS vote_count
FROM Vote
GROUP BY CandidateId
) v ON c.id = v.CandidateId
ORDER BY v.vote_count DESC, c.id ASC
LIMIT 1;
PostgreSQL solution:
SELECT c.Name
FROM Candidate c
JOIN (
542
1000+ SQL Interview Questions & Answers | By Zero Analyst
Notes
• The subquery calculates the total number of votes each candidate has received by using
COUNT(*) and grouping by CandidateId.
• The outer query joins the Candidate table with the result of the subquery to fetch the
candidate's name.
• The query orders by vote_count in descending order to find the candidate with the most
votes.
• In case of a tie (multiple candidates with the same highest vote count), the query orders by
id in ascending order to select the candidate with the lowest id.
• Finally, LIMIT 1 ensures that only the top candidate is returned.
• Q.437
Question
Write an SQL query to find the names of all candidates who have received more votes than
the average number of votes per candidate.
Explanation
• First, calculate the average number of votes per candidate.
• Then, count the number of votes for each candidate.
• Finally, find all candidates whose vote count is greater than the average.
Datasets
Table creation and sample data:
CREATE TABLE Candidate (
id INT PRIMARY KEY AUTO_INCREMENT,
Name VARCHAR(100)
);
543
1000+ SQL Interview Questions & Answers | By Zero Analyst
(4, 2),
(5, 5);
Learnings
• Using COUNT() to calculate the number of votes for each candidate.
• Using AVG() to calculate the average votes across all candidates.
• Using a HAVING clause to filter the results based on aggregated conditions.
Solutions
MySQL solution:
SELECT c.Name
FROM Candidate c
JOIN (
SELECT CandidateId, COUNT(*) AS vote_count
FROM Vote
GROUP BY CandidateId
) v ON c.id = v.CandidateId
WHERE v.vote_count > (
SELECT AVG(vote_count)
FROM (
SELECT COUNT(*) AS vote_count
FROM Vote
GROUP BY CandidateId
) avg_votes
);
PostgreSQL solution:
SELECT c.Name
FROM Candidate c
JOIN (
SELECT CandidateId, COUNT(*) AS vote_count
FROM Vote
GROUP BY CandidateId
) v ON c.id = v.CandidateId
WHERE v.vote_count > (
SELECT AVG(vote_count)
FROM (
SELECT COUNT(*) AS vote_count
FROM Vote
GROUP BY CandidateId
) avg_votes
);
Notes
• The subquery inside the WHERE clause calculates the average number of votes per candidate
using AVG().
• The outer query counts the votes for each candidate using COUNT(*), then filters the results
by comparing each candidate's vote count with the calculated average.
• The JOIN ensures that we can access both the candidate name and their vote count in the
final result.
• Q.438
Question
Write an SQL query to find the product_id of products that are both low fat and recyclable.
Return the result table in any order.
544
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to:
• Filter products where the low_fats column is 'Y' (low fat) and the recyclable column is
'Y' (recyclable).
• Return the product_id of those products that satisfy both conditions.
Datasets
Table creation and sample data:
CREATE TABLE Products (
product_id INT PRIMARY KEY,
low_fats ENUM('Y', 'N'),
recyclable ENUM('Y', 'N')
);
Learnings
• Filtering results based on multiple conditions in the WHERE clause.
• Using the ENUM type for handling specific values (e.g., 'Y' and 'N').
• Selecting specific columns in the result table.
Solutions
MySQL solution:
SELECT product_id
FROM Products
WHERE low_fats = 'Y' AND recyclable = 'Y';
PostgreSQL solution:
SELECT product_id
FROM Products
WHERE low_fats = 'Y' AND recyclable = 'Y';
Notes
• The WHERE clause filters rows where both low_fats and recyclable are 'Y', meaning the
product is both low fat and recyclable.
• The result returns only the product_id of those products.
• Q.439
Question
545
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to find the product_id and launch_year of products that were
launched in the last 3 years but have not been sold in the last year.
Explanation
You need to:
• Filter products that were launched in the last 3 years.
• Check that these products have not been sold in the last year.
• Return the product_id and the launch_year of those products.
Datasets
Table creation and sample data:
CREATE TABLE Products (
product_id INT PRIMARY KEY,
launch_date DATE
);
Learnings
• Using DATE functions to filter records by specific time periods.
• Understanding how to compare dates across two tables.
• Using JOIN to combine product and sales data.
• Handling date calculations to determine a range (e.g., "last 3 years", "last year").
Solutions
MySQL solution:
SELECT p.product_id, YEAR(p.launch_date) AS launch_year
FROM Products p
LEFT JOIN Sales s ON p.product_id = s.product_id
WHERE p.launch_date >= CURDATE() - INTERVAL 3 YEAR
AND (s.sale_date IS NULL OR s.sale_date < CURDATE() - INTERVAL 1 YEAR)
GROUP BY p.product_id;
546
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
SELECT p.product_id, EXTRACT(YEAR FROM p.launch_date) AS launch_year
FROM Products p
LEFT JOIN Sales s ON p.product_id = s.product_id
WHERE p.launch_date >= CURRENT_DATE - INTERVAL '3 years'
AND (s.sale_date IS NULL OR s.sale_date < CURRENT_DATE - INTERVAL '1 year')
GROUP BY p.product_id;
Notes
• The query filters products launched in the last 3 years by checking if launch_date is
within the last 3 years using the INTERVAL keyword.
• The LEFT JOIN ensures that we get all products, even those that have not been sold yet
(where no matching sale_date exists).
• The condition s.sale_date IS NULL OR s.sale_date < CURDATE() - INTERVAL 1
YEAR ensures that products have not been sold in the last year.
• The result includes the product_id and the launch_year of those products.
• Q.440
Question
Write an SQL query to find the user_id who, over a period of time, has shown the largest
fluctuation in their engagement on a platform. Specifically, find the user whose engagement
(activity_score) has varied the most between any two consecutive days, based on the
activity data stored in the UserEngagement table.
The query should return:
• user_id
• The maximum difference in activity score between two consecutive days for that user
(calculated as the absolute difference between the scores).
• The start date of the period when this fluctuation started (the first day of the two
consecutive days with the largest difference).
• The end date of the period when this fluctuation ended (the second day of the two
consecutive days with the largest difference).
Explanation
You need to:
• Calculate the activity_score difference between consecutive days for each user.
• Find the user with the highest fluctuation in activity score between two consecutive days.
• Return the user_id, the maximum fluctuation, and the dates of the fluctuation.
The query should handle:
• Users who have multiple fluctuations and return the one with the largest absolute
difference.
• Ensure that the fluctuation calculation is based on consecutive days for each user.
Datasets
Table creation and sample data:
547
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LAG() window function to calculate the previous day's activity score for each user.
• Calculating the absolute difference in scores between consecutive days.
• Finding the maximum difference and selecting the appropriate dates using window
functions and ORDER BY.
• Handling gaps in data and ensuring consecutive days are correctly identified.
Solutions
MySQL solution:
WITH ScoreDifferences AS (
SELECT
user_id,
activity_date,
activity_score,
LAG(activity_score) OVER (PARTITION BY user_id ORDER BY activity_date) AS prev_s
core
FROM UserEngagement
),
Fluctuations AS (
SELECT
user_id,
activity_date AS end_date,
LAG(activity_date) OVER (PARTITION BY user_id ORDER BY activity_date) AS start_d
ate,
ABS(activity_score - prev_score) AS score_diff
FROM ScoreDifferences
WHERE prev_score IS NOT NULL
)
SELECT
user_id,
MAX(score_diff) AS max_fluctuation,
start_date,
end_date
FROM Fluctuations
GROUP BY user_id
ORDER BY max_fluctuation DESC
LIMIT 1;
548
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
WITH ScoreDifferences AS (
SELECT
user_id,
activity_date,
activity_score,
LAG(activity_score) OVER (PARTITION BY user_id ORDER BY activity_date) AS prev_s
core
FROM UserEngagement
),
Fluctuations AS (
SELECT
user_id,
activity_date AS end_date,
LAG(activity_date) OVER (PARTITION BY user_id ORDER BY activity_date) AS start_d
ate,
ABS(activity_score - prev_score) AS score_diff
FROM ScoreDifferences
WHERE prev_score IS NOT NULL
)
SELECT
user_id,
MAX(score_diff) AS max_fluctuation,
start_date,
end_date
FROM Fluctuations
GROUP BY user_id
ORDER BY max_fluctuation DESC
LIMIT 1;
Notes
• The LAG() window function is used to retrieve the previous day's activity_score for
each user, partitioned by user_id and ordered by activity_date.
• The absolute difference is calculated using ABS() to find the fluctuation.
• We use a common table expression (CTE) to compute these differences and filter out cases
where no previous score exists (i.e., the first day for each user).
• The MAX() function is used to find the largest fluctuation for each user, and the LIMIT 1
ensures we only return the user with the largest fluctuation overall.
• The query returns the user_id, max_fluctuation, start_date, and end_date for the
user with the largest fluctuation in activity scores.
Netflix
• Q.441
Question
Write an SQL query to find the top 5 most-watched pieces of content in each region based
on total watch duration for the last month (December 2024).
Return the following columns:
• Region
• ContentID
• Total Watch Duration (sum of WatchDuration for each content in that region)
Explanation
You need to:
549
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Filter the ViewingHistory table to only include records from December 2024.
• Calculate the total watch duration for each content piece in each region.
• Rank the content within each region based on the total watch duration, and select the top 5
content for each region.
• Return the results ordered by region and total watch duration.
Datasets
Table creation and sample data:
CREATE TABLE ViewingHistory (
UserID INT,
ContentID INT,
WatchDate DATE,
WatchDuration INT,
Region VARCHAR(100)
);
Learnings
• Using the SUM() function to aggregate total watch duration.
• Using ROW_NUMBER() or RANK() to rank the top 5 content in each region.
• Filtering data based on date conditions (WHERE clause) for the last month.
• Window Functions (like ROW_NUMBER()) to rank content in each region.
Solutions
MySQL solution:
WITH RankedContent AS (
SELECT
Region,
ContentID,
SUM(WatchDuration) AS TotalWatchDuration,
ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SUM(WatchDuration) DESC) AS Rank
FROM ViewingHistory
WHERE WatchDate BETWEEN '2024-12-01' AND '2024-12-31'
GROUP BY Region, ContentID
)
SELECT Region, ContentID, TotalWatchDuration
FROM RankedContent
WHERE Rank <= 5
550
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution:
WITH RankedContent AS (
SELECT
Region,
ContentID,
SUM(WatchDuration) AS TotalWatchDuration,
ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SUM(WatchDuration) DESC) AS Rank
FROM ViewingHistory
WHERE WatchDate BETWEEN '2024-12-01' AND '2024-12-31'
GROUP BY Region, ContentID
)
SELECT Region, ContentID, TotalWatchDuration
FROM RankedContent
WHERE Rank <= 5
ORDER BY Region, TotalWatchDuration DESC;
Notes
• The query filters the data for December 2024 using the WHERE clause.
• The SUM() function is used to calculate the total watch duration for each content piece.
• The ROW_NUMBER() window function ranks the content within each region by total watch
duration in descending order.
• Only the top 5 ranked content for each region are selected using WHERE Rank <= 5.
• The final result is ordered by Region and TotalWatchDuration in descending order to
show the most-watched content at the top.
• Q.442
Question
Write an SQL query to find, for each month and country, the following:
• The number of transactions.
• The total amount of all transactions.
• The number of approved transactions.
• The total amount of approved transactions.
Return the results in any order.
Explanation
You need to:
• Extract the month and year from the trans_date to group the data by month.
• Count the total number of transactions and calculate the total amount of transactions for
each country and month.
• For transactions with the state 'approved', calculate the count and total amount
separately.
• Return the results with the following columns:
• month (in YYYY-MM format)
• country
• trans_count (total number of transactions)
• approved_count (number of approved transactions)
• trans_total_amount (total amount of all transactions)
551
1000+ SQL Interview Questions & Answers | By Zero Analyst
Datasets
Table creation and sample data:
CREATE TABLE Transactions (
id INT PRIMARY KEY,
country VARCHAR(50),
state ENUM('approved', 'declined'),
amount INT,
trans_date DATE
);
Learnings
• Using DATE_FORMAT (MySQL) or TO_CHAR (PostgreSQL) to extract the year and month
from the transaction date.
• Using GROUP BY to aggregate data by month and country.
• Using COUNT() to count the number of transactions and SUM() to calculate total amounts.
• Filtering approved transactions using the WHERE clause or CASE statements.
Solutions
MySQL solution:
SELECT
DATE_FORMAT(trans_date, '%Y-%m') AS month,
country,
COUNT(*) AS trans_count,
SUM(CASE WHEN state = 'approved' THEN 1 ELSE 0 END) AS approved_count,
SUM(amount) AS trans_total_amount,
SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END) AS approved_total_amount
FROM Transactions
GROUP BY month, country
ORDER BY month, country;
PostgreSQL solution:
SELECT
TO_CHAR(trans_date, 'YYYY-MM') AS month,
country,
COUNT(*) AS trans_count,
SUM(CASE WHEN state = 'approved' THEN 1 ELSE 0 END) AS approved_count,
SUM(amount) AS trans_total_amount,
SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END) AS approved_total_amount
FROM Transactions
GROUP BY month, country
ORDER BY month, country;
Notes
552
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.443
Question
Write an SQL query to calculate the monthly retention rate of users over the past 6 months,
given the following table:
Where:
• Each row in the table represents an activity performed by a user on a specific date.
• The table contains data for all users over multiple months.
The monthly retention rate for a given month is defined as the percentage of users who
logged in during that month and were also active in the previous month.
Explanation
To calculate the monthly retention rate for each month:
• Identify the active users for each month (those who logged in that month).
• Find the users who were active in the previous month (i.e., logged in during the previous
month).
• Calculate the retention rate for each month as:
Datasets
Table creation and sample data:
CREATE TABLE UserActivity (
UserID INT,
LoginDate DATE,
ActivityType VARCHAR(50)
);
553
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using DATE_TRUNC (PostgreSQL) or DATE_FORMAT (MySQL) to extract the month and year
from the login date.
• Using JOIN to link users who logged in both in the current month and the previous month.
• COUNTing distinct users to calculate retention.
• Calculating percentage retention by dividing the number of retained users by total active
users.
Solutions
MySQL solution:
WITH MonthlyActiveUsers AS (
SELECT DISTINCT
UserID,
DATE_FORMAT(LoginDate, '%Y-%m') AS month
FROM UserActivity
WHERE LoginDate BETWEEN '2024-07-01' AND '2024-12-31'
),
Retention AS (
SELECT
curr.UserID,
curr.month AS current_month,
prev.month AS previous_month
FROM MonthlyActiveUsers curr
LEFT JOIN MonthlyActiveUsers prev
ON curr.UserID = prev.UserID
AND prev.month = DATE_FORMAT(DATE_SUB(STR_TO_DATE(curr.month, '%Y-%m'), INTERVAL
1 MONTH), '%Y-%m')
)
SELECT
current_month AS month,
COUNT(DISTINCT current_month.UserID) AS active_users,
COUNT(DISTINCT CASE WHEN previous_month IS NOT NULL THEN current_month.UserID END) A
S retained_users,
ROUND(COUNT(DISTINCT CASE WHEN previous_month IS NOT NULL THEN current_month.UserID
END) /
COUNT(DISTINCT current_month.UserID) * 100, 2) AS retention_rate
FROM Retention
GROUP BY current_month
ORDER BY current_month DESC;
PostgreSQL solution:
WITH MonthlyActiveUsers AS (
SELECT DISTINCT
UserID,
TO_CHAR(LoginDate, 'YYYY-MM') AS month
FROM UserActivity
WHERE LoginDate BETWEEN '2024-07-01' AND '2024-12-31'
),
Retention AS (
SELECT
curr.UserID,
554
1000+ SQL Interview Questions & Answers | By Zero Analyst
curr.month AS current_month,
prev.month AS previous_month
FROM MonthlyActiveUsers curr
LEFT JOIN MonthlyActiveUsers prev
ON curr.UserID = prev.UserID
AND prev.month = TO_CHAR(TO_DATE(curr.month, 'YYYY-MM') - INTERVAL '1 month', 'Y
YYY-MM')
)
SELECT
current_month AS month,
COUNT(DISTINCT current_month.UserID) AS active_users,
COUNT(DISTINCT CASE WHEN previous_month IS NOT NULL THEN current_month.UserID END) A
S retained_users,
ROUND(COUNT(DISTINCT CASE WHEN previous_month IS NOT NULL THEN current_month.UserID
END) /
COUNT(DISTINCT current_month.UserID) * 100, 2) AS retention_rate
FROM Retention
GROUP BY current_month
ORDER BY current_month DESC;
Notes
• WITH Clauses are used to create two intermediate tables:
• MonthlyActiveUsers: Retrieves all distinct users who were active in each month between
July 2024 and December 2024.
• Retention: Joins the MonthlyActiveUsers table on itself to find users who were active in
both the current month and the previous month.
• COUNT(DISTINCT) is used to count distinct users to avoid duplicate counts for the same
user.
• The retention rate is calculated by dividing the number of retained users by the total
number of users active in the current month, multiplied by 100.
• DATE_FORMAT in MySQL and TO_CHAR in PostgreSQL are used to extract the
month and year from the LoginDate.
• The result is ordered by the current month to show the most recent month first.
Explanation
You need to filter the data by country (USA, India, UK) and limit the result to the top 3
genres in each of these countries based on their popularity. The query will involve:
• Filtering by country.
• Grouping by genre and country.
• Ordering the results by the number of occurrences of each genre in the past quarter.
555
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use GROUP BY and COUNT() to aggregate data by genre and country.
• Use ROW_NUMBER() or RANK() to limit the result to top N genres per country.
• Understand how to filter data by date (the past quarter in this case).
• Use PARTITION BY with window functions for ranking.
Solutions
• - PostgreSQL solution
WITH GenreRank AS (
SELECT Genre, Country, COUNT(*) AS GenreCount,
ROW_NUMBER() OVER (PARTITION BY Country ORDER BY COUNT(*) DESC) AS rank
FROM ContentGenres
WHERE Country IN ('USA', 'India', 'UK')
AND WatchDate >= CURRENT_DATE - INTERVAL '3 months'
GROUP BY Genre, Country
)
SELECT Genre, Country, GenreCount
FROM GenreRank
WHERE rank <= 3;
• - MySQL solution
WITH GenreRank AS (
SELECT Genre, Country, COUNT(*) AS GenreCount,
ROW_NUMBER() OVER (PARTITION BY Country ORDER BY COUNT(*) DESC) AS rank
FROM ContentGenres
WHERE Country IN ('USA', 'India', 'UK')
AND WatchDate >= CURDATE() - INTERVAL 3 MONTH
GROUP BY Genre, Country
)
SELECT Genre, Country, GenreCount
FROM GenreRank
WHERE rank <= 3;
• Q.445
Question
Write a query to identify users who watched at least 5 episodes consecutively in a single
sitting in the past week from the ViewingPatterns table.
556
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to identify users who watched multiple episodes consecutively in a single sitting,
meaning the EndTime of one episode is very close to the StartTime of the next. The query
will:
• Filter data from the past week.
• Group by UserID and ContentID to identify consecutive episodes.
• Use window functions (e.g., LEAD or LAG) to determine consecutive episodes.
• Count how many consecutive episodes each user watched in a single sitting and return
users who watched at least 5.
Learnings
• Using window functions like LEAD() and LAG() to check for consecutive events.
• Filtering by date range (last week).
• Grouping data to aggregate viewing behavior.
• Detecting consecutive events based on time difference.
Solutions
• - PostgreSQL solution
WITH ConsecutiveEpisodes AS (
SELECT UserID, ContentID, EpisodeID, StartTime, EndTime,
LEAD(StartTime) OVER (PARTITION BY UserID, ContentID ORDER BY StartTime) AS N
extStartTime
FROM ViewingPatterns
WHERE StartTime >= CURRENT_DATE - INTERVAL '7 days'
)
SELECT UserID, ContentID
FROM ConsecutiveEpisodes
WHERE EXTRACT(EPOCH FROM NextStartTime - EndTime) <= 900 -- 15 minutes gap for consecut
ive episodes
GROUP BY UserID, ContentID
HAVING COUNT(*) >= 5;
• - MySQL solution
WITH ConsecutiveEpisodes AS (
SELECT UserID, ContentID, EpisodeID, StartTime, EndTime,
LEAD(StartTime) OVER (PARTITION BY UserID, ContentID ORDER BY StartTime) AS N
extStartTime
557
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM ViewingPatterns
WHERE StartTime >= CURDATE() - INTERVAL 7 DAY
)
SELECT UserID, ContentID
FROM ConsecutiveEpisodes
WHERE TIMESTAMPDIFF(SECOND, EndTime, NextStartTime) <= 900 -- 15 minutes gap for consec
utive episodes
GROUP BY UserID, ContentID
HAVING COUNT(*) >= 5;
• Q.446
Question
Write a query to calculate the percentage of users who canceled their subscription within 30
days of a billing cycle in the past year from the SubscriptionLogs table.
Explanation
To solve this, you need to:
• Filter the data to include only cancellations in the past year.
• Identify users who canceled within 30 days of a renewal action by comparing the
ActionDate of the cancellation to the ActionDate of the renewal.
• Calculate the percentage of these users relative to the total number of users who had a
renewal action within the past year.
Learnings
• Using JOIN to compare actions for the same user (e.g., renewal and cancellation).
• Filtering data by date range (the past year).
• Using date functions like DATEDIFF() or EXTRACT() to calculate date differences.
• Aggregating data to compute percentages.
Solutions
• - PostgreSQL solution
WITH RenewalAndCancellation AS (
SELECT r.UserID, r.ActionDate AS RenewalDate, c.ActionDate AS CancellationDate
FROM SubscriptionLogs r
558
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this, you need to:
• Aggregate the data by UserID and ContentID to get the total WatchDuration for each
combination of user and content.
• The result should have UserID as rows, ContentID as columns, and the total
WatchDuration as the value in the matrix.
Learnings
• Using GROUP BY to aggregate data.
559
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT
UserID,
SUM(CASE WHEN ContentID = 101 THEN WatchDuration ELSE 0 END) AS Content_101,
SUM(CASE WHEN ContentID = 102 THEN WatchDuration ELSE 0 END) AS Content_102,
SUM(CASE WHEN ContentID = 103 THEN WatchDuration ELSE 0 END) AS Content_103,
SUM(CASE WHEN ContentID = 104 THEN WatchDuration ELSE 0 END) AS Content_104
FROM ViewingHistory
GROUP BY UserID;
• - MySQL solution
SELECT
UserID,
SUM(CASE WHEN ContentID = 101 THEN WatchDuration ELSE 0 END) AS Content_101,
SUM(CASE WHEN ContentID = 102 THEN WatchDuration ELSE 0 END) AS Content_102,
SUM(CASE WHEN ContentID = 103 THEN WatchDuration ELSE 0 END) AS Content_103,
SUM(CASE WHEN ContentID = 104 THEN WatchDuration ELSE 0 END) AS Content_104
FROM ViewingHistory
GROUP BY UserID;
• Q.448
Question
Write a query to calculate the accuracy of recommendations by country from the
Recommendations table, where UserID, RecommendedContentID, WatchStatus ('Watched',
'Skipped'), and Country are columns.
Explanation
To calculate the accuracy of recommendations, you need to:
• Group the data by Country and UserID.
• Count the number of Watched recommendations for each country.
• Calculate the accuracy as the ratio of Watched recommendations to the total number of
recommendations (Watched + Skipped) for each country.
560
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Grouping data by Country to calculate aggregated statistics.
• Using COUNT() and SUM() to calculate the number of Watched recommendations.
• Calculating accuracy as a ratio of favorable outcomes (Watched) to total outcomes
(Watched + Skipped).
Solutions
• - PostgreSQL solution
SELECT
Country,
ROUND(100.0 * SUM(CASE WHEN WatchStatus = 'Watched' THEN 1 ELSE 0 END) / COUNT(*), 2
) AS Accuracy
FROM Recommendations
GROUP BY Country;
• - MySQL solution
SELECT
Country,
ROUND(100.0 * SUM(CASE WHEN WatchStatus = 'Watched' THEN 1 ELSE 0 END) / COUNT(*), 2
) AS Accuracy
FROM Recommendations
GROUP BY Country;
• Q.449
Question
Write a query to find pairs of content with the same genre and similar average watch
durations (+/- 10 minutes) using tables ContentDetails and ViewingHistory.
Explanation
To solve this:
• Join the ContentDetails table with ViewingHistory on ContentID to get the genre and
watch duration.
• Calculate the average watch duration for each content.
• Find content pairs that belong to the same genre and have similar average watch durations
(within a 10-minute difference).
• Return the ContentID pairs along with their genre.
561
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine data from multiple tables.
• Aggregating data with AVG() to calculate average watch durations.
• Using conditional filtering (e.g., ABS() function) to find similar durations.
• Grouping by genre to find content within the same genre.
Solutions
• - PostgreSQL solution
WITH AverageDurations AS (
SELECT v.ContentID, c.Genre, AVG(v.WatchDuration) AS AvgWatchDuration
FROM ViewingHistory v
JOIN ContentDetails c ON v.ContentID = c.ContentID
GROUP BY v.ContentID, c.Genre
)
SELECT a.ContentID AS ContentID1, b.ContentID AS ContentID2, a.Genre, a.AvgWatchDuration
AS AvgDuration1, b.AvgWatchDuration AS AvgDuration2
FROM AverageDurations a
JOIN AverageDurations b ON a.Genre = b.Genre AND a.ContentID < b.ContentID
WHERE ABS(a.AvgWatchDuration - b.AvgWatchDuration) <= 10;
• - MySQL solution
WITH AverageDurations AS (
SELECT v.ContentID, c.Genre, AVG(v.WatchDuration) AS AvgWatchDuration
FROM ViewingHistory v
JOIN ContentDetails c ON v.ContentID = c.ContentID
GROUP BY v.ContentID, c.Genre
)
SELECT a.ContentID AS ContentID1, b.ContentID AS ContentID2, a.Genre, a.AvgWatchDuration
AS AvgDuration1, b.AvgWatchDuration AS AvgDuration2
FROM AverageDurations a
JOIN AverageDurations b ON a.Genre = b.Genre AND a.ContentID < b.ContentID
WHERE ABS(a.AvgWatchDuration - b.AvgWatchDuration) <= 10;
• Q.450
Question
Write a query to identify content that has become popular in India in the past 30 days but was
not in the top 100 a month ago from the ContentViewLogs table.
Explanation
To solve this:
• Get the content that has high view counts in India in the past 30 days.
• Check the view counts from a month ago for the same content.
• Compare the recent view counts with the previous month's view counts to determine if the
content wasn't in the top 100 a month ago.
• The result should list the content that has gained popularity in the last 30 days but wasn't
among the top 100 content a month ago.
562
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN or SUBQUERY to compare recent data with historical data.
• Aggregating data using GROUP BY to identify the top content based on view counts.
• Using date filtering to focus on content viewed in specific time ranges (last 30 days vs.
previous month).
• Using NOT IN or NOT EXISTS to exclude content from the previous month's top content.
Solutions
• - PostgreSQL solution
WITH RecentViews AS (
SELECT ContentID, SUM(ViewCount) AS TotalViewCount
FROM ContentViewLogs
WHERE Country = 'India' AND ViewDate >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY ContentID
),
PreviousTopViews AS (
SELECT ContentID
FROM ContentViewLogs
WHERE Country = 'India' AND ViewDate BETWEEN CURRENT_DATE - INTERVAL '60 days' AND C
URRENT_DATE - INTERVAL '30 days'
GROUP BY ContentID
ORDER BY SUM(ViewCount) DESC
LIMIT 100
)
SELECT r.ContentID, r.TotalViewCount
FROM RecentViews r
LEFT JOIN PreviousTopViews p ON r.ContentID = p.ContentID
WHERE p.ContentID IS NULL
ORDER BY r.TotalViewCount DESC;
• - MySQL solution
WITH RecentViews AS (
SELECT ContentID, SUM(ViewCount) AS TotalViewCount
FROM ContentViewLogs
WHERE Country = 'India' AND ViewDate >= CURDATE() - INTERVAL 30 DAY
GROUP BY ContentID
),
PreviousTopViews AS (
SELECT ContentID
FROM ContentViewLogs
WHERE Country = 'India' AND ViewDate BETWEEN CURDATE() - INTERVAL 60 DAY AND CURDATE
() - INTERVAL 30 DAY
GROUP BY ContentID
563
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Calculate the average watch time for each user per genre.
• Group users based on their watch patterns. A common approach to clustering in SQL
involves using basic similarity measures or aggregations (e.g., using CASE WHEN or GROUP BY
for clustering).
• For more sophisticated clustering (e.g., k-means), a database might not be ideal; however,
this can be approximated with categorical grouping based on watch time thresholds.
This query will return users grouped by their average watch time across different genres,
which can be treated as a simple form of clustering.
Learnings
• Using GROUP BY and AVG() to aggregate watch time by user and genre.
• Understanding how to group users based on their watch patterns.
564
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Applying conditional logic to classify users into broad "clusters" based on watch time.
Solutions
• - PostgreSQL solution
WITH AvgWatchTime AS (
SELECT UserID, Genre, AVG(TotalWatchTime) AS AvgWatchTime
FROM UserGenres
GROUP BY UserID, Genre
)
SELECT UserID,
CASE
WHEN AVG(AvgWatchTime) >= 150 THEN 'Heavy Watchers'
WHEN AVG(AvgWatchTime) BETWEEN 100 AND 149 THEN 'Moderate Watchers'
ELSE 'Light Watchers'
END AS WatchCluster
FROM AvgWatchTime
GROUP BY UserID
ORDER BY UserID;
• - MySQL solution
WITH AvgWatchTime AS (
SELECT UserID, Genre, AVG(TotalWatchTime) AS AvgWatchTime
FROM UserGenres
GROUP BY UserID, Genre
)
SELECT UserID,
CASE
WHEN AVG(AvgWatchTime) >= 150 THEN 'Heavy Watchers'
WHEN AVG(AvgWatchTime) BETWEEN 100 AND 149 THEN 'Moderate Watchers'
ELSE 'Light Watchers'
END AS WatchCluster
FROM AvgWatchTime
GROUP BY UserID
ORDER BY UserID;
• Q.452
Question
Write a query to calculate the total revenue generated per region over the past 6 months from
the Revenue table.
Explanation
To solve this:
• Filter the Revenue table to include only data from the past 6 months using the
PaymentDate.
• Group the data by Region to calculate the total revenue per region.
• Sum the Amount for each Region and return the result.
565
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data by date using WHERE clause and date functions.
• Using SUM() to calculate the total revenue for each region.
• Grouping data by Region to get regional insights.
Solutions
• - PostgreSQL solution
SELECT Region,
SUM(Amount) AS TotalRevenue
FROM Revenue
WHERE PaymentDate >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY Region
ORDER BY TotalRevenue DESC;
• - MySQL solution
SELECT Region,
SUM(Amount) AS TotalRevenue
FROM Revenue
WHERE PaymentDate >= CURDATE() - INTERVAL 6 MONTH
GROUP BY Region
ORDER BY TotalRevenue DESC;
• Q.453
Question
Write a query to find the percentage of users upgrading to higher-tier plans in the last year
from the SubscriptionChanges table, which has columns UserID, OldPlan, NewPlan, and
ChangeDate.
Explanation
To solve this:
• Filter the SubscriptionChanges table to include only records from the last year using
ChangeDate.
• Identify users who have upgraded to a higher-tier plan by comparing OldPlan and
NewPlan.
• Calculate the percentage of users who upgraded, relative to the total number of users who
had any plan change in the last year.
566
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(1, 'Basic', 'Premium', '2024-06-15'),
(2, 'Standard', 'Premium', '2024-07-01'),
(3, 'Premium', 'Standard', '2024-08-10'),
(4, 'Basic', 'Standard', '2024-09-20'),
(5, 'Premium', 'Premium', '2024-10-05'),
(6, 'Basic', 'Premium', '2023-11-18'),
(7, 'Standard', 'Premium', '2023-12-12'),
(8, 'Standard', 'Basic', '2024-12-15');
Learnings
• Using CASE expressions to identify upgrades.
• Calculating the percentage by comparing counts of upgraded users versus total users.
• Using date functions (CURRENT_DATE, INTERVAL) to filter records for the past year.
Solutions
• - PostgreSQL solution
WITH UserChanges AS (
SELECT UserID, OldPlan, NewPlan
FROM SubscriptionChanges
WHERE ChangeDate >= CURRENT_DATE - INTERVAL '1 year'
)
SELECT
ROUND(100.0 * COUNT(DISTINCT CASE WHEN (OldPlan = 'Basic' AND NewPlan = 'Premium')
OR (OldPlan = 'Standard' AND NewPlan = 'Premium')
OR (OldPlan = 'Standard' AND NewPlan = 'Basic')
THEN UserID END)
/ COUNT(DISTINCT UserID), 2) AS UpgradePercentage
FROM UserChanges;
• - MySQL solution
WITH UserChanges AS (
SELECT UserID, OldPlan, NewPlan
FROM SubscriptionChanges
WHERE ChangeDate >= CURDATE() - INTERVAL 1 YEAR
)
SELECT
ROUND(100.0 * COUNT(DISTINCT CASE WHEN (OldPlan = 'Basic' AND NewPlan = 'Premium')
OR (OldPlan = 'Standard' AND NewPlan = 'Premium')
OR (OldPlan = 'Standard' AND NewPlan = 'Basic')
THEN UserID END)
/ COUNT(DISTINCT UserID), 2) AS UpgradePercentage
FROM UserChanges;
• Q.454
Question
Write a query to calculate the average revenue per user for those who used discount codes
versus those who didn’t from the DiscountUsage table, which has columns UserID,
DiscountCode, UsageDate, and Plan.
Explanation
To solve this:
• Identify users who used discount codes by checking for the presence of DiscountCode.
• Separate the users into two groups: those who used a discount code and those who did not.
• Calculate the average revenue per user for both groups.
• Join with the SubscriptionPlans table to map Plan to a revenue value (assuming the
Plan corresponds to specific revenue).
567
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using CASE statements to differentiate users who used a discount code and those who
didn’t.
• Calculating average revenue using AVG().
• Joins to map plans to revenue values.
Solutions
• - PostgreSQL solution
WITH RevenueData AS (
SELECT d.UserID,
d.DiscountCode,
sp.Revenue
FROM DiscountUsage d
JOIN SubscriptionPlans sp ON d.Plan = sp.Plan
)
SELECT
CASE
WHEN DiscountCode IS NOT NULL THEN 'With Discount'
ELSE 'Without Discount'
END AS DiscountGroup,
AVG(Revenue) AS AvgRevenuePerUser
FROM RevenueData
GROUP BY DiscountGroup;
• - MySQL solution
WITH RevenueData AS (
SELECT d.UserID,
d.DiscountCode,
sp.Revenue
FROM DiscountUsage d
JOIN SubscriptionPlans sp ON d.Plan = sp.Plan
)
SELECT
CASE
568
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a query to calculate the average lifetime value (LTV) of a Netflix subscriber using
tables UserRevenue (UserID, SubscriptionAmount, PaymentDate) and UserRetention
(UserID, RetentionDays).
Explanation
To calculate the average LTV:
• First, calculate the total revenue generated by each user. This is done by summing up the
SubscriptionAmount from UserRevenue for each UserID.
• Multiply the total revenue by the average retention period for that user from the
UserRetention table (in days).
• Calculate the average LTV across all users.
Learnings
• Using SUM() to aggregate subscription amounts per user.
• Calculating total revenue per user by multiplying SubscriptionAmount with
RetentionDays.
• Using AVG() to find the average LTV across all users.
Solutions
569
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
WITH UserTotalRevenue AS (
SELECT ur.UserID,
SUM(ur.SubscriptionAmount) AS TotalRevenue,
uret.RetentionDays
FROM UserRevenue ur
JOIN UserRetention uret ON ur.UserID = uret.UserID
GROUP BY ur.UserID, uret.RetentionDays
)
SELECT AVG(TotalRevenue * RetentionDays) AS AverageLTV
FROM UserTotalRevenue;
• - MySQL solution
WITH UserTotalRevenue AS (
SELECT ur.UserID,
SUM(ur.SubscriptionAmount) AS TotalRevenue,
uret.RetentionDays
FROM UserRevenue ur
JOIN UserRetention uret ON ur.UserID = uret.UserID
GROUP BY ur.UserID, uret.RetentionDays
)
SELECT AVG(TotalRevenue * RetentionDays) AS AverageLTV
FROM UserTotalRevenue;
• Q.456
Question
Write a query to calculate the percentage of churned users who resubscribed within 6 months
from the ChurnLogs table, which has columns UserID, ChurnDate, ResubscriptionDate,
and Region.
Explanation
To solve this:
• Identify the users who have a ResubscriptionDate within 6 months of their ChurnDate.
• Calculate the total number of users who churned (i.e., the total number of unique UserIDs
with a ChurnDate).
• Calculate the number of users who churned and then resubscribed within 6 months.
• Compute the percentage of users who resubscribed within 6 months relative to the total
number of churned users.
Learnings
570
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using DATEDIFF() (or equivalent) to calculate the difference in days between ChurnDate
and ResubscriptionDate.
• Identifying users who resubscribed within 6 months of churn using conditional logic.
• Calculating percentage values using COUNT() for the numerator and denominator.
Solutions
• - PostgreSQL solution
WITH ChurnedUsers AS (
SELECT UserID,
CASE
WHEN ResubscriptionDate IS NOT NULL AND ResubscriptionDate <= ChurnDate +
INTERVAL '6 months' THEN 1
ELSE 0
END AS ResubscribedWithin6Months
FROM ChurnLogs
)
SELECT
ROUND(100.0 * COUNT(ResubscribedWithin6Months) FILTER (WHERE ResubscribedWithin6Mont
hs = 1)
/ COUNT(UserID), 2) AS ResubscriptionPercentage
FROM ChurnedUsers;
• - MySQL solution
WITH ChurnedUsers AS (
SELECT UserID,
CASE
WHEN ResubscriptionDate IS NOT NULL AND DATEDIFF(ResubscriptionDate, Chur
nDate) <= 180 THEN 1
ELSE 0
END AS ResubscribedWithin6Months
FROM ChurnLogs
)
SELECT
ROUND(100.0 * COUNT(CASE WHEN ResubscribedWithin6Months = 1 THEN 1 END)
/ COUNT(UserID), 2) AS ResubscriptionPercentage
FROM ChurnedUsers;
• Q.457
Question
Write a query to find content licenses expiring in the next 30 days and their total view count.
Explanation
The task is to identify content licenses that are expiring in the next 30 days. The query should
join the Licenses table with a Views table (assuming it exists) to calculate the total view
count for each content item. Use the CURRENT_DATE function to filter the licenses expiring
within the next 30 days.
571
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Views
INSERT INTO Views (ViewID, ContentID, ViewCount, ViewDate)
VALUES
(1, 1, 50, '2024-12-01'),
(2, 1, 30, '2024-12-02'),
(3, 2, 100, '2024-12-01'),
(4, 3, 150, '2024-12-05');
Learnings
• Using CURRENT_DATE to filter based on a relative date range
• Joining tables to combine license data with view counts
• Using SUM() to aggregate view counts for each content
Solutions
• - PostgreSQL solution
SELECT l.ContentID,
COALESCE(SUM(v.ViewCount), 0) AS TotalViewCount
FROM Licenses l
LEFT JOIN Views v ON l.ContentID = v.ContentID
WHERE l.LicenseEndDate BETWEEN CURRENT_DATE AND CURRENT_DATE + INTERVAL '30 days'
GROUP BY l.ContentID;
• - MySQL solution
SELECT l.ContentID,
COALESCE(SUM(v.ViewCount), 0) AS TotalViewCount
FROM Licenses l
LEFT JOIN Views v ON l.ContentID = v.ContentID
WHERE l.LicenseEndDate BETWEEN CURDATE() AND CURDATE() + INTERVAL 30 DAY
GROUP BY l.ContentID;
• Q.458
Question
Write a query to find the top-performing content in each genre based on a combined metric of
total views and average watch duration.
Explanation
The goal is to identify the top-performing content in each genre, where the performance is
determined by a combination of total views and average watch duration. The query should
group the data by genre and calculate a weighted or combined score based on total views and
average watch duration, then return the content with the highest score in each genre.
572
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
-- ContentPerformance
INSERT INTO ContentPerformance (ContentID, Genre, TotalViews, WatchDuration)
VALUES
(1, 'Action', 1000, 45.5),
(2, 'Action', 1200, 40.0),
(3, 'Comedy', 800, 25.0),
(4, 'Comedy', 950, 30.0),
(5, 'Drama', 600, 50.0),
(6, 'Drama', 1100, 48.0);
Learnings
• Grouping data by genre
• Calculating combined performance metric (e.g., TotalViews * WatchDuration)
• Using aggregation functions (SUM(), AVG()) for performance metrics
• Using ROW_NUMBER() or similar window functions to rank content per genre
Solutions
• - PostgreSQL solution
WITH RankedContent AS (
SELECT ContentID,
Genre,
TotalViews,
WatchDuration,
(TotalViews * WatchDuration) AS PerformanceScore,
ROW_NUMBER() OVER (PARTITION BY Genre ORDER BY (TotalViews * WatchDuration) D
ESC) AS Rank
FROM ContentPerformance
)
SELECT ContentID, Genre, TotalViews, WatchDuration
FROM RankedContent
WHERE Rank = 1;
• - MySQL solution
WITH RankedContent AS (
SELECT ContentID,
Genre,
TotalViews,
WatchDuration,
(TotalViews * WatchDuration) AS PerformanceScore,
ROW_NUMBER() OVER (PARTITION BY Genre ORDER BY (TotalViews * WatchDuration) D
ESC) AS Rank
FROM ContentPerformance
)
SELECT ContentID, Genre, TotalViews, WatchDuration
FROM RankedContent
WHERE Rank = 1;
• Q.459
Question
Write a SQL query to identify the top 10 VIP users for Netflix, based on the most watched
hours of content in the last month.
Explanation
The goal is to identify the top 10 users with the highest total watch time in the past month.
This involves:
• Joining the users table with the watching_activity table.
• Filtering the activity data for the past month.
• Summing the hours_watched for each user.
573
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Watching Activity
INSERT INTO watching_activity (activity_id, user_id, date_time, show_id, hours_watched)
VALUES
(10355, 435, '2022-02-09 12:30:00', 12001, 2.5),
(14872, 278, '2022-02-10 14:15:00', 17285, 1.2),
(12293, 529, '2022-02-18 21:10:00', 12001, 4.3),
(16352, 692, '2022-02-20 19:00:00', 17285, 3.7),
(17485, 729, '2022-02-25 16:45:00', 17285, 1.9);
Learnings
• Using JOIN to combine data from multiple tables.
• Filtering data based on date ranges (last month).
• Using aggregation (SUM()) to calculate total hours watched.
• Sorting the result in descending order and limiting the number of results.
Solutions
• - PostgreSQL solution
SELECT users.user_id, SUM(watching_activity.hours_watched) AS total_hours_watched
FROM users
JOIN watching_activity ON users.user_id = watching_activity.user_id
WHERE watching_activity.date_time BETWEEN date_trunc('month', CURRENT_DATE - INTERVAL '1
month') AND date_trunc('month', CURRENT_DATE)
GROUP BY users.user_id
ORDER BY total_hours_watched DESC
LIMIT 10;
• - MySQL solution
SELECT users.user_id, SUM(watching_activity.hours_watched) AS total_hours_watched
FROM users
JOIN watching_activity ON users.user_id = watching_activity.user_id
WHERE watching_activity.date_time BETWEEN CURDATE() - INTERVAL 1 MONTH AND CURDATE()
GROUP BY users.user_id
ORDER BY total_hours_watched DESC
LIMIT 10;
• Q.460
574
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a SQL query to calculate the average rating for each show within a given month. The
results should be ordered by month and then by average rating in descending order.
Explanation
The task is to calculate the average rating for each show in each month. This involves:
• Extracting the month from the review_date.
• Grouping the data by show_id and month.
• Calculating the average rating (AVG(stars)).
• Sorting the results first by month and then by the average rating in descending order.
Learnings
• Using EXTRACT(MONTH FROM date) to extract the month part from a TIMESTAMP or DATE.
• Aggregating data with AVG() to calculate the average rating.
• Grouping data by multiple columns (show_id and month).
• Sorting results using ORDER BY.
Solutions
• - PostgreSQL solution
SELECT
EXTRACT(MONTH FROM review_date) AS mth,
show_id,
AVG(stars) AS avg_stars
FROM
show_reviews
GROUP BY
mth,
show_id
ORDER BY
mth,
avg_stars DESC;
• - MySQL solution
SELECT
MONTH(review_date) AS mth,
show_id,
AVG(stars) AS avg_stars
FROM
show_reviews
575
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY
mth,
show_id
ORDER BY
mth,
avg_stars DESC;
Uber
• Q.461
Question
Write a query to find the total number of rides each user has taken.
Explanation
This query should count the number of rides per user. It involves using the COUNT() function
to aggregate the rides and grouping by user_id.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
ride_date TIMESTAMP,
ride_distance DECIMAL(10, 2),
fare DECIMAL(10, 2)
);
• - Datasets
-- Rides
INSERT INTO rides (ride_id, user_id, ride_date, ride_distance, fare)
VALUES
(1, 101, '2023-01-01 08:00:00', 5.0, 15.50),
(2, 102, '2023-01-01 09:00:00', 10.0, 25.00),
(3, 101, '2023-01-02 10:00:00', 3.5, 10.00),
(4, 103, '2023-01-02 11:00:00', 7.0, 18.50),
(5, 101, '2023-01-03 12:00:00', 4.0, 12.00);
Learnings
• Using COUNT() for aggregation.
• Grouping by user_id to get counts per user.
• Basic JOIN operations (if needed).
Solutions
• - PostgreSQL solution
SELECT user_id, COUNT(ride_id) AS total_rides
FROM rides
GROUP BY user_id;
• - MySQL solution
SELECT user_id, COUNT(ride_id) AS total_rides
FROM rides
GROUP BY user_id;
• Q.462
Question
Write a query to find the total number of active users who have taken at least one ride in the
last 7 days.
Explanation
576
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query should filter users who have taken at least one ride in the last 7 days. It involves
using COUNT(DISTINCT user_id) to get the number of unique active users.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
ride_date TIMESTAMP,
ride_distance DECIMAL(10, 2),
fare DECIMAL(10, 2)
);
• - Datasets
-- Rides
INSERT INTO rides (ride_id, user_id, ride_date, ride_distance, fare)
VALUES
(1, 101, '2023-01-01 08:00:00', 5.0, 15.50),
(2, 102, '2023-01-02 09:00:00', 10.0, 25.00),
(3, 103, '2023-01-07 10:00:00', 3.5, 12.00),
(4, 101, '2023-01-08 12:00:00', 7.0, 18.50),
(5, 104, '2023-01-09 14:00:00', 4.0, 14.00);
Learnings
• Using date functions to filter by the last 7 days (NOW() - INTERVAL '7 days').
• Aggregating with COUNT(DISTINCT user_id) for unique active users.
• Basic JOIN operations (if needed).
Solutions
• - PostgreSQL solution
SELECT COUNT(DISTINCT user_id) AS active_users
FROM rides
WHERE ride_date >= NOW() - INTERVAL '7 days';
• - MySQL solution
SELECT COUNT(DISTINCT user_id) AS active_users
FROM rides
WHERE ride_date >= CURDATE() - INTERVAL 7 DAY;
• Q.463
Question
Write a query to find the average fare for Uber rides taken during each hour of the day.
Explanation
This query should calculate the average fare per hour of the day, which requires extracting
the hour from the ride_date and grouping by that hour.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
ride_date TIMESTAMP,
ride_distance DECIMAL(10, 2),
fare DECIMAL(10, 2)
);
• - Datasets
-- Rides
INSERT INTO rides (ride_id, user_id, ride_date, ride_distance, fare)
VALUES
(1, 101, '2023-01-01 08:15:00', 5.0, 15.50),
(2, 102, '2023-01-01 09:30:00', 10.0, 25.00),
577
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using EXTRACT(HOUR FROM ride_date) to get the hour part.
• Aggregating with AVG(fare) for average fare.
• Grouping by hour for analysis.
Solutions
• - PostgreSQL solution
SELECT EXTRACT(HOUR FROM ride_date) AS hour_of_day, AVG(fare) AS average_fare
FROM rides
GROUP BY hour_of_day
ORDER BY hour_of_day;
• - MySQL solution
SELECT HOUR(ride_date) AS hour_of_day, AVG(fare) AS average_fare
FROM rides
GROUP BY hour_of_day
ORDER BY hour_of_day;
• Q.464
Question
Write a query to calculate the total earnings for each driver in the past month. The total
earnings are the sum of the fare for all rides completed by each driver.
Explanation
This query involves filtering rides from the last month, grouping by driver_id, and
calculating the total earnings using the SUM() function.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
driver_id INT,
ride_date TIMESTAMP,
fare DECIMAL(10, 2)
);
• - Datasets
-- Rides
INSERT INTO rides (ride_id, user_id, driver_id, ride_date, fare)
VALUES
(1, 101, 201, '2023-02-10 08:00:00', 15.50),
(2, 102, 202, '2023-02-15 09:00:00', 25.00),
(3, 103, 201, '2023-02-20 10:00:00', 18.50),
(4, 104, 201, '2023-03-01 11:00:00', 12.00),
(5, 105, 202, '2023-02-25 12:00:00', 14.00);
Learnings
• Using SUM() for aggregation.
• Filtering data for the last month.
• Grouping by driver_id.
Solutions
• - PostgreSQL solution
SELECT driver_id, SUM(fare) AS total_earnings
FROM rides
578
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using SUM() to calculate the total distance traveled by each user.
• Sorting the results in descending order.
• Using LIMIT to get the top 5 users.
Solutions
• - PostgreSQL solution
SELECT user_id, SUM(ride_distance) AS total_distance
FROM rides
GROUP BY user_id
ORDER BY total_distance DESC
LIMIT 5;
• - MySQL solution
SELECT user_id, SUM(ride_distance) AS total_distance
FROM rides
GROUP BY user_id
ORDER BY total_distance DESC
LIMIT 5;
• Q.466
Question
Write a query to find the number of Uber rides taken on each day of the week. Display the
day of the week and the total number of rides taken on that day.
579
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
This query should extract the day of the week from the ride_date and then count the number
of rides taken on each day of the week. The EXTRACT() function can be used to get the day of
the week.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
ride_date TIMESTAMP,
ride_distance DECIMAL(10, 2),
fare DECIMAL(10, 2)
);
• - Datasets
-- Rides
INSERT INTO rides (ride_id, user_id, ride_date, ride_distance, fare)
VALUES
(1, 101, '2023-02-01 08:00:00', 5.0, 15.50),
(2, 102, '2023-02-02 09:00:00', 10.0, 25.00),
(3, 103, '2023-02-03 10:00:00', 7.0, 18.50),
(4, 104, '2023-02-03 11:00:00', 3.5, 12.00),
(5, 101, '2023-02-04 12:00:00', 6.0, 16.00),
(6, 105, '2023-02-05 13:00:00', 4.0, 14.00);
Learnings
• Using EXTRACT(DOW FROM ride_date) or DAYOFWEEK(ride_date) to get the day of the
week.
• Aggregating using COUNT() to find the number of rides per day.
• Grouping by day of the week.
Solutions
• - PostgreSQL solution
SELECT EXTRACT(DOW FROM ride_date) AS day_of_week, COUNT(ride_id) AS total_rides
FROM rides
GROUP BY day_of_week
ORDER BY day_of_week;
• - MySQL solution
SELECT DAYOFWEEK(ride_date) AS day_of_week, COUNT(ride_id) AS total_rides
FROM rides
GROUP BY day_of_week
ORDER BY day_of_week;
• Q.467
Question
Assume you are given the table below on Uber transactions made by users. Write a query to
obtain the third transaction of every user. Output the user id, spend, and transaction date.
Explanation
To get the third transaction for each user, we need to:
• Rank the transactions of each user based on the transaction_date.
• Filter to only include the third transaction for each user.
We can use the
ROW_NUMBER() window function to rank transactions for each user and then filter out those
that are ranked 3.
Datasets and SQL Schemas
580
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE transactions (
user_id INT,
spend DECIMAL(10, 2),
transaction_date TIMESTAMP
);
• - Datasets
-- Transactions
INSERT INTO transactions (user_id, spend, transaction_date)
VALUES
(111, 100.50, '2022-01-08 12:00:00'),
(111, 55.00, '2022-01-10 12:00:00'),
(121, 36.00, '2022-01-18 12:00:00'),
(145, 24.99, '2022-01-26 12:00:00'),
(111, 89.60, '2022-02-05 12:00:00');
Learnings
• Using window functions (ROW_NUMBER()) to rank data within partitions.
• Filtering data based on the row number to get specific entries, such as the third transaction.
Solutions
• - PostgreSQL solution
WITH ranked_transactions AS (
SELECT
user_id,
spend,
transaction_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY transaction_date) AS transactio
n_rank
FROM transactions
)
SELECT user_id, spend, transaction_date
FROM ranked_transactions
WHERE transaction_rank = 3;
• - MySQL solution
WITH ranked_transactions AS (
SELECT
user_id,
spend,
transaction_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY transaction_date) AS transactio
n_rank
FROM transactions
)
SELECT user_id, spend, transaction_date
FROM ranked_transactions
WHERE transaction_rank = 3;
Key Points:
• The ROW_NUMBER() function is used to assign a sequential rank to each transaction for each
user.
• PARTITION BY user_id ensures the ranking is reset for each user.
• WHERE transaction_rank = 3 filters for the third transaction in the sequence.
• Q.468
Question
As a data analyst at Uber, it's your job to report the latest metrics for specific groups of Uber
users. Some riders create their Uber account the same day they book their first ride; the rider
engagement team calls them "in-the-moment" users.
581
1000+ SQL Interview Questions & Answers | By Zero Analyst
Uber wants to know the average delay between the day of user sign-up and the day of their
2nd ride. Write a query to pull the average 2nd ride delay for "in-the-moment" Uber users.
Round the answer to 2-decimal places.
Explanation
"In-the-moment" users are those whose registration date matches the date of their first ride.
To find the average delay between the day of sign-up and the day of their 2nd ride:
• We first identify "in-the-moment" users (where the sign-up date equals the first ride date).
• For each "in-the-moment" user, we calculate the delay between the registration date and
the date of their second ride.
• Finally, we compute the average delay across all such users.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE users (
user_id INT PRIMARY KEY,
registration_date DATE
);
-- Rides
INSERT INTO rides (ride_id, user_id, ride_date)
VALUES
(1, 1, '2022-08-15'),
(2, 1, '2022-08-16'),
(3, 2, '2022-09-20'),
(4, 2, '2022-09-23');
Learnings
• Using window functions to rank rides by date.
• Filtering for users who meet the "in-the-moment" criteria.
• Using DATEDIFF() or equivalent functions to calculate date differences.
Solutions
• - PostgreSQL solution
WITH in_the_moment_users AS (
SELECT u.user_id, u.registration_date,
(SELECT ride_date FROM rides r WHERE r.user_id = u.user_id ORDER BY ride_date
LIMIT 1) AS first_ride_date,
(SELECT ride_date FROM rides r WHERE r.user_id = u.user_id ORDER BY ride_date
OFFSET 1 LIMIT 1) AS second_ride_date
FROM users u
)
SELECT ROUND(AVG(DATE_PART('day', second_ride_date - registration_date)), 2) AS avg_2nd_
ride_delay
FROM in_the_moment_users
WHERE registration_date = first_ride_date;
• - MySQL solution
WITH in_the_moment_users AS (
SELECT u.user_id, u.registration_date,
582
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Points:
• Identifying "In-the-moment" Users: These are users whose registration date matches the
date of their first ride.
• Ranking Rides: Using ORDER BY to select the first and second rides for each user.
• Date Difference: Calculating the delay between registration and the second ride using
DATE_PART (PostgreSQL) or DATEDIFF (MySQL).
• Rounding the Result: The ROUND() function ensures the delay is shown to 2 decimal
places.
• Q.469
Question
Uber has a diverse range of vehicles from bikes, scooters, to premium luxury cars. In order to
cater their services better, Uber wants to understand their customers' preference. The task is
to write a SQL query that filters out the most used vehicle type by Uber's customers in the
past year. To provide a more holistic view, the results should also exclude rides that were
cancelled by either the driver or the user.
Explanation
This query will:
• Filter out cancelled rides (cancelled = false).
• Limit the data to rides that occurred within the last year.
• Join the rides table with the vehicle_types table to fetch the vehicle type name.
• Count the number of rides for each vehicle type.
• Order the results by the total number of rides and return the vehicle type with the highest
usage.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
vehicle_type_id INT,
start_time TIMESTAMP,
end_time TIMESTAMP,
cancelled BOOLEAN
);
583
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Vehicle Types
INSERT INTO vehicle_types (type_id, vehicle_type)
VALUES
(1, 'Bike'),
(2, 'Car'),
(3, 'SUV'),
(4, 'Luxury Car'),
(5, 'Scooter');
Learnings
• Using JOIN to combine data from multiple tables.
• Filtering data based on conditions such as cancelled = false and start_time >=
NOW() - INTERVAL '1 year'.
• Grouping data by vehicle type and using COUNT() to find the most popular vehicle type.
• Sorting the result in descending order and limiting to the top result with LIMIT 1.
Solutions
• - PostgreSQL solution
SELECT v.vehicle_type, COUNT(*) AS total_rides
FROM rides r
JOIN vehicle_types v ON r.vehicle_type_id = v.type_id
WHERE r.cancelled = false
AND r.start_time >= (NOW() - INTERVAL '1 year')
GROUP BY v.vehicle_type
ORDER BY total_rides DESC
LIMIT 1;
• - MySQL solution
SELECT v.vehicle_type, COUNT(*) AS total_rides
FROM rides r
JOIN vehicle_types v ON r.vehicle_type_id = v.type_id
WHERE r.cancelled = false
AND r.start_time >= CURDATE() - INTERVAL 1 YEAR
GROUP BY v.vehicle_type
ORDER BY total_rides DESC
LIMIT 1;
Key Points:
• Filtering Out Cancelled Rides: We exclude cancelled rides by filtering with
r.cancelled = false.
• Limiting by Date: We restrict the data to rides that occurred in the last year using NOW() -
INTERVAL '1 year' for PostgreSQL or CURDATE() - INTERVAL 1 YEAR for MySQL.
• Using COUNT: We count the number of rides for each vehicle type.
• Sorting and Limiting: The query sorts by the number of rides (total_rides) and returns
the vehicle type with the most rides.
• Q.470
Question
Uber has a diverse range of vehicles from bikes, scooters, to premium luxury cars. In order to
cater their services better, Uber wants to understand their customers' preference. The task is
to write a SQL query that filters out the most used vehicle type by Uber's customers in the
past year. To provide a more holistic view, the results should also exclude rides that were
cancelled by either the driver or the user.
584
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To find the most used vehicle type in the past year:
• Filter out the cancelled rides (cancelled = false).
• Limit the data to rides that occurred within the past year.
• Join the rides table with the vehicle_types table to fetch vehicle names.
• Group by vehicle type and count the number of rides for each type.
• Sort the results by the total number of rides and return the most used vehicle type.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
vehicle_type_id INT,
start_time TIMESTAMP,
end_time TIMESTAMP,
cancelled BOOLEAN
);
-- Vehicle Types
INSERT INTO vehicle_types (type_id, vehicle_type)
VALUES
(1, 'Bike'),
(2, 'Car'),
(3, 'SUV'),
(4, 'Luxury Car'),
(5, 'Scooter');
Learnings
• Joining Tables: The query uses JOIN to combine the rides and vehicle_types tables.
• Filtering Cancelled Rides: We exclude cancelled rides by checking r.cancelled =
false.
• Date Filtering: The query limits results to the past year by using NOW() - INTERVAL '1
year' for PostgreSQL or CURDATE() - INTERVAL 1 YEAR for MySQL.
• Counting Rides: The number of rides for each vehicle type is counted using COUNT(*).
• Sorting and Limiting: The results are sorted in descending order by the ride count, and
only the top result (most used vehicle type) is returned.
Solutions
• - PostgreSQL solution
SELECT v.vehicle_type, COUNT(*) AS total_rides
FROM rides r
JOIN vehicle_types v ON r.vehicle_type_id = v.type_id
WHERE r.cancelled = false
AND r.start_time >= (NOW() - INTERVAL '1 year')
585
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY v.vehicle_type
ORDER BY total_rides DESC
LIMIT 1;
• - MySQL solution
SELECT v.vehicle_type, COUNT(*) AS total_rides
FROM rides r
JOIN vehicle_types v ON r.vehicle_type_id = v.type_id
WHERE r.cancelled = false
AND r.start_time >= CURDATE() - INTERVAL 1 YEAR
GROUP BY v.vehicle_type
ORDER BY total_rides DESC
LIMIT 1;
Key Points:
• Filtering Out Cancelled Rides: By using r.cancelled = false, we ensure that only
completed rides are considered.
• Limiting by Date: We restrict the data to the past year using NOW() - INTERVAL '1
year' (PostgreSQL) or CURDATE() - INTERVAL 1 YEAR (MySQL).
• Counting Rides: The COUNT() function is used to calculate the number of rides for each
vehicle type.
• Sorting and Limiting: The results are ordered by the number of rides (total_rides), and
we use LIMIT 1 to get the vehicle type with the most rides.
• Q.471
Question
As a data analyst for Uber, you are asked to determine each driver's average ratings for each
city. This will help Uber monitor performance and perhaps highlight any problems that might
be arising in any specific city.
We have two tables, rides and ratings.
Explanation
This query will:
• Join the rides table with the ratings table using ride_id to link the corresponding ride
and its rating.
• Group the results by driver_id and city to calculate the average rating per driver per
city.
• Use the AVG() function to compute the average rating for each group.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
driver_id INT,
city VARCHAR(100),
fare_amount DECIMAL
);
586
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Ratings
INSERT INTO ratings (ride_id, rating)
VALUES
(101, 4.3),
(102, 4.1),
(103, 4.8),
(104, 4.7),
(105, 3.9);
Learnings
• Joining Tables: Use INNER JOIN to combine data from the rides and ratings tables
based on the ride_id.
• Grouping Data: Use GROUP BY to aggregate the data by driver_id and city.
• Aggregating with AVG(): Use AVG() to calculate the average rating for each group of
driver_id and city.
Solutions
• - PostgreSQL solution
SELECT r.driver_id, r.city, AVG(rt.rating) AS avg_rating
FROM rides r
INNER JOIN ratings rt ON r.ride_id = rt.ride_id
GROUP BY r.driver_id, r.city;
• - MySQL solution
SELECT r.driver_id, r.city, AVG(rt.rating) AS avg_rating
FROM rides r
INNER JOIN ratings rt ON r.ride_id = rt.ride_id
GROUP BY r.driver_id, r.city;
Key Points:
• INNER JOIN: Combines rides and ratings based on the matching ride_id.
• Grouping: Aggregates data by both driver_id and city using GROUP BY.
• AVG(): Computes the average rating for each group.
• Q.472
Question
As an SQL analyst at Uber, you are assigned to filter out the customers who have registered
using their Gmail IDs. You are given a database named 'users'. The records in this table
contain multiple email domains. You need to write an SQL query that filters only those
records where the 'email' field contains 'gmail.com'.
Explanation
To retrieve users with Gmail IDs:
• Use the LIKE operator to match emails containing 'gmail.com'.
• The % symbol is a wildcard that matches any number of characters before 'gmail.com'.
• Filter out only the records where the email domain is 'gmail.com'.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE users (
user_id INT PRIMARY KEY,
full_name VARCHAR(100),
587
1000+ SQL Interview Questions & Answers | By Zero Analyst
registration_date DATE,
email VARCHAR(255)
);
• - Datasets
-- Users
INSERT INTO users (user_id, full_name, registration_date, email)
VALUES
(7162, 'John Doe', '2019-05-04', '[email protected]'),
(7625, 'Jane Smith', '2020-11-09', '[email protected]'),
(5273, 'Steve Johnson', '2018-06-20', '[email protected]'),
(6322, 'Emily Davis', '2021-08-14', '[email protected]'),
(4812, 'Olivia Brown', '2019-09-30', '[email protected]');
Learnings
• Using LIKE for Pattern Matching: The LIKE operator allows us to filter records based
on a pattern. Using %gmail.com ensures that only emails containing 'gmail.com' are returned.
• Wildcard Matching: % is a wildcard that matches zero or more characters, allowing
flexible pattern matching.
Solutions
• - PostgreSQL solution
SELECT *
FROM users
WHERE email LIKE '%gmail.com';
• - MySQL solution
SELECT *
FROM users
WHERE email LIKE '%gmail.com';
Key Points:
• LIKE Operator: Used for matching email addresses containing specific patterns.
• Wildcard %: Matches any sequence of characters, ensuring flexibility in the search.
• Q.473
Question
Uber is conducting an analysis of its driver performance across various cities. Your task is to
develop a SQL query to identify the top-performing drivers based on their average rating.
Only drivers who have completed at least 6 trips should be considered for this analysis. The
query should provide the driver's name, city, and their average rating, sorted in descending
order of average rating.
Note: Round the average rating to 2 decimal points.
Explanation
This query will:
• Join the Drivers and Trips tables using DRIVER_ID to fetch the relevant driver
information and their ratings.
• Filter out drivers who have completed fewer than 6 trips.
• Calculate the average rating for each driver.
• Round the average rating to 2 decimal places.
• Sort the results in descending order of the average rating.
Datasets and SQL Schemas
• - Table creation
588
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Trips
INSERT INTO Trips (TRIP_ID, DRIVER_ID, RATING)
VALUES
(21, 4, 5),
(22, 4, 4),
(23, 4, 5);
Learnings
• JOIN: The JOIN operation is used to combine driver information with trip ratings based on
DRIVER_ID.
• HAVING: The HAVING clause filters out drivers with fewer than 6 trips after grouping by
DRIVER_ID.
• AVG(): The AVG() function calculates the average rating for each driver.
• ROUND(): The ROUND() function rounds the average rating to 2 decimal places.
• Sorting: The ORDER BY clause is used to sort the results in descending order of average
rating.
Solutions
• - PostgreSQL solution
SELECT d.DRIVER_NAME, d.CITY, ROUND(AVG(t.RATING), 2) AS avg_rating
FROM Drivers d
JOIN Trips t ON d.DRIVER_ID = t.DRIVER_ID
GROUP BY d.DRIVER_ID, d.DRIVER_NAME, d.CITY
HAVING COUNT(t.TRIP_ID) >= 6
ORDER BY avg_rating DESC;
• - MySQL solution
SELECT d.DRIVER_NAME, d.CITY, ROUND(AVG(t.RATING), 2) AS avg_rating
FROM Drivers d
JOIN Trips t ON d.DRIVER_ID = t.DRIVER_ID
GROUP BY d.DRIVER_ID, d.DRIVER_NAME, d.CITY
HAVING COUNT(t.TRIP_ID) >= 6
ORDER BY avg_rating DESC;
Key Points:
• JOIN: Combines data from Drivers and Trips based on DRIVER_ID.
• HAVING: Filters drivers who have completed at least 6 trips.
• AVG(): Calculates the average rating per driver.
• ROUND(): Rounds the average rating to 2 decimal points.
• ORDER BY: Sorts the drivers based on their average rating in descending order.
• Q.474
589
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a query to find, for each user that has taken at least two trips with Uber, the time that
elapsed between the first trip and the second trip.
Explanation
This query:
• Identifies users who have taken at least two trips by counting the number of trips per rider.
• Uses the LAG() function to retrieve the timestamp of the previous trip for each rider,
ordered by the trip_timestamp.
• Calculates the time difference between each trip and the previous one (i.e., the time elapsed
between the first and second trip for each rider).
• Filters out riders who have fewer than two trips.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE sign_ups (
rider_id INT PRIMARY KEY,
signup_timestamp DATE
);
-- Trips
INSERT INTO trips (trip_id, rider_id, driver_id, trip_timestamp)
VALUES
(1, 1, 2, '2022-02-01'),
(2, 2, 2, '2022-03-11'),
(3, 1, 2, '2022-04-01'),
(4, 1, 2, '2022-05-21'),
(5, 2, 2, '2022-06-01'),
(6, 3, 2, '2022-07-31');
Learnings
• LAG() Window Function: The LAG() function allows us to access the previous row's
value, enabling us to calculate the time difference between consecutive trips.
• PARTITION BY: This ensures that the time difference is calculated separately for each
rider_id.
• Time Calculation: The query calculates the difference between timestamps and returns the
result in days.
• Filtering with HAVING: We filter out riders who have taken fewer than two trips using
HAVING COUNT(*) > 1.
Solutions
• - PostgreSQL solution
590
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT rider_id,
trip_timestamp,
lag(trip_timestamp, 1) OVER (PARTITION BY rider_id
ORDER BY trip_timestamp DESC) - trip_timestamp
AS time_between_two_trip
FROM trips
WHERE rider_id IN
(SELECT rider_id
FROM trips
GROUP BY rider_id
HAVING COUNT(*) > 1);
• - MySQL solution
SELECT rider_id,
trip_timestamp,
TIMESTAMPDIFF(DAY,
lag(trip_timestamp, 1) OVER (PARTITION BY rider_id ORDER BY trip_ti
mestamp DESC),
trip_timestamp) AS time_between_two_trip
FROM trips
WHERE rider_id IN
(SELECT rider_id
FROM trips
GROUP BY rider_id
HAVING COUNT(*) > 1);
Key Points:
• LAG() Function: Retrieves the timestamp of the previous trip in the ordered sequence for
each rider.
• TIMESTAMPDIFF(): In MySQL, TIMESTAMPDIFF() is used to calculate the time
difference in days.
• PARTITION BY: Ensures calculations are done per rider.
• HAVING COUNT(*) > 1: Filters out riders with fewer than two trips.
• Q.475
Question
Write a query to find how many users placed their third order containing a product owned by
the ATG (holding company) on or after 9/21/22. Only consider orders containing at least one
ATG holding company product.
Explanation
This query:
• Filters for orders placed on or after 9/21/22.
• Identifies users who placed at least three orders containing products owned by ATG (the
holding company).
• Ensures that we only consider orders that contain a product owned by the ATG holding
company.
• Returns the user_id of users who meet these criteria.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE brands (
brand_id INT PRIMARY KEY,
brand_name VARCHAR(50),
holding_company_id INT,
holding_company_name VARCHAR(50)
);
591
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Orders
INSERT INTO orders (order_id, user_id, product_id, brand_id, price, quantity, date, stor
e_id)
VALUES
(1, 111, 3, 1, 100, 2, '2022-09-21', 12),
(2, 222, 7, 2, 123, 30, '2022-12-09', 34),
(3, 222, 9, 1, 14, 1, '2020-05-02', 435),
(4, 222, 11, 3, 140, 40, '2019-11-03', 23),
(5, 333, 13, 5, 120, 15, '2019-10-01', 45);
Learnings
• CTE (Common Table Expression): Using WITH to create an intermediate result
(count_orders) that counts the total orders per user after 9/21/22.
• EXISTS: Filters only users who have placed orders with products owned by ATG by using
a subquery with EXISTS.
• Filtering by Date: We ensure that we only count orders after 9/21/22 by filtering using
the date column.
Solution
WITH count_orders AS (
SELECT user_id,
COUNT(order_id) AS total_orders
FROM orders
WHERE date >= '2022-09-21'
GROUP BY user_id
)
SELECT user_id
FROM count_orders
WHERE total_orders >= 3
AND EXISTS (
SELECT 1
FROM orders o
JOIN brands b ON o.brand_id = b.brand_id
WHERE o.user_id = count_orders.user_id
AND b.holding_company_name = 'ATG'
);
592
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The main query uses the count_orders CTE to filter users who have placed at least three
orders (WHERE total_orders >= 3).
• EXISTS Subquery:
• The EXISTS subquery ensures that the user has ordered at least one product owned by the
ATG holding company (AND b.holding_company_name = 'ATG').
• The JOIN between the orders table and the brands table allows us to filter for products
belonging to ATG.
• Final Output:
• The final result returns user_id of users who placed at least three orders containing
products owned by ATG on or after 9/21/22.
Key Points:
• WITH (CTE): Used to calculate the number of orders for each user.
• EXISTS: Efficiently filters users who placed orders with ATG products.
• Date Filter: Ensures only orders after 9/21/22 are considered.
• Q.476
Question
Write a query to find the latest trip timestamp for each user who took at least one trip.
Explanation
The query should:
• Find the latest trip timestamp for each user from the trips table.
• Only consider users who have taken at least one trip.
• Sort the results by rider_id in ascending order.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE sign_ups (
rider_id INT PRIMARY KEY,
signup_timestamp DATE
);
-- Trips
INSERT INTO trips (trip_id, rider_id, driver_id, trip_timestamp)
VALUES
(1, 1, 2, '2022-02-01 00:00:00'),
(2, 2, 2, '2022-03-11 00:00:00'),
(3, 1, 2, '2022-04-01 00:00:00'),
(4, 1, 2, '2022-05-21 00:00:00'),
593
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• MAX() Function: The MAX() function is used to find the latest timestamp for each user.
• GROUP BY Clause: This groups the data by rider_id to get the latest trip for each user.
• ORDER BY Clause: Sorting the results by rider_id ensures that the output is ordered as
requested.
Solution
SELECT rider_id,
MAX(trip_timestamp) AS latest_trip_timestamp
FROM trips
GROUP BY rider_id
ORDER BY rider_id ASC;
Key Points:
• MAX(): Helps in getting the most recent trip timestamp.
• GROUP BY: Used to aggregate results by each unique user (rider_id).
• Sorting: The result is ordered by rider_id to comply with the prompt requirements.
• Q.477
Question
You are given a table of Uber rides that contains the mileage and the purpose for the business
expense. Your task is to find the top 3 business purposes by total mileage driven for
passengers that use Uber for their business transportation.
The query should:
• Calculate the total miles driven for each business purpose.
• Only include trips where the business purpose is categorized as "Business."
• Return the top 3 business purposes by total mileage.
594
1000+ SQL Interview Questions & Answers | By Zero Analyst
user_id INT,
miles_driven DECIMAL(10, 2),
business_purpose VARCHAR(50),
ride_date DATE
);
Explanation:
• Table Structure:
• The my_uber_drives table contains the columns:
• ride_id: Unique identifier for each ride.
• user_id: The ID of the user who took the ride.
• miles_driven: The total miles driven for the ride.
• business_purpose: The purpose of the ride (e.g., 'Business', 'Leisure').
• ride_date: The date when the ride occurred.
• Query Breakdown:
• We filter for rides where the business_purpose is 'Business'.
• We then aggregate the data by business_purpose using SUM(miles_driven) to calculate
the total miles driven for each business purpose.
• We use the ORDER BY clause to sort the results in descending order by the total miles
driven.
• Finally, the LIMIT 3 ensures we only return the top 3 business purposes.
595
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution:
SELECT business_purpose,
SUM(miles_driven) AS total_miles
FROM my_uber_drives
WHERE business_purpose = 'Business'
GROUP BY business_purpose
ORDER BY total_miles DESC
LIMIT 3;
PostgreSQL Solution:
SELECT business_purpose,
SUM(miles_driven) AS total_miles
FROM my_uber_drives
WHERE business_purpose = 'Business'
GROUP BY business_purpose
ORDER BY total_miles DESC
LIMIT 3;
Output:
Assuming the sample data provided in the my_uber_drives table, the output will show the
total mileage driven for each business purpose, ordered by total miles in descending order.
business_purpose total_miles
Business 95.0
In this case, only one business purpose category ('Business') is considered, so it returns that
with the sum of the miles driven for all rides categorized as 'Business'.
Key Points:
• SUM() Function: The SUM() function is used to calculate the total miles driven for each
business purpose.
• GROUP BY Clause: This is essential to group the data by business_purpose to aggregate
mileage for each category.
• Filtering Data: The query only considers rides where the business purpose is 'Business',
filtering out leisure-related rides.
• Limiting Results: We use the LIMIT 3 clause to restrict the output to the top 3 business
purposes (if there were more than one).
• Sorting: The query sorts the results by total_miles in descending order to ensure the
highest mileage is at the top.
• Q.478
596
1000+ SQL Interview Questions & Answers | By Zero Analyst
Table Creation:
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
fare_amount DECIMAL(10, 2),
ride_date DATE
);
Insert Data:
INSERT INTO rides (ride_id, user_id, fare_amount, ride_date)
VALUES
(1, 101, 20.50, '2023-07-01'),
(2, 102, 30.75, '2023-07-02'),
(3, 103, 25.00, '2023-07-03'),
(4, 101, 15.00, '2023-07-10'),
(5, 104, 18.00, '2023-07-11'),
(6, 102, 28.25, '2023-07-12'),
(7, 105, 12.50, '2023-07-15'),
(8, 103, 40.00, '2023-07-18');
Solution
PostgreSQL and MySQL:
SELECT user_id, SUM(fare_amount) AS total_spent
FROM rides
WHERE ride_date BETWEEN '2023-07-01' AND '2023-07-31'
GROUP BY user_id
HAVING COUNT(ride_id) > 0
ORDER BY total_spent DESC
LIMIT 5;
Explanation:
• SUM(fare_amount): Calculates the total fare amount spent by each user.
• WHERE ride_date BETWEEN '2023-07-01' AND '2023-07-31': Filters the rides for the
month of July 2023.
• HAVING COUNT(ride_id) > 0: Ensures that only users who have at least one ride are
included.
• ORDER BY total_spent DESC: Orders the results by the total amount spent in descending
order.
• LIMIT 5: Limits the result to the top 5 users.
Learnings:
• Using Aggregation Functions: SUM() to calculate the total spending.
• Filtering by Date: Date operations using BETWEEN for a specific range.
• Grouping and Ordering: Using GROUP BY to group by user_id and ORDER BY to sort the
results.
• Q.479
Uber SQL Interview Question 2: Track Users Who Frequently Cancel Rides
Uber wants to understand which users are frequently canceling their rides. Write a query to
identify the top 3 users who have canceled the most rides in the last 30 days. The database
contains the rides table with the ride status (cancelled column) and user_id information.
597
1000+ SQL Interview Questions & Answers | By Zero Analyst
Table Creation:
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
user_id INT,
cancelled BOOLEAN,
ride_date DATE
);
Insert Data:
INSERT INTO rides (ride_id, user_id, cancelled, ride_date)
VALUES
(1, 101, TRUE, '2023-06-01'),
(2, 102, FALSE, '2023-06-02'),
(3, 103, TRUE, '2023-06-05'),
(4, 101, FALSE, '2023-06-07'),
(5, 104, TRUE, '2023-06-10'),
(6, 105, FALSE, '2023-06-12'),
(7, 101, TRUE, '2023-06-15'),
(8, 103, TRUE, '2023-06-20');
Solution
PostgreSQL and MySQL:
SELECT user_id, COUNT(*) AS cancelled_rides
FROM rides
WHERE cancelled = TRUE
AND ride_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY user_id
ORDER BY cancelled_rides DESC
LIMIT 3;
Explanation:
• COUNT(*) AS cancelled_rides: Counts the number of canceled rides for each user.
• WHERE cancelled = TRUE: Filters to include only canceled rides.
• AND ride_date >= CURDATE() - INTERVAL 30 DAY: Filters for rides canceled within
the last 30 days using the CURDATE() function in MySQL and PostgreSQL.
• GROUP BY user_id: Groups the data by user_id to aggregate canceled rides per user.
• ORDER BY cancelled_rides DESC: Orders the users by the number of canceled rides in
descending order.
• LIMIT 3: Limits the results to the top 3 users.
Learnings:
• Filtering by Boolean: Handling boolean columns (cancelled = TRUE).
• Date Calculations: Using CURDATE() with INTERVAL for dynamic date filtering.
• Grouping and Sorting: Combining COUNT() with GROUP BY and ORDER BY.
• Q.480
Find the Average Ride Time for Each Driver
Write a query to find the average ride duration for each driver, measured in minutes. The
rides table contains ride start and end times for each ride. Return the driver ID, their name
(from the drivers table), and their average ride duration (rounded to 2 decimal places). Only
include drivers who have completed at least 5 rides.
598
1000+ SQL Interview Questions & Answers | By Zero Analyst
Insert Data:
INSERT INTO rides (ride_id, driver_id, start_time, end_time)
VALUES
(1, 201, '2023-06-01 08:00:00', '2023-06-01 08:30:00'),
(2, 202, '2023-06-02 09:00:00', '2023-06-02 09:15:00'),
(3, 201, '2023-06-03 10:00:00', '2023-06-03 10:45:00'),
(4, 202, '2023-06-04 12:00:00', '2023-06-04 12:30:00'),
(5, 201, '2023-06-05 13:00:00', '2023-06-05 13:25:00'),
(6, 203, '2023-06-06 14:00:00', '2023-06-06 14:20:00');
Solution
PostgreSQL and MySQL:
SELECT d.driver_name,
r.driver_id,
ROUND(AVG(TIMESTAMPDIFF(MINUTE, r.start_time, r.end_time)), 2) AS avg_ride_durati
on
FROM rides r
JOIN drivers d ON r.driver_id = d.driver_id
GROUP BY r.driver_id
HAVING COUNT(r.ride_id) >= 5;
Explanation:
• TIMESTAMPDIFF(MINUTE, r.start_time, r.end_time): Calculates the difference in
minutes between the start_time and end_time for each ride.
• ROUND(..., 2): Rounds the average duration to 2 decimal places.
• JOIN drivers d ON r.driver_id = d.driver_id: Joins the rides table with the
drivers table to fetch the driver's name.
• GROUP BY r.driver_id: Groups the results by driver_id to calculate average ride
duration per driver.
• HAVING COUNT(r.ride_id) >= 5: Filters the results to only include drivers with at least 5
rides.
Learnings:
• Calculating Time Differences: Using TIMESTAMPDIFF() to calculate the time difference
in minutes.
• Rounding Results: Using ROUND() to limit the precision of floating-point values.
• Joining Multiple Tables: Combining data from the rides and drivers tables.
• Filtering with HAVING: Using HAVING to filter results after aggregation (when using
COUNT()).
599
1000+ SQL Interview Questions & Answers | By Zero Analyst
Summary of
Key Concepts:
• Filtering by Date and Time: Using WHERE and BETWEEN to restrict data based on date
ranges.
• Aggregation Functions: Using SUM(), COUNT(), AVG() for aggregation and analysis.
• Grouping Data: Applying GROUP BY to perform analysis on groups of data.
• Joins: Combining data from multiple tables using JOIN to enhance results.
• Time Calculations: Working with timestamps and calculating time differences using
TIMESTAMPDIFF().
PayPal
• Q.481
Question:
Write an SQL query to report the fraction of players that logged in again on the day after the
day they first logged in, rounded to 2 decimal places.
In other words, count the number of players that logged in for at least two consecutive days
starting from their first login date, then divide that number by the total number of players.
Explanation:
• Identify first login date for each player.
• Check for consecutive logins: For each player, verify if they logged in on the day after
their first login date.
• Count the players who logged in consecutively starting from their first login date.
• Calculate the fraction of players who logged in on consecutive days by dividing the count
of players with consecutive logins by the total number of distinct players.
Learnings:
• Window Functions: Using window functions like MIN() and LEAD() to capture login
dates and check for consecutive days.
• Date Arithmetic: Using date functions to compare dates and check for consecutive login
days.
• Aggregation: Using COUNT() and DISTINCT to count players who meet the condition.
600
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Rounding: Using the ROUND() function to round the result to 2 decimal places.
Solutions
• - PostgreSQL solution
WITH FirstLogin AS (
SELECT player_id, MIN(event_date) AS first_login_date
FROM Activity
GROUP BY player_id
), ConsecutiveLogins AS (
SELECT a.player_id
FROM Activity a
JOIN FirstLogin f ON a.player_id = f.player_id
WHERE a.event_date = f.first_login_date + INTERVAL '1 day'
)
SELECT ROUND(COUNT(DISTINCT c.player_id) * 1.0 / (SELECT COUNT(DISTINCT player_id) FROM
Activity), 2) AS fraction
FROM ConsecutiveLogins c;
• - MySQL solution
WITH FirstLogin AS (
SELECT player_id, MIN(event_date) AS first_login_date
FROM Activity
GROUP BY player_id
), ConsecutiveLogins AS (
SELECT a.player_id
FROM Activity a
JOIN FirstLogin f ON a.player_id = f.player_id
WHERE a.event_date = DATE_ADD(f.first_login_date, INTERVAL 1 DAY)
)
SELECT ROUND(COUNT(DISTINCT c.player_id) / COUNT(DISTINCT a.player_id), 2) AS fraction
FROM Activity a
LEFT JOIN ConsecutiveLogins c ON a.player_id = c.player_id;
• Q.482
Question:
Write an SQL query to select the product ID, year, quantity, and price for the first year of
every product sold.
Explanation:
• Identify the first year for each product by selecting the minimum year from the sales
records of each product.
• Join the result with the Sales table to get the details for that first year (product_id, year,
quantity, price).
• Return the results for all products in any order.
601
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Aggregation: Use of MIN() to identify the first year of sales for each product.
• Joins: How to join tables to combine related information from Product and Sales.
• Subquery: Using subquery to find the minimum year of sale for each product.
Solutions
• - PostgreSQL solution
SELECT s.product_id, s.year, s.quantity, s.price
FROM Sales s
JOIN (
SELECT product_id, MIN(year) AS first_year
FROM Sales
GROUP BY product_id
) first_sales ON s.product_id = first_sales.product_id
AND s.year = first_sales.first_year;
• - MySQL solution
SELECT s.product_id, s.year, s.quantity, s.price
FROM Sales s
JOIN (
SELECT product_id, MIN(year) AS first_year
FROM Sales
GROUP BY product_id
) first_sales ON s.product_id = first_sales.product_id
AND s.year = first_sales.first_year;
• Q.483
Question:
Write an SQL query to report the customer IDs from the Customer table that bought all the
products in the Product table.
Explanation:
• We need to find customers who have purchased every product listed in the Product table.
• To achieve this, we can count the distinct product_key for each customer and compare it
with the total number of distinct products in the Product table.
• If the customer has bought all the products, the count of distinct products they bought will
match the total number of distinct products in the Product table.
602
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learnings:
• Aggregation: Use of COUNT(DISTINCT ...) to count unique products bought by each
customer.
• Subquery: Using a subquery to get the total number of distinct products.
• Grouping: Grouping by customer_id to evaluate each customer individually.
Solutions
• - PostgreSQL solution
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(DISTINCT product_key) FROM Product);
• - MySQL solution
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(DISTINCT product_key) FROM Product);
• Q.484
Question:
Write an SQL query to report the customer IDs from the Customer table that bought all the
products in the Product table.
Explanation:
• We need to identify customers who have purchased every product listed in the Product
table.
• This can be done by comparing the count of distinct products a customer has bought to the
total count of distinct products in the Product table.
• If the customer has bought all products, the count of distinct products they bought will
match the total count of products in the Product table.
603
1000+ SQL Interview Questions & Answers | By Zero Analyst
product_key INT,
PRIMARY KEY (product_key)
);
• - Sample data for Customer and Product tables
INSERT INTO Product (product_key)
VALUES (5), (6);
Learnings:
• Aggregation: Using COUNT(DISTINCT ...) to count the unique products bought by each
customer.
• Subquery: A subquery is used to calculate the total number of distinct products.
• Grouping: Grouping by customer_id ensures that we evaluate each customer
individually.
Solutions
• - PostgreSQL solution
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(DISTINCT product_key) FROM Product);
• - MySQL solution
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(DISTINCT product_key) FROM Product);
• Q.485
Question:
Given a table containing information about bank deposits and withdrawals made using
PayPal, write a query to retrieve the final account balance for each account, taking into
account all the transactions recorded in the table, with the assumption that there are no
missing transactions.
Explanation:
• For each account, we need to calculate the final balance by considering both deposits and
withdrawals.
• If the transaction type is 'Deposit', the amount is added to the balance, and if the transaction
type is 'Withdrawal', the amount is subtracted from the balance.
• The final balance for each account is the sum of the amounts, adjusted by the type of
transaction.
604
1000+ SQL Interview Questions & Answers | By Zero Analyst
transaction_type VARCHAR
);
• - Sample data for transactions table
INSERT INTO transactions (transaction_id, account_id, amount, transaction_type)
VALUES
(123, 101, 10.00, 'Deposit'),
(124, 101, 20.00, 'Deposit'),
(125, 101, 5.00, 'Withdrawal'),
(126, 201, 20.00, 'Deposit'),
(128, 201, 10.00, 'Withdrawal');
Learnings:
• Conditional Aggregation: Using a CASE statement to adjust the sign of the amount based
on the transaction type.
• Grouping: Grouping by account_id to aggregate all transactions for each account.
• Arithmetic Operations: Calculating the final balance by considering deposits as positive
and withdrawals as negative values.
Solutions
• - PostgreSQL solution
SELECT
account_id,
SUM(CASE
WHEN transaction_type = 'Deposit' THEN amount
ELSE -amount
END) AS final_balance
FROM transactions
GROUP BY account_id;
• - MySQL solution
SELECT
account_id,
SUM(CASE
WHEN transaction_type = 'Deposit' THEN amount
ELSE -amount
END) AS final_balance
FROM transactions
GROUP BY account_id;
• Q.486
Question
Calculate the Average Transaction Amount per User
As a data scientist at PayPal, you have been asked to write a SQL query to analyze the
transaction history of PayPal users. Specifically, management wants to know the average
transaction amount for each user, and how they rank based on their averages. For this task:
• Calculate the average transaction amount for every user
• Rank the users by their average transaction amount in descending order
Note: When the same average transaction amount is found for multiple users, they should
have the same rank. The next rank should be consecutive.
Explanation
• Calculate the average transaction amount per user using the AVG() function.
• Rank users by their average transaction amounts using the RANK() window function,
ordering in descending order.
• Handle ties in average transaction amounts with the RANK() function, ensuring consecutive
ranks.
605
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using AVG() to calculate the average transaction amount.
• The RANK() function allows ranking based on specific criteria (e.g., descending order).
• Handling ties in ranking with RANK() for equal values.
Solutions
• - PostgreSQL solution
WITH user_average AS (
SELECT
user_id,
AVG(amount) OVER (PARTITION BY user_id) AS avg_transaction
FROM transactions)
SELECT
user_id,
avg_transaction,
RANK() OVER (ORDER BY avg_transaction DESC) AS rank
FROM user_average
ORDER BY rank;
• - MySQL solution
WITH user_average AS (
SELECT
user_id,
AVG(amount) AS avg_transaction
FROM transactions
GROUP BY user_id)
SELECT
user_id,
avg_transaction,
RANK() OVER (ORDER BY avg_transaction DESC) AS rank
FROM user_average
ORDER BY rank;
• Q.487
Question
Unique Money Transfer Relationships
606
1000+ SQL Interview Questions & Answers | By Zero Analyst
You are given a table of PayPal payments showing the payer, the recipient, and the amount
paid. A two-way unique relationship is established when two people send money back and
forth. Write a query to find the number of two-way unique relationships in this data.
Explanation
• A unique relationship occurs when two people send money to each other, i.e., there is an
inverse pair of transactions.
• Use INTERSECT to find mutual payment pairs where the payer and recipient are reversed
between two records.
• Divide the count by 2 to avoid double-counting each relationship.
Learnings
• Using INTERSECT to find mutual records in two tables.
• Avoiding double counting by dividing by 2 when counting relationships.
• Identifying unique bidirectional relationships between pairs.
Solutions
• - PostgreSQL solution
SELECT COUNT(payer_id) / 2 AS unique_relationships
FROM (
SELECT payer_id, recipient_id
FROM payments
INTERSECT
SELECT recipient_id, payer_id
FROM payments) AS relationships;
• - MySQL solution
SELECT COUNT(payer_id) / 2 AS unique_relationships
FROM (
SELECT payer_id, recipient_id
FROM payments
WHERE (payer_id, recipient_id) IN (
SELECT recipient_id, payer_id
FROM payments)) AS relationships;
• Q.488
Question
Determining High-Value Customers
607
1000+ SQL Interview Questions & Answers | By Zero Analyst
You are a data analyst at PayPal, and you have been asked to create a report that identifies all
users who have sent payments of more than 1000 or have received payments of more than
5000 in the last month. Additionally, you must filter out any users whose account is flagged
as "fraudulent".
Explanation
• Join the Transactions table with the User table on user_id.
• Filter for transactions within the last month using the transaction_date.
• Filter for users who sent payments greater than 1000 or received payments greater than
5000.
• Exclude users marked as "fraudulent".
• Group by user_id and username to avoid duplicate records.
Learnings
• Using JOIN to combine data from multiple tables based on a common column (user_id).
• Filtering data based on date ranges using CURRENT_DATE - INTERVAL.
• Applying multiple conditions with AND and OR for complex filtering.
• Excluding specific users using boolean flags.
Solutions
• - PostgreSQL solution
SELECT u.user_id, u.username
608
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM Transactions t
JOIN User u ON t.user_id = u.user_id
WHERE t.transaction_date > (CURRENT_DATE - INTERVAL '1 month')
AND ((t.transaction_type = 'Sent' AND t.amount > 1000)
OR (t.transaction_type = 'Received' AND t.amount > 5000))
AND u.is_fraudulent = false
GROUP BY u.user_id, u.username;
• - MySQL solution
SELECT u.user_id, u.username
FROM Transactions t
JOIN User u ON t.user_id = u.user_id
WHERE t.transaction_date > (CURDATE() - INTERVAL 1 MONTH)
AND ((t.transaction_type = 'Sent' AND t.amount > 1000)
OR (t.transaction_type = 'Received' AND t.amount > 5000))
AND u.is_fraudulent = false
GROUP BY u.user_id, u.username;
• Q.489
Question
Calculate Click-Through Conversion Rate For PayPal
Given a hypothetical situation where PayPal runs several online marketing campaigns, they
want to monitor the click-through conversion rate for their campaigns. The click-through
conversion rate is the number of users who click on the advertisement and proceed to add a
product (in this case, setting up a new PayPal account) divided by the total number of users
who have clicked the ad.
Calculate the daily click-through conversion rate for the first week of September 2022.
Explanation
• Use the ad_clicks table to identify users who clicked on ads.
• Use the account_setup table to track users who successfully set up accounts after clicking
on ads.
• Calculate the daily click-through conversion rate as:
Click-
through conversion rate=Total users who set up accountsTotal users who clicked the ad\text{
Click-through conversion rate} = \frac{\text{Total users who set up accounts}}{\text{Total
users who clicked the ad}}
• Apply a LEFT JOIN to combine the two tables and filter the data for the first week of
September 2022.
609
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to combine tables while preserving all records from the ad_clicks
table.
• Using COUNT(DISTINCT ...) to ensure unique user counts for clicks and account setups.
• Filtering by date ranges using DATE() and BETWEEN.
• Calculating ratios for conversion rates.
Solutions
• - PostgreSQL solution
SELECT
DATE(ac.click_time) AS day,
COUNT(DISTINCT ac.user_id) AS total_clicks,
COUNT(DISTINCT as.user_id) AS total_setups,
COUNT(DISTINCT as.user_id)::float / COUNT(DISTINCT ac.user_id) AS click_through_conver
sion_rate
FROM
ad_clicks AS ac
LEFT JOIN
account_setup AS as ON ac.user_id = as.user_id
WHERE
DATE(ac.click_time) BETWEEN '2022-09-01' AND '2022-09-07'
GROUP BY
DATE(ac.click_time)
ORDER BY
day;
• - MySQL solution
SELECT
DATE(ac.click_time) AS day,
COUNT(DISTINCT ac.user_id) AS total_clicks,
COUNT(DISTINCT as.user_id) AS total_setups,
COUNT(DISTINCT as.user_id) / COUNT(DISTINCT ac.user_id) AS click_through_conversion_ra
te
FROM
ad_clicks AS ac
LEFT JOIN
account_setup AS as ON ac.user_id = as.user_id
WHERE
DATE(ac.click_time) BETWEEN '2022-09-01' AND '2022-09-07'
GROUP BY
DATE(ac.click_time)
ORDER BY
day;
• Q.490
Calculate the Total Revenue per User
You are working as a data analyst at PayPal and have been tasked with calculating the total
revenue generated by each user. The transactions table contains details of all transactions,
610
1000+ SQL Interview Questions & Answers | By Zero Analyst
and you need to calculate the total revenue per user, summing the amount of all their
transactions.
Explanation
• Use the SUM() function to calculate the total revenue for each user.
• Group the results by user_id to calculate the sum for each individual user.
Learnings
• Using SUM() to calculate the total revenue.
• Grouping results by user_id to aggregate transaction amounts for each user.
• Working with date and amount data types.
Solutions
• - PostgreSQL solution
SELECT
user_id,
SUM(amount) AS total_revenue
FROM
transactions
GROUP BY
user_id
ORDER BY
total_revenue DESC;
• - MySQL solution
SELECT
user_id,
SUM(amount) AS total_revenue
FROM
transactions
GROUP BY
user_id
ORDER BY
total_revenue DESC;
• Q.491
611
1000+ SQL Interview Questions & Answers | By Zero Analyst
You have been asked to identify the top 3 users who have spent the most money on PayPal
transactions over the last month. The transactions table contains the details of all
transactions made by users. Calculate the total amount spent by each user and list the top 3
users.
Explanation
• Sum the transaction amounts for each user to calculate the total spent.
• Sort the results by total amount in descending order.
• Use LIMIT to return only the top 3 users.
Learnings
• Using SUM() to calculate the total amount spent.
• Sorting results using ORDER BY in descending order.
• Using LIMIT to restrict the output to top N records.
Solutions
• - PostgreSQL solution
SELECT
user_id,
SUM(amount) AS total_spent
FROM
transactions
WHERE
transaction_date > CURRENT_DATE - INTERVAL '1 month'
GROUP BY
user_id
ORDER BY
total_spent DESC
LIMIT 3;
• - MySQL solution
SELECT
user_id,
SUM(amount) AS total_spent
FROM
transactions
WHERE
transaction_date > CURDATE() - INTERVAL 1 MONTH
GROUP BY
612
1000+ SQL Interview Questions & Answers | By Zero Analyst
user_id
ORDER BY
total_spent DESC
LIMIT 3;
• Q.492
Identify Users with Inactive Accounts (No Transactions in 3 Months)
You need to identify users who have not made any transactions in the last 3 months. The
transactions table tracks all user transactions, and you need to find those who have been
inactive for 3 months or longer.
Explanation
• Use the MAX() function to find the most recent transaction date for each user.
• Filter users whose most recent transaction is older than 3 months.
Learnings
• Using MAX() to find the latest transaction date.
• Filtering data based on time intervals using CURRENT_DATE and INTERVAL.
• Identifying inactive users based on transaction history.
Solutions
• - PostgreSQL solution
SELECT
user_id
FROM
transactions
GROUP BY
user_id
HAVING
MAX(transaction_date) < CURRENT_DATE - INTERVAL '3 months';
• - MySQL solution
SELECT
user_id
FROM
transactions
GROUP BY
user_id
HAVING
MAX(transaction_date) < CURDATE() - INTERVAL 3 MONTH;
613
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.493
Question
Identify the Highest Revenue-Generating Products of PayPal
As a data analyst at PayPal, your task is to identify the products which generate the highest
total revenue for each month. Assume that each transaction on PayPal relates to a product
purchased, and the revenue generated is the transaction amount. Each transaction is
timestamped, and the product ID is also recorded.
Explanation
• Use EXTRACT(MONTH FROM transaction_date) to extract the month from the transaction
date.
• Calculate the total revenue for each product by summing the transaction_amount.
• Group by month and product_id to aggregate the total revenue for each product per
month.
• Sort the result by total_revenue in descending order to identify the highest revenue-
generating products.
Learnings
• Using EXTRACT() to extract parts of a date, such as the month.
• Using SUM() to aggregate transaction amounts for each product.
• Grouping data by multiple fields (month and product).
• Sorting results in descending order to identify top-performing products.
Solutions
• - PostgreSQL solution
SELECT
EXTRACT(MONTH FROM transaction_date) AS month,
product_id AS product,
SUM(transaction_amount) AS total_revenue
FROM
transactions
GROUP BY
614
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Use the SUM() function to calculate the total transaction amount for each user.
• Use the AVG() function to calculate the average transaction amount for each user.
• Use COUNT() in the HAVING clause to filter users who have made at least two transactions.
• Group the results by user_id to aggregate the data for each user.
615
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using SUM() and AVG() to calculate total and average values.
• Filtering results with HAVING to ensure only users with at least two transactions are
included.
• Grouping data by user_id for aggregation.
Solutions
• - PostgreSQL solution
SELECT
t.user_id,
SUM(t.transaction_amount) AS total_amount,
AVG(t.transaction_amount) AS average_amount
FROM
Transactions t
GROUP BY
t.user_id
HAVING
COUNT(t.transaction_id) >= 2;
• - MySQL solution
SELECT
t.user_id,
SUM(t.transaction_amount) AS total_amount,
AVG(t.transaction_amount) AS average_amount
FROM
Transactions t
GROUP BY
t.user_id
HAVING
COUNT(t.transaction_id) >= 2;
• Q.495
Question
Finding Unused Coupons:
Using a table of transactions (transaction_id, user_id, coupon_code, amount) and a table of
coupons (coupon_code, discount_amount, expiration_date), write a query to identify the
coupons that have been created but have never been used in a transaction.
Explanation
• The goal is to find coupons that are in the coupons table but have not appeared in the
transactions table.
• Use a LEFT JOIN to combine the two tables and identify coupons without matching entries
in the transactions table (i.e., where the coupon_code is NULL in the transactions table).
• Only include coupons that exist in the coupons table but not in any transaction.
616
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to identify records with no matching data from the right table.
• Filtering for NULL values in a LEFT JOIN to find unused coupons.
• Identifying records that exist in one table but not the other.
Solutions
• - PostgreSQL solution
SELECT
c.coupon_code
FROM
coupons c
LEFT JOIN
transactions t ON c.coupon_code = t.coupon_code
WHERE
t.transaction_id IS NULL;
• - MySQL solution
SELECT
c.coupon_code
FROM
coupons c
LEFT JOIN
transactions t ON c.coupon_code = t.coupon_code
WHERE
t.transaction_id IS NULL;
• Q.496
Question
Analyzing Payment Failures:
Using a table of payments (payment_id, user_id, payment_method, amount, payment_status,
payment_date), write a query to calculate the failure rate of payments, broken down by
payment method (e.g., PayPal, credit card), for the last month. payment_status can be
'success', 'failed', or 'pending'.
617
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• The failure rate is calculated as the number of failed payments divided by the total number
of payments for each payment method.
• We will filter the payments for the last month using the payment_date field.
• Use COUNT() to count the total and failed payments, and then calculate the failure rate by
dividing the count of failed payments by the total count of payments for each method.
• Group the results by payment_method to break down the failure rate by method.
Learnings
• Using COUNT() for counting occurrences of specific conditions (e.g., success and failed
payments).
• Filtering results by date ranges using WHERE and CURRENT_DATE - INTERVAL '1 month'.
• Grouping results by payment_method to calculate the failure rate for each method.
Solutions
• - PostgreSQL solution
SELECT
payment_method,
COUNT(CASE WHEN payment_status = 'failed' THEN 1 END) * 1.0 / COUNT(payment_id) AS f
ailure_rate
FROM
payments
WHERE
payment_date > CURRENT_DATE - INTERVAL '1 month'
GROUP BY
payment_method;
• - MySQL solution
SELECT
payment_method,
COUNT(CASE WHEN payment_status = 'failed' THEN 1 END) * 1.0 / COUNT(payment_id) AS f
ailure_rate
FROM
payments
WHERE
payment_date > CURDATE() - INTERVAL 1 MONTH
GROUP BY
618
1000+ SQL Interview Questions & Answers | By Zero Analyst
payment_method;
• Q.497
Question
Identifying Users Who Have Never Made a Payment:
Given a table of users (user_id, registration_date) and a table of transactions (transaction_id,
user_id, amount, transaction_date), write a query to find all users who have registered but
have never made a payment.
Explanation
• To find users who have never made a payment, we need to identify users in the users table
who do not have a corresponding record in the transactions table.
• We can achieve this by using a LEFT JOIN between the users table and the transactions
table, and then filtering for rows where no transaction exists (i.e., the transaction_id is
NULL).
• This will give us a list of users who have registered but have not made any payments.
Learnings
• Using LEFT JOIN to include all records from the users table, and matching records from
the transactions table.
• Filtering results with WHERE transaction_id IS NULL to identify users with no
transactions.
• Understanding how LEFT JOIN works when looking for non-matching records.
Solutions
• - PostgreSQL solution
619
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
u.user_id
FROM
users u
LEFT JOIN
transactions t ON u.user_id = t.user_id
WHERE
t.transaction_id IS NULL;
• - MySQL solution
SELECT
u.user_id
FROM
users u
LEFT JOIN
transactions t ON u.user_id = t.user_id
WHERE
t.transaction_id IS NULL;
• Q.498
Question
Detecting Fraudulent Transactions:
Using a table of transactions (transaction_id, user_id, amount, transaction_date), write a
query to detect users who have made more than 5 transactions in a single day, each exceeding
$500.
Explanation
• To detect potentially fraudulent transactions, we need to identify users who have made
more than 5 transactions on the same day where each transaction exceeds $500.
• Use COUNT() to count the number of transactions for each user per day where the
transaction amount is greater than $500.
• Use GROUP BY to group the data by user_id and transaction_date.
• Filter the results with a HAVING clause to include only users who made more than 5
transactions on the same day.
Learnings
620
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using COUNT() and HAVING to filter groups based on the count of qualifying records (in
this case, transactions greater than $500).
• Grouping data by both user_id and transaction_date to analyze transactions by day for
each user.
• Understanding how to filter for users who meet specific conditions (e.g., more than 5
qualifying transactions).
Solutions
• - PostgreSQL solution
SELECT
user_id,
transaction_date,
COUNT(transaction_id) AS num_transactions
FROM
transactions
WHERE
amount > 500
GROUP BY
user_id, transaction_date
HAVING
COUNT(transaction_id) > 5;
• - MySQL solution
SELECT
user_id,
transaction_date,
COUNT(transaction_id) AS num_transactions
FROM
transactions
WHERE
amount > 500
GROUP BY
user_id, transaction_date
HAVING
COUNT(transaction_id) > 5;
• Q.500
Question
Detecting Payment Spikes
Given a table of payments (payment_id, user_id, payment_method, amount, payment_date),
write a query to detect users who have made more than 3 payments in a single day with a
cumulative amount greater than $10,000.
Explanation
• The goal is to identify users who made multiple payments in a single day, and the total
amount of those payments exceeds $10,000.
• Use SUM() to calculate the total payment amount for each user per day.
• Use COUNT() to ensure that more than 3 payments are made.
• Group by user_id and payment_date and filter for groups where the total payment
amount exceeds $10,000 and the count of payments exceeds 3.
621
1000+ SQL Interview Questions & Answers | By Zero Analyst
user_id INT,
payment_method VARCHAR(50),
amount DECIMAL(10, 2),
payment_date DATE
);
• - Datasets
INSERT INTO payments (payment_id, user_id, payment_method, amount, payment_date)
VALUES
(1, 101, 'credit card', 3000.00, '2022-08-01'),
(2, 101, 'PayPal', 4000.00, '2022-08-01'),
(3, 101, 'credit card', 5000.00, '2022-08-01'),
(4, 101, 'PayPal', 2000.00, '2022-08-01'),
(5, 102, 'credit card', 1000.00, '2022-08-01'),
(6, 103, 'credit card', 15000.00, '2022-08-01'),
(7, 103, 'PayPal', 2000.00, '2022-08-02');
Learnings
• Using SUM() to calculate the total amount of payments.
• Using COUNT() to filter for users who made more than 3 payments.
• Grouping by both user_id and payment_date to aggregate payments by day.
Solutions
• - PostgreSQL solution
SELECT
user_id,
payment_date,
COUNT(payment_id) AS num_payments,
SUM(amount) AS total_amount
FROM
payments
GROUP BY
user_id, payment_date
HAVING
COUNT(payment_id) > 3 AND SUM(amount) > 10000;
• - MySQL solution
SELECT
user_id,
payment_date,
COUNT(payment_id) AS num_payments,
SUM(amount) AS total_amount
FROM
payments
GROUP BY
user_id, payment_date
HAVING
COUNT(payment_id) > 3 AND SUM(amount) > 10000;
PWc
• Q.501
Question
Given a table of products where each row indicates a price change for a product on a specific
date, write a SQL query to find the prices of all products on 2019-08-16. Assume that the
price of all products before any change is 10.
Explanation
622
1000+ SQL Interview Questions & Answers | By Zero Analyst
• We need to track the price changes for each product and determine the price on the specific
date 2019-08-16.
• If a product had multiple price changes before 2019-08-16, the most recent change before
that date will determine the price.
• If a product had no price change before that date, it should have the default price of 10.
• The task requires filtering the products based on the change date and joining the records to
get the price for the given date.
Learnings
• Using LEFT JOIN to find the most recent price change before a specific date.
• Applying COALESCE() to handle products that have no price change by using the default
price (10).
• Using MAX() to filter out the most recent change for each product.
Solutions
• - PostgreSQL solution
SELECT
p.product_id,
COALESCE(MAX(pr.new_price), 10) AS price_on_2019_08_16
FROM
(SELECT DISTINCT product_id FROM Products) p
LEFT JOIN
Products pr ON p.product_id = pr.product_id AND pr.change_date <= '2019-08-16'
GROUP BY
p.product_id;
• - MySQL solution
SELECT
p.product_id,
COALESCE(MAX(pr.new_price), 10) AS price_on_2019_08_16
FROM
(SELECT DISTINCT product_id FROM Products) p
LEFT JOIN
Products pr ON p.product_id = pr.product_id AND pr.change_date <= '2019-08-16'
GROUP BY
p.product_id;
• Q.502
Question
623
1000+ SQL Interview Questions & Answers | By Zero Analyst
Given a table Queue where each row represents a person waiting to board a bus, and the bus
has a weight limit of 1000 kilograms, write a SQL query to find the name of the last person
that can board the bus without exceeding the weight limit. The turn column determines the
boarding order, and the weight column contains the weight of each person. The test cases are
generated such that the first person does not exceed the weight limit.
Explanation
• We need to iterate through the people in the queue and calculate the cumulative weight as
each person boards the bus.
• We stop once adding the next person would exceed the bus weight limit of 1000 kilograms.
• We must identify the last person who can board the bus without exceeding the weight
limit.
• This requires keeping track of the cumulative weight and finding the last valid person in
the sequence.
Learnings
• Using SUM() with OVER() to calculate the cumulative sum of weights for each person.
• Filtering based on the cumulative sum to ensure that we don't exceed the weight limit of
1000 kg.
• Identifying the last person who can board using window functions or conditional
aggregation.
Solutions
• - PostgreSQL solution
WITH Cumulative_Weight AS (
SELECT
person_name,
weight,
turn,
SUM(weight) OVER (ORDER BY turn) AS total_weight
FROM Queue
)
SELECT
person_name
FROM
Cumulative_Weight
624
1000+ SQL Interview Questions & Answers | By Zero Analyst
WHERE
total_weight <= 1000
ORDER BY
turn DESC
LIMIT 1;
• - MySQL solution
WITH Cumulative_Weight AS (
SELECT
person_name,
weight,
turn,
SUM(weight) OVER (ORDER BY turn) AS total_weight
FROM Queue
)
SELECT
person_name
FROM
Cumulative_Weight
WHERE
total_weight <= 1000
ORDER BY
turn DESC
LIMIT 1;
• Q.503
Question
Given a table Accounts with the columns account_id and income, write a SQL query to
calculate the number of bank accounts for each salary category. The salary categories are as
follows:
• "Low Salary": All salaries strictly less than $20,000.
• "Average Salary": All salaries in the inclusive range [$20,000, $50,000].
• "High Salary": All salaries strictly greater than $50,000.
The result table must contain all three categories, even if some categories have zero accounts.
Explanation
• We need to categorize the bank accounts into three salary ranges.
• We count how many accounts fall into each category and return the count for each.
• If a category has no accounts, we should still return 0 for that category.
• This problem involves conditional aggregation based on income ranges.
Learnings
625
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT 'Low Salary' AS category, COUNT(*) AS accounts_count
FROM Accounts
WHERE income < 20000
UNION ALL
SELECT 'Average Salary' AS category, COUNT(*) AS accounts_count
FROM Accounts
WHERE income BETWEEN 20000 AND 50000
UNION ALL
SELECT 'High Salary' AS category, COUNT(*) AS accounts_count
FROM Accounts
WHERE income > 50000;
• - MySQL solution
SELECT 'Low Salary' AS category, COUNT(*) AS accounts_count
FROM Accounts
WHERE income < 20000
UNION ALL
SELECT 'Average Salary' AS category, COUNT(*) AS accounts_count
FROM Accounts
WHERE income BETWEEN 20000 AND 50000
UNION ALL
SELECT 'High Salary' AS category, COUNT(*) AS accounts_count
FROM Accounts
WHERE income > 50000;
• Q.504
Question
Given a table employee with the columns employee_id, first_name, last_name,
department, and salary, write a PostgreSQL query to rank employees within their
respective departments based on their salary, in descending order. The employee with the
highest salary in a department should have a rank of 1.
Explanation
• We need to rank employees within each department based on their salary, where the
highest salary in each department should get rank 1.
• This can be accomplished using the RANK() window function, partitioned by department
and ordered by salary in descending order.
• Employees within the same department who have the same salary should receive the same
rank.
626
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO employee (employee_id, first_name, last_name, department, salary)
VALUES
(101, 'John', 'Doe', 'IT', 50000),
(102, 'Jane', 'Smith', 'Accounting', 60000),
(103, 'Mary', 'Johnson', 'Marketing', 55000),
(104, 'James', 'Brown', 'IT', 70000),
(105, 'Patricia', 'Jones', 'Accounting', 65000);
Learnings
• Using the RANK() window function to assign ranks based on sorting criteria.
• Using PARTITION BY to calculate ranks within each department separately.
• Handling ties in salary with RANK(), ensuring employees with equal salaries receive the
same rank.
Solutions
• - PostgreSQL & MySQL solution
SELECT
employee_id,
first_name,
last_name,
department,
salary,
RANK() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS rank
FROM employee;
• Q.505
Question
Given a table of PwC employee salary information, write a SQL query to find the top 3
highest-paid employees in each department.
Explanation
• We need to rank employees within each department based on their salary, selecting the top
3 highest-paid employees for each department.
• This can be achieved using the ROW_NUMBER() window function, which will assign a
unique rank to each employee based on their salary within each department.
• By filtering out employees with a rank greater than 3, we can retrieve only the top 3
employees for each department.
627
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using the ROW_NUMBER() window function to rank employees within each department
based on salary.
• Using PARTITION BY to create ranks within each department.
• Filtering results to show only the top 3 employees by using the rank condition.
Solutions
• - PostgreSQL solution
WITH RankedEmployees AS (
SELECT
e.employee_id,
e.name,
e.salary,
e.department_id,
ROW_NUMBER() OVER (
PARTITION BY e.department_id
ORDER BY e.salary DESC
) AS rank
FROM employee e
)
SELECT
re.employee_id,
re.name,
re.salary,
d.department_name
FROM RankedEmployees re
JOIN department d ON re.department_id = d.department_id
WHERE re.rank <= 3
ORDER BY re.department_id, re.rank;
• Q.506
Question
How can you determine which records in one table are not present in another table?
Explanation
To find records in one table that are not present in another, there are a few SQL methods you
can use:
628
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Using LEFT JOIN: You can perform a LEFT JOIN between the two tables and check for
NULL values in the right-side table. If a record in the left table does not have a match in the
right table, it will show NULL for the right-side columns.
• Using EXCEPT: The EXCEPT operator returns the rows that are in the first query result but
not in the second. It is supported by PostgreSQL, SQL Server, and some other databases.
Learnings
• Using LEFT JOIN to find unmatched records from one table.
• Using EXCEPT to return rows that exist in one result set but not the other.
• Understanding that LEFT JOIN requires checking for NULL values to identify missing
records.
• EXCEPT is a set operation that directly compares two result sets and filters out duplicates.
Solutions
• - Using LEFT JOIN (Works in most databases)
SELECT *
FROM pwc_employees e
LEFT JOIN pwc_managers m
ON e.id = m.id
WHERE m.id IS NULL;
This query will return all records from the pwc_employees table where there is no matching
record in the pwc_managers table based on the id column.
629
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.507
Question
You have two tables: orders (order_id, order_date, customer_id, total_amount) and
payments (payment_id, order_id, payment_date, payment_amount). Write a query to find
orders that have not been fully paid, i.e., where the total_amount in the orders table does
not match the sum of payment_amount in the payments table.
Explanation
• To find orders that have not been fully paid, we need to compare the total_amount from
the orders table with the sum of payment_amount from the payments table.
• We can achieve this by performing an INNER JOIN between the orders and payments
tables on the order_id, and then using a GROUP BY to aggregate payments by order_id.
• Finally, we filter out the orders where the sum of payment_amount is less than the
total_amount.
Learnings
• Using JOIN to combine data from two tables based on a common column (order_id).
• Using SUM() with GROUP BY to aggregate the payment amounts.
• Filtering data to identify where the sum of payments does not match the total order
amount.
Solutions
• - PostgreSQL / MySQL Solution
630
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To find duplicate orders, we need to identify rows in the orders table where the same
customer has placed the same order for the same product on the same day more than once.
This can be achieved by:
• Grouping the records by customer_id, product_id, and order_date.
• Counting how many times each combination occurs using COUNT().
• Filtering the results to show only those combinations where the count is greater than 1,
indicating a duplicate order.
Learnings
• Using GROUP BY to aggregate data by multiple columns (customer_id, product_id, and
order_date).
• Using HAVING to filter groups based on a condition (e.g., count > 1 for duplicates).
631
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL / MySQL Solution
SELECT customer_id, product_id, order_date, COUNT(*) AS duplicate_count
FROM orders
GROUP BY customer_id, product_id, order_date
HAVING COUNT(*) > 1;
Explanation
To solve this problem, we need to:
• Join the employees table with the departments table to get the department manager
information.
• Identify the highest-paid employee within each department using the MAX() function along
with GROUP BY.
• Calculate the salary difference between the highest-paid employee and their department
manager.
• Ensure that the employee with the highest salary is associated with their department
manager's salary to compute the difference.
632
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine related data from two tables based on common columns
(manager_id).
• Using GROUP BY with aggregation functions like MAX() to find the highest-paid employee
in each department.
• Calculating differences in salary using simple arithmetic.
Solutions
• - PostgreSQL / MySQL Solution
SELECT d.department_name,
e.employee_id AS highest_paid_employee_id,
e.salary AS highest_salary,
m.salary AS manager_salary,
e.salary - m.salary AS salary_difference
FROM departments d
JOIN employees e ON d.manager_id = e.manager_id
JOIN employees m ON d.manager_id = m.employee_id
WHERE e.salary = (
SELECT MAX(salary)
FROM employees
WHERE manager_id = d.manager_id)
ORDER BY d.department_name;
Explanation:
• We perform two JOIN operations:
• The first JOIN connects the departments table with the employees table to associate each
department with its manager.
• The second JOIN brings the salary of the manager for the calculation of the salary
difference.
• We use a WHERE clause with a subquery to select only the highest-paid employee for each
department:
• The subquery SELECT MAX(salary) finds the employee with the highest salary within the
department.
• The difference between the highest-paid employee's salary and the manager's salary is
calculated using simple arithmetic: e.salary - m.salary.
• Finally, the results are ordered by the department name.
• Q.510
Question
Given a table of customer purchases (purchase_id, customer_id, purchase_amount,
purchase_date), write a query to rank customers based on their total purchases over the last
6 months. Exclude customers who made fewer than 5 purchases during this period.
633
1000+ SQL Interview Questions & Answers | By Zero Analyst
Additionally, calculate the rank within each segment of customers who made purchases
above $500.
Explanation
To solve this problem, we need to:
• Filter the purchases made within the last 6 months.
• Count the number of purchases for each customer within this period and exclude customers
with fewer than 5 purchases.
• Calculate the total purchase amount for each customer over the last 6 months.
• Rank the customers based on their total purchase amount, and also rank the customers who
made purchases greater than $500 within their segment.
• Use RANK() or ROW_NUMBER() for the rankings and PARTITION BY to calculate the rank for
customers who made purchases over $500.
Learnings
• Using DATE_SUB() or CURRENT_DATE to filter records within a specific time range (last 6
months).
• Using COUNT() with GROUP BY to filter customers with fewer than 5 purchases.
• Using RANK() to rank customers based on their total purchases.
• Using PARTITION BY to rank within specific segments (those with purchases above $500).
Solutions
• - PostgreSQL / MySQL Solution
WITH filtered_purchases AS (
SELECT
customer_id,
SUM(purchase_amount) AS total_spent,
634
1000+ SQL Interview Questions & Answers | By Zero Analyst
COUNT(purchase_id) AS num_purchases
FROM
customer_purchases
WHERE
purchase_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY
customer_id
HAVING
COUNT(purchase_id) >= 5
),
ranked_customers AS (
SELECT
customer_id,
total_spent,
RANK() OVER (ORDER BY total_spent DESC) AS overall_rank,
RANK() OVER (PARTITION BY CASE WHEN total_spent > 500 THEN 1 ELSE 0 END ORDER BY
total_spent DESC) AS segment_rank
FROM
filtered_purchases
)
SELECT
customer_id,
total_spent,
overall_rank,
segment_rank
FROM
ranked_customers;
Explanation:
• filtered_purchases CTE:
• This Common Table Expression (CTE) filters purchases made in the last 6 months using
the condition purchase_date >= CURRENT_DATE - INTERVAL '6 months'.
• It aggregates the total purchase amount (SUM(purchase_amount)) and counts the number
of purchases (COUNT(purchase_id)) for each customer.
• The HAVING COUNT(purchase_id) >= 5 clause ensures that only customers with 5 or
more purchases are included.
• ranked_customers CTE:
• This CTE uses RANK() to rank the customers based on their total spending (total_spent)
in descending order.
• The RANK() function calculates the overall_rank for all customers.
• For the segment_rank, customers who have spent more than $500 are grouped using the
PARTITION BY clause. This allows ranking customers based on their spending within this
segment, with those who have spent above $500 being ranked separately from others.
• Final SELECT:
• The final query selects the customer_id, total_spent, overall_rank, and
segment_rank for each customer.
• Q.511
Calculate the Average Number of Clients per Consultant
You are given two tables: consultants (consultant_id, consultant_name) and clients
(client_id, client_name, consultant_id). Write a SQL query to calculate the average number
of clients assigned to each consultant.
Explanation
• We need to count the number of clients assigned to each consultant and calculate the
average number of clients for all consultants.
635
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT() to count rows within a GROUP BY clause.
• Calculating average values using AVG().
Solutions
• - PostgreSQL / MySQL Solution
SELECT
AVG(client_count) AS average_clients_per_consultant
FROM (
SELECT
consultant_id,
COUNT(client_id) AS client_count
FROM clients
GROUP BY consultant_id
) AS consultant_clients;
• Q.512
Find the Department with the Most Clients
You are given two tables: consultants (consultant_id, department_id) and clients
(client_id, consultant_id). Write a query to find the department with the most clients. If there
is a tie, return all departments with the same number of clients.
Explanation
• Join the two tables to get the number of clients for each department.
636
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Use COUNT() and GROUP BY to calculate the number of clients per department.
• Use ORDER BY to get the department with the most clients.
Learnings
• Using JOIN to combine data from multiple tables.
• Using COUNT() to aggregate data per group.
• Sorting results to find the maximum value.
Solutions
• - PostgreSQL / MySQL Solution
SELECT department_id, COUNT(client_id) AS num_clients
FROM clients
JOIN consultants ON clients.consultant_id = consultants.consultant_id
GROUP BY department_id
HAVING COUNT(client_id) = (
SELECT MAX(client_count)
FROM (
SELECT department_id, COUNT(client_id) AS client_count
FROM clients
JOIN consultants ON clients.consultant_id = consultants.consultant_id
GROUP BY department_id
) AS department_counts
)
ORDER BY num_clients DESC;
• Q.513
Identify Clients Without Consultants
637
1000+ SQL Interview Questions & Answers | By Zero Analyst
You are given two tables: consultants (consultant_id, consultant_name) and clients
(client_id, consultant_id). Write a SQL query to identify clients who do not have a consultant
assigned.
Explanation
• Use a LEFT JOIN between clients and consultants on consultant_id.
• Filter the results to show only clients without a consultant assigned by checking for NULL in
the consultant_id from the consultants table.
Learnings
• Using LEFT JOIN to get all records from the left table and matching records from the right.
• Filtering for NULL values to identify unmatched records.
Solutions
• - PostgreSQL / MySQL Solution
SELECT client_id, client_name
FROM clients
LEFT JOIN consultants ON clients.consultant_id = consultants.consultant_id
WHERE consultants.consultant_id IS NULL;
• Q.514
Count Clients Assigned to Multiple Consultants
You are given two tables: consultants (consultant_id, consultant_name) and clients
(client_id, consultant_id). Write a SQL query to count how many clients are assigned to more
than one consultant.
638
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Use GROUP BY on client_id and HAVING COUNT(DISTINCT consultant_id) > 1 to find
clients who are assigned to multiple consultants.
Learnings
• Using GROUP BY to group results by specific columns.
• Using HAVING to filter groups based on a condition.
• Using COUNT(DISTINCT) to count unique values.
Solutions
• - PostgreSQL / MySQL Solution
SELECT COUNT(DISTINCT client_id) AS clients_assigned_to_multiple_consultants
FROM clients
GROUP BY client_id
HAVING COUNT(DISTINCT consultant_id) > 1;
• Q.515
Find the Most Recent Client Assignment
You are given two tables: consultants (consultant_id, consultant_name) and clients
(client_id, consultant_id, assignment_date). Write a SQL query to find the most recent
assignment for each client.
Explanation
• Use ROW_NUMBER() to assign a rank to each client's assignment based on the
assignment_date.
639
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Filter the result to select only the most recent assignment for each client.
Learnings
• Using ROW_NUMBER() to rank records within partitions.
• Filtering for the most recent record per client.
Solutions
• - PostgreSQL / MySQL Solution
WITH ranked_assignments AS (
SELECT
client_id,
consultant_id,
assignment_date,
ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY assignment_date DESC) AS rn
FROM clients
)
SELECT client_id, consultant_id, assignment_date
FROM ranked_assignments
WHERE rn = 1;
These SQL questions are tailored for the PwC domain and focus on concepts like joins,
aggregation, ranking, and filtering. They offer a mix of challenges involving client
relationships, assignments, and consultant data.
• Q.516
Calculate the Total Project Budget by Department
640
1000+ SQL Interview Questions & Answers | By Zero Analyst
You are given two tables: projects (project_id, project_name, department_id, budget) and
departments (department_id, department_name). Write a SQL query to calculate the total
budget for each department, including departments that have no projects.
Explanation
• Use LEFT JOIN to include all departments, even if they have no projects.
• Use SUM() to calculate the total budget for each department.
Learnings
• Using LEFT JOIN to ensure all departments are included.
• Aggregating with SUM().
Solutions
• - PostgreSQL / MySQL Solution
SELECT d.department_name, COALESCE(SUM(p.budget), 0) AS total_budget
FROM departments d
LEFT JOIN projects p ON d.department_id = p.department_id
GROUP BY d.department_name;
• Q.517
Identify the Clients Who Have Not Been Assigned a Consultant
You are given two tables: clients (client_id, client_name) and consultants (consultant_id,
consultant_name). Write a SQL query to find clients who are not currently assigned to any
consultants.
641
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Use a LEFT JOIN between clients and consultants, and filter for NULL in the
consultant_id column to find clients without an assigned consultant.
Learnings
• Using LEFT JOIN and filtering for NULL values.
Solutions
• - PostgreSQL / MySQL Solution
SELECT c.client_id, c.client_name
FROM clients c
LEFT JOIN consultants con ON c.client_id = con.consultant_id
WHERE con.consultant_id IS NULL;
• Q.518
Find Projects That Exceed the Average Budget of Their Department
You are given two tables: projects (project_id, project_name, department_id, budget) and
departments (department_id, department_name). Write a SQL query to find all projects that
exceed the average budget for their respective department.
Explanation
• Calculate the average budget per department using a subquery or a JOIN.
• Compare each project's budget against the department's average budget.
642
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using a subquery to calculate the average budget per department.
• Filtering based on comparison.
Solutions
• - PostgreSQL / MySQL Solution
SELECT p.project_name, p.budget, d.department_name
FROM projects p
JOIN departments d ON p.department_id = d.department_id
WHERE p.budget > (
SELECT AVG(budget)
FROM projects
WHERE department_id = p.department_id
);
• Q.519
Calculate the Total Spending by Clients in Each Project
You are given two tables: clients (client_id, client_name) and purchases (purchase_id,
client_id, project_id, amount_spent). Write a SQL query to calculate the total amount spent
by clients in each project.
Explanation
• Join the purchases and clients tables to get the client names and their spending.
• Group by project and sum the total spending.
643
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine data from multiple tables.
• Using SUM() and GROUP BY to calculate the total spending per project.
Solutions
• - PostgreSQL / MySQL Solution
SELECT p.project_id, SUM(p.amount_spent) AS total_spending
FROM purchases p
GROUP BY p.project_id;
• Q.520
Find the Consultants with the Highest Number of Clients
You are given two tables: consultants (consultant_id, consultant_name) and clients
(client_id, consultant_id). Write a SQL query to find the consultant(s) with the highest
number of assigned clients.
Explanation
• Use GROUP BY to count the number of clients per consultant.
• Use ORDER BY and LIMIT to find the consultant(s) with the maximum client count.
644
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY to aggregate data by consultant.
• Sorting and using LIMIT to find the top consultant(s).
Solutions
• - PostgreSQL / MySQL Solution
SELECT c.consultant_name, COUNT(cl.client_id) AS num_clients
FROM consultants c
JOIN clients cl ON c.consultant_id = cl.consultant_id
GROUP BY c.consultant_name
ORDER BY num_clients DESC
LIMIT 1;
Cisco
• Q.521
Question
Calculate customer product scores over time
Cisco Systems is interested in how their different network components' quality ratings vary
over time by clients. Write a SQL query to calculate the average star ratings for each product
per month.
Explanation
You need to calculate the average star ratings for each product by month. Use date_trunc()
to truncate the submit_date to the month level and AVG() to calculate the average stars.
Group the results by the truncated month and product.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE product_reviews (
review_id INT,
client_id INT,
submit_date DATE,
product_id VARCHAR(50),
stars INT
645
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO product_reviews (review_id, client_id, submit_date, product_id, stars)
VALUES
(6171, 123, '2022-06-08', 'RTR-901', 4),
(7802, 265, '2022-06-10', 'SWT-1050', 4),
(5293, 362, '2022-06-18', 'RTR-901', 3),
(6352, 192, '2022-07-26', 'SWT-1050', 3),
(4517, 981, '2022-07-05', 'SWT-1050', 2);
Learnings
• date_trunc() function to extract the month from a date
• AVG() function to calculate the average
• Grouping results using GROUP BY
• Ordering results with ORDER BY
Solutions
• - PostgreSQL solution
SELECT
date_trunc('month', submit_date) AS month,
product_id AS product,
AVG(stars) AS avg_stars
FROM
product_reviews
GROUP BY
month,
product
ORDER BY
month,
product;
• - MySQL solution
SELECT
DATE_FORMAT(submit_date, '%Y-%m-01') AS month,
product_id AS product,
AVG(stars) AS avg_stars
FROM
product_reviews
GROUP BY
month,
product
ORDER BY
month,
product;
• Q.522
Question
Find all patients who consulted both Apollo and Fortis hospitals in the past year.
Explanation
To find patients who have visited both Apollo and Fortis hospitals in the past year, use a JOIN
between the two tables on the patient_id field and filter the results to include only records
from the past year using CURRENT_DATE. Ensure that the patients appear in both tables.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE apollo_patients (
patient_id INT,
patient_name VARCHAR(100),
consultation_date DATE
);
646
1000+ SQL Interview Questions & Answers | By Zero Analyst
patient_id INT,
patient_name VARCHAR(100),
consultation_date DATE
);
• - Datasets
INSERT INTO apollo_patients VALUES
(1, 'Ravi Joshi', '2023-05-15'),
(2, 'Sanya Kapoor', '2023-03-12'),
(3, 'Anuj Sethi', '2023-02-10');
Learnings
• Using JOIN to combine data from two tables based on a common field
• Filtering records with CURRENT_DATE for the past year using DATE_SUB() or equivalent
• Ensuring the same patient exists in both tables
Solutions
• - PostgreSQL solution
SELECT DISTINCT ap.patient_id, ap.patient_name
FROM apollo_patients ap
JOIN fortis_patients fp
ON ap.patient_id = fp.patient_id
WHERE ap.consultation_date >= CURRENT_DATE - INTERVAL '1 year'
AND fp.consultation_date >= CURRENT_DATE - INTERVAL '1 year';
• - MySQL solution
SELECT DISTINCT ap.patient_id, ap.patient_name
FROM apollo_patients ap
JOIN fortis_patients fp
ON ap.patient_id = fp.patient_id
WHERE ap.consultation_date >= CURDATE() - INTERVAL 1 YEAR
AND fp.consultation_date >= CURDATE() - INTERVAL 1 YEAR;
• Q.523
Question
Find the top 5 cities with the highest number of tech startups that have received funding from
both angel investors and venture capitalists.
Explanation
To solve this, you need to identify startups that have received investments from both angel
investors and venture capitalists. Then, count the number of startups per city and select the
top 5 cities based on this count. Use JOIN operations to match startups with investments from
both sources, group by city_id, and sort by the number of startups.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE cities (
city_id INT,
city_name VARCHAR(100)
);
647
1000+ SQL Interview Questions & Answers | By Zero Analyst
startup_id INT,
amount DECIMAL(10, 2)
);
Learnings
• Using JOIN to combine data from multiple tables based on common keys
• Filtering startups that have investments from both angel investors and venture capitalists
• Counting the number of startups per city and sorting the results
• Using GROUP BY and ORDER BY to aggregate and rank the results
Solutions
• - PostgreSQL solution
SELECT c.city_name, COUNT(DISTINCT s.startup_id) AS startup_count
FROM cities c
JOIN startups s ON c.city_id = s.city_id
JOIN angel_investments ai ON s.startup_id = ai.startup_id
JOIN venture_capital_investments vci ON s.startup_id = vci.startup_id
GROUP BY c.city_name
ORDER BY startup_count DESC
LIMIT 5;
• - MySQL solution
SELECT c.city_name, COUNT(DISTINCT s.startup_id) AS startup_count
FROM cities c
JOIN startups s ON c.city_id = s.city_id
JOIN angel_investments ai ON s.startup_id = ai.startup_id
JOIN venture_capital_investments vci ON s.startup_id = vci.startup_id
GROUP BY c.city_name
ORDER BY startup_count DESC
LIMIT 5;
• Q.524
Question
648
1000+ SQL Interview Questions & Answers | By Zero Analyst
List all products sold in different categories that have received reviews from at least 3 unique
customers across both online and physical stores.
Explanation
You need to identify products that have received reviews from at least 3 unique customers
across both online and physical stores. First, join the online_reviews and
physical_reviews tables with the products table. Then, count the distinct customers per
product and filter out those with fewer than 3 unique customers. Finally, list the products
along with their categories.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(100),
category_id INT
);
Learnings
• Combining data from multiple tables using JOIN
• Counting unique customers using COUNT(DISTINCT ...)
• Filtering products based on a condition (HAVING clause)
649
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Grouping data with GROUP BY and joining with categories to display the category name
Solutions
• - PostgreSQL solution
SELECT p.product_name, c.category_name
FROM products p
JOIN categories c ON p.category_id = c.category_id
LEFT JOIN (
SELECT product_id, customer_id
FROM online_reviews
UNION
SELECT product_id, customer_id
FROM physical_reviews
) r ON p.product_id = r.product_id
GROUP BY p.product_name, c.category_name
HAVING COUNT(DISTINCT r.customer_id) >= 3;
• - MySQL solution
SELECT p.product_name, c.category_name
FROM products p
JOIN categories c ON p.category_id = c.category_id
LEFT JOIN (
SELECT product_id, customer_id
FROM online_reviews
UNION
SELECT product_id, customer_id
FROM physical_reviews
) r ON p.product_id = r.product_id
GROUP BY p.product_name, c.category_name
HAVING COUNT(DISTINCT r.customer_id) >= 3;
• Q.525
Question
Identify the products with sales above the average for their category.
Explanation
You need to calculate the average sales for each product category and then find the products
whose sales are above the average for their respective categories. This can be done by first
calculating the average sales per category and then comparing each product's sales with that
average.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(100),
category_name VARCHAR(100)
);
650
1000+ SQL Interview Questions & Answers | By Zero Analyst
(4, 4, 20000.00);
Learnings
• Using JOIN to combine data from multiple tables
• Using a subquery to calculate average sales by category
• Filtering products that have sales above the average using a HAVING clause
Solutions
• - PostgreSQL solution
SELECT p.product_name, p.category_name, s.sales_amount
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE s.sales_amount > (
SELECT AVG(sales_amount)
FROM sales s2
JOIN products p2 ON s2.product_id = p2.product_id
WHERE p2.category_name = p.category_name
)
ORDER BY p.category_name, s.sales_amount DESC;
• - MySQL solution
SELECT p.product_name, p.category_name, s.sales_amount
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE s.sales_amount > (
SELECT AVG(sales_amount)
FROM sales s2
JOIN products p2 ON s2.product_id = p2.product_id
WHERE p2.category_name = p.category_name
)
ORDER BY p.category_name, s.sales_amount DESC;
• Q.526
Question
Identify all customers who made purchases in both January and February.
Explanation
You need to find customers who have made purchases in both January and February. This
can be achieved by filtering the sales data for each customer and checking if there are
purchases in both months. You can use GROUP BY to group by customer_id and HAVING to
ensure that both January and February sales are present for each customer.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
customer_id INT,
sale_date DATE
);
• - Datasets
INSERT INTO sales VALUES
(1, 1, '2023-01-15'),
(2, 2, '2023-02-20'),
(3, 1, '2023-02-28'),
(4, 3, '2023-01-10'),
(5, 3, '2023-02-05'),
(6, 4, '2023-01-22'),
(7, 5, '2023-02-18'),
(8, 6, '2023-01-05'),
(9, 6, '2023-02-10'),
(10, 7, '2023-01-14'),
(11, 7, '2023-02-25'),
(12, 8, '2023-01-30'),
651
1000+ SQL Interview Questions & Answers | By Zero Analyst
(13, 9, '2023-02-01'),
(14, 10, '2023-01-05'),
(15, 11, '2023-02-15'),
(16, 11, '2023-01-28');
Learnings
• Filtering data by specific months using EXTRACT(MONTH FROM ...) or MONTH()
• Grouping data by customer_id
• Using HAVING to filter customers who have purchases in both months
Solutions
• - PostgreSQL solution
SELECT customer_id
FROM sales
WHERE EXTRACT(MONTH FROM sale_date) IN (1, 2)
GROUP BY customer_id
HAVING COUNT(DISTINCT EXTRACT(MONTH FROM sale_date)) = 2;
• - MySQL solution
SELECT customer_id
FROM sales
WHERE MONTH(sale_date) IN (1, 2)
GROUP BY customer_id
HAVING COUNT(DISTINCT MONTH(sale_date)) = 2;
• Q.527
Question
List the airlines British Airways customers traveled with in addition to British Airways.
Explanation
You need to find customers who have traveled with British Airways and then identify the
other airlines they have traveled with. Use a JOIN to link the customers and flights tables,
filtering for flights where the airline is not British Airways and where the customer has also
traveled with British Airways.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
Learnings
• Using JOIN to combine customer and flight data
• Filtering data with WHERE to exclude British Airways
• Identifying customers who traveled with British Airways and other airlines
652
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT DISTINCT f.airline
FROM flights f
JOIN customers c ON f.customer_id = c.customer_id
WHERE c.customer_id IN (
SELECT customer_id
FROM flights
WHERE airline = 'British Airways'
) AND f.airline != 'British Airways';
• - MySQL solution
SELECT DISTINCT f.airline
FROM flights f
JOIN customers c ON f.customer_id = c.customer_id
WHERE c.customer_id IN (
SELECT customer_id
FROM flights
WHERE airline = 'British Airways'
) AND f.airline != 'British Airways';
• Q.528
Question
Identify the top 5 products that have been returned the most and the reason for their return.
Explanation
To solve this, you need to count the number of returns for each product and then list the top 5
products with the highest return count. You should also include the reason for the returns.
This can be achieved using GROUP BY and COUNT() to aggregate the data, and ORDER BY to
rank the products by their return frequency.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(100)
);
Learnings
• Using JOIN to combine product and return data
653
1000+ SQL Interview Questions & Answers | By Zero Analyst
654
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine data from flights and hotel_bookings tables
• Filtering data using WHERE for specific conditions (e.g., airline and platform)
• Identifying customers who meet both criteria (flights with British Airways and bookings
with Booking.com)
Solutions
• - PostgreSQL solution
SELECT c.customer_name
FROM customers c
JOIN flights f ON c.customer_id = f.customer_id
JOIN hotel_bookings hb ON c.customer_id = hb.customer_id
WHERE f.airline = 'British Airways'
AND hb.platform = 'Booking.com';
• - MySQL solution
SELECT c.customer_name
FROM customers c
JOIN flights f ON c.customer_id = f.customer_id
JOIN hotel_bookings hb ON c.customer_id = hb.customer_id
WHERE f.airline = 'British Airways'
AND hb.platform = 'Booking.com';
• Q.530
Question
Write a SQL query to find the customer who made the most recent order.
Explanation
You need to identify the customer who placed the most recent order based on the
order_date. You can achieve this by selecting the order with the latest order_date and then
joining it with the customers table to retrieve the corresponding customer information.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100)
);
655
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Selecting the most recent record using MAX() or ORDER BY with LIMIT
• Joining orders with customers to get customer details
• Using ORDER BY and LIMIT to pick the latest order
Solutions
• - PostgreSQL Solution
SELECT c.customer_name
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
ORDER BY o.order_date DESC
LIMIT 1;
• - MySQL Solution
SELECT c.customer_name
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
ORDER BY o.order_date DESC
LIMIT 1;
656
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Aggregating sales by product and month using GROUP BY
• Calculating total sales for each month using a subquery or window function
• Calculating percentage by dividing product sales by total sales for each month
Solutions
• - PostgreSQL Solution
SELECT
EXTRACT(MONTH FROM sale_date) AS month,
product_id,
SUM(sales_amount) AS product_sales,
(SUM(sales_amount) / total_sales.total_sales) * 100 AS sales_percentage
FROM
sales,
(SELECT EXTRACT(MONTH FROM sale_date) AS month, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY month) AS total_sales
WHERE EXTRACT(MONTH FROM sale_date) = total_sales.month
GROUP BY month, product_id, total_sales.total_sales
ORDER BY month, product_id;
• - MySQL Solution
SELECT
MONTH(sale_date) AS month,
product_id,
SUM(sales_amount) AS product_sales,
(SUM(sales_amount) / total_sales.total_sales) * 100 AS sales_percentage
FROM
sales,
(SELECT MONTH(sale_date) AS month, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY month) AS total_sales
WHERE MONTH(sale_date) = total_sales.month
GROUP BY month, product_id, total_sales.total_sales
ORDER BY month, product_id;
Explanation
• The subquery calculates the total sales per month.
• The main query calculates the sales amount per product for each month.
• The WHERE clause ensures that the total sales are correctly matched to each product's sales
by month.
• The sales percentage is calculated by dividing the product's sales by the total sales for the
month, then multiplying by 100.
• Q.532
Question
Find the most popular product in each category based on total sales.
657
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To identify the most popular product in each category based on total sales, you need to:
• Aggregate the total sales for each product.
• Group the products by category.
• Select the product with the highest total sales for each category.
You can achieve this by using GROUP BY to calculate the total sales for each product and then
applying a JOIN with the products table to group by category. Finally, you can use a
subquery or window function to select the product with the highest total sales in each
category.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
product_id INT,
revenue DECIMAL(10, 2)
);
Learnings
• Using GROUP BY to aggregate data and calculate total sales per product
• Using JOIN to bring product details into the query
• Selecting the highest sales product per category using aggregation or window functions
Solutions
• - PostgreSQL Solution
WITH total_sales AS (
SELECT
s.product_id,
658
1000+ SQL Interview Questions & Answers | By Zero Analyst
SUM(s.revenue) AS total_revenue
FROM sales s
GROUP BY s.product_id
)
SELECT p.category,
p.product_name,
ts.total_revenue
FROM total_sales ts
JOIN products p ON ts.product_id = p.product_id
WHERE (p.category, ts.total_revenue) IN (
SELECT category, MAX(total_revenue)
FROM total_sales ts
JOIN products p ON ts.product_id = p.product_id
GROUP BY category
)
ORDER BY p.category;
• - MySQL Solution
WITH total_sales AS (
SELECT
s.product_id,
SUM(s.revenue) AS total_revenue
FROM sales s
GROUP BY s.product_id
)
SELECT p.category,
p.product_name,
ts.total_revenue
FROM total_sales ts
JOIN products p ON ts.product_id = p.product_id
WHERE (p.category, ts.total_revenue) IN (
SELECT category, MAX(total_revenue)
FROM total_sales ts
JOIN products p ON ts.product_id = p.product_id
GROUP BY category
)
ORDER BY p.category;
659
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE apollo_patients (
patient_id INT,
patient_name VARCHAR(100),
treatment_date DATE
);
Learnings
• Using JOIN to combine records from multiple tables based on common fields (e.g.,
patient_id).
• Filtering records for a date range (last year) to identify recent treatments.
• Identifying common records in both tables using INTERSECT or a combination of INNER
JOIN and filtering.
Solutions
• - PostgreSQL Solution
SELECT DISTINCT a.patient_name
FROM apollo_patients a
JOIN max_patients m ON a.patient_id = m.patient_id
WHERE a.treatment_date >= CURRENT_DATE - INTERVAL '1 year'
AND m.treatment_date >= CURRENT_DATE - INTERVAL '1 year';
• - MySQL Solution
SELECT DISTINCT a.patient_name
FROM apollo_patients a
JOIN max_patients m ON a.patient_id = m.patient_id
WHERE a.treatment_date >= CURDATE() - INTERVAL 1 YEAR
AND m.treatment_date >= CURDATE() - INTERVAL 1 YEAR;
660
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To retrieve the customer details along with their most recent order date, you need to:
• Join the customers table with the orders table based on customer_id.
• Use the MAX function on the order_date to get the most recent order for each customer.
• Group the result by customer_id and customer_name to get one row per customer.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE customers (
customer_id INT,
customer_name VARCHAR(100)
);
Learnings
• Using JOIN to combine data from multiple tables.
• Using MAX() to identify the most recent order date.
• Grouping by customer to ensure each customer is listed only once.
Solutions
• - PostgreSQL Solution
SELECT c.customer_id, c.customer_name, MAX(o.order_date) AS last_order_date
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name;
• - MySQL Solution
SELECT c.customer_id, c.customer_name, MAX(o.order_date) AS last_order_date
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
661
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data with WHERE to check conditions on specific columns (e.g., sector and
co_founders).
• Using the AND operator to combine multiple conditions in a WHERE clause.
Solutions
• - PostgreSQL Solution
SELECT startup_name
662
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM startups
WHERE sector = 'Fintech'
AND co_founders >= 2;
• - MySQL Solution
SELECT startup_name
FROM startups
WHERE sector = 'Fintech'
AND co_founders >= 2;
Learnings
• Using GROUP BY to aggregate data by customer.
• Using HAVING to filter results based on aggregate values.
• Counting the number of orders for each customer.
Solutions
663
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using INNER JOIN to combine data from two tables based on a common column (in this
case, hotel_id).
• Filtering hotels that have reviews from both sources.
Solutions
• - PostgreSQL / MySQL Solution
664
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to find unmatched rows between two tables.
• Filtering results based on a specific condition (sale_date not in 2023).
• Identifying customers without sales in a given year.
665
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL / MySQL Solution
SELECT c.customer_name
FROM customers c
LEFT JOIN sales s ON c.customer_id = s.customer_id AND EXTRACT(YEAR FROM s.sale_date) =
2023
WHERE s.sale_id IS NULL;
666
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using EXTRACT(YEAR FROM sale_date) and EXTRACT(MONTH FROM sale_date) to break
down the date.
• Using JOIN to match sales data from the current and previous months.
• Calculating differences in sales for each product using subtraction.
Solutions
• - PostgreSQL/MySQL Solution
SELECT
p.product_name,
SUM(CASE
WHEN MONTH(s.sale_date) = MONTH(CURRENT_DATE) AND YEAR(s.sale_date) = YEAR(CURRE
NT_DATE)
THEN s.sales_amount
ELSE 0
END) AS current_month_sales,
SUM(CASE
WHEN MONTH(s.sale_date) = MONTH(CURRENT_DATE) - 1 AND YEAR(s.sale_date) = YEAR(C
URRENT_DATE)
THEN s.sales_amount
ELSE 0
END) AS previous_month_sales,
(SUM(CASE
WHEN MONTH(s.sale_date) = MONTH(CURRENT_DATE) AND YEAR(s.sale_date) = YEAR(CURRE
NT_DATE)
THEN s.sales_amount
ELSE 0
END) - SUM(CASE
WHEN MONTH(s.sale_date) = MONTH(CURRENT_DATE) - 1 AND YEAR(s.sale_date) = YEAR(C
URRENT_DATE)
THEN s.sales_amount
ELSE 0
END)) AS sales_difference
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY p.product_name;
667
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
• CTE (sales_monthly): We first aggregate the total sales for each product by year and
month.
• LEFT JOIN: We join the sales data for the current month (current) with the previous
month (previous). We match on product_id, sale_year, and sale_month (with the
condition current.sale_month = previous.sale_month + 1 to get the previous month).
• COALESCE(previous.total_sales, 0): This handles the case where there are no sales in
the previous month, defaulting the sales to 0.
• sales_diff: The difference in sales between the current and previous months is calculated
by subtracting previous_month_sales from current_month_sales.
• Q.540
Question
List all projects that started in 2024 using a date function.
Explanation
The task requires extracting projects that started in the year 2024. This can be done by using a
date function to filter records based on the year of the start_date column.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE projects (
project_id INT PRIMARY KEY,
project_name VARCHAR(100),
start_date DATE
);
• - Datasets
INSERT INTO projects VALUES
(1, 'Project Alpha', '2024-02-15'),
(2, 'Project Beta', '2023-11-30'),
(3, 'Project Gamma', '2024-05-10'),
(4, 'Project Delta', '2024-01-20'),
(5, 'Project Epsilon', '2024-08-01'),
(6, 'Project Zeta', '2023-12-25'),
(7, 'Project Eta', '2024-04-18'),
(8, 'Project Theta', '2024-07-12');
Learnings
• Use of date functions to extract year from a DATE type column.
• Filtering records based on year in SQL queries.
668
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT project_id, project_name, start_date
FROM projects
WHERE EXTRACT(YEAR FROM start_date) = 2024;
• - MySQL solution
SELECT project_id, project_name, start_date
FROM projects
WHERE YEAR(start_date) = 2024;
Zomato
• Q.541
Question
Write a SQL query to find all customers who never ordered anything from the Customers
and Orders tables. Return the customer names who do not appear in the Orders table.
Explanation
To solve this problem, we need to identify customers who are present in the Customers table
but do not have any corresponding entries in the Orders table. This can be achieved by
performing a LEFT JOIN or using a NOT EXISTS or NOT IN condition to check for
customers without matching order records.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Customers (
Id INT PRIMARY KEY,
NameCust VARCHAR(100)
);
Learnings
• Identifying customers without orders using LEFT JOIN and IS NULL condition.
• Using NOT EXISTS or NOT IN to filter out customers with existing orders.
• JOIN operations to combine data from multiple tables.
Solutions
• - PostgreSQL and MySQL solution
SELECT c.NameCust AS Customers
FROM Customers c
LEFT JOIN Orders o ON c.Id = o.CustomerId
WHERE o.CustomerId IS NULL;
669
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering records by City and date range (last month).
• Using JOINs to combine data from Orders and OrderDetails.
• GROUP BY to aggregate the total quantities per item.
• ORDER BY to sort the items by quantity in descending order.
670
1000+ SQL Interview Questions & Answers | By Zero Analyst
671
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using LEFT JOIN to identify customers who have no orders in the last 30 days.
• Filtering records using CURRENT_DATE or CURDATE() for accurate date comparison.
• GROUP BY to aggregate inactive customers by City.
• Using COUNT to calculate the number of inactive customers.
Solutions
• - PostgreSQL and MySQL solution
SELECT c.City, COUNT(DISTINCT c.CustomerID) AS InactiveCustomers
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = o.CustomerID AND o.OrderDate > CURRENT_DATE - INTER
VAL '30 days'
GROUP BY c.City
HAVING COUNT(o.OrderID) = 0;
Explanation:
• LEFT JOIN: We join Customers with Orders, but only include orders placed in the last
30 days (using o.OrderDate > CURRENT_DATE - INTERVAL '30 days').
• COUNT(o.OrderID) = 0: After joining, we check if the customer has no recent orders.
The count of their OrderID should be 0, indicating no orders in the last 30 days.
• GROUP BY c.City: We group the results by city to count inactive customers for each city.
• COUNT(DISTINCT c.CustomerID): This counts the unique customers who have not
placed an order in the last 30 days.
• Q.544
Question
Write a SQL query to calculate the percentage of total orders from each city for food and
grocery deliveries in the past month. The table involved is Orders.
Explanation
To solve this:
• Filter Orders to include only food and grocery deliveries in the past month.
• Calculate the total number of orders and the number of orders from each city.
• Calculate the percentage of orders from each city by dividing the city-specific orders by
the total orders.
• Use GROUP BY to get results per city and filter for the last month using the OrderDate.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Cities (
CityID INT PRIMARY KEY,
CityName VARCHAR(50)
);
672
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learnings
• Filtering by date range to consider only the last month's orders.
• Using GROUP BY to aggregate orders by city.
• COUNT() to calculate the number of orders from each city.
• Calculating the percentage using simple arithmetic.
• JOINs (if necessary) to reference the Cities table, though in this case, the City is directly
in the Orders table.
Solutions
• - PostgreSQL and MySQL solution
SELECT
o.City,
COUNT(o.OrderID) AS CityOrders,
(COUNT(o.OrderID) * 100.0 / (SELECT COUNT(*) FROM Orders WHERE OrderDate >= CURRENT_
DATE - INTERVAL '1 month' AND (OrderType = 'Food' OR OrderType = 'Grocery'))) AS Percent
ageOfTotalOrders
FROM Orders o
WHERE o.OrderDate >= CURRENT_DATE - INTERVAL '1 month'
AND (o.OrderType = 'Food' OR o.OrderType = 'Grocery')
GROUP BY o.City
ORDER BY PercentageOfTotalOrders DESC;
Explanation:
• WHERE: Filters orders for the last month (OrderDate >= CURRENT_DATE - INTERVAL
'1 month') and for either food or grocery (OrderType = 'Food' OR OrderType =
'Grocery').
• COUNT(o.OrderID): Counts the total orders from each city in the last month.
• Percentage Calculation: The formula (COUNT(o.OrderID) * 100.0) / total_orders
calculates the percentage of orders from each city. The subquery (SELECT COUNT(*) FROM
Orders WHERE OrderDate >= CURRENT_DATE - INTERVAL '1 month' AND (OrderType
= 'Food' OR OrderType = 'Grocery')) computes the total orders of food and grocery
in the last month.
673
1000+ SQL Interview Questions & Answers | By Zero Analyst
• GROUP BY o.City: Groups the results by City to calculate the number of orders per city.
• ORDER BY PercentageOfTotalOrders DESC: Orders the cities by the percentage of
orders in descending order.
Notes:
• Ensure that the CURRENT_DATE is considered correctly for the query depending on the
database system (it typically works for both MySQL and PostgreSQL).
• Adjust the INTERVAL '1 month' depending on the database if needed (e.g., for some
databases, INTERVAL 1 MONTH or INTERVAL '1' MONTH might be used).
• Q.545
Question
Write a SQL query to find the most searched item on Blinkit in 2024 and its search volume
(i.e., how many times it was searched). The table involved is SearchRecords and Items.
Explanation
To find the most searched item:
• We need to join the SearchRecords table (which logs item searches) with the Items table
(which contains details of the items).
• Filter the records to consider only searches made in 2024.
• Group the results by ItemID and calculate the search volume for each item.
• Order the results by search volume in descending order and return the top item.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Items (
ItemID INT PRIMARY KEY,
ItemName VARCHAR(100)
);
674
1000+ SQL Interview Questions & Answers | By Zero Analyst
(7, 2, '2024-02-10'),
(8, 6, '2024-02-11'),
(9, 3, '2024-02-15'),
(10, 7, '2024-02-18'),
(11, 8, '2024-03-01'),
(12, 9, '2024-03-03'),
(13, 4, '2024-03-07'),
(14, 1, '2024-03-10'),
(15, 2, '2024-03-12'),
(16, 5, '2024-03-15'),
(17, 6, '2024-03-17'),
(18, 7, '2024-03-20'),
(19, 8, '2024-03-22'),
(20, 9, '2024-03-25'),
(21, 10, '2024-03-28'),
(22, 1, '2024-04-01'),
(23, 2, '2024-04-02'),
(24, 3, '2024-04-03'),
(25, 4, '2024-04-04'),
(26, 5, '2024-04-05'),
(27, 6, '2024-04-06'),
(28, 7, '2024-04-07'),
(29, 8, '2024-04-08'),
(30, 9, '2024-04-09'),
(31, 10, '2024-04-10'),
(32, 1, '2024-04-11'),
(33, 2, '2024-04-12'),
(34, 3, '2024-04-13'),
(35, 4, '2024-04-14');
Learnings
• Filtering records based on date range (2024 in this case).
• Using JOIN to combine SearchRecords with Items to get item names.
• GROUP BY to aggregate search volume by ItemID.
• COUNT() function to calculate the search volume for each item.
• Sorting the result by search volume to identify the most searched item.
Solutions
• - PostgreSQL and MySQL solution
SELECT i.ItemName, COUNT(sr.SearchID) AS SearchVolume
FROM SearchRecords sr
JOIN Items i ON sr.ItemID = i.ItemID
WHERE sr.SearchDate BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY i.ItemName
ORDER BY SearchVolume DESC
LIMIT 1;
Explanation:
• JOIN: The SearchRecords table is joined with the Items table on ItemID to get the name
of the item.
• WHERE: The SearchDate is filtered for the year 2024 using the BETWEEN '2024-01-01'
AND '2024-12-31' condition.
• COUNT(sr.SearchID): The COUNT() function is used to calculate how many times each
item was searched in 2024.
• GROUP BY i.ItemName: We group by ItemName to aggregate the search counts for
each item.
• ORDER BY SearchVolume DESC: The results are ordered by the search volume in
descending order to get the most searched item at the top.
• LIMIT 1: Only the top 1 item (most searched) is returned.
675
1000+ SQL Interview Questions & Answers | By Zero Analyst
Notes:
• You can modify the LIMIT 1 if you want the top N most searched items.
• Q.546
Question
Write a SQL query to find the top 3 most searched items for Blinkit in 2024, along with
their search volumes. The tables involved are SearchRecords and Items.
Explanation
To find the top 3 most searched items:
• Join the SearchRecords table with the Items table to get item names.
• Filter the records for the year 2024.
• Group by ItemID to calculate the search volume for each item.
• Order by the search volume in descending order to get the most searched items.
• Limit the results to top 3 items.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Items (
ItemID INT PRIMARY KEY,
ItemName VARCHAR(100)
);
676
1000+ SQL Interview Questions & Answers | By Zero Analyst
(17, 6, '2024-03-17'),
(18, 7, '2024-03-20'),
(19, 8, '2024-03-22'),
(20, 9, '2024-03-25'),
(21, 10, '2024-03-28'),
(22, 1, '2024-04-01'),
(23, 2, '2024-04-02'),
(24, 3, '2024-04-03'),
(25, 4, '2024-04-04'),
(26, 5, '2024-04-05'),
(27, 6, '2024-04-06'),
(28, 7, '2024-04-07'),
(29, 8, '2024-04-08'),
(30, 9, '2024-04-09'),
(31, 10, '2024-04-10'),
(32, 1, '2024-04-11'),
(33, 2, '2024-04-12'),
(34, 3, '2024-04-13'),
(35, 4, '2024-04-14');
Learnings
• Filtering by date range for 2024.
• JOIN to combine the SearchRecords and Items tables based on ItemID.
• COUNT() to calculate search volume for each item.
• GROUP BY to aggregate by item.
• LIMIT to return only the top N results.
Solutions
• - PostgreSQL and MySQL solution
SELECT
i.ItemName,
COUNT(sr.SearchID) AS SearchVolume
FROM SearchRecords sr
JOIN Items i ON sr.ItemID = i.ItemID
WHERE sr.SearchDate BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY i.ItemName
ORDER BY SearchVolume DESC
LIMIT 3;
Explanation:
• JOIN: Combines SearchRecords with Items on ItemID to get the item names.
• WHERE: Filters search records to only include those from 2024.
• COUNT(sr.SearchID): Counts how many times each item was searched.
• GROUP BY i.ItemName: Aggregates results by ItemName.
• ORDER BY SearchVolume DESC: Orders the items by the number of searches in
descending order.
• LIMIT 3: Returns the top 3 most searched items.
Notes:
• You can adjust the LIMIT value to return more or fewer items based on your needs.
• Q.547
Question
Write a SQL query to find the top 5 most frequently ordered items from Blinkit in 2024,
along with their total order quantity, considering both food and grocery categories, and
excluding items ordered less than 10 times in total. The tables involved are Orders and
OrderDetails.
677
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Join the Orders table with the OrderDetails table to get the details of items ordered.
• Filter for orders in 2024 and for food and grocery categories.
• Calculate the total order quantity for each item in the year.
• Exclude items with a total order quantity of less than 10.
• Order the items by their total order quantity in descending order and return the top 5.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderType VARCHAR(50),
OrderDate DATE
);
678
1000+ SQL Interview Questions & Answers | By Zero Analyst
(13, 7, 2, 5),
(14, 8, 1, 4),
(15, 9, 2, 3),
(16, 9, 3, 5),
(17, 10, 1, 9),
(18, 10, 2, 3),
(19, 11, 3, 6),
(20, 12, 4, 2),
(21, 13, 1, 7),
(22, 14, 3, 4),
(23, 14, 1, 3),
(24, 15, 2, 6),
(25, 15, 1, 2),
(26, 16, 3, 4),
(27, 17, 4, 5),
(28, 17, 2, 7),
(29, 18, 1, 6);
Learnings
• JOIN operation between Orders and OrderDetails to gather order details and item
quantities.
• Using GROUP BY to aggregate the total quantity of each item.
• HAVING clause to filter out items with a total order quantity less than 10.
• Calculating the total order quantity for each item over 2024.
• Sorting the results in descending order of quantity and limiting the results to the top 5
items.
Solutions
• - PostgreSQL and MySQL solution
SELECT i.ItemName, SUM(od.Quantity) AS TotalQuantity
FROM OrderDetails od
JOIN Orders o ON od.OrderID = o.OrderID
JOIN Items i ON od.ItemID = i.ItemID
WHERE o.OrderDate BETWEEN '2024-01-01' AND '2024-12-31'
AND (o.OrderType = 'Food' OR o.OrderType = 'Grocery')
GROUP BY i.ItemName
HAVING SUM(od.Quantity) >= 10
ORDER BY TotalQuantity DESC
LIMIT 5;
Explanation:
• JOIN: We join OrderDetails with Orders on OrderID and Items on ItemID to get the
item names.
• WHERE: Filters orders for food and grocery and limits the date range to 2024.
• SUM(od.Quantity): This calculates the total quantity ordered for each item across all
orders.
• GROUP BY i.ItemName: We group by ItemName to aggregate the results for each item.
• HAVING: Filters out items with a total quantity less than 10.
• ORDER BY TotalQuantity DESC: Sorts the items in descending order of total quantity
ordered.
• LIMIT 5: Returns only the top 5 most ordered items.
Notes:
• Ensure that all necessary JOINs are performed to combine the OrderDetails, Orders, and
Items tables.
• Adjust the HAVING clause to ensure items with a total quantity of at least 10 are included.
• Q.548
679
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a SQL query to find the top 5 most ordered food items on Zomato in 2024, along
with their total order quantities for each item. The tables involved are Orders,
OrderDetails, and FoodItems.
Explanation
To find the top 5 most ordered food items:
• Join the Orders table with the OrderDetails table to get item-level details.
• Filter the orders to include only food items (i.e., OrderType = 'Food').
• Filter the records to only include orders from 2024.
• Group the results by ItemID to calculate the total order quantity for each food item.
• Order the results by the total order quantity in descending order and return the top 5 most
ordered items.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderType VARCHAR(50),
OrderDate DATE
);
--
INSERT INTO Orders (OrderID, OrderType, OrderDate)
VALUES
(1, 'Food', '2024-01-10'),
(2, 'Food', '2024-01-12'),
(3, 'Food', '2024-02-01'),
(4, 'Food', '2024-02-15'),
(5, 'Food', '2024-03-05'),
(6, 'Food', '2024-03-10'),
(7, 'Food', '2024-04-01'),
(8, 'Food', '2024-04-05'),
(9, 'Food', '2024-04-15'),
(10, 'Food', '2024-05-10');
--
INSERT INTO OrderDetails (OrderDetailID, OrderID, ItemID, Quantity)
VALUES
(1, 1, 1, 5),
(2, 1, 2, 2),
(3, 2, 3, 3),
(4, 2, 4, 1),
(5, 3, 1, 6),
(6, 3, 4, 2),
(7, 4, 3, 4),
(8, 4, 2, 5),
(9, 5, 1, 8),
(10, 5, 3, 3),
(11, 6, 4, 2),
(12, 6, 1, 7),
680
1000+ SQL Interview Questions & Answers | By Zero Analyst
(13, 7, 2, 5),
(14, 8, 1, 4),
(15, 9, 2, 3),
(16, 9, 3, 5),
(17, 10, 1, 9),
(18, 10, 2, 3);
--
INSERT INTO FoodItems (ItemID, ItemName)
VALUES
(1, 'Pizza'),
(2, 'Burger'),
(3, 'Pasta'),
(4, 'Fries'),
(5, 'Sandwich');
Learnings
• JOIN to combine Orders, OrderDetails, and FoodItems based on ItemID.
• Filtering records for food orders only.
• Using GROUP BY to aggregate the total order quantity for each food item.
• ORDER BY to sort the results by total order quantity.
• LIMIT to get the top N items.
Solutions
• - PostgreSQL and MySQL solution
SELECT fi.ItemName, SUM(od.Quantity) AS TotalQuantity
FROM OrderDetails od
JOIN Orders o ON od.OrderID = o.OrderID
JOIN FoodItems fi ON od.ItemID = fi.ItemID
WHERE o.OrderType = 'Food'
AND o.OrderDate BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY fi.ItemName
ORDER BY TotalQuantity DESC
LIMIT 5;
Explanation:
• JOIN: Joins OrderDetails with Orders on OrderID and FoodItems on ItemID to get
item names.
• WHERE: Filters only food orders and restricts to the year 2024 using the BETWEEN clause.
• SUM(od.Quantity): Computes the total quantity ordered for each food item.
• GROUP BY fi.ItemName: Aggregates the data by ItemName.
• ORDER BY TotalQuantity DESC: Sorts by the total quantity ordered in descending
order.
• LIMIT 5: Returns only the top 5 most ordered food items.
Notes:
• Adjust the LIMIT if you need more or fewer top items.
• The WHERE clause filters to ensure only food items are considered for analysis.
• Q.549
Question
Write a SQL query to find the best-selling category on Zomato in 2024, based on the total
order quantity of food items in each category. The tables involved are Orders,
OrderDetails, FoodItems, and Categories.
Explanation
To find the best-selling category:
681
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Join the Orders table with the OrderDetails table to get the details of items ordered.
• Join the FoodItems table with the Categories table to categorize the items.
• Filter the records to include only food orders in 2024.
• Group the results by CategoryName and calculate the total order quantity for each
category.
• Order the results by total order quantity in descending order and return the top category.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderType VARCHAR(50),
OrderDate DATE
);
--
INSERT INTO Orders (OrderID, OrderType, OrderDate)
VALUES
(1, 'Food', '2024-01-10'),
(2, 'Food', '2024-01-12'),
(3, 'Food', '2024-02-01'),
(4, 'Food', '2024-02-15'),
(5, 'Food', '2024-03-05'),
(6, 'Food', '2024-03-10'),
(7, 'Food', '2024-04-01'),
(8, 'Food', '2024-04-05'),
(9, 'Food', '2024-04-15'),
(10, 'Food', '2024-05-10');
--
INSERT INTO OrderDetails (OrderDetailID, OrderID, ItemID, Quantity)
VALUES
(1, 1, 1, 5),
(2, 1, 2, 2),
(3, 2, 3, 3),
(4, 2, 4, 1),
(5, 3, 1, 6),
(6, 3, 4, 2),
(7, 4, 3, 4),
(8, 4, 2, 5),
(9, 5, 1, 8),
(10, 5, 3, 3),
(11, 6, 4, 2),
(12, 6, 1, 7),
(13, 7, 2, 5),
(14, 8, 1, 4),
(15, 9, 2, 3),
(16, 9, 3, 5),
(17, 10, 1, 9),
682
1000+ SQL Interview Questions & Answers | By Zero Analyst
--
INSERT INTO Categories (CategoryID, CategoryName)
VALUES
(1, 'Pizza'),
(2, 'Burgers'),
(3, 'Pasta'),
(4, 'Fries'),
(5, 'Sandwiches');
--
INSERT INTO FoodItems (ItemID, ItemName, CategoryID)
VALUES
(1, 'Margherita Pizza', 1),
(2, 'Cheese Burger', 2),
(3, 'Spaghetti Carbonara', 3),
(4, 'French Fries', 4),
(5, 'Veg Sandwich', 5),
(6, 'Pepperoni Pizza', 1),
(7, 'Chicken Burger', 2),
(8, 'Penne Alfredo', 3),
(9, 'Curly Fries', 4),
(10, 'Grilled Cheese Sandwich', 5);
Learnings
• JOIN operation between OrderDetails, Orders, FoodItems, and Categories to combine
item and category information.
• Filtering orders for food types only.
• GROUP BY to calculate total order quantities for each category.
• ORDER BY to sort categories by the total order quantity in descending order.
• LIMIT 1 to find the best-selling category.
Solutions
• - PostgreSQL and MySQL solution
SELECT c.CategoryName, SUM(od.Quantity) AS TotalQuantity
FROM OrderDetails od
JOIN Orders o ON od.OrderID = o.OrderID
JOIN FoodItems fi ON od.ItemID = fi.ItemID
JOIN Categories c ON fi.CategoryID = c.CategoryID
WHERE o.OrderType = 'Food'
AND o.OrderDate BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY c.CategoryName
ORDER BY TotalQuantity DESC
LIMIT 1;
Explanation:
• JOIN: We join OrderDetails with Orders on OrderID, FoodItems on ItemID, and
Categories on CategoryID to get category names.
• WHERE: Filters to include only food orders within the year 2024.
• SUM(od.Quantity): Calculates the total quantity of items ordered in each category.
• GROUP BY c.CategoryName: Aggregates the total quantity by CategoryName.
• ORDER BY TotalQuantity DESC: Orders the categories by the total quantity in
descending order.
• LIMIT 1: Returns the category with the highest total order quantity.
Notes:
• This query returns the best-selling category by total order quantity. Adjust the LIMIT if
you want the top N categories.
683
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The JOIN operation is critical to combine the information from the multiple tables.
• Q.550
Question
Write a SQL query to identify the highest ordered restaurant in each city on Zomato in
2024, based on the total quantity of items ordered. The tables involved are Orders,
OrderDetails, Restaurants, and Cities.
Explanation
To find the highest ordered restaurant in each city:
• Join the Orders table with the OrderDetails table to get the details of items ordered.
• Join the Restaurants table to link each order to a specific restaurant.
• Join the Cities table to associate each restaurant with a city.
• Filter the records to only include orders from 2024.
• Group the results by CityName and RestaurantID to calculate the total order quantity
for each restaurant.
• Identify the restaurant with the highest total order quantity in each city.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Cities (
CityID INT PRIMARY KEY,
CityName VARCHAR(100)
);
--
INSERT INTO Cities (CityID, CityName)
VALUES
(1, 'Delhi'),
(2, 'Mumbai'),
(3, 'Bangalore'),
(4, 'Chennai');
--
INSERT INTO Restaurants (RestaurantID, RestaurantName, CityID)
VALUES
(1, 'Restaurant A', 1),
(2, 'Restaurant B', 1),
(3, 'Restaurant C', 1),
684
1000+ SQL Interview Questions & Answers | By Zero Analyst
--
INSERT INTO Orders (OrderID, OrderDate, RestaurantID, OrderType)
VALUES
(1, '2024-01-10', 1, 'Food'),
(2, '2024-01-12', 2, 'Food'),
(3, '2024-02-01', 3, 'Food'),
(4, '2024-02-15', 4, 'Food'),
(5, '2024-03-05', 5, 'Food'),
(6, '2024-03-10', 6, 'Food'),
(7, '2024-04-01', 7, 'Food'),
(8, '2024-04-05', 8, 'Food'),
(9, '2024-04-15', 9, 'Food'),
(10, '2024-05-10', 10, 'Food'),
(11, '2024-05-12', 11, 'Food'),
(12, '2024-06-01', 12, 'Food'),
(13, '2024-06-05', 13, 'Food'),
(14, '2024-07-01', 14, 'Food'),
(15, '2024-07-10', 15, 'Food'),
(16, '2024-08-01', 16, 'Food'),
(17, '2024-09-01', 17, 'Food'),
(18, '2024-09-15', 18, 'Food'),
(19, '2024-10-05', 19, 'Food'),
(20, '2024-10-15', 20, 'Food');
--
INSERT INTO OrderDetails (OrderDetailID, OrderID, ItemID, Quantity)
VALUES
(1, 1, 1, 5),
(2, 1, 2, 2),
(3, 2, 3, 3),
(4, 2, 4, 1),
(5, 3, 1, 6),
(6, 3, 4, 2),
(7, 4, 3, 4),
(8, 4, 2, 5),
(9, 5, 1, 8),
(10, 5, 3, 3),
(11, 6, 4, 2),
(12, 6, 1, 7),
(13, 7, 2, 5),
(14, 8, 1, 4),
(15, 9, 2, 3),
(16, 9, 3, 5),
(17, 10, 1, 9),
(18, 10, 2, 3),
(19, 11, 3, 6),
(20, 12, 4, 2),
(21, 13, 1, 7),
(22, 14, 3, 4),
(23, 14, 1, 3),
(24, 15, 2, 6),
(25, 15, 1, 2),
(26, 16, 3, 4),
(27, 17, 4, 5),
685
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• JOIN operations between Orders, OrderDetails, Restaurants, and Cities to link order
data with restaurant and city information.
• GROUP BY to aggregate total order quantities for each restaurant within each city.
• Using ORDER BY to sort by total quantity ordered for each restaurant in each city.
• ROW_NUMBER() to identify the highest ordered restaurant in each city.
Solutions
• - PostgreSQL and MySQL solution
WITH RestaurantTotalOrders AS (
SELECT r.CityID, r.RestaurantID, r.RestaurantName, SUM(od.Quantity) AS TotalQuantity
FROM OrderDetails od
JOIN Orders o ON od.OrderID = o.OrderID
JOIN Restaurants r ON o.RestaurantID = r.RestaurantID
WHERE o.OrderDate BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY r.CityID, r.RestaurantID, r.RestaurantName
)
SELECT c.CityName, r.RestaurantName, r.TotalQuantity
FROM RestaurantTotalOrders r
JOIN Cities c ON r.CityID = c.CityID
WHERE (r.CityID, r.TotalQuantity) IN (
SELECT CityID, MAX(TotalQuantity)
FROM RestaurantTotalOrders
GROUP BY CityID
)
ORDER BY c.CityName;
Explanation:
• WITH: The RestaurantTotalOrders Common Table Expression (CTE) aggregates the
total quantity ordered by each restaurant in each city in 2024.
• JOIN: Joins RestaurantTotalOrders with Cities to get the city name.
• WHERE: Filters the results to get only the restaurant with the highest total quantity in
each city.
• MAX(TotalQuantity): Identifies the restaurant with the highest total quantity in each city
using the MAX function.
• ORDER BY: Orders the final result by CityName for clarity.
Notes:
• This query returns the highest ordered restaurant in each city by total quantity of items
ordered.
• The WITH clause helps to pre-
aggregate the total order quantities for restaurants in each city.
• Q.551
Question
Write a SQL query to select the name and bonus of all employees whose bonus is less than
1000. If an employee does not have a bonus, the result should show NULL for the bonus.
Explanation
686
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
• Perform a LEFT JOIN between the Employee and Bonus tables, using empId as the
common key.
• Filter the results to only include employees whose bonus is less than 1000 or where the
bonus is NULL.
• Select the name and bonus of the employees.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Employee (
empId INT PRIMARY KEY,
name VARCHAR(50),
supervisor INT,
salary INT
);
Learnings
• LEFT JOIN to include all employees from the Employee table, even if they don't have a
bonus in the Bonus table.
• Use of WHERE clause to filter employees with a bonus less than 1000 or NULL.
• NULL handling to ensure employees with no bonus are included with a NULL value for
bonus.
Solutions
• - PostgreSQL and MySQL solution
SELECT e.name, b.bonus
FROM Employee e
LEFT JOIN Bonus b ON e.empId = b.empId
WHERE (b.bonus < 1000 OR b.bonus IS NULL);
Explanation:
• LEFT JOIN ensures all employees are included even if they don't have a corresponding
record in the Bonus table.
• The WHERE clause filters the employees whose bonus is either less than 1000 or NULL.
• The query returns the name and the bonus of those employees, including NULL for those
who don't have a bonus or have no matching record in the Bonus table.
• Q.552
Question
687
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to find the customer_number for the customer who has placed the
largest number of orders in the Orders table. It is guaranteed that exactly one customer will
have placed more orders than any other customer.
Explanation
To solve this:
• Group the Orders table by customer_number and count the number of orders placed by
each customer.
• Sort the customers by the number of orders in descending order.
• Select the customer with the highest order count.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Orders (
order_number INT PRIMARY KEY,
customer_number INT
);
Learnings
• GROUP BY to aggregate the number of orders for each customer.
• COUNT() function to count the number of orders per customer.
• ORDER BY to sort the customers by order count in descending order.
• LIMIT 1 to select the customer with the most orders.
Solutions
• - PostgreSQL and MySQL solution
SELECT customer_number
FROM Orders
GROUP BY customer_number
ORDER BY COUNT(order_number) DESC
LIMIT 1;
Explanation:
• GROUP BY: This groups the orders by customer_number.
• COUNT(order_number): This counts the number of orders for each customer.
• ORDER BY COUNT(order_number) DESC: Sorts the customers by the total number of
orders in descending order.
• LIMIT 1: Ensures only the customer with the most orders is returned.
This query returns the customer_number of the customer who has placed the most orders.
• Q.553
Question
Write an SQL query to report the movies from the Cinema table that have an odd-numbered
id and a description that is not "boring". Return the result sorted by rating in descending
order.
688
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Filter movies where the id is odd using the modulo operation (id % 2 != 0).
• Ensure the description is not "boring" by using a WHERE clause (description !=
'boring').
• Sort the results in descending order of the rating.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Cinema (
id INT PRIMARY KEY,
movie VARCHAR(100),
description VARCHAR(255),
rating FLOAT
);
Learnings
• Use the MOD operator (%) to check for odd numbers (id % 2 != 0).
• Filter rows using the WHERE clause for conditions like description not being "boring".
• Sort the result with ORDER BY rating DESC to get the movies with the highest rating
first.
Solutions
• - PostgreSQL and MySQL solution
SELECT id, movie, description, rating
FROM Cinema
WHERE id % 2 != 0
AND description != 'boring'
ORDER BY rating DESC;
Explanation:
• WHERE id % 2 != 0: This filters out movies with an even id, leaving only those with odd
id values.
• AND description != 'boring': Ensures that the description is not "boring".
• ORDER BY rating DESC: Sorts the movies by rating in descending order, so movies
with higher ratings appear first.
This query will return the list of movies that meet the criteria, sorted by their rating in
descending order.
• Q.554
Question
Write an SQL query to find the top-performing rider(s) for Zomato, based on the number of
deliveries completed in the last 30 days. The rider(s) who made the most deliveries during
this period should be considered top performers. Return the rider_id and the total_deliveries
for each top-performing rider.
689
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Calculate the total number of deliveries made by each rider in the last 30 days. This
requires filtering the data for the last 30 days based on the delivery_date.
• Count the number of deliveries per rider using the COUNT() function.
• Find the maximum number of deliveries made by any rider in this period.
• Filter out the riders who made the maximum number of deliveries.
• Return the rider_id and total_deliveries for the top-performing rider(s).
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Deliveries (
delivery_id INT PRIMARY KEY,
rider_id INT,
delivery_date DATE
);
Learnings
• Use the WHERE clause to filter deliveries within the last 30 days.
• Use COUNT() to count the number of deliveries per rider.
• MAX() to find the highest number of deliveries.
• Use GROUP BY to group the results by rider_id and HAVING to filter out riders who
made the maximum deliveries.
Solutions
• - PostgreSQL and MySQL solution
SELECT rider_id, COUNT(delivery_id) AS total_deliveries
FROM Deliveries
WHERE delivery_date >= CURRENT_DATE - INTERVAL 30 DAY
GROUP BY rider_id
HAVING COUNT(delivery_id) = (
SELECT MAX(delivery_count)
FROM (
SELECT rider_id, COUNT(delivery_id) AS delivery_count
FROM Deliveries
WHERE delivery_date >= CURRENT_DATE - INTERVAL 30 DAY
GROUP BY rider_id
) AS rider_counts
);
Explanation:
• WHERE delivery_date >= CURRENT_DATE - INTERVAL 30 DAY: Filters
deliveries in the last 30 days.
• COUNT(delivery_id): Counts the number of deliveries made by each rider.
• GROUP BY rider_id: Groups the results by rider_id to count deliveries per rider.
690
1000+ SQL Interview Questions & Answers | By Zero Analyst
691
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• COUNT() function to count successful and unsuccessful deliveries.
• CASE WHEN to differentiate between successful and unsuccessful deliveries.
• SUM() function to calculate total earnings.
• ORDER BY to sort based on earnings in descending order.
• LIMIT 3 to select only the top 3 riders.
Solutions
• - PostgreSQL and MySQL solution
SELECT rider_id,
COUNT(CASE WHEN status = 'successful' THEN 1 END) AS total_successful_deliveries,
COUNT(CASE WHEN status = 'unsuccessful' THEN 1 END) AS total_unsuccessful_deliver
ies,
COUNT(CASE WHEN status = 'successful' THEN 1 END) * 5 AS total_earnings
FROM Deliveries
WHERE delivery_date >= CURRENT_DATE - INTERVAL 30 DAY
GROUP BY rider_id
ORDER BY total_earnings DESC
LIMIT 3;
Explanation:
• COUNT(CASE WHEN status = 'successful' THEN 1 END): This counts the number of
successful deliveries for each rider.
• COUNT(CASE WHEN status = 'unsuccessful' THEN 1 END): This counts the number
of unsuccessful deliveries for each rider.
• COUNT(CASE WHEN status = 'successful' THEN 1 END) * 5: This calculates the
total earnings, where each successful delivery earns $5.
• WHERE delivery_date >= CURRENT_DATE - INTERVAL 30 DAY: Filters
deliveries within the last 30 days.
• GROUP BY rider_id: Groups the results by rider to count the deliveries and earnings per
rider.
• ORDER BY total_earnings DESC: Sorts riders by total earnings in descending order.
• LIMIT 3: Selects only the top 3 riders based on their earnings.
This query will return the rider_id, total_successful_deliveries,
total_unsuccessful_deliveries, and total_earnings for the top 3 riders who made the most
successful deliveries in the last 30 days.
• Q.557
Question
Write an SQL query to find the average rider working hours per day based on the order
delivery times from the OrderDelivery table. The table contains the rider_id, order_id,
and delivery_time (in timestamp format). The goal is to calculate the average working hours
each rider spends delivering orders per day.
Explanation
To solve this:
• Extract the working hours for each rider per day from the delivery_time.
692
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DATE() or CAST() to extract the date part from a timestamp.
• COUNT() to calculate the number of delivery days for each rider.
• TIMESTAMPDIFF() or EXTRACT() to calculate the difference between two
timestamps (start and end times) to get working hours.
• AVG() to calculate the average working hours per day for each rider.
Solutions
• - PostgreSQL and MySQL solution
SELECT rider_id,
AVG(TIMESTAMPDIFF(HOUR, delivery_time, NOW())) AS avg_working_hours_per_day
FROM OrderDelivery
GROUP BY rider_id
ORDER BY avg_working_hours_per_day DESC;
Explanation:
• TIMESTAMPDIFF(HOUR, delivery_time, NOW()): This calculates the difference
between the delivery_time and the current time in hours. For a proper "working hour per
day," you might need two timestamps (e.g., start and end time) for deliveries, but if it's a
simple timestamp column representing delivery time, this will give the hours from the
delivery time to the current time.
• AVG(): To calculate the average of these hourly differences for each rider.
• GROUP BY rider_id: This groups the results by rider_id to calculate the working hours
per rider.
• ORDER BY avg_working_hours_per_day DESC: This orders the results by average
working hours per day in descending order.
This query will return the rider_id along with their average working hours per day. You
can adjust the logic if there are explicit start and end times for each delivery.
• Q.558
693
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to calculate the overall idle time for each rider on 31st December 2024.
The idle time for each rider is defined as the time between the first order's delivery time
and the next order's receive time. Only consider the orders that occurred on 31st December
2024. Return the rider_id along with their overall idle time in hours.
Explanation
To solve this:
• Identify the first order: The first order of the day for each rider on 31st December 2024.
• Find the next order: The order placed right after the first order for each rider.
• Calculate idle time: The idle time is the difference between the first order's delivery
time and the next order's receive time.
• Summing idle times: We need to calculate the total idle time for each rider based on all
orders on 31st December 2024.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE OrderDelivery (
order_id INT PRIMARY KEY,
rider_id INT,
order_time TIMESTAMP
);
Learnings
• LEAD(): To get the next order's delivery time after the current order.
• DATE(): To filter only orders that occurred on 31st December 2024.
• TIMESTAMPDIFF(): To calculate the time difference between two timestamps in hours.
• SUM(): To get the total idle time for each rider.
Solutions
• - PostgreSQL and MySQL solution
WITH OrderWithNext AS (
SELECT rider_id,
order_time,
LEAD(order_time) OVER (PARTITION BY rider_id ORDER BY order_time) AS next_ord
er_time
FROM OrderDelivery
WHERE DATE(order_time) = '2024-12-31'
)
SELECT rider_id,
SUM(TIMESTAMPDIFF(HOUR, order_time, next_order_time)) AS overall_idle_time
FROM OrderWithNext
694
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
• LEAD(order_time) OVER (PARTITION BY rider_id ORDER BY order_time): This
window function returns the next order time for each rider based on the order_time ordered
in ascending order.
• It partitions the data by rider_id, meaning that for each rider, it will look at their orders
and get the time of the next order for the rider.
• WHERE DATE(order_time) = '2024-12-31': Filters the data to only consider orders on
31st December 2024.
• TIMESTAMPDIFF(HOUR, order_time, next_order_time): This function calculates the
time difference (in hours) between the order_time and the next_order_time (i.e., idle time
between orders).
• SUM(TIMESTAMPDIFF(HOUR, order_time, next_order_time)): This sums up the
idle time for all orders for each rider, i.e., the total idle time for each rider on 31st December
2024.
• GROUP BY rider_id: Groups the results by rider_id to calculate the total idle time for
each rider.
• WHERE next_order_time IS NOT NULL: Ensures that only records with a next order
(i.e., there is a subsequent order to calculate idle time) are included.
• ORDER BY overall_idle_time DESC: Orders the results by the total idle time for each
rider in descending order.
This query will return the rider_id and their overall idle time in hours for 31st December
2024, sorted in descending order by idle time.
• Q.559
Problem statement
Write an SQL query to calculate the revenue contribution of food delivery vs. grocery
delivery for the top 5 revenue-generating cities in the last quarter.
Explanation
To solve this:
• Filter the data to consider only the orders from the last quarter.
• Sum the revenue for food delivery and grocery delivery orders separately.
• Identify the top 5 cities based on total revenue.
• Calculate the contribution percentage for food delivery and grocery delivery in these
cities.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Revenue (
OrderID INT PRIMARY KEY,
OrderType VARCHAR(50), -- 'food' or 'grocery'
City VARCHAR(100),
TotalAmount DECIMAL(10, 2),
OrderDate DATE
);
695
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(1, 'food', 'New York', 100.00, '2024-10-01'),
(2, 'grocery', 'New York', 50.00, '2024-10-05'),
(3, 'food', 'Los Angeles', 120.00, '2024-11-15'),
(4, 'grocery', 'Los Angeles', 75.00, '2024-11-17'),
(5, 'food', 'San Francisco', 150.00, '2024-12-20'),
(6, 'grocery', 'San Francisco', 40.00, '2024-12-22'),
(7, 'food', 'Chicago', 80.00, '2024-10-10'),
(8, 'grocery', 'Chicago', 60.00, '2024-10-12'),
(9, 'food', 'Houston', 90.00, '2024-11-05'),
(10, 'grocery', 'Houston', 100.00, '2024-11-08'),
(11, 'food', 'Phoenix', 110.00, '2024-12-01'),
(12, 'grocery', 'Phoenix', 120.00, '2024-12-05');
Learnings
• Filtering by date: Use the WHERE clause to filter data for the last quarter.
• Aggregating by order type: Use SUM() to get total revenue for food and grocery.
• Ranking cities: Use GROUP BY and ORDER BY to get top cities based on total revenue.
• Calculating percentages: The revenue contribution is calculated as (revenue from food
/ total revenue) * 100 and similarly for grocery.
Solutions
• - PostgreSQL and MySQL solution
WITH CityRevenue AS (
SELECT City,
SUM(CASE WHEN OrderType = 'food' THEN TotalAmount ELSE 0 END) AS FoodRevenue,
SUM(CASE WHEN OrderType = 'grocery' THEN TotalAmount ELSE 0 END) AS GroceryRe
venue,
SUM(TotalAmount) AS TotalRevenue
FROM Revenue
WHERE OrderDate >= '2024-10-01' AND OrderDate <= '2024-12-31' -- Last Quarter (Q4 2
024)
GROUP BY City
ORDER BY TotalRevenue DESC
LIMIT 5
)
SELECT City,
FoodRevenue,
GroceryRevenue,
TotalRevenue,
(FoodRevenue / TotalRevenue) * 100 AS FoodRevenueContribution,
(GroceryRevenue / TotalRevenue) * 100 AS GroceryRevenueContribution
FROM CityRevenue;
Explanation:
• CityRevenue CTE:
• This common table expression (CTE) filters the data to the last quarter (from October 1,
2024, to December 31, 2024).
• It calculates the FoodRevenue, GroceryRevenue, and TotalRevenue for each city using
the SUM() function.
• We use CASE inside SUM() to sum the amounts conditionally based on OrderType (food or
grocery).
• After grouping by city, we order the cities by TotalRevenue in descending order and limit
the result to the top 5 cities.
• Final SELECT:
• The outer query selects the city and calculates the percentage contributions of food and
grocery revenues by dividing the respective revenues by the total revenue for each city.
• We multiply by 100 to express the contributions as percentages.
696
1000+ SQL Interview Questions & Answers | By Zero Analyst
• WHERE OrderDate >= '2024-10-01' AND OrderDate <= '2024-12-31': Filters the
orders that are placed during the last quarter of 2024 (October, November, December).
• LIMIT 5: Restricts the result to only the top 5 cities based on total revenue.
Expected Output:
New
100.00 50.00 150.00 66.67% 33.33%
York
Los
Angel 120.00 75.00 195.00 61.54% 38.46%
es
San
Franci 150.00 40.00 190.00 78.95% 21.05%
sco
Houst
90.00 100.00 190.00 47.37% 52.63%
on
Chica
80.00 60.00 140.00 57.14% 42.86%
go
This query calculates the revenue contribution of food vs. grocery orders in the top 5
revenue-generating cities in the last quarter of 2024.
• Q.560
Write an SQL query to find the average order amount for each OrderType (food or
grocery) in the last 30 days for all cities.
Explanation
To solve this:
• Filter the orders from the last 30 days.
• Group by OrderType to calculate the average order amount separately for food and
grocery.
• Return the results with the average order amount for each order type.
Datasets and SQL Schemas
• - Table creation and sample data
CREATE TABLE Revenue (
OrderID INT PRIMARY KEY,
OrderType VARCHAR(50), -- 'food' or 'grocery'
City VARCHAR(100),
TotalAmount DECIMAL(10, 2),
OrderDate DATE
);
697
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DATE subtraction: Use the CURRENT_DATE function to subtract 30 days and filter the
orders in the last 30 days.
• GROUP BY and AVG(): Use GROUP BY to group the data by OrderType and AVG() to
calculate the average order amount for each type.
Solutions
• - PostgreSQL and MySQL solution
SELECT OrderType,
AVG(TotalAmount) AS AverageOrderAmount
FROM Revenue
WHERE OrderDate >= CURRENT_DATE - INTERVAL 30 DAY
GROUP BY OrderType;
Explanation:
• WHERE OrderDate >= CURRENT_DATE - INTERVAL 30 DAY: Filters the orders
to include only those that have been placed in the last 30 days from the current date.
• GROUP BY OrderType: Groups the data by OrderType (food or grocery) so we can
calculate the average separately for each type.
• AVG(TotalAmount): Calculates the average of the TotalAmount for each group of orders
(food or grocery).
Swiggy
• Q.561
Question
Find city-wise customer count who have placed more than three orders in November 2023.
Explanation
To solve this, we need to:
• Filter orders placed in November 2023.
• Group by customer_id to count the number of orders for each customer.
• Filter out customers who have placed more than three orders.
• Group by city to count the number of customers in each city.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE orders(
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
698
1000+ SQL Interview Questions & Answers | By Zero Analyst
price FLOAT,
city VARCHAR(25)
);
• - Datasets
INSERT INTO orders (order_id, customer_id, order_date, price, city) VALUES
(1, 101, '2023-11-01', 150.50, 'Mumbai'),
(2, 102, '2023-11-05', 200.75, 'Delhi'),
(3, 103, '2023-11-10', 180.25, 'Mumbai'),
(4, 104, '2023-11-15', 120.90, 'Delhi'),
(5, 105, '2023-11-20', 250.00, 'Mumbai'),
(6, 108, '2023-11-25', 180.75, 'Gurgoan'),
(7, 107, '2023-12-30', 300.25, 'Delhi'),
(8, 108, '2023-12-02', 220.50, 'Gurgoan'),
(9, 109, '2023-11-08', 170.00, 'Mumbai'),
(10, 110, '2023-10-12', 190.75, 'Delhi'),
(11, 108, '2023-10-18', 210.25, 'Gurgoan'),
(12, 112, '2023-11-24', 280.50, 'Mumbai'),
(13, 113, '2023-10-29', 150.00, 'Mumbai'),
(14, 103, '2023-11-03', 200.75, 'Mumbai'),
(15, 115, '2023-10-07', 230.90, 'Delhi'),
(16, 116, '2023-11-11', 260.00, 'Mumbai'),
(17, 117, '2023-11-16', 180.75, 'Mumbai'),
(18, 102, '2023-11-22', 320.25, 'Delhi'),
(19, 103, '2023-11-27', 170.50, 'Mumbai'),
(20, 102, '2023-11-05', 220.75, 'Delhi'),
(21, 103, '2023-11-09', 300.25, 'Mumbai'),
(22, 101, '2023-11-15', 180.50, 'Mumbai'),
(23, 104, '2023-11-18', 250.75, 'Delhi'),
(24, 102, '2023-11-20', 280.25, 'Delhi'),
(25, 117, '2023-11-16', 180.75, 'Mumbai'),
(26, 117, '2023-11-16', 180.75, 'Mumbai'),
(27, 117, '2023-11-16', 180.75, 'Mumbai'),
(28, 117, '2023-11-16', 180.75, 'Mumbai');
Learnings
• Use of GROUP BY to aggregate data.
• Using HAVING to filter on aggregated results.
• Date filtering with WHERE clause.
Solutions
• - PostgreSQL solution
SELECT city, COUNT(DISTINCT customer_id) AS customer_count
FROM orders
WHERE order_date >= '2023-11-01' AND order_date <= '2023-11-30'
GROUP BY city
HAVING COUNT(order_id) > 3;
• - MySQL solution
SELECT city, COUNT(DISTINCT customer_id) AS customer_count
FROM orders
WHERE order_date BETWEEN '2023-11-01' AND '2023-11-30'
GROUP BY city
HAVING COUNT(order_id) > 3;
• Q.562
Question
Count the delayed orders for each delivery partner based on predicted and actual delivery
times.
Explanation
The task is to count the number of delayed orders for each delivery partner. An order is
considered delayed if the actual delivery time is later than the predicted delivery time. We
need to:
699
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of WHERE clause for filtering data based on time comparison.
• GROUP BY to aggregate data by each delivery partner.
• Counting filtered results with COUNT().
Solutions
• - PostgreSQL solution
SELECT del_partner, COUNT(order_id) AS delayed_orders
FROM order_details
WHERE delivery_time > predicted_time
GROUP BY del_partner;
• - MySQL solution
SELECT del_partner, COUNT(order_id) AS delayed_orders
FROM order_details
WHERE delivery_time > predicted_time
GROUP BY del_partner;
• Q.563
Question
Calculate the bad experience rate for new users who signed up in June 2022 during their first
14 days on the platform. The output should include the percentage of bad experiences,
rounded to 2 decimal places. A bad experience is defined as orders that were either completed
incorrectly, never received, or delivered late (i.e., delivery was more than 30 minutes later
than the estimated delivery time).
Explanation
To calculate the bad experience rate, follow these steps:
• Identify the customers who signed up in June 2022.
• Filter the orders placed by these customers within their first 14 days.
• Join the orders with the trips table to check for late deliveries (actual delivery time >
estimated delivery time by more than 30 minutes).
700
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Count the number of "bad experiences" (completed incorrectly, never received, or late
deliveries).
• Calculate the percentage of bad experiences based on the total number of orders within the
first 14 days.
Datasets and SQL Schemas
• - customers Table
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY,
signup_timestamp TIMESTAMP
);
• - orders Table
CREATE TABLE orders (
order_id INTEGER PRIMARY KEY,
customer_id INTEGER,
trip_id INTEGER,
status VARCHAR(255),
order_timestamp TIMESTAMP
);
• - trips Table
CREATE TABLE trips (
dasher_id INTEGER,
trip_id INTEGER PRIMARY KEY,
estimated_delivery_timestamp TIMESTAMP,
actual_delivery_timestamp TIMESTAMP
);
• - Datasets
INSERT INTO customers (customer_id, signup_timestamp) VALUES
(8472, '2022-05-30 00:00:00'),
(2341, '2022-06-01 00:00:00'),
(1314, '2022-06-03 00:00:00'),
(1435, '2022-06-05 00:00:00'),
(5421, '2022-06-07 00:00:00');
INSERT INTO orders (order_id, customer_id, trip_id, status, order_timestamp) VALUES
(727424, 8472, 100463, 'completed successfully', '2022-06-05 09:12:00'),
(242513, 2341, 100482, 'completed incorrectly', '2022-06-05 14:40:00'),
(141367, 1314, 100362, 'completed incorrectly', '2022-06-07 15:03:00'),
(582193, 5421, 100657, 'never_received', '2022-07-07 15:22:00'),
(253613, 1314, 100213, 'completed successfully', '2022-06-12 13:43:00');
INSERT INTO trips (dasher_id, trip_id, estimated_delivery_timestamp, actual_delivery_tim
estamp) VALUES
(101, 100463, '2022-06-05 09:42:00', '2022-06-05 09:38:00'),
(102, 100482, '2022-06-05 15:10:00', '2022-06-05 15:46:00'),
(101, 100362, '2022-06-07 15:33:00', '2022-06-07 16:45:00'),
(102, 100657, '2022-07-07 15:52:00', NULL),
(103, 100213, '2022-06-12 14:13:00', '2022-06-12 14:10:00');
Learnings
• Use of EXTRACT function to filter users by signup date.
• Filtering orders based on order timestamps to ensure they are within the first 14 days of
signup.
• Understanding the use of INTERVAL to calculate date ranges.
• Use of JOIN between multiple tables to aggregate data from different sources.
• Applying aggregation and conditional counting with COUNT() and WHERE clauses.
Solutions
• - PostgreSQL solution
WITH june22_cte AS (
SELECT
orders.order_id,
orders.trip_id,
orders.status
701
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM customers
INNER JOIN orders
ON customers.customer_id = orders.customer_id
WHERE EXTRACT(MONTH FROM customers.signup_timestamp) = 6
AND EXTRACT(YEAR FROM customers.signup_timestamp) = 2022
AND orders.order_timestamp BETWEEN customers.signup_timestamp
AND customers.signup_timestamp + INTERVAL '14 DAYS'
)
SELECT
ROUND(
100.0 * COUNT(june22.order_id) / (SELECT COUNT(order_id) FROM june22_cte),
2) AS bad_experience_pct
FROM june22_cte AS june22
INNER JOIN trips
ON june22.trip_id = trips.trip_id
WHERE june22.status IN ('completed incorrectly', 'never_received')
OR trips.actual_delivery_timestamp > trips.estimated_delivery_timestamp + INTERVAL
'30 MINUTE';
• - MySQL solution
WITH june22_cte AS (
SELECT
orders.order_id,
orders.trip_id,
orders.status
FROM customers
INNER JOIN orders
ON customers.customer_id = orders.customer_id
WHERE MONTH(customers.signup_timestamp) = 6
AND YEAR(customers.signup_timestamp) = 2022
AND orders.order_timestamp BETWEEN customers.signup_timestamp
AND DATE_ADD(customers.signup_timestamp, INTERVAL 14 DAY)
)
SELECT
ROUND(
100.0 * COUNT(june22.order_id) / (SELECT COUNT(order_id) FROM june22_cte),
2) AS bad_experience_pct
FROM june22_cte AS june22
INNER JOIN trips
ON june22.trip_id = trips.trip_id
WHERE june22.status IN ('completed incorrectly', 'never_received')
OR trips.actual_delivery_timestamp > DATE_ADD(trips.estimated_delivery_timestamp,
INTERVAL 30 MINUTE);
• Q.564
Question
As a Data Analyst at Swiggy Analyst, compute the average delivery duration for each driver
on each day, the rank of each driver's daily average delivery duration, and the overall average
delivery duration per driver.
Explanation
The task requires calculating:
• The average delivery duration for each driver on a specific day.
• Ranking drivers by their daily average delivery duration.
• The overall average delivery duration for each driver across all their deliveries.
To achieve this, we need to:
• Use EXTRACT(EPOCH FROM ...) to calculate the delivery duration in minutes for each
delivery.
• Use window functions (RANK() and AVG()) to rank the drivers and compute the overall
average.
Datasets and SQL Schemas
• - deliveries Table
702
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of EXTRACT(EPOCH FROM ...) to calculate delivery durations in minutes.
• Use of RANK() to assign ranks based on daily average delivery duration.
• Use of AVG() window function to calculate the overall average delivery duration for each
driver.
• Use of PARTITION BY to perform calculations within partitions of data, such as per driver
and per day.
Solutions
• - PostgreSQL solution
SELECT
driver_id,
day,
avg_delivery_duration,
RANK() OVER (PARTITION BY driver_id ORDER BY avg_delivery_duration) AS rank,
AVG(avg_delivery_duration) OVER (PARTITION BY driver_id) AS overall_avg_delivery_dur
ation
FROM
(SELECT
driver_id,
DATE(delivery_start_time) AS day,
AVG(EXTRACT(EPOCH FROM (delivery_end_time - delivery_start_time)) / 60)
OVER (PARTITION BY driver_id, DATE(delivery_start_time)) AS avg_delivery_duratio
n
FROM
deliveries) subquery;
• - MySQL solution
SELECT
driver_id,
day,
avg_delivery_duration,
RANK() OVER (PARTITION BY driver_id ORDER BY avg_delivery_duration) AS rank,
AVG(avg_delivery_duration) OVER (PARTITION BY driver_id) AS overall_avg_delivery_dur
ation
FROM
(SELECT
driver_id,
DATE(delivery_start_time) AS day,
AVG(TIMESTAMPDIFF(SECOND, delivery_start_time, delivery_end_time) / 60)
OVER (PARTITION BY driver_id, DATE(delivery_start_time)) AS avg_delivery_duratio
n
FROM
deliveries) subquery;
• Q.565
Question
Identify the top 5 restaurants with the most orders in the last month from a database that
contains restaurants, users, and orders tables.
703
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this, we need to:
• Join the orders table with the restaurants table on the restaurant_id.
• Filter the orders to only include those within the last month using NOW() - INTERVAL '1
month'.
• Group the results by the restaurant name to count the number of orders for each restaurant.
• Sort the results by the order count in descending order to identify the top 5 restaurants.
• Use LIMIT 5 to restrict the results to the top 5 restaurants.
Datasets and SQL Schemas
• - restaurants Table
CREATE TABLE restaurants (
restaurant_id VARCHAR(10) PRIMARY KEY,
restaurant_name VARCHAR(100)
);
• - users Table
CREATE TABLE users (
user_id INT PRIMARY KEY,
user_name VARCHAR(100)
);
• - orders Table
CREATE TABLE orders (
order_id INT PRIMARY KEY,
user_id INT,
restaurant_id VARCHAR(10),
order_date TIMESTAMP
);
• - Datasets
INSERT INTO restaurants (restaurant_id, restaurant_name) VALUES
('001', 'Burger King'),
('002', 'KFC'),
('003', 'McDonald\'s'),
('004', 'Pizza Hut'),
('005', 'Starbucks');
INSERT INTO users (user_id, user_name) VALUES
(101, 'John Doe'),
(102, 'Jane Smith'),
(103, 'Bob Johnson'),
(104, 'Alice Anderson'),
(105, 'Emma Wilson');
INSERT INTO orders (order_id, user_id, restaurant_id, order_date) VALUES
(2001, 101, '001', '2022-10-01'),
(2002, 102, '002', '2022-10-02'),
(2003, 101, '003', '2022-10-03'),
(2004, 103, '002', '2022-10-04'),
(2005, 102, '001', '2022-10-05'),
(2006, 104, '004', '2022-10-06'),
(2007, 105, '005', '2022-10-07'),
(2008, 101, '001', '2022-10-08'),
(2009, 102, '002', '2022-10-09'),
(2010, 104, '005', '2022-10-10');
Learnings
• Use of JOIN to combine data from multiple tables based on a common key.
• Filtering data by date range with NOW() and INTERVAL.
• Grouping and aggregation using COUNT() to calculate the number of orders per restaurant.
• Sorting results with ORDER BY and limiting the output with LIMIT.
Solutions
• - PostgreSQL solution
704
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• You can use ROW_NUMBER() or MOD() functions to handle alternating rows and determine
which rows to swap.
• Use CASE or conditional logic to handle the odd number of rows.
• Sorting by id ensures the final result is in ascending order.
Solutions
• - PostgreSQL/MySQL solution
WITH swapped_seats AS (
705
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
id,
student,
ROW_NUMBER() OVER (ORDER BY id) AS row_num
FROM Seat
)
SELECT
CASE
WHEN row_num % 2 = 1 AND row_num + 1 <= (SELECT COUNT(*) FROM Seat) THEN (SELECT
student FROM swapped_seats WHERE row_num = swapped_seats.row_num + 1)
WHEN row_num % 2 = 0 AND row_num - 1 >= 1 THEN (SELECT student FROM swapped_seat
s WHERE row_num = swapped_seats.row_num - 1)
ELSE student
END AS student,
id
FROM swapped_seats
ORDER BY id;
706
1000+ SQL Interview Questions & Answers | By Zero Analyst
(1, 'Inception'),
(2, 'Titanic'),
(3, 'Avatar'),
(4, 'The Dark Knight');
Learnings
• Use aggregation (COUNT(), AVG()) to summarize data.
• Handle ties by sorting the result lexicographically (ORDER BY with LIMIT).
• Filtering for a specific date range (WHERE clause with date conditions).
Solutions
1. Find the user who has rated the greatest number of movies
SELECT u.name
FROM Users u
JOIN MovieRating mr ON u.user_id = mr.user_id
GROUP BY u.name
ORDER BY COUNT(mr.movie_id) DESC, u.name ASC
LIMIT 1;
Explanation:
• JOIN the Users table with the MovieRating table on user_id.
• GROUP BY the user_name to count the number of movies rated by each user.
• ORDER BY first by the count of ratings in descending order, then by the name
lexicographically in ascending order to break ties.
• Q.568
Question
Find the movie name with the highest average rating in February 2020. In case of a tie, return
the lexicographically smaller movie name.
Explanation
To solve these problems:
For the second task:
• We need to find the movie with the highest average rating in February 2020. If multiple
movies have the same average rating, we return the movie with the lexicographically smallest
name.
Datasets and SQL Schemas
• - Movies Table
CREATE TABLE Movies (
movie_id INT PRIMARY KEY,
707
1000+ SQL Interview Questions & Answers | By Zero Analyst
title VARCHAR(100)
);
• - Users Table
CREATE TABLE Users (
user_id INT PRIMARY KEY,
name VARCHAR(100)
);
• - MovieRating Table
CREATE TABLE MovieRating (
movie_id INT,
user_id INT,
rating INT,
created_at DATE,
PRIMARY KEY (movie_id, user_id)
);
• - Example Data
-- Insert into Movies table
INSERT INTO Movies (movie_id, title) VALUES
(1, 'Inception'),
(2, 'Titanic'),
(3, 'Avatar'),
(4, 'The Dark Knight');
Learnings
• Use aggregation (COUNT(), AVG()) to summarize data.
• Handle ties by sorting the result lexicographically (ORDER BY with LIMIT).
• Filtering for a specific date range (WHERE clause with date conditions).
Solutions
Find the movie with the highest average rating in February 2020
SELECT m.title
FROM Movies m
JOIN MovieRating mr ON m.movie_id = mr.movie_id
WHERE mr.created_at BETWEEN '2020-02-01' AND '2020-02-29'
GROUP BY m.title
ORDER BY AVG(mr.rating) DESC, m.title ASC
LIMIT 1;
Explanation:
• JOIN the Movies table with the MovieRating table on movie_id.
• WHERE filter the ratings to only include those created in February 2020 (from 2020-02-01
to 2020-02-29).
• GROUP BY movie title to calculate the average rating for each movie.
• ORDER BY first by the average rating in descending order, then by the movie title
lexicographically in ascending order to break ties.
708
1000+ SQL Interview Questions & Answers | By Zero Analyst
• LIMIT 1 to return the movie with the highest average rating (and lexicographically
smallest title in case of a tie).
• Q.569
Question
Find the top 3 delivery partners who have made the highest number of deliveries in the last
30 days (consider today as ‘5th Feb 2024’). In case of a tie, return the delivery partners in
lexicographical order.
Explanation
• You need to count the number of deliveries made by each delivery partner in the last 30
days.
• Sort the result by the delivery count in descending order. In case of ties, order the delivery
partners lexicographically.
• Return only the top 3 delivery partners.
• Today’s date is given as ‘5th February 2024’ to ensure consistent results.
Datasets and SQL Schemas
• Deliveries Table
CREATE TABLE Deliveries (
delivery_id INT PRIMARY KEY,
delivery_partner VARCHAR(100),
delivery_date DATE
);
• Datasets
INSERT INTO Deliveries (delivery_id, delivery_partner, delivery_date) VALUES
(1, 'Partner A', '2024-01-01'),
(2, 'Partner B', '2024-01-02'),
(3, 'Partner A', '2024-01-05'),
(4, 'Partner C', '2024-01-10'),
(5, 'Partner B', '2024-01-12'),
(6, 'Partner A', '2024-01-15'),
(7, 'Partner C', '2024-01-20'),
(8, 'Partner B', '2024-01-25'),
(9, 'Partner D', '2024-10-29'),
(10, 'Partner A','2024-02-01');
Learnings
• Using COUNT() for aggregation.
• Sorting by both numerical and alphabetical order.
• Handling time-based filtering with DATE and INTERVAL.
• Using an explicit date for filtering the last 30 days.
Solution
SELECT delivery_partner, COUNT(delivery_id) AS delivery_count
FROM Deliveries
WHERE delivery_date >= '2024-02-05' - INTERVAL 30 DAY
GROUP BY delivery_partner
ORDER BY delivery_count DESC, delivery_partner ASC
LIMIT 3;
Explanation of Changes:
• Explicit Date Handling:
• In the original solution, CURDATE() was used, which automatically considers the current
date. However, since the question specifically asks to consider ‘5th February 2024’ as today's
date, I have explicitly used the date '2024-02-05' in the query.
709
1000+ SQL Interview Questions & Answers | By Zero Analyst
• INTERVAL 30 DAY:
• We subtract 30 days from '2024-02-05' to get the date range for the last 30 days (i.e.,
from 2024-01-06 to 2024-02-05).
• Q.570
Question
List all the restaurants that have received more than 10 orders in the month of December
2023.
Explanation
• You need to count how many orders each restaurant received in December 2023.
• Return only those restaurants that have more than 10 orders.
Datasets and SQL Schemas
• - Orders Table
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
restaurant_id INT,
order_date DATE
);
• - Datasets
INSERT INTO Orders (order_id, restaurant_id, order_date) VALUES
(1, 101, '2023-12-01'),
(2, 101, '2023-12-03'),
(3, 102, '2023-12-04'),
(4, 102, '2023-12-05'),
(5, 101, '2023-12-06'),
(6, 103, '2023-12-07'),
(7, 101, '2023-12-08'),
(8, 102, '2023-12-09'),
(9, 101, '2023-12-10'),
(10, 104, '2023-12-11'),
(11, 101, '2023-12-12'),
(12, 101, '2023-12-13'),
(13, 102, '2023-12-14'),
(14, 103, '2023-12-15'),
(15, 101, '2023-12-16'),
(16, 104, '2023-12-17'),
(17, 101, '2023-12-18');
Learnings
• Counting rows with a GROUP BY.
• Filtering based on date conditions.
• Using HAVING for conditions after GROUP BY.
Solution
SELECT restaurant_id
FROM Orders
WHERE order_date BETWEEN '2023-12-01' AND '2023-12-31'
GROUP BY restaurant_id
HAVING COUNT(order_id) > 10;
• Q.571
Question
Calculate the average order value (price per order) for each restaurant for the month of
December 2023. In case a restaurant did not receive any orders in December, show the
restaurant's name as well, but with a NULL for the average order value.
710
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• You need to join the Orders and Restaurant tables to get the average order value for each
restaurant.
• If the restaurant did not receive any orders in December, return NULL for the average order
value.
• Handle restaurants with zero orders in December as well.
Datasets and SQL Schemas
• - Orders Table
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
restaurant_id INT,
order_date DATE,
price DECIMAL(10,2)
);
• - Restaurants Table
CREATE TABLE Restaurants (
restaurant_id INT PRIMARY KEY,
restaurant_name VARCHAR(100)
);
• - Datasets
INSERT INTO Orders (order_id, restaurant_id, order_date, price) VALUES
(1, 101, '2023-12-01', 250.00),
(2, 101, '2023-12-03', 300.00),
(3, 102, '2023-12-04', 450.00),
(4, 101, '2023-12-06', 275.00),
(5, 103, '2023-12-07', 150.00),
(6, 101, '2023-12-08', 400.00),
(7, 104, '2023-12-09', 350.00),
(8, 101, '2023-12-10', 500.00),
(9, 102, '2023-12-15', 400.00),
(10, 101, '2023-12-16', 350.00),
(11, 101, '2023-12-18', 450.00);
• - Restaurants Table
INSERT INTO Restaurants (restaurant_id, restaurant_name) VALUES
(101, 'Burger King'),
(102, 'McDonalds'),
(103, 'Pizza Hut'),
(104, 'Starbucks');
Learnings
• Handling missing data (using LEFT JOIN to include all restaurants).
• Grouping and calculating averages with AVG().
• Using conditional aggregation with COALESCE() to handle NULL values.
Solution
SELECT r.restaurant_name,
AVG(o.price) AS avg_order_value
FROM Restaurants r
LEFT JOIN Orders o
ON r.restaurant_id = o.restaurant_id
AND o.order_date BETWEEN '2023-12-01' AND '2023-12-31'
GROUP BY r.restaurant_name;
• Q.572
Question
Find the number of orders placed by each customer, and categorize them into three groups:
• 'Frequent' (more than 10 orders),
• 'Regular' (between 5 and 10 orders),
711
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT() to aggregate data.
• Using CASE to categorize data based on conditions.
• Using GROUP BY for grouping by customer.
Solution
SELECT customer_id,
COUNT(order_id) AS total_orders,
CASE
WHEN COUNT(order_id) > 10 THEN 'Frequent'
WHEN COUNT(order_id) BETWEEN 5 AND 10 THEN 'Regular'
ELSE 'Occasional'
END AS order_category
FROM Orders
GROUP BY customer_id;
712
1000+ SQL Interview Questions & Answers | By Zero Analyst
• You need to calculate the average delivery time for each delivery partner using AVG() and
the TIMESTAMPDIFF function.
• Use RANK() as a window function to rank delivery partners based on their average delivery
time.
Datasets and SQL Schemas
• Deliveries Table
CREATE TABLE Deliveries (
delivery_id INT PRIMARY KEY,
delivery_partner VARCHAR(100),
delivery_start_time TIMESTAMP,
delivery_end_time TIMESTAMP
);
• Datasets
INSERT INTO Deliveries (delivery_id, delivery_partner, delivery_start_time, delivery_end
_time) VALUES
(1, 'Partner A', '2024-01-01 10:00:00', '2024-01-01 10:45:00'),
(2, 'Partner B', '2024-01-01 10:30:00', '2024-01-01 11:00:00'),
(3, 'Partner A', '2024-01-02 11:00:00', '2024-01-02 11:40:00'),
(4, 'Partner B', '2024-01-02 14:00:00', '2024-01-02 14:30:00'),
(5, 'Partner C', '2024-01-03 08:30:00', '2024-01-03 09:00:00'),
(6, 'Partner C', '2024-01-03 10:00:00', '2024-01-03 10:30:00');
Learnings
• Using AVG() to calculate the average delivery time.
• Using TIMESTAMPDIFF to calculate the delivery duration in minutes.
• Using RANK() to rank the delivery partners based on their performance.
Solution
SELECT delivery_partner,
AVG(TIMESTAMPDIFF(MINUTE, delivery_start_time, delivery_end_time)) AS avg_deliver
y_time,
RANK() OVER (ORDER BY AVG(TIMESTAMPDIFF(MINUTE, delivery_start_time, delivery_end
_time))) AS rank
FROM Deliveries
GROUP BY delivery_partner;
• Q.574
Question
Identify the top 5 customers with the highest total spending in the last 30 days, and calculate
their rank based on the total amount spent. Additionally, show the percentage of their total
spending relative to the entire spending of all customers in the last 30 days.
Explanation
• You need to calculate the total spending of each customer in the last 30 days.
• Rank customers based on the total amount spent.
• Calculate the percentage of each customer’s spending relative to the total spending of all
customers in the last 30 days.
Datasets and SQL Schemas
• Orders Table
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10, 2)
);
• Datasets
713
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using SUM() to calculate the total spending per customer.
• Using RANK() to rank the customers based on total spending.
• Calculating the percentage of each customer’s spending relative to the total spending of all
customers.
Solution
WITH TotalSpending AS (
SELECT customer_id,
SUM(total_amount) AS customer_spending
FROM Orders
WHERE order_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY customer_id
),
TotalSpendingAll AS (
SELECT SUM(total_amount) AS total_spending
FROM Orders
WHERE order_date >= CURDATE() - INTERVAL 30 DAY
)
SELECT t.customer_id,
t.customer_spending,
RANK() OVER (ORDER BY t.customer_spending DESC) AS rank,
ROUND(100.0 * t.customer_spending / tsa.total_spending, 2) AS spending_percentage
FROM TotalSpending t, TotalSpendingAll tsa
ORDER BY t.customer_spending DESC
LIMIT 5;
• Q.575
Question
Compute the moving average of how much the customer paid in a seven-day window, where
the moving average is calculated for the current day and the 6 days before. The result should
be rounded to two decimal places and ordered by visited_on.
Explanation
You need to calculate a rolling average for each day over the past 7 days (current day + 6
previous days) for the amount column. This involves using window functions to compute the
moving average for each day and ensuring the result is ordered by visited_on.
714
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using window functions like AVG() with OVER clause for moving averages.
• Understanding date ranges and rolling windows for aggregate calculations.
• How to use ORDER BY and date filters in SQL.
Solutions
PostgreSQL solution
SELECT
visited_on,
SUM(amount) AS amount,
ROUND(AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS average_amount
FROM
Customer
GROUP BY
visited_on
ORDER BY
visited_on;
MySQL solution
SELECT
visited_on,
SUM(amount) AS amount,
ROUND(AVG(amount) OVER (ORDER BY visited_on ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS average_amount
FROM
Customer
GROUP BY
visited_on
ORDER BY
visited_on;
• Q.576
Question
Calculate the total amount of food wasted due to order cancellations for each day. The food
waste is equal to the total value of the canceled orders. Return the result ordered by
order_date in ascending order.
Explanation
You need to compute the total amount of food waste per day based on the canceled orders.
This involves summing up the amount of canceled orders for each order_date. Ensure the
result is ordered by the order_date in ascending order.
715
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Table creation
CREATE TABLE Orders (
order_id INT,
customer_id INT,
order_date DATE,
amount INT,
status VARCHAR(20),
PRIMARY KEY (order_id)
);
Learnings
• Using conditional aggregation (SUM()) with a WHERE clause to filter specific order statuses.
• Understanding how to aggregate data for each date.
• Ensuring data is properly filtered and ordered.
Solutions
PostgreSQL solution
SELECT
order_date,
SUM(amount) AS total_food_waste
FROM
Orders
WHERE
status = 'canceled'
GROUP BY
order_date
ORDER BY
order_date;
MySQL solution
SELECT
order_date,
SUM(amount) AS total_food_waste
FROM
Orders
WHERE
status = 'canceled'
GROUP BY
order_date
ORDER BY
order_date;
• Q.577
Question 1: Customer Retention Analysis
Question
716
1000+ SQL Interview Questions & Answers | By Zero Analyst
Calculate the customer retention rate for each month. The retention rate is defined as the
percentage of customers who made at least one order in a given month and also made an
order in the previous month.
Explanation
To calculate customer retention:
• Identify customers who placed orders in a given month.
• Identify customers who placed orders in the previous month.
• Calculate the percentage of customers who ordered in both the given month and the
previous month.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id VARCHAR(20),
order_date DATE,
total_amount DECIMAL(10, 2)
);
Learnings
• Using EXTRACT(MONTH FROM date) to group by month.
• Using JOIN to find customers who ordered in two consecutive months.
• Calculating percentages and aggregating data.
Solutions
PostgreSQL solution
WITH Retention AS (
SELECT
EXTRACT(MONTH FROM order_date) AS order_month,
EXTRACT(YEAR FROM order_date) AS order_year,
customer_id
FROM orders
GROUP BY customer_id, EXTRACT(YEAR FROM order_date), EXTRACT(MONTH FROM order_date)
)
SELECT
a.order_year,
a.order_month,
COUNT(DISTINCT a.customer_id) AS customers_ordered_in_month,
COUNT(DISTINCT b.customer_id) AS customers_retained,
ROUND((COUNT(DISTINCT b.customer_id)::DECIMAL / COUNT(DISTINCT a.customer_id)) * 100
, 2) AS retention_rate
FROM
Retention a
LEFT JOIN
Retention b ON a.customer_id = b.customer_id
AND a.order_year = b.order_year
AND a.order_month = b.order_month - 1
GROUP BY
717
1000+ SQL Interview Questions & Answers | By Zero Analyst
a.order_year, a.order_month
ORDER BY
a.order_year, a.order_month;
MySQL solution
WITH Retention AS (
SELECT
YEAR(order_date) AS order_year,
MONTH(order_date) AS order_month,
customer_id
FROM orders
GROUP BY customer_id, YEAR(order_date), MONTH(order_date)
)
SELECT
a.order_year,
a.order_month,
COUNT(DISTINCT a.customer_id) AS customers_ordered_in_month,
COUNT(DISTINCT b.customer_id) AS customers_retained,
ROUND((COUNT(DISTINCT b.customer_id) / COUNT(DISTINCT a.customer_id)) * 100, 2) AS r
etention_rate
FROM
Retention a
LEFT JOIN
Retention b ON a.customer_id = b.customer_id
AND a.order_year = b.order_year
AND a.order_month = b.order_month - 1
GROUP BY
a.order_year, a.order_month
ORDER BY
a.order_year, a.order_month;
• Q.578
Top-selling Products in Each Region
Question
Identify the top 3 selling products in each region for the last quarter of 2023 (October,
November, December). The total sales are calculated by summing up the total_amount for
each product. Return the region, product_id, and total sales for the top 3 products in each
region, ordered by sales in descending order.
Explanation
• Filter orders from the last quarter (October, November, December) of 2023.
• Sum the total_amount for each product_id by region.
• Rank the products within each region by total sales.
• Select the top 3 products per region.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id VARCHAR(20),
product_id VARCHAR(20),
total_amount DECIMAL(10, 2),
order_date DATE,
region VARCHAR(50)
);
718
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using SUM() for aggregating sales.
• Filtering based on a specific quarter and using EXTRACT(MONTH FROM date) for date
filtering.
• Using RANK() or ROW_NUMBER() to rank products by sales within each region.
Solutions
PostgreSQL solution
WITH ProductSales AS (
SELECT
region,
product_id,
SUM(total_amount) AS total_sales
FROM orders
WHERE order_date BETWEEN '2023-10-01' AND '2023-12-31'
GROUP BY region, product_id
),
RankedProducts AS (
SELECT
region,
product_id,
total_sales,
ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_sales DESC) AS rank
FROM ProductSales
)
SELECT
region,
product_id,
total_sales
FROM RankedProducts
WHERE rank <= 3
ORDER BY region, total_sales DESC;
MySQL solution
WITH ProductSales AS (
SELECT
region,
product_id,
SUM(total_amount) AS total_sales
FROM orders
WHERE order_date BETWEEN '2023-10-01' AND '2023-12-31'
GROUP BY region, product_id
),
RankedProducts AS (
SELECT
region,
product_id,
total_sales,
@rank := IF(@prev_region = region, @rank + 1, 1) AS rank,
@prev_region := region
FROM ProductSales
ORDER BY region, total_sales DESC
)
SELECT
region,
product_id,
total_sales
FROM RankedProducts
WHERE rank <= 3
ORDER BY region, total_sales DESC;
• Q.579
719
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Calculate the year-over-year growth rate of the amount spent by each customer. The growth
rate is calculated as the difference in the total amount spent between two consecutive years
divided by the amount spent in the earlier year.
Explanation
To calculate the year-over-year growth rate:
• First, aggregate the total amount spent by each customer for each year.
• Then, calculate the difference in the total amount spent between two consecutive years for
each customer.
• Finally, compute the growth rate as the difference divided by the amount spent in the
earlier year.
Learnings
• Aggregating data by year using YEAR() function.
• Calculating the year-over-year growth rate.
• Using JOIN to align data from consecutive years.
Solutions
PostgreSQL solution
WITH YearlySpend AS (
SELECT
Customer_id,
EXTRACT(YEAR FROM Purchase_date) AS Purchase_year,
SUM(Purchase_amount) AS total_spent
FROM
customer_purchases
GROUP BY
Customer_id, EXTRACT(YEAR FROM Purchase_date)
)
SELECT
a.Customer_id,
a.Purchase_year AS year,
ROUND(((a.total_spent - b.total_spent) / b.total_spent) * 100, 2) AS yoy_growth_rate
FROM
YearlySpend a
JOIN
YearlySpend b ON a.Customer_id = b.Customer_id
720
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL solution
WITH YearlySpend AS (
SELECT
Customer_id,
YEAR(Purchase_date) AS Purchase_year,
SUM(Purchase_amount) AS total_spent
FROM
customer_purchases
GROUP BY
Customer_id, YEAR(Purchase_date)
)
SELECT
a.Customer_id,
a.Purchase_year AS year,
ROUND(((a.total_spent - b.total_spent) / b.total_spent) * 100, 2) AS yoy_growth_rate
FROM
YearlySpend a
JOIN
YearlySpend b ON a.Customer_id = b.Customer_id
AND a.Purchase_year = b.Purchase_year + 1
ORDER BY
a.Customer_id, a.Purchase_year;
• Q.580
Question
Select the Supplier_id, Product_id, and the start date of the period when the stock quantity
was below 50 units for more than two consecutive days.
Explanation
To solve this:
• We need to identify periods where the Stock_quantity was below 50 for more than two
consecutive days.
• For this, we can use a window function or self-join to check if a product's stock remained
below 50 units for at least three consecutive days.
• Return the Supplier_id, Product_id, and the start date of that period.
721
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using self-joins or window functions to identify consecutive dates.
• Filtering on Stock_quantity values that are below 50 for more than two consecutive
days.
• Handling date sequences for consecutive periods.
Solutions
PostgreSQL solution
WITH ConsecutiveDays AS (
SELECT
Supplier_id,
Product_id,
Record_date,
LAG(Record_date, 1) OVER (PARTITION BY Supplier_id, Product_id ORDER BY Record_d
ate) AS prev_date,
LEAD(Record_date, 1) OVER (PARTITION BY Supplier_id, Product_id ORDER BY Record_
date) AS next_date,
Stock_quantity
FROM
supplier_inventory
)
SELECT
Supplier_id,
Product_id,
MIN(Record_date) AS start_date
FROM
ConsecutiveDays
WHERE
Stock_quantity < 50
AND prev_date IS NOT NULL
AND next_date IS NOT NULL
AND (Record_date - prev_date = INTERVAL '1 day' AND next_date - Record_date = INTERV
AL '1 day')
GROUP BY
Supplier_id, Product_id
ORDER BY
Supplier_id, Product_id, start_date;
MySQL solution
WITH ConsecutiveDays AS (
SELECT
Supplier_id,
Product_id,
Record_date,
LAG(Record_date, 1) OVER (PARTITION BY Supplier_id, Product_id ORDER BY Record_d
ate) AS prev_date,
LEAD(Record_date, 1) OVER (PARTITION BY Supplier_id, Product_id ORDER BY Record_
date) AS next_date,
Stock_quantity
FROM
supplier_inventory
)
SELECT
Supplier_id,
Product_id,
MIN(Record_date) AS start_date
FROM
ConsecutiveDays
WHERE
Stock_quantity < 50
AND prev_date IS NOT NULL
AND next_date IS NOT NULL
AND DATEDIFF(Record_date, prev_date) = 1
AND DATEDIFF(next_date, Record_date) = 1
GROUP BY
Supplier_id, Product_id
722
1000+ SQL Interview Questions & Answers | By Zero Analyst
ORDER BY
Supplier_id, Product_id, start_date;
Key Concepts
• LAG() and LEAD(): These window functions help identify the previous and next dates to
detect consecutive days.
• MIN(Record_date): Used to find the start date of the period.
• DATEDIFF(): Used in MySQL to calculate the difference between two dates to check if the
dates are consecutive.
• Consecutive period logic: Ensure the gap between the current date and the previous and
next dates is exactly one day to confirm the consecutive sequence.
Tesla
• Q.581
Question
Write a query to identify the average charging duration at different Tesla charging stations,
and rank the stations by the duration of charging time for each model of Tesla vehicle.
Explanation
To solve this:
• Group the data by charging_station_id and vehicle_model.
• Calculate the average charging duration for each charging station and model combination.
• Rank the stations for each model based on the average charging duration.
• Return the charging_station_id, vehicle_model, average charging duration, and rank.
Learnings
• Use of TIMESTAMPDIFF() or EXTRACT() to calculate charging duration.
• Grouping by charging_station_id and vehicle_model for aggregation.
• Using RANK() to rank stations by the charging duration.
• Handling date and time calculations for durations.
723
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL solution
WITH ChargingDurations AS (
SELECT
charging_station_id,
vehicle_model,
EXTRACT(EPOCH FROM (charging_end_time - charging_start_time)) / 60 AS charging_d
uration_minutes
FROM tesla_charging_data
),
AverageChargingDurations AS (
SELECT
charging_station_id,
vehicle_model,
AVG(charging_duration_minutes) AS avg_charging_duration
FROM ChargingDurations
GROUP BY charging_station_id, vehicle_model
)
SELECT
charging_station_id,
vehicle_model,
avg_charging_duration,
RANK() OVER (PARTITION BY vehicle_model ORDER BY avg_charging_duration DESC) AS stat
ion_rank
FROM AverageChargingDurations
ORDER BY vehicle_model, station_rank;
MySQL solution
WITH ChargingDurations AS (
SELECT
charging_station_id,
vehicle_model,
TIMESTAMPDIFF(MINUTE, charging_start_time, charging_end_time) AS charging_durati
on_minutes
FROM tesla_charging_data
),
AverageChargingDurations AS (
SELECT
charging_station_id,
vehicle_model,
AVG(charging_duration_minutes) AS avg_charging_duration
FROM ChargingDurations
GROUP BY charging_station_id, vehicle_model
)
SELECT
charging_station_id,
vehicle_model,
avg_charging_duration,
RANK() OVER (PARTITION BY vehicle_model ORDER BY avg_charging_duration DESC) AS stat
ion_rank
FROM AverageChargingDurations
ORDER BY vehicle_model, station_rank;
Key Concepts
• Charging Duration: Calculated as the difference between charging_end_time and
charging_start_time, expressed in minutes.
• AVG(): Aggregates the total charging time for each station and model combination.
• RANK(): Ranks the stations within each model category based on the average charging
duration in descending order.
• Grouping: Use of GROUP BY to aggregate data by charging_station_id and
vehicle_model.
• Q.582
Question
724
1000+ SQL Interview Questions & Answers | By Zero Analyst
Calculate the sum of tiv_2016 for all policyholders who have the same tiv_2015 value as
one or more other policyholders, and are not located in the same city as any other
policyholder (i.e., the (lat, lon) pairs must be unique). Round the tiv_2016 values to two
decimal places.
Explanation
• First, identify the policyholders who have the same tiv_2015 value as one or more other
policyholders.
• Then, ensure that the policyholders are located in unique cities by ensuring that (lat,
lon) pairs are not repeated.
• Finally, sum the tiv_2016 values for those policyholders and round the result to two
decimal places.
Learnings
• Use of JOIN or GROUP BY to identify common values (tiv_2015).
• Use of HAVING or filtering conditions to ensure unique cities based on (lat, lon).
• Conditional aggregation with SUM() and rounding in SQL.
Solutions
PostgreSQL solution
SELECT
ROUND(SUM(tiv_2016), 2) AS tiv_2016
FROM
Insurance i
WHERE
tiv_2015 IN (
SELECT tiv_2015
FROM Insurance
GROUP BY tiv_2015
HAVING COUNT(pid) > 1
)
AND (lat, lon) IN (
SELECT lat, lon
FROM Insurance
GROUP BY lat, lon
HAVING COUNT(pid) = 1
);
MySQL solution
SELECT
ROUND(SUM(tiv_2016), 2) AS tiv_2016
725
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM
Insurance i
WHERE
tiv_2015 IN (
SELECT tiv_2015
FROM Insurance
GROUP BY tiv_2015
HAVING COUNT(pid) > 1
)
AND (lat, lon) IN (
SELECT lat, lon
FROM Insurance
GROUP BY lat, lon
HAVING COUNT(pid) = 1
);
• Q.583
Question
Given two tables, one for Tesla vehicle production (with columns model_id,
production_date) and one for vehicle deliveries (with columns delivery_id, model_id,
delivery_date), write a query to calculate the average time it takes from production to
delivery for each vehicle model.
Explanation
To solve this:
• We need to join the production table with the deliveries table on model_id.
• For each vehicle model, calculate the difference between the delivery_date and
production_date.
• Calculate the average time between production and delivery for each model.
• Return the model ID and the average delivery time (in days).
726
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Joining tables: The query requires a JOIN between two tables using the model_id as the
common key.
• Date difference calculation: We calculate the difference between delivery_date and
production_date using date functions.
• Aggregation: Using AVG() to calculate the average time between production and delivery.
• Grouping: Group the results by model_id to calculate the average time per vehicle model.
Solutions
PostgreSQL solution
SELECT
p.model_id,
ROUND(AVG(EXTRACT(EPOCH FROM (d.delivery_date - p.production_date)) / 86400), 2) AS
avg_delivery_time_days
FROM
vehicle_production p
JOIN
vehicle_deliveries d
ON
p.model_id = d.model_id
GROUP BY
p.model_id
ORDER BY
p.model_id;
MySQL solution
SELECT
p.model_id,
ROUND(AVG(DATEDIFF(d.delivery_date, p.production_date)), 2) AS avg_delivery_time_day
s
FROM
vehicle_production p
JOIN
vehicle_deliveries d
ON
p.model_id = d.model_id
GROUP BY
p.model_id
ORDER BY
p.model_id;
Key Concepts
• JOIN: Used to combine data from the vehicle_production and vehicle_deliveries
tables.
• Date difference: In PostgreSQL, we use EXTRACT(EPOCH FROM (date1 - date2)) /
86400 to calculate the difference in days, and in MySQL, we use DATEDIFF() directly.
• AVG(): Aggregates the delivery times for each vehicle model to calculate the average.
• ROUND(): Rounds the result to two decimal places for readability.
• Q.584
Question
Write a query to calculate the month-on-month sales growth percentage for a specific Tesla
model, and display the sales figures, revenue, and growth percentages over the last 12
months.
727
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Filter the sales data for the specific Tesla model.
• Group the data by month, calculate the total number of units sold (sales) and the total
revenue (revenue) for each month.
• Calculate the month-on-month growth in sales and revenue.
• Display the sales, revenue, and growth percentage for each of the last 12 months.
Steps:
• Use GROUP BY to aggregate sales and revenue for each month.
• Use LAG() function to calculate the sales and revenue for the previous month to compute
growth percentages.
• Calculate growth percentages for both sales and revenue.
• Filter data to show only the last 12 months (use date filtering).
Learnings
• GROUP BY: Group sales and revenue by month to aggregate data.
• LAG(): To calculate previous month's sales/revenue for growth calculation.
• Date filtering: Using WHERE to filter the data for the last 12 months.
• Growth calculation: The percentage growth formula:
Growth Percentage=(Current Month−Previous MonthPrevious Month)×100\text{Growth
Percentage} = \left(\frac{\text{Current Month} - \text{Previous Month}}{\text{Previous
Month}}\right) \times 100
• Date formatting: Using DATE_TRUNC() or MONTH() for extracting the month part of the
date.
Solutions
PostgreSQL solution
728
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH MonthlySales AS (
SELECT
model_id,
DATE_TRUNC('month', sale_date) AS sale_month,
SUM(units_sold) AS total_units_sold,
SUM(revenue) AS total_revenue
FROM tesla_sales
WHERE model_id = 'Model S'
AND sale_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY model_id, DATE_TRUNC('month', sale_date)
),
GrowthCalculation AS (
SELECT
model_id,
sale_month,
total_units_sold,
total_revenue,
LAG(total_units_sold) OVER (ORDER BY sale_month) AS previous_month_units,
LAG(total_revenue) OVER (ORDER BY sale_month) AS previous_month_revenue
FROM MonthlySales
)
SELECT
model_id,
sale_month,
total_units_sold AS sales,
total_revenue AS revenue,
ROUND(((total_units_sold - previous_month_units) / NULLIF(previous_month_units, 0))
* 100, 2) AS sales_growth_percentage,
ROUND(((total_revenue - previous_month_revenue) / NULLIF(previous_month_revenue, 0))
* 100, 2) AS revenue_growth_percentage
FROM GrowthCalculation
ORDER BY sale_month DESC
LIMIT 12;
MySQL solution
WITH MonthlySales AS (
SELECT
model_id,
DATE_FORMAT(sale_date, '%Y-%m-01') AS sale_month,
SUM(units_sold) AS total_units_sold,
SUM(revenue) AS total_revenue
FROM tesla_sales
WHERE model_id = 'Model S'
AND sale_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY model_id, DATE_FORMAT(sale_date, '%Y-%m-01')
),
GrowthCalculation AS (
SELECT
model_id,
sale_month,
total_units_sold,
total_revenue,
LAG(total_units_sold) OVER (ORDER BY sale_month) AS previous_month_units,
LAG(total_revenue) OVER (ORDER BY sale_month) AS previous_month_revenue
FROM MonthlySales
)
SELECT
model_id,
sale_month,
total_units_sold AS sales,
total_revenue AS revenue,
ROUND(((total_units_sold - previous_month_units) / NULLIF(previous_month_units, 0))
* 100, 2) AS sales_growth_percentage,
ROUND(((total_revenue - previous_month_revenue) / NULLIF(previous_month_revenue, 0))
* 100, 2) AS revenue_growth_percentage
FROM GrowthCalculation
ORDER BY sale_month DESC
LIMIT 12;
Key Concepts
729
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Group the data by charging_station_id to aggregate the total energy dispensed (in
kWh) and the total charging time (in hours).
• Calculate the efficiency by dividing the total kWh by the total charging time (in hours).
• Order the stations based on efficiency in descending order.
Steps:
• Sum of Energy Dispensed: Calculate the total kWh dispensed at each station.
• Sum of Charging Time: Calculate the total charging time in hours.
• Efficiency Calculation: The formula for efficiency is:
Efficiency=Total kWhTotal Charging Time (hours)\text{Efficiency} = \frac{\text{Total
kWh}}{\text{Total Charging Time (hours)}}
• Sorting: Sort the results by efficiency in descending order.
Learnings
730
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Date/Time calculations: We need to calculate the charging time (in hours) by finding the
difference between charging_end_time and charging_start_time.
• Aggregation: Use SUM() to calculate total energy dispensed and total charging time for
each station.
• Efficiency formula: The efficiency of a station is calculated as the ratio of energy
dispensed (kWh) to charging time (hours).
• Ordering: The results are ordered by efficiency from highest to lowest.
Solutions
PostgreSQL solution
SELECT
charging_station_id,
ROUND(SUM(energy_dispensed_kWh) / SUM(EXTRACT(EPOCH FROM (charging_end_time - chargi
ng_start_time)) / 3600), 2) AS efficiency
FROM
ev_charging_sessions
GROUP BY
charging_station_id
ORDER BY
efficiency DESC;
MySQL solution
SELECT
charging_station_id,
ROUND(SUM(energy_dispensed_kWh) / SUM(TIMESTAMPDIFF(SECOND, charging_start_time, cha
rging_end_time) / 3600), 2) AS efficiency
FROM
ev_charging_sessions
GROUP BY
charging_station_id
ORDER BY
efficiency DESC;
Key Concepts
• Energy Dispensed: The total energy dispensed at each station is aggregated using
SUM(energy_dispensed_kWh).
• Charging Time Calculation: In PostgreSQL, EXTRACT(EPOCH FROM
(charging_end_time - charging_start_time)) / 3600 gives the time in hours, while
in MySQL, TIMESTAMPDIFF(SECOND, charging_start_time, charging_end_time) /
3600 calculates the time in hours.
• Efficiency Calculation: The efficiency is calculated by dividing the total kWh by the total
charging time (in hours).
• ROUND(): Used to round the efficiency to two decimal places for better readability.
• Ordering: The stations are ordered by efficiency in descending order, showing the most
efficient stations first.
• Q.586
Question
Given a dataset of Tesla charging stations, we'd like to analyze the usage pattern. The dataset
captures when a Tesla car starts charging, finishes charging, and the charging station used.
Calculate the total charging time at each station and compare it with the previous day.
Explanation
To solve this:
731
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Calculate the Total Charging Time: For each charging session, compute the difference
between end_time and start_time and convert this to hours. We then sum these values to
get the total charging time per day for each station.
• Compare with Previous Day: Use LAG() to calculate the difference in charging time
between the current day and the previous day for each station.
• Group and Order Data: Group the data by station_id and the truncated date of
start_time (to get the day), then order the results by station_id and date.
Learnings
• Date Truncation: Using date_trunc() to truncate the timestamp to the day ensures that
we aggregate the data by day.
• Time Calculation: Use EXTRACT(EPOCH FROM (end_time - start_time)) / 3600 to
calculate the difference between end_time and start_time in hours.
• LAG() Window Function: The LAG() function is used to access the value from the
previous day, allowing us to compute the difference in total charging hours between
consecutive days.
• Grouping and Ordering: The query groups by station_id and the truncated date
(charge_day), then orders by station_id and charge_day to ensure the correct sequence.
Solutions
PostgreSQL solution
SELECT
station_id,
date_trunc('day', start_time) AS charge_day,
SUM(EXTRACT(EPOCH FROM (end_time - start_time))/3600) AS total_charge_hours,
(SUM(EXTRACT(EPOCH FROM (end_time - start_time))/3600)
- LAG(SUM(EXTRACT(EPOCH FROM (end_time - start_time))/3600), 1, 0)
OVER (PARTITION BY station_id ORDER BY date_trunc('day', start_time))
) AS diff_prev_day_hours
FROM
charging_data
GROUP BY
station_id, charge_day
ORDER BY
station_id, charge_day;
MySQL solution
SELECT
station_id,
DATE(DATE_FORMAT(start_time, '%Y-%m-%d')) AS charge_day,
732
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Concepts
• LAG(): This window function allows us to retrieve the value from the previous row (in this
case, the previous day's total charging hours) in a partitioned and ordered dataset.
• Time Difference: We calculate the difference between start_time and end_time using
EXTRACT(EPOCH FROM (end_time - start_time)) for PostgreSQL and TIMESTAMPDIFF()
in MySQL.
• Grouping and Aggregation: By grouping by station_id and charge_day, we calculate
the total charging time for each station on each day.
• Handling NULLs: The COALESCE() function in MySQL ensures that if there is no
previous day's data (i.e., the first day), we substitute NULL with 0.
Output
The query will output the following columns:
• station_id: The ID of the charging station.
• charge_day: The day when the charging took place (truncated from start_time).
• total_charge_hours: The total charging hours for that station on that day.
• diff_prev_day_hours: The difference in charging hours compared to the previous day for
that station.
• Q.587
Question
Write a SQL query that determines which parts have begun the assembly process but are not
yet finished.
Explanation
To solve this problem:
• Identify unfinished parts: We assume that the finish_date being NULL indicates that the
part is still in the assembly process.
• Query the table: We need to filter the records where finish_date is NULL to find the
parts that haven't been finished yet.
• Return the part and assembly step: The query should return the part name (or ID) and
the corresponding assembly step.
733
1000+ SQL Interview Questions & Answers | By Zero Analyst
assembly_step VARCHAR(50),
start_date DATE,
finish_date DATE
);
Learnings
• Filtering with NULL: We use the condition finish_date IS NULL to find parts that are
unfinished.
• Basic SELECT and WHERE: This problem involves basic SQL queries that use SELECT,
FROM, and WHERE to filter data.
• No JOINs: This problem does not require any joins or complex operations since it only
involves filtering one table based on a condition.
Solutions
PostgreSQL solution
SELECT part, assembly_step
FROM parts_assembly
WHERE finish_date IS NULL;
MySQL solution
SELECT part, assembly_step
FROM parts_assembly
WHERE finish_date IS NULL;
Key Concepts
• NULL Handling: In SQL, NULL represents missing or undefined values. Here,
finish_date IS NULL helps identify rows where the part is still in the assembly process.
• Simple Query: The query involves a straightforward SELECT with a WHERE clause to filter
out the finished parts.
• Q.588
Question
Write a query to calculate the 3-day weighted moving average of sales for each product using
weights: 0.5 (current day), 0.3 (previous day), 0.2 (two days ago).
Explanation
To solve this:
• Identify the Weighted Moving Average (WMA): The formula for the 3-day weighted
moving average is:
• Use Window Functions: We can use LAG() to get sales from the previous day and two
days ago.
734
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Calculate Weighted Average: Apply the weights (0.5, 0.3, 0.2) on the respective sales
figures for the current day, previous day, and two days ago.
Learnings
• Window Functions: The LAG() function is useful to get previous values (sales amounts
from previous days).
• Weighted Calculations: Use basic arithmetic to calculate the weighted moving average.
• Handling NULLs: For the first two days, the LAG() function will return NULL for missing
values, which can be handled appropriately.
Solutions
PostgreSQL solution
SELECT
product_id,
sale_date,
ROUND(
0.5 * sales_amount +
0.3 * COALESCE(LAG(sales_amount, 1) OVER (PARTITION BY product_id ORDER BY sale_
date), 0) +
0.2 * COALESCE(LAG(sales_amount, 2) OVER (PARTITION BY product_id ORDER BY sale_
date), 0),
2) AS weighted_moving_avg
FROM
product_sales
ORDER BY
product_id, sale_date;
MySQL solution
SELECT
product_id,
sale_date,
ROUND(
0.5 * sales_amount +
0.3 * COALESCE(LAG(sales_amount, 1) OVER (PARTITION BY product_id ORDER BY sale_
date), 0) +
0.2 * COALESCE(LAG(sales_amount, 2) OVER (PARTITION BY product_id ORDER BY sale_
date), 0),
2) AS weighted_moving_avg
FROM
product_sales
ORDER BY
product_id, sale_date;
735
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Concepts
• LAG() Window Function: LAG() allows us to access data from the previous rows within
the same partition, which is essential for calculating the moving average.
• COALESCE(): This function handles NULL values that may arise from the LAG() function,
substituting them with 0 for missing sales data on the first two days.
• Weighted Moving Average: The moving average is calculated by applying specific
weights to the sales amounts of the current day and the previous two days.
• Rounding: We use ROUND() to format the result to two decimal places, as required.
Output
This query will return the product_id, sale_date, and the 3-day weighted moving average
of sales (weighted_moving_avg) for each product, ordered by product_id and sale_date.
• Q.589
Question
Write an SQL query to identify the top 10 states with the highest Tesla sales for the past year,
grouped by vehicle model.
Explanation
To solve this:
• Filter Sales for the Last Year: We need to filter the data to include only sales that
occurred in the past year. This can be done by comparing the sale_date with the current
date minus one year.
• Group by State and Vehicle Model: We group the results by state and vehicle_model
to aggregate sales for each combination.
• Sum the Sales: For each state and vehicle model, we need to calculate the total sales,
typically by summing up a sales amount column (e.g., sales_amount).
• Rank the States: Using the RANK() or ROW_NUMBER() function, we can rank states based
on their sales, and then select the top 10 based on these ranks.
736
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date Filtering: Use CURRENT_DATE or NOW() to filter records for the past year.
• Aggregating Data: Group the data by state and vehicle_model to calculate the total
sales.
• Window Functions: Use RANK() or ROW_NUMBER() to rank states by their total sales for
each model.
• Handling Sales: The sales_amount column is summed to get the total sales for each state
and model.
Solutions
PostgreSQL solution
WITH ranked_sales AS (
SELECT
state,
vehicle_model,
SUM(sales_amount) AS total_sales,
RANK() OVER (PARTITION BY vehicle_model ORDER BY SUM(sales_amount) DESC) AS sale
s_rank
FROM tesla_sales
WHERE sale_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY state, vehicle_model
)
SELECT state, vehicle_model, total_sales
FROM ranked_sales
WHERE sales_rank <= 10
737
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL solution
WITH ranked_sales AS (
SELECT
state,
vehicle_model,
SUM(sales_amount) AS total_sales,
RANK() OVER (PARTITION BY vehicle_model ORDER BY SUM(sales_amount) DESC) AS sale
s_rank
FROM tesla_sales
WHERE sale_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY state, vehicle_model
)
SELECT state, vehicle_model, total_sales
FROM ranked_sales
WHERE sales_rank <= 10
ORDER BY vehicle_model, total_sales DESC;
Key Concepts
• RANK() Window Function: This function ranks each state by the total sales for each
vehicle model. States with the highest sales for a model get a rank of 1, 2, etc. We filter the
results to only include the top 10 states based on sales for each model.
• Date Filtering: We filter the sales data for the last year using CURRENT_DATE - INTERVAL
'1 year' (PostgreSQL) or CURDATE() - INTERVAL 1 YEAR (MySQL).
• SUM() Aggregation: The SUM(sales_amount) calculates the total sales for each
combination of state and vehicle model.
• Q.590
Question
Write an SQL query to calculate the click-through conversion rates for Tesla's digital ads,
from viewing a digital ad to adding a product (vehicle model) to the cart.
Explanation
To calculate the click-through conversion rate:
• Group Click Data: First, we need to count how many times each product was clicked
within each ad campaign by grouping the ad_clicks table by ad_campaign and
product_model.
• Group Add Data: Next, count how many times a product was added to the cart from the
add_to_carts table, grouping by product_model.
• Calculate Conversion Rate: For each ad campaign and product, we calculate the
conversion rate as:
Conversion Rate
where "Number of Adds" refers to the number of times a product was added to the cart, and
"Number of Clicks" refers to the number of times a user clicked the ad.
• Join the Tables: The ad_clicks and add_to_carts tables are joined on product_model
to get the relevant click and add data for each product within each campaign.
738
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Aggregation: Use COUNT() with GROUP BY to aggregate the number of clicks and adds for
each product.
• Joining: Join the tables on the product_model to combine the data from both tables (click
and add actions).
• Conversion Rate Calculation: The conversion rate is calculated by dividing the number
of adds by the number of clicks, then multiplying by 100 to get the percentage.
Solutions
PostgreSQL Solution
WITH clicks AS (
SELECT ad_campaign, product_model, COUNT(*) AS num_clicks
FROM ad_clicks
GROUP BY ad_campaign, product_model
),
adds AS (
SELECT product_model, COUNT(*) AS num_adds
FROM add_to_carts
GROUP BY product_model
)
SELECT clicks.ad_campaign,
clicks.product_model,
clicks.num_clicks,
adds.num_adds,
(adds.num_adds::DECIMAL / clicks.num_clicks) * 100 AS conversion_rate
FROM clicks
JOIN adds ON clicks.product_model = adds.product_model;
MySQL Solution
WITH clicks AS (
739
1000+ SQL Interview Questions & Answers | By Zero Analyst
Output
The result will return the following columns:
• ad_campaign: The name of the ad campaign (e.g., 'Campaign1').
• product_model: The Tesla vehicle model (e.g., 'Model S').
• num_clicks: The total number of clicks for the product in the ad campaign.
• num_adds: The total number of times the product was added to the cart.
• conversion_rate: The click-through conversion rate for the ad campaign and product.
• Q.591
Question
Write an SQL query to identify the top 5 Tesla models with the highest average revenue per
sale in 2024. The query should return the model name, total number of sales, and the average
sale price for that model.
Explanation
• Filter for 2024 Sales: The sales data should be filtered for the year 2024.
• Aggregate by Model: For each model, calculate the total sales and average revenue.
• Sort and Limit: Sort the models by average revenue per sale in descending order and
return only the top 5 models.
740
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date Filtering: Use YEAR() or EXTRACT() to filter sales data for the year 2024.
• Aggregation: Use COUNT() for total sales and AVG() for average price per sale.
• Sorting and Limiting: Use ORDER BY and LIMIT to sort the results and restrict it to the top
5.
Solutions
PostgreSQL Solution
SELECT model_name, COUNT(*) AS total_sales, AVG(price) AS avg_price
FROM sales
WHERE EXTRACT(YEAR FROM sale_date) = 2024
GROUP BY model_name
ORDER BY avg_price DESC
LIMIT 5;
MySQL Solution
SELECT model_name, COUNT(*) AS total_sales, AVG(price) AS avg_price
FROM sales
WHERE YEAR(sale_date) = 2024
GROUP BY model_name
ORDER BY avg_price DESC
LIMIT 5;
Output
Model X 3 92000
Model S 3 78333.33
Model Y 3 52000
Cybertruck 2 40500
Model 3 3 35000
Output
741
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.592
Write an SQL query to calculate the total miles driven and average power consumption for
each Tesla model in 2024. The result should return the model name, total miles driven, and
the average power consumed for that model.
Explanation
• Filter for 2024 Service Data: The service data should be filtered for the year 2024.
• Aggregation: Calculate the total distance driven and average power consumption for each
model.
• Group by Model: Group the results by model_name.
Learnings
• Date Filtering: Use EXTRACT() or YEAR() to filter records for the year 2024.
• Aggregation: Use SUM() for total distance driven and AVG() for average power
consumption.
• Grouping: Group the results by model_name to summarize data by Tesla model.
742
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT model_name,
SUM(distance_driven) AS total_miles,
AVG(power_consumed) AS avg_power_consumed
FROM service_data s
JOIN vehicles v ON s.vehicle_id = v.vehicle_id
WHERE EXTRACT(YEAR FROM service_date) = 2024
GROUP BY model_name
ORDER BY total_miles DESC;
MySQL Solution
SELECT model_name,
SUM(distance_driven) AS total_miles,
AVG(power_consumed) AS avg_power_consumed
FROM service_data s
JOIN vehicles v ON s.vehicle_id = v.vehicle_id
WHERE YEAR(service_date) = 2024
GROUP BY model_name
ORDER BY total_miles DESC;
• Q.593
Identify Tesla Models with the Highest Number of Service Records in 2024
Problem Statement
You are tasked with identifying the Tesla models that have had the highest number of service
records in 2024. The result should return the model name and the total number of service
records for each model. The query should be sorted in descending order based on the number
of service records, and the model with the most service records should be displayed.
service_data Table
-- Table definition for service data
CREATE TABLE service_data (
record_id INT PRIMARY KEY,
vehicle_id VARCHAR(20),
service_type VARCHAR(50),
service_date DATE
);
743
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solution Query
The following query retrieves the Tesla model with the highest number of service records in
2024:
PostgreSQL Solution
SELECT v.model_name, COUNT(s.record_id) AS total_services
FROM service_data s
JOIN vehicles v ON s.vehicle_id = v.vehicle_id
WHERE EXTRACT(YEAR FROM service_date) = 2024
GROUP BY v.model_name
ORDER BY total_services DESC
LIMIT 1;
MySQL Solution
SELECT v.model_name, COUNT(s.record_id) AS total_services
FROM service_data s
JOIN vehicles v ON s.vehicle_id = v.vehicle_id
WHERE YEAR(service_date) = 2024
GROUP BY v.model_name
ORDER BY total_services DESC
LIMIT 1;
• Q.594
Question
Write an SQL query to identify the Tesla model with the highest number of customer
complaints in the UK for the year 2024. The query should return the model name and the
total number of complaints for that model.
Explanation
• Filter Complaints for the Year 2024: The complaints should be filtered for the year 2024.
• Filter for UK: The complaints should be specifically filtered for customers located in the
UK (you will likely need to filter by the country or similar field).
• Aggregation: The number of complaints for each model in 2024 should be aggregated.
• Identify the Model with Most Complaints: Use ORDER BY to sort the models by the
number of complaints in descending order, and use LIMIT to get the model with the highest
number of complaints.
744
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering by Year: Use YEAR() or EXTRACT(YEAR FROM complaint_date) to filter
complaints for the year 2024.
• Filtering by Country: Filter complaints based on the country column (specifically 'UK').
• Aggregation: Use COUNT() to count the number of complaints for each model.
• Sorting: Sort the results by the number of complaints in descending order and limit the
result to one row.
Solutions
PostgreSQL Solution
SELECT model_name, COUNT(*) AS total_complaints
FROM customer_complaints
WHERE EXTRACT(YEAR FROM complaint_date) = 2024
AND country = 'UK'
GROUP BY model_name
ORDER BY total_complaints DESC
LIMIT 1;
MySQL Solution
SELECT model_name, COUNT(*) AS total_complaints
FROM customer_complaints
WHERE YEAR(complaint_date) = 2024
AND country = 'UK'
GROUP BY model_name
ORDER BY total_complaints DESC
LIMIT 1;
Output
The output will return the Tesla model with the highest number of complaints in the UK for
2024, along with the total number of complaints.
model_name total_complaints
Model 3 4
• Q.595
745
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to identify the Tesla model with the highest number of sales in the year
2024. The result should return the model name and the total number of units sold in that year.
Explanation
• Filter Sales for 2024: The sales data should be filtered for the year 2024.
• Aggregation: You need to count the total number of sales for each model in 2024.
• Identify Highest Sales: Use the ORDER BY clause and LIMIT to identify the model with the
highest sales.
Learnings
• Filtering by Year: Use YEAR() or EXTRACT(YEAR FROM sale_date) to filter sales for the
year 2024.
• Aggregation: Use COUNT() to aggregate the number of sales for each model.
• Sorting: Use ORDER BY to sort by the number of sales in descending order and retrieve the
top model.
Solutions
PostgreSQL Solution
SELECT model_name, COUNT(*) AS total_sales
FROM sales
WHERE EXTRACT(YEAR FROM sale_date) = 2024
GROUP BY model_name
ORDER BY total_sales DESC
LIMIT 1;
MySQL Solution
SELECT model_name, COUNT(*) AS total_sales
FROM sales
WHERE YEAR(sale_date) = 2024
GROUP BY model_name
ORDER BY total_sales DESC
LIMIT 1;
Output
746
1000+ SQL Interview Questions & Answers | By Zero Analyst
The output will return the Tesla model with the highest sales in 2024, along with the number
of units sold.
model_name total_sales
Model 3 4
• Q.596
Question
Write an SQL query to identify the total sales of the Tesla Cybertruck in the month it was
launched. The query should calculate the total sales in terms of the number of units sold and
the total revenue generated (assuming sales price is included in the sales data).
Explanation
• Identifying the Launch Month: First, you need to identify the launch month of the Tesla
Cybertruck. This can be done using the sale_date column in the sales table.
• Filter Sales for Cybertruck: The model_name will be used to filter for Cybertruck sales.
• Calculate Total Sales: Use aggregation to calculate the total number of units sold and total
revenue in that launch month.
Learnings
• Date Filtering: Use MONTH() and YEAR() to filter by the specific launch month.
• Aggregation: Use COUNT() for total units sold and SUM() for total revenue.
• Filtering by Model: Use the model_name column to filter the data for Cybertruck sales
only.
Solutions
PostgreSQL Solution
SELECT
model_name,
EXTRACT(MONTH FROM sale_date) AS launch_month,
EXTRACT(YEAR FROM sale_date) AS launch_year,
COUNT(*) AS total_sales,
SUM(price) AS total_revenue
FROM
747
1000+ SQL Interview Questions & Answers | By Zero Analyst
sales
WHERE
model_name = 'Cybertruck'
AND EXTRACT(MONTH FROM sale_date) = 11 -- Month of launch (November)
AND EXTRACT(YEAR FROM sale_date) = 2021 -- Launch year
GROUP BY
model_name, launch_month, launch_year;
MySQL Solution
SELECT
model_name,
MONTH(sale_date) AS launch_month,
YEAR(sale_date) AS launch_year,
COUNT(*) AS total_sales,
SUM(price) AS total_revenue
FROM
sales
WHERE
model_name = 'Cybertruck'
AND MONTH(sale_date) = 11 -- Month of launch (November)
AND YEAR(sale_date) = 2021 -- Launch year
GROUP BY
model_name, launch_month, launch_year;
Output
The output will return the total number of Cybertruck units sold and the total revenue for the
launch month (November 2021).
Question
Write an SQL query to produce a report summarizing the average distance driven and
average power consumed by each Tesla vehicle model, grouped by the year of manufacture.
The report should include the model name, manufacture year, average distance driven (in
miles), and average power consumed (in kilowatt-hour).
Explanation
• Join Operation: You need to join the vehicles and service_data tables on the
vehicle_id to combine the data of each vehicle with its service records.
• Aggregation: Use the AVG() function to calculate the average distance driven and average
power consumed for each vehicle model grouped by the year of manufacture.
• Group By: The results should be grouped by model_name and manufacture_year to get
the average statistics for each model and year.
• Ordering: Order the result by model_name and manufacture_year to maintain a logical
order in the output.
748
1000+ SQL Interview Questions & Answers | By Zero Analyst
owner_id INT
);
Learnings
• Join: You need to join multiple tables on a common column (e.g., vehicle_id).
• Aggregation: Use AVG() to calculate averages for a set of values.
• Grouping: Group by columns to aggregate data at a higher level (in this case, by
model_name and manufacture_year).
• Ordering: Use ORDER BY to sort the results in a readable manner.
Solutions
PostgreSQL Solution
SELECT
v.model_name,
v.manufacture_year,
AVG(s.distance_driven) AS average_distance,
AVG(s.power_consumed) AS average_power
FROM
vehicles v
JOIN
service_data s
ON
v.vehicle_id = s.vehicle_id
GROUP BY
v.model_name,
v.manufacture_year
ORDER BY
v.model_name,
v.manufacture_year;
MySQL Solution
SELECT
v.model_name,
v.manufacture_year,
AVG(s.distance_driven) AS average_distance,
AVG(s.power_consumed) AS average_power
FROM
vehicles v
JOIN
749
1000+ SQL Interview Questions & Answers | By Zero Analyst
service_data s
ON
v.vehicle_id = s.vehicle_id
GROUP BY
v.model_name,
v.manufacture_year
ORDER BY
v.model_name,
v.manufacture_year;
Output
The output will return a table summarizing the average distance driven and average power
consumed for each Tesla model in each year.
• Q.598
Question
Write an SQL query to compute the battery performance index for each test run based on the
formula:
performance_index
Where:
• CHARGE is the energy used to fully charge the battery (in kWh).
• DISCHARGE is the energy recovered from the battery (in kWh).
• DAYS is the runtime of the test, which is calculated as the difference between end_date
and start_date in days.
You need to round off the performance index to two decimal places.
Explanation
• Date Calculation: The number of days (DAYS) is computed by subtracting start_date
from end_date and adding 1 to account for both start and end dates.
• Performance Index Formula: The formula provided for performance_index involves
calculating the absolute difference between charge_energy and discharge_energy, and
then dividing it by the square root of the runtime in days.
• Rounding: The result of the formula is rounded to two decimal places.
750
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(1, 'Model S', '2021-07-31', '2021-08-05', 100, 98),
(2, 'Model S', '2021-08-10', '2021-08-12', 102, 99),
(3, 'Model 3', '2021-09-01', '2021-09-04', 105, 103),
(4, 'Model X', '2021-10-01', '2021-10-10', 110, 107),
(5, 'Model 3', '2021-11-01', '2021-11-03', 100, 95);
Learnings
• Date Difference: Use date subtraction to calculate the number of days between
start_date and end_date.
• Mathematical Operations: Use ABS() to compute the absolute value and SQRT() to
compute the square root.
• Rounding: Use ROUND() to round the result to two decimal places.
Solutions
PostgreSQL Solution
SELECT
run_id,
battery_model,
ROUND(ABS(charge_energy - discharge_energy) / SQRT(end_date - start_date + 1), 2) AS
performance_index
FROM
battery_runs;
MySQL Solution
SELECT
run_id,
battery_model,
ROUND(ABS(charge_energy - discharge_energy) / SQRT(DATEDIFF(end_date, start_date) +
1), 2) AS performance_index
FROM
battery_runs;
Output
The result will return the following columns:
• run_id: The unique identifier for each battery test run.
• battery_model: The model of the battery being tested (e.g., 'Model S', 'Model 3').
• performance_index: The calculated battery performance index, rounded to two decimal
places.
Example Output
1 Model S 0.36
2 Model S 0.25
3 Model 3 0.29
4 Model X 0.31
751
1000+ SQL Interview Questions & Answers | By Zero Analyst
5 Model 3 0.45
This output shows the performance index for each battery model in each test run, helping to
evaluate the efficiency of the batteries.
• Q.599
Question
Write an SQL query to calculate the average selling price per Tesla car model for each year.
The result should show the year, model, and average price.
Explanation
To calculate the average selling price per model for each year:
• Extract Year: We will extract the year from the sale_date column using the EXTRACT()
function.
• Group by Year and Model: We will group the data by model_id and the extracted year
to compute the average price for each model within each year.
• Average Calculation: We will use the AVG() function to calculate the average price of
each model for the respective year.
Learnings
• Date Extraction: Use the EXTRACT() function to get specific parts of a date (in this case,
the year).
• Aggregation: Use AVG() to calculate the average value of a column.
• Grouping: Group by both the year and the car model to get the average price for each
model per year.
Solutions
PostgreSQL Solution
SELECT EXTRACT(YEAR FROM sale_date) AS year,
model_id AS model,
AVG(price) AS average_price
FROM sales
752
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT YEAR(sale_date) AS year,
model_id AS model,
AVG(price) AS average_price
FROM sales
GROUP BY year, model
ORDER BY year, model;
Output
The result will return the following columns:
• year: The year of the sale.
• model: The car model (e.g., 'ModelS', 'ModelX').
• average_price: The average selling price of that model for that year.
• Q.600
Question
Find the Second-Highest Price for Each Tesla Car Model
Write an SQL query to find the second-highest price for each Tesla car model from the Cars
table. If there is no second-highest price (e.g., only one car model in the table), return NULL
for that model.
Explanation
Use the ROW_NUMBER() or RANK() window function to rank car prices within each car model.
Then, filter the results to get the second-highest price for each model. If there is only one
price, return NULL for that car model.
Learnings
• Using window functions like ROW_NUMBER() and RANK() to rank items within each group.
• Handling cases where there may be no second-highest value by filtering based on rank.
• Partitioning results by a specific column (e.g., car model) while applying ranking.
Solutions
753
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL solution
In PostgreSQL, the RANK() window function can be used to rank car prices within each
model. We will filter the second-highest ranked price.
WITH RankedPrices AS (
SELECT CarID, ModelName, Price,
RANK() OVER (PARTITION BY ModelName ORDER BY Price DESC) AS rank
FROM Cars
)
SELECT CarID, ModelName, Price
FROM RankedPrices
WHERE rank = 2;
• RANK() assigns a rank to each car based on the price within each car model (PARTITION BY
ModelName).
• The query then filters only those with a rank of 2 to get the second-highest price for each
model.
MySQL solution
In MySQL, you can use the same approach with the RANK() or ROW_NUMBER() window
function to rank the car prices and get the second-highest.
WITH RankedPrices AS (
SELECT CarID, ModelName, Price,
RANK() OVER (PARTITION BY ModelName ORDER BY Price DESC) AS rank
FROM Cars
)
SELECT CarID, ModelName, Price
FROM RankedPrices
WHERE rank = 2;
• The logic is the same as in PostgreSQL, where we partition by ModelName and order by
Price in descending order. We then filter for rank 2.
Tik Tok
• Q.601
Calculate the Total Number of Likes for Each TikTok Video
Problem Statement
Write an SQL query to calculate the total number of likes for each TikTok video. The result
should return the video_id and the total_likes for each video, sorted by the video_id.
Explanation
• Aggregation: Count the total number of likes for each video.
• Grouping: Group the result by video_id to get the total likes for each individual video.
• Sorting: Sort the result by video_id in ascending order.
754
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT video_id, COUNT(like_id) AS total_likes
FROM video_likes
GROUP BY video_id
ORDER BY video_id;
MySQL Solution
SELECT video_id, COUNT(like_id) AS total_likes
FROM video_likes
GROUP BY video_id
ORDER BY video_id;
Learnings
• Aggregation: The COUNT() function is used to count the total number of likes for each
video.
• Grouping: GROUP BY helps in grouping the results by video_id so that the count is per
video.
• Sorting: ORDER BY video_id ensures the result is sorted by video IDs.
• Q.602
Find the Most Popular TikTok Hashtags
Problem Statement
Write an SQL query to find the top 5 most popular hashtags on TikTok. The result should
return the hashtag and the total_mentions, sorted by the total number of mentions in
descending order.
Explanation
• Count Mentions: Count the number of times each hashtag has been mentioned.
• Top 5: Limit the result to the top 5 most mentioned hashtags.
• Sorting: Sort by the total number of mentions in descending order to show the most
popular hashtags.
755
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT hashtag, COUNT(mention_id) AS total_mentions
FROM hashtag_mentions
GROUP BY hashtag
ORDER BY total_mentions DESC
LIMIT 5;
MySQL Solution
SELECT hashtag, COUNT(mention_id) AS total_mentions
FROM hashtag_mentions
GROUP BY hashtag
ORDER BY total_mentions DESC
LIMIT 5;
Learnings
• Counting: COUNT() is used to count how many times each hashtag has been mentioned.
• Grouping: GROUP BY groups the hashtags so we can aggregate the counts.
• Limiting: LIMIT 5 ensures that only the top 5 hashtags are returned.
• Sorting: Sorting by total_mentions in descending order to find the most popular
hashtags.
• Q.603
Identify TikTok Users with the Most Posts
Problem Statement
Write an SQL query to identify the top 3 TikTok users with the most posts. The result should
return the user_id and the total_posts, sorted by the total number of posts in descending
order.
Explanation
• Count Posts: Count the number of posts each user has made.
756
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Top 3: Limit the result to the top 3 users with the most posts.
• Sorting: Sort the results by the number of posts in descending order to identify the most
active users.
Solutions
PostgreSQL Solution
SELECT user_id, COUNT(post_id) AS total_posts
FROM user_posts
GROUP BY user_id
ORDER BY total_posts DESC
LIMIT 3;
MySQL Solution
SELECT user_id, COUNT(post_id) AS total_posts
FROM user_posts
GROUP BY user_id
ORDER BY total_posts DESC
LIMIT 3;
Learnings
• Counting: COUNT(post_id) counts the number of posts per user.
• Grouping: The GROUP BY user_id groups the posts by user, enabling the calculation of
total posts per user.
• Limiting: LIMIT 3 ensures the query returns the top 3 users.
• Sorting: Sorting the results by total_posts in descending order gives the users with the
most posts.
• Q.604
757
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to calculate the average duration of TikTok videos (in seconds) for each
video category. The result should return the category, the average_duration of the
videos, and the total number of likes for that category.
Explanation
• Group by Category: Group the result by category to calculate the average duration and
likes for each category.
• Aggregate Functions: Use AVG() to calculate the average video duration and SUM() to
calculate the total number of likes for each category.
• Join Tables: Join the video_details table (which includes video metadata) with the
video_likes table to count the total likes for each category.
Solutions
PostgreSQL Solution
SELECT v.category,
AVG(v.video_duration) AS average_duration,
COUNT(l.like_id) AS total_likes
FROM video_details v
JOIN video_likes l ON v.video_id = l.video_id
GROUP BY v.category;
MySQL Solution
758
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT v.category,
AVG(v.video_duration) AS average_duration,
COUNT(l.like_id) AS total_likes
FROM video_details v
JOIN video_likes l ON v.video_id = l.video_id
GROUP BY v.category;
Learnings
• Aggregating by Category: Using GROUP BY allows you to calculate the average duration
and total likes for each video category.
• Join Operations: By joining two tables, you can gather data from multiple sources (e.g.,
video metadata and likes).
• Use of Aggregate Functions: AVG() and COUNT() allow for calculating averages and totals
over groups of records.
• Q.605
Identify the Top 3 Most Active TikTok Users in Terms of Content Creation
Problem Statement
Write an SQL query to identify the top 3 most active TikTok users based on the number of
videos they have uploaded. The result should return the user_id, total_videos_uploaded,
and the user_name (if available) sorted by the number of videos uploaded in descending
order.
Explanation
• Count Videos: Count the number of videos each user has uploaded.
• Limit the Results: Return the top 3 users with the most videos.
• Sort the Results: Sort the users by the number of videos uploaded, in descending order.
759
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT u.user_id,
COUNT(v.video_id) AS total_videos_uploaded,
u.user_name
FROM user_videos v
JOIN user_profiles u ON v.user_id = u.user_id
GROUP BY u.user_id, u.user_name
ORDER BY total_videos_uploaded DESC
LIMIT 3;
MySQL Solution
SELECT u.user_id,
COUNT(v.video_id) AS total_videos_uploaded,
u.user_name
FROM user_videos v
JOIN user_profiles u ON v.user_id = u.user_id
GROUP BY u.user_id, u.user_name
ORDER BY total_videos_uploaded DESC
LIMIT 3;
Learnings
• Counting User Contributions: COUNT() helps in identifying how many videos each user
has uploaded.
• Sorting and Limiting: Sorting by total_videos_uploaded and limiting the result helps
to identify the most active users.
• Join Operations: Joining the user_profiles and user_videos tables allows retrieving
user details alongside the count of videos.
• Q.606
Calculate the Average Number of Comments per Video Category
Problem Statement
Write an SQL query to calculate the average number of comments per video category for
TikTok videos. The result should return the category, the average_comments, and the
total_comments for each category.
Explanation
• Join Tables: Join the video_details table (containing video metadata) with the
video_comments table (containing comments for videos).
• Group by Category: Group the result by category to calculate the average number of
comments per category.
760
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Aggregation: Use COUNT() to get the total number of comments and AVG() to get the
average number of comments.
Solutions
PostgreSQL Solution
SELECT v.category,
AVG(comment_count) AS average_comments,
SUM(comment_count) AS total_comments
FROM (
SELECT video_id, category, COUNT(c.comment_id) AS comment_count
FROM video_details v
LEFT JOIN video_comments c ON v.video_id = c.video_id
GROUP BY v.video_id, v.category
) AS subquery
GROUP BY category;
MySQL Solution
SELECT v.category,
AVG(comment_count) AS average_comments,
SUM(comment_count) AS total_comments
FROM (
SELECT video_id, category, COUNT(c.comment_id) AS comment_count
FROM video_details v
LEFT JOIN video_comments c ON v.video_id = c.video_id
GROUP BY v.video_id, v
.category
) AS subquery
761
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY category;
Learnings
• Left Join: Using LEFT JOIN ensures that videos without comments are still included in the
result with a count of 0.
• Aggregating Counts: COUNT() is used to get the number of comments per video, and
AVG() and SUM() help with calculating averages and totals at the category level.
• Subqueries: A subquery is used to aggregate the comment counts for each video, which is
then aggregated again by category.
• Q.607
Question 1: Identify the Most Popular Time of Day for TikTok Video Uploads
Problem Statement
Write an SQL query to identify the most popular time of day (hour of the day) for video
uploads on TikTok. The result should return the hour of the day (upload_hour), the number
of videos uploaded during that hour (total_uploads), and the percentage of total uploads
that occurred during that hour.
Explanation
• Extract Hour: Use EXTRACT() or HOUR() to extract the hour from the upload_date.
• Count Uploads: Count the number of video uploads for each hour.
• Calculate Percentage: Calculate the percentage of uploads for each hour based on the
total number of uploads.
Solutions
PostgreSQL Solution
762
1000+ SQL Interview Questions & Answers | By Zero Analyst
WITH hourly_uploads AS (
SELECT EXTRACT(HOUR FROM upload_date) AS upload_hour,
COUNT(video_id) AS total_uploads
FROM video_uploads
GROUP BY upload_hour
)
SELECT upload_hour,
total_uploads,
ROUND((total_uploads::DECIMAL / (SELECT COUNT(*) FROM video_uploads)) * 100, 2) A
S upload_percentage
FROM hourly_uploads
ORDER BY total_uploads DESC;
MySQL Solution
WITH hourly_uploads AS (
SELECT HOUR(upload_date) AS upload_hour,
COUNT(video_id) AS total_uploads
FROM video_uploads
GROUP BY upload_hour
)
SELECT upload_hour,
total_uploads,
ROUND((total_uploads / (SELECT COUNT(*) FROM video_uploads)) * 100, 2) AS upload_
percentage
FROM hourly_uploads
ORDER BY total_uploads DESC;
Learnings
• Extracting Date Components: Using EXTRACT() or HOUR() helps to break down the
timestamp into usable parts like the hour.
• Aggregation and Grouping: Aggregating the number of uploads per hour helps to
determine peak times.
• Calculating Percentages: Using a subquery to get the total number of uploads enables the
calculation of upload percentages.
• Q.608
Explanation
• Track Follower Count: Use user_id to track the follower count for each influencer over
the last three months.
• Calculate Percentage Change: Calculate the percentage change in follower count
between the start and end of the 3-month period.
• Consider Date Filtering: Filter the records to only consider data within the last 3 months.
763
1000+ SQL Interview Questions & Answers | By Zero Analyst
user_name VARCHAR(100),
follower_count INT,
profile_update_date DATE
);
Solutions
PostgreSQL Solution
WITH follower_changes AS (
SELECT user_id, user_name,
FIRST_VALUE(follower_count) OVER (PARTITION BY user_id ORDER BY profile_updat
e_date) AS start_follower_count,
LAST_VALUE(follower_count) OVER (PARTITION BY user_id ORDER BY profile_update
_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS end_follower_count
FROM user_profiles
WHERE profile_update_date BETWEEN '2024-01-01' AND '2024-03-31'
)
SELECT user_id, user_name, start_follower_count, end_follower_count,
ROUND(((end_follower_count - start_follower_count)::DECIMAL / start_follower_coun
t) * 100, 2) AS percentage_change
FROM follower_changes;
MySQL Solution
WITH follower_changes AS (
SELECT user_id, user_name,
FIRST_VALUE(follower_count) OVER (PARTITION BY user_id ORDER BY profile_updat
e_date) AS start_follower_count,
LAST_VALUE(follower_count) OVER (PARTITION BY user_id ORDER BY profile_update
_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS end_follower_count
FROM user_profiles
WHERE profile_update_date BETWEEN '2024-01-01' AND '2024-03-31'
)
SELECT user_id, user_name, start_follower_count, end_follower_count,
ROUND(((end_follower_count - start_follower_count) / start_follower_count) * 100,
2) AS percentage_change
FROM follower_changes;
Learnings
• Date Filtering: By using the WHERE clause to filter the records based on date ranges, you
can track changes over a specific period.
• Window Functions: The use of FIRST_VALUE() and LAST_VALUE() allows you to fetch
the start and end follower counts for each user.
• Percentage Calculation: The percentage change formula helps in understanding the
growth of each influencer’s followers.
• Q.609
Identify the Most Popular Hashtags for TikTok Videos in 2024
Problem Statement
764
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to identify the most popular hashtags used in TikTok videos in 2024.
The query should return the hashtag, the number of times it was used, and the percentage of
total hashtag uses for that year.
Explanation
• Count Hashtags: Count the number of times each hashtag appears in the videos uploaded
in 2024.
• Calculate Percentage: Calculate the percentage of total hashtag usage for each hashtag.
• Filter by Year: Ensure that only hashtags used in 2024 are considered.
Postgres Solution
WITH hashtag_counts AS (
SELECT h.hashtag, COUNT(h.hashtag_id) AS hashtag_count
FROM video_hashtags h
JOIN video_details v ON h.video_id = v.video_id
WHERE EXTRACT(YEAR FROM v.upload_date) = 2024
GROUP BY h.hashtag
)
SELECT hashtag, hashtag_count,
ROUND((hashtag_count::DECIMAL / (SELECT COUNT(*) FROM video_hashtags)) * 100, 2)
AS percentage
FROM hashtag_counts
ORDER BY hashtag_count DESC;
MySQL Solution
WITH hashtag_counts AS (
SELECT h.hashtag, COUNT(h.hashtag_id) AS hashtag_count
765
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM video_hashtags h
JOIN video_details v ON h.video_id = v.video_id
WHERE YEAR(v.upload_date) = 2024
GROUP BY h.hashtag
)
SELECT hashtag, hashtag_count,
ROUND((hashtag_count / (SELECT COUNT(*) FROM video_hashtags)) * 100, 2) AS percen
tage
FROM hashtag_counts
ORDER BY hashtag_count DESC;
Learnings
• Using JOIN: Joining video_hashtags and video_details allows us to count hashtags
from videos uploaded in a specific year.
• Date Filtering: The EXTRACT(YEAR FROM ...) or YEAR() function is used to ensure we
only consider videos uploaded in 2024.
• Percentage Calculation: The percentage gives an idea of how popular a particular hashtag
was relative to all hashtags.
• Q.610
Analyze Video Engagement by Week
Problem Statement
Write an SQL query to analyze the total number of views and likes for TikTok videos
uploaded each week in 2024. The query should return the week number (week_of_year), the
total views, total likes, and the average views per like for each week.
Explanation
• Extract Week Information: Use EXTRACT(WEEK FROM ...) or WEEK() to extract the
week number of the year from the video upload date.
• Aggregate Data: Use SUM() to calculate total views and likes for each week.
• Calculate Views per Like: Calculate the average views per like by dividing the total
views by the total likes.
766
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT EXTRACT(WEEK FROM upload_date) AS week_of_year,
SUM(views) AS total_views,
SUM(likes) AS total_likes,
ROUND(SUM(views)::DECIMAL / NULLIF(SUM(likes), 0), 2) AS views_per_like
FROM video_details
WHERE EXTRACT(YEAR FROM upload_date) = 2024
GROUP BY week_of_year
ORDER BY week_of_year;
MySQL Solution
SELECT WEEK(upload_date) AS week_of_year,
SUM(views) AS total_views,
SUM(likes) AS total_likes,
ROUND(SUM(views) / NULLIF(SUM(likes), 0), 2) AS views_per_like
FROM video_details
WHERE YEAR(upload_date) = 2024
GROUP BY week_of_year
ORDER BY week_of_year;
Learnings
• Aggregating Data by Week: Using EXTRACT(WEEK FROM ...) or WEEK() helps group
data by week, allowing analysis of trends over time.
• Handling Division by Zero: The NULLIF function ensures that division by zero does not
occur when calculating views per like.
• Date Filtering: The WHERE clause ensures that only videos uploaded in 2024 are included
in the result.
• Q.611
Identify Top Performing TikTok Influencers by Total Likes
Problem Statement
Write an SQL query to identify the top 5 TikTok influencers based on the total number of
likes received on their videos in 2024. The query should return the influencer's user_id,
user_name, and the total likes they have received, sorted by total likes in descending order.
Explanation
• Sum of Likes: Use SUM(likes) to calculate the total likes for each influencer.
• Sort and Limit: Sort the results by total likes in descending order and return only the top 5
influencers.
767
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT i.user_id, i.user_name, SUM(v.likes) AS total_likes
FROM influencer_profiles i
JOIN video_details v ON i.user_id = v.user_id
WHERE EXTRACT(YEAR FROM v.upload_date) = 2024
GROUP BY i.user_id, i.user_name
ORDER BY total_likes DESC
LIMIT 5;
MySQL Solution
SELECT i.user_id, i.user_name, SUM(v.likes) AS total_likes
FROM influencer_profiles i
JOIN video_details v ON i.user_id = v.user_id
WHERE YEAR(v.upload_date) = 2024
GROUP BY i.user_id, i.user_name
ORDER BY total_likes DESC
LIMIT 5;
Learnings
• JOIN Operation: Joining the influencer_profiles table with the video_details table
allows you to aggregate likes by influencer.
• LIMIT: Using LIMIT 5 ensures that only the top 5 influencers are returned.
• Date Filtering: The WHERE clause filters the data to include only videos uploaded in 2024.
• Q.612
768
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Group by Genre: Group the data by video genre to aggregate total views for each genre.
• Calculate Percentage: Calculate the percentage of total views each genre represents.
• Filter by Year: Only consider videos uploaded in 2024.
PostgreSQL Solution
WITH genre_views AS (
SELECT video_genre, SUM(views) AS total_views
FROM video_details
WHERE EXTRACT(YEAR FROM upload_date) = 2024
GROUP BY video_genre
)
SELECT video_genre, total_views,
ROUND((total_views::DECIMAL / (SELECT SUM(views) FROM video_details WHERE EXTRACT
(YEAR FROM upload_date) = 2024)) * 100, 2) AS percentage
FROM genre_views
ORDER BY total_views DESC;
MySQL Solution
WITH genre_views AS (
SELECT video_genre, SUM(views) AS total_views
FROM video_details
WHERE YEAR(upload_date) = 2024
GROUP BY video_genre
)
SELECT video_genre, total_views,
769
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Aggregating Data by Genre: Grouping by video genre allows for easy calculation of total
views per genre.
• Percentage Calculation: Calculating the percentage of views per genre provides insight
into which genres are the most popular.
• Using WITH Clause: The WITH clause simplifies the logic by creating intermediate views to
calculate total views by genre.
• Q.613
Calculate the Retention Rate of Users After Watching Ads
Problem Statement
Write an SQL query to calculate the retention rate of users who watched an ad campaign in
2024. The retention rate is defined as the percentage of users who performed an action (like
liking, commenting, or sharing a video) after interacting with an ad.
• A user is considered to have "retained" if they performed any action (like, comment, or
share) on a video after viewing an ad.
• You need to calculate the retention rate for each ad campaign in 2024.
Explanation
• Data Sources: You have two tables: ad_interactions (which records when a user clicks
on an ad) and user_actions (which records when a user performs an action like liking,
commenting, or sharing a video).
• Joins: You'll need to join these tables on user_id and calculate the retention rate for each
ad campaign.
• Retention Calculation: Retention rate is calculated as COUNT(DISTINCT
retained_users) / COUNT(DISTINCT total_users) * 100.
770
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
WITH retained_users AS (
SELECT DISTINCT ai.user_id, ai.ad_campaign
FROM ad_interactions ai
JOIN user_actions ua
ON ai.user_id = ua.user_id
WHERE ai.interaction_date <= ua.action_date
AND EXTRACT(YEAR FROM ai.interaction_date) = 2024
),
total_users AS (
SELECT DISTINCT user_id, ad_campaign
FROM ad_interactions
WHERE EXTRACT(YEAR FROM interaction_date) = 2024
)
SELECT ai.ad_campaign,
ROUND(COUNT(DISTINCT ru.user_id) * 100.0 / COUNT(DISTINCT tu.user_id), 2) AS rete
ntion_rate
FROM total_users tu
JOIN retained_users ru ON tu.user_id = ru.user_id AND tu.ad_campaign = ru.ad_campaign
GROUP BY ai.ad_campaign;
MySQL Solution
WITH retained_users AS (
SELECT DISTINCT ai.user_id, ai.ad_campaign
FROM ad_interactions ai
JOIN user_actions ua
ON ai.user_id = ua.user_id
WHERE ai.interaction_date <= ua.action_date
AND YEAR(ai.interaction_date) = 2024
),
total_users AS (
SELECT DISTINCT user_id, ad_campaign
FROM ad_interactions
WHERE YEAR(interaction_date) = 2024
)
SELECT ai.ad_campaign,
ROUND(COUNT(DISTINCT ru.user_id) * 100.0 / COUNT(DISTINCT tu.user_id), 2) AS rete
ntion_rate
FROM total_users tu
JOIN retained_users ru ON tu.user_id = ru.user_id AND tu.ad_campaign = ru.ad_campaign
GROUP BY ai.ad_campaign;
Learnings
• Complex Joins: Joining on multiple conditions and using subqueries to calculate retention
based on specific date constraints.
• Data Aggregation: Using COUNT(DISTINCT ...) to count unique users for each ad
campaign.
771
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Multiple Actions: Actions can be likes, comments, or shares, and need to be aggregated.
• Group by Genre: Calculate the total actions and the number of unique users per genre.
• Filter by Year: Only include data from 2024.
Solutions
PostgreSQL Solution
SELECT vd.video_genre,
COUNT(ua.action_id) AS total_actions,
772
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT vd.video_genre,
COUNT(ua.action_id) AS total_actions,
COUNT(DISTINCT ua.user_id) AS unique_users
FROM video_details vd
JOIN user_actions ua ON vd.video_id = ua.video_id
WHERE YEAR(ua.action_date) = 2024
GROUP BY vd.video_genre
ORDER BY total_actions DESC;
Learnings
• Multi-Table Join: Combining data from both the video_details and user_actions
tables.
• Complex Aggregation: Calculating both the total number of actions and the number of
unique users for each genre.
• Grouping and Sorting: Grouping data by video genre and sorting by total actions to
identify the most engaging genres.
• Q.615
Identify Top 3 Users with the Highest Video Engagement Rate
Problem Statement
Write an SQL query to identify the top 3 users with the highest video engagement rate in
2024. The engagement rate is defined as the total number of actions (likes, comments, and
shares) divided by the total number of videos uploaded by the user.
• Only consider actions performed on videos uploaded in 2024.
• Sort the users by engagement rate in descending order.
Explanation
• Engagement Rate: For each user, calculate the total number of actions on their videos and
divide it by the number of videos they uploaded.
• Ranking: Sort the users by their engagement rate and return the top 3
773
1000+ SQL Interview Questions & Answers | By Zero Analyst
action_date DATETIME
);
Solutions
PostgreSQL Solution
WITH user_engagement AS (
SELECT vd.user_id,
COUNT(ua.action_id) AS total_actions,
COUNT(DISTINCT vd.video_id) AS total_videos
FROM video_details vd
LEFT JOIN user_actions ua ON vd.video_id = ua.video_id
WHERE EXTRACT(YEAR FROM vd.upload_date) = 2024
GROUP BY vd.user_id
)
SELECT user_id,
total_actions / NULLIF(total_videos, 0) AS engagement_rate
FROM user_engagement
ORDER BY engagement_rate DESC
LIMIT 3;
MySQL Solution
WITH user_engagement AS (
SELECT vd.user_id,
COUNT(ua.action_id) AS total_actions,
COUNT(DISTINCT vd.video_id) AS total_videos
FROM video_details vd
LEFT JOIN user_actions ua ON vd.video_id = ua.video_id
WHERE YEAR(vd.upload_date) = 2024
GROUP BY vd.user_id
)
SELECT user_id,
total_actions / NULLIF(total_videos, 0) AS engagement_rate
FROM user_engagement
ORDER BY engagement_rate DESC
LIMIT 3;
Learnings
• NULLIF: This function ensures that division by zero does not occur when there are no
videos uploaded by a user.
• Engagement Rate Calculation: Understanding how to compute an engagement metric by
dividing total actions by the number of videos uploaded.
• Ranking: Sorting the results to get the top users based on engagement.
• Q.616
774
1000+ SQL Interview Questions & Answers | By Zero Analyst
Problem Statement
Write an SQL query to calculate the average watch time per user for each video genre in
2024. Watch time is recorded in seconds, and you need to calculate it for each genre, grouped
by the user's interactions with videos of that genre. Only include data for videos uploaded in
2024.
Explanation
• Data Sources: You have two tables: video_details (which holds information about
videos, including the genre and upload date) and video_interactions (which records the
watch time for each user).
• Watch Time Aggregation: For each genre, calculate the average watch time per user.
• Filter by Year: Only include interactions with videos uploaded in 2024.
Solutions
PostgreSQL Solution
SELECT vd.video_genre,
AVG(vi.watch_time_seconds) AS avg_watch_time
FROM video_details vd
JOIN video_interactions vi ON vd.video_id = vi.video_id
WHERE EXTRACT(YEAR FROM vi.interaction_date) = 2024
775
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY vd.video_genre
ORDER BY avg_watch_time DESC;
MySQL Solution
SELECT vd.video_genre,
AVG(vi.watch_time_seconds) AS avg_watch_time
FROM video_details vd
JOIN video_interactions vi ON vd.video_id = vi.video_id
WHERE YEAR(vi.interaction_date) = 2024
GROUP BY vd.video_genre
ORDER BY avg_watch_time DESC;
Learnings
• JOIN with Aggregation: Combining data from the video_details and
video_interactions tables to calculate an average.
• Time-based Filtering: Using EXTRACT() or YEAR() to filter data by year.
• Aggregation: Calculating average watch time per genre.
• Q.617
Problem Statement
Write an SQL query to identify users who have shared videos more than 3 times within a
single week in 2024. The query should return the user_id, share_count, and the week
number for the share activity.
Explanation
• Data Sources: You have the user_actions table, which tracks user activities (like, share,
comment).
• Week Number Calculation: You will need to use WEEK() or DATE_TRUNC() to calculate
the week number of the year.
• Group by Week and User: Count how many shares each user has performed in each week
and filter for users who shared more than 3 times.
776
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT user_id,
COUNT(*) AS share_count,
EXTRACT(WEEK FROM action_date) AS week_number
FROM user_actions
WHERE action_type = 'share'
AND EXTRACT(YEAR FROM action_date) = 2024
GROUP BY user_id, week_number
HAVING COUNT(*) > 3
ORDER BY share_count DESC;
MySQL Solution
SELECT user_id,
COUNT(*) AS share_count,
WEEK(action_date) AS week_number
FROM user_actions
WHERE action_type = 'share'
AND YEAR(action_date) = 2024
GROUP BY user_id, week_number
HAVING COUNT(*) > 3
ORDER BY share_count DESC;
Learnings
• Filtering Specific Actions: Using WHERE to focus only on 'share' actions.
• Week Calculation: Using EXTRACT(WEEK) or WEEK() to group by week numbers.
• HAVING: Applying filters after the aggregation to only include users with more than 3
shares in a week.
• Q.618
Problem Statement
Write an SQL query to identify the top 3 most commented TikTok videos in 2024. The query
should return the video ID, the number of comments, and the video genre. Sort the results by
the number of comments in descending order.
Explanation
• Data Sources: You have two tables: video_details (which includes the genre and video
ID) and user_actions (which tracks the comments).
• Grouping by Video: You need to count how many comments each video has in 2024 and
sort them.
• Filter by Year: Only consider comments made in 2024.
777
1000+ SQL Interview Questions & Answers | By Zero Analyst
action_date DATETIME
);
Solutions
PostgreSQL Solution
SELECT vd.video_id,
COUNT(ua.action_id) AS comment_count,
vd.video_genre
FROM user_actions ua
JOIN video_details vd ON ua.video_id = vd.video_id
WHERE ua.action_type = 'comment'
AND EXTRACT(YEAR FROM ua.action_date) = 2024
GROUP BY vd.video_id, vd.video_genre
ORDER BY comment_count DESC
LIMIT 3;
MySQL Solution
SELECT vd.video_id,
COUNT(ua.action_id) AS comment_count,
vd.video_genre
FROM user_actions ua
JOIN video_details vd ON ua.video_id = vd.video_id
WHERE ua.action_type = 'comment'
AND YEAR(ua.action_date) = 2024
GROUP BY vd.video_id, vd.video_genre
ORDER BY comment_count DESC
LIMIT 3;
Learnings
• JOIN: Combining two tables (user_actions and video_details) to get the genre and
comment counts.
• Filtering Actions: Using WHERE to focus only on 'comment' actions.
• Top N Results: Using LIMIT to get the top 3 videos based on the comment count.
• Q.619
Problem Statement
You are tasked with identifying the user IDs of those who did not confirm their sign-up on
the first day but confirmed their sign-up on the second day after signing up.
778
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Non-Confirmation on Day 1: We need to check if a user did not confirm on the same day
as their signup_date (i.e., where signup_action = 'Not Confirmed' and action_date is
the same as signup_date).
• Confirmation on Day 2: After a user has failed to confirm on Day 1, we need to check if
they confirmed on the next day, which is one day after their signup_date (i.e., where
signup_action = 'Confirmed' and action_date is exactly one day after the
signup_date).
Query Solution
The goal is to identify users who did not confirm on the signup date but confirmed the
next day.
SELECT e.user_id
FROM emails e
JOIN texts t1 ON e.email_id = t1.email_id
JOIN texts t2 ON e.email_id = t2.email_id
WHERE t1.signup_action = 'Not Confirmed'
AND t2.signup_action = 'Confirmed'
AND t1.action_date = e.signup_date
AND t2.action_date = DATE_ADD(e.signup_date, INTERVAL 1 DAY);
779
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Join the emails table with the texts table twice (as t1 and t2) to get both the "Not
Confirmed" and "Confirmed" actions.
• Condition 1: Ensure that on t1, the action is "Not Confirmed" and the action_date
matches the signup_date from the emails table.
• Condition 2: Ensure that on t2, the action is "Confirmed" and the action_date is exactly
one day after the signup_date.
• Final Output: The query returns the user_id of those who confirmed their account on the
second day after signing up.
• Q.620
Analyzing User Behavior and Content Interactions on TikTok
Problem Statement
You are tasked with identifying the top 5 users who have uploaded videos that have received
the most likes on TikTok. The output should display the following:
• User ID
• The total number of videos uploaded by the user
• The total number of likes received by the videos uploaded by the user
Tables:
• Users Table:
• This table contains information about the TikTok users.
CREATE TABLE Users (
user_id INT PRIMARY KEY,
username VARCHAR(100),
country VARCHAR(100),
join_date DATE
);
• Videos Table:
• This table contains information about the videos uploaded by users, including the number
of likes each video has received.
CREATE TABLE Videos (
video_id INT PRIMARY KEY,
upload_date DATE,
user_id INT,
video_likes INT
);
Sample Data:
-- Sample data for Users
INSERT INTO Users (user_id, username, country, join_date)
VALUES
(1, 'user1', 'USA', '2021-01-01'),
(2, 'user2', 'Canada', '2021-02-01'),
(3, 'user3', 'UK', '2021-01-31'),
(4, 'user4', 'USA', '2021-01-30'),
(5, 'user5', 'Canada', '2021-01-15');
780
1000+ SQL Interview Questions & Answers | By Zero Analyst
Requirements
• Join the Users and Videos tables on user_id to link each video to its respective user.
• Aggregate the data by user to calculate the total number of videos uploaded and the total
number of likes their videos received.
• Sort the result by the total likes in descending order to get the top users.
• Limit the result to the top 5 users.
Apple
• Q.621
Identify Top Customers by Total Purchase Amount in a Given Year
Problem Statement:
Apple needs to track its top customers based on the total purchase amount in a given year.
You need to write a SQL query that returns the top 5 customers who made the highest total
purchases in 2023. The output should show the customer ID and the total amount they spent.
781
1000+ SQL Interview Questions & Answers | By Zero Analyst
Requirements:
• Filter the data for the year 2023 using YEAR(purchase_date).
• Sum the total amount spent by each customer during the year.
• Sort the results by total purchase amount in descending order.
• Limit the output to the top 5 customers.
782
1000+ SQL Interview Questions & Answers | By Zero Analyst
Requirements:
• Group by product category to calculate the average purchase amount for each category.
• Calculate the average using the AVG() function.
• Filter for the year 2023 using YEAR(purchase_date).
Requirements:
• Group by customer_id to count the number of orders placed by each customer.
• Use the COUNT() function to calculate the number of orders.
• Show customer_id and order count.
783
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Apple has a trade-in program where customers can return their old iPhone device and receive
a trade-in payout in cash. For each store, write a query to calculate the total revenue from the
trade-in payouts. Order the result by total revenue in descending order.
Explanation
To solve this, you need to join the trade_in_transactions table with the
trade_in_payouts table using the model_id column. Then, for each store, calculate the
total trade-in revenue by multiplying the number of transactions for each model by its
respective payout amount. Finally, order the results by total revenue in descending order.
Datasets
-- Inserting data into trade_in_transactions
INSERT INTO trade_in_transactions (transaction_id, model_id, store_id, transaction_date)
VALUES
(1, 112, 512, '2022-01-01'),
(2, 113, 512, '2022-01-01');
Learnings
• Using JOIN to combine data from multiple tables based on common columns (model_id).
• Aggregating data with COUNT or SUM for calculating total revenues.
• Ordering results with ORDER BY in descending order to get the stores with the highest
trade-in payouts first.
Solutions
784
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL Solution
SELECT t.store_id, SUM(p.payout_amount) AS total_revenue
FROM trade_in_transactions t
JOIN trade_in_payouts p ON t.model_id = p.model_id
GROUP BY t.store_id
ORDER BY total_revenue DESC;
MySQL Solution
SELECT t.store_id, SUM(p.payout_amount) AS total_revenue
FROM trade_in_transactions t
JOIN trade_in_payouts p ON t.model_id = p.model_id
GROUP BY t.store_id
ORDER BY total_revenue DESC;
• Q.625
Question
Write a query to determine the percentage of buyers who bought AirPods directly after they
bought iPhones, with no intermediate purchases in between. Round the answer to a whole
percentage (e.g., 20 for 20%, 50 for 50%).
Explanation
To solve this, you need to:
• Identify customers who bought iPhones and later bought AirPods.
• Ensure no intermediate purchases (e.g., iPads, etc.) occurred between buying an iPhone
and AirPods.
• Calculate the percentage of customers who bought AirPods after iPhones relative to the
total number of customers who bought iPhones.
Datasets
-- Inserting data into transactions
INSERT INTO transactions (transaction_id, customer_id, product_name, transaction_timesta
mp)
VALUES
(1, 101, 'iPhone', '2022-08-08 00:00:00'),
(2, 101, 'AirPods', '2022-08-08 00:00:00'),
(5, 301, 'iPhone', '2022-09-05 00:00:00'),
(6, 301, 'iPad', '2022-09-06 00:00:00'),
(7, 301, 'AirPods', '2022-09-07 00:00:00');
Learnings
• Using JOIN or SELF JOIN to track sequences of events for each customer.
• Filtering with conditions on timestamps to ensure correct order of purchases.
• Calculating percentage by dividing the count of desired events by the total and multiplying
by 100 for percentage.
785
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
WITH iPhone_buyers AS (
SELECT DISTINCT customer_id
FROM transactions
WHERE product_name = 'iPhone'
),
airpod_followers AS (
SELECT DISTINCT t1.customer_id
FROM transactions t1
JOIN transactions t2
ON t1.customer_id = t2.customer_id
AND t1.product_name = 'iPhone'
AND t2.product_name = 'AirPods'
AND t1.transaction_timestamp < t2.transaction_timestamp
WHERE NOT EXISTS (
SELECT 1
FROM transactions t3
WHERE t3.customer_id = t1.customer_id
AND t3.transaction_timestamp > t1.transaction_timestamp
AND t3.transaction_timestamp < t2.transaction_timestamp
AND t3.product_name != 'AirPods'
)
)
SELECT ROUND(100.0 * COUNT(DISTINCT af.customer_id) / COUNT(DISTINCT ib.customer_id)) AS
percentage
FROM iPhone_buyers ib
LEFT JOIN airpod_followers af
ON ib.customer_id = af.customer_id;
MySQL Solution
WITH iPhone_buyers AS (
SELECT DISTINCT customer_id
FROM transactions
WHERE product_name = 'iPhone'
),
airpod_followers AS (
SELECT DISTINCT t1.customer_id
FROM transactions t1
JOIN transactions t2
ON t1.customer_id = t2.customer_id
AND t1.product_name = 'iPhone'
AND t2.product_name = 'AirPods'
AND t1.transaction_timestamp < t2.transaction_timestamp
WHERE NOT EXISTS (
SELECT 1
FROM transactions t3
WHERE t3.customer_id = t1.customer_id
AND t3.transaction_timestamp > t1.transaction_timestamp
AND t3.transaction_timestamp < t2.transaction_timestamp
AND t3.product_name != 'AirPods'
)
)
SELECT ROUND(100 * COUNT(DISTINCT af.customer_id) / COUNT(DISTINCT ib.customer_id)) AS p
ercentage
FROM iPhone_buyers ib
LEFT JOIN airpod_followers af
ON ib.customer_id = af.customer_id;
• Q.626
Question
Write a SQL query to calculate the monthly average rating for each Apple product based on
reviews submitted by users. The review table contains the following columns: review_id,
user_id, submit_date, product_id, and stars. For the purposes of this problem, assume
that the product_id corresponds to an Apple product.
786
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this problem, you need to:
• Extract the month and year from the submit_date to group reviews by month.
• Calculate the average stars for each product within each month.
• Group by the extracted month and product to get the average rating for each Apple
product.
• Order the results by month and product.
Datasets
-- Inserting sample data into reviews
INSERT INTO reviews (review_id, user_id, submit_date, product_id, stars)
VALUES
(6171, 123, '2022-06-08 00:00:00', 50001, 4),
(7802, 265, '2022-06-10 00:00:00', 69852, 4),
(5293, 362, '2022-06-18 00:00:00', 50001, 3),
(6352, 192, '2022-07-26 00:00:00', 69852, 3),
(4517, 981, '2022-07-05 00:00:00', 69852, 2);
Learnings
• Using EXTRACT() or DATE_TRUNC() to isolate parts of a date (month, year).
• Aggregating data using AVG() to compute average ratings.
• Grouping data using GROUP BY and sorting results with ORDER BY.
Solutions
PostgreSQL Solution
SELECT EXTRACT(MONTH FROM submit_date) AS mth,
product_id AS product,
AVG(stars) AS avg_stars
FROM reviews
GROUP BY mth, product
ORDER BY mth, product;
MySQL Solution
SELECT MONTH(submit_date) AS mth,
product_id AS product,
AVG(stars) AS avg_stars
FROM reviews
GROUP BY mth, product
ORDER BY mth, product;
• Q.627
Question
787
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to compute the average quantity of each product sold per month for the
year 2021. You are given two tables: products and sales. The products table contains
information about Apple products, and the sales table contains data about product sales,
including quantity sold and the sale date.
Explanation
To solve this problem:
• Join the products and sales tables on the product_id column.
• Filter the sales for the year 2021 using the YEAR() function.
• Extract the month from the date_of_sale to group the sales by month.
• Calculate the average quantity sold for each product per month.
• Group by the extracted month and the product name to get the monthly average sales.
• Order the results by month and product.
Datasets
-- Inserting data into products table
INSERT INTO products (product_id, product_name)
VALUES
(1, 'iPhone 12'),
(2, 'Apple Watch'),
(3, 'MacBook Pro');
Learnings
• Using JOIN to combine data from multiple tables based on a common column
(product_id).
• Using YEAR() to filter data for a specific year.
• Using MONTH() to extract the month from a date for grouping purposes.
• Using AVG() to calculate the average of a numeric column.
• Grouping and ordering results with GROUP BY and ORDER BY.
788
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT EXTRACT(MONTH FROM s.date_of_sale) AS "Month",
p.product_name,
AVG(s.quantity_sold) AS "Average_Sold"
FROM sales s
JOIN products p ON s.product_id = p.product_id
WHERE EXTRACT(YEAR FROM s.date_of_sale) = 2021
GROUP BY EXTRACT(MONTH FROM s.date_of_sale), p.product_name
ORDER BY "Month", p.product_name;
MySQL Solution
SELECT MONTH(s.date_of_sale) AS 'Month',
p.product_name,
AVG(s.quantity_sold) AS 'Average_Sold'
FROM sales s
JOIN products p ON s.product_id = p.product_id
WHERE YEAR(s.date_of_sale) = 2021
GROUP BY MONTH(s.date_of_sale), p.product_name
ORDER BY 'Month', p.product_name;
• Q.628
Question
Write a SQL query to calculate the Add-to-Bag Conversion Rate for each product in the
Apple Store. The conversion rate is defined as the number of users who add a product to their
shopping bag (cart) after clicking on the product listing, divided by the total number of clicks
for that product. The result should be broken down by product_id.
Explanation
To calculate the conversion rate:
• Join the clicks table with the bag_adds table on product_id and user_id, ensuring that
only users who clicked on a product and added it to the bag are considered.
• Count the total clicks for each product and the total number of successful adds-to-bag
(where add_id is not null).
• Calculate the conversion rate as the ratio of adds-to-bag to total clicks for each product.
• Group the result by product_id to get the conversion rate for each product.
Datasets
789
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• LEFT JOIN is used to ensure we include all clicks, even if there was no add-to-bag
action.
• Conditional aggregation (CASE WHEN) helps count only those records where an add
occurred.
• Aggregation with GROUP BY allows us to compute the conversion rate per product.
Solutions
PostgreSQL Solution
SELECT
c.product_id,
SUM(CASE WHEN a.add_id IS NOT NULL THEN 1 ELSE 0 END) / COUNT(c.click_id) AS convers
ion_rate
FROM
clicks c
LEFT JOIN bag_adds a ON a.product_id = c.product_id AND a.user_id = c.user_id
GROUP BY c.product_id;
MySQL Solution
SELECT
c.product_id,
SUM(CASE WHEN a.add_id IS NOT NULL THEN 1 ELSE 0 END) / COUNT(c.click_id) AS convers
ion_rate
FROM
clicks c
LEFT JOIN bag_adds a ON a.product_id = c.product_id AND a.user_id = c.user_id
GROUP BY c.product_id;
• Q.629
Question
Write a SQL query to find all users who have more than one type of device (e.g., both an
iPhone and a MacBook) and are using more than 50GB of total iCloud storage across all their
devices. The result should include the UserID, UserName, total number of devices, and total
storage used. Order the results by the total storage used in descending order.
Explanation
To solve this:
• Join the Users, Devices, and StorageUsage tables:
• Users table contains user details.
• Devices table contains device details, including the type of device.
790
1000+ SQL Interview Questions & Answers | By Zero Analyst
• StorageUsage table contains the amount of iCloud storage used for each device.
• Count the distinct device types for each user to check if they have more than one type of
device.
• Sum the total storage used for each user across all their devices.
• Filter results using HAVING to ensure:
• The user has more than one device type.
• The total storage usage is greater than 50GB.
• Group the results by UserID and UserName to get the summary for each user.
• Order the results by total storage used in descending order.
Datasets
-- Inserting data into Users table
INSERT INTO Users (UserID, UserName, Email, Country)
VALUES
(1, 'John Doe', '[email protected]', 'USA'),
(2, 'Jane Smith', '[email protected]', 'Canada'),
(3, 'Alice Johnson', '[email protected]', 'UK');
Learnings
791
1000+ SQL Interview Questions & Answers | By Zero Analyst
• JOIN: Combining data from multiple tables based on shared keys (UserID and DeviceID).
• GROUP BY: Grouping data by UserID and UserName to aggregate information at the user
level.
• HAVING: Filtering the grouped results to ensure users have multiple device types and
exceed the storage usage threshold.
• COUNT(DISTINCT): Counting distinct device types to ensure the user has more than one
type of device.
• SUM(): Summing the total storage used by each user across their devices.
Solutions
PostgreSQL Solution
SELECT
u.UserID,
u.UserName,
COUNT(DISTINCT d.DeviceType) AS TotalDevices,
SUM(s.StorageUsed) AS TotalStorageUsed
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
JOIN
StorageUsage s ON d.DeviceID = s.DeviceID
GROUP BY
u.UserID,
u.UserName
HAVING
COUNT(DISTINCT d.DeviceType) > 1
AND SUM(s.StorageUsed) > 50
ORDER BY
TotalStorageUsed DESC;
MySQL Solution
SELECT
u.UserID,
u.UserName,
COUNT(DISTINCT d.DeviceType) AS TotalDevices,
SUM(s.StorageUsed) AS TotalStorageUsed
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
JOIN
StorageUsage s ON d.DeviceID = s.DeviceID
GROUP BY
u.UserID,
u.UserName
HAVING
COUNT(DISTINCT d.DeviceType) > 1
AND SUM(s.StorageUsed) > 50
ORDER BY
TotalStorageUsed DESC;
• Q.630
Device Upgrade Frequency
Write a SQL query to calculate the average number of months between each user's device
purchases. Only consider users who have more than one device. The result should include the
UserID, UserName, and the average number of months between their device purchases. Order
the results by the average number of months in descending order.
Explanation
792
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
• Join the Users and Devices tables to get device purchase information for each user.
• Filter to include only users who have more than one device.
• Calculate the months between consecutive device purchases for each user.
• Aggregate the results to calculate the average number of months between device purchases
for each user.
• Group by UserID and UserName and order by the average number of months in
descending order.
Datasets and SQL Schemas
-- Creating Users table
CREATE TABLE Users (
UserID INT,
UserName VARCHAR(100),
Email VARCHAR(100),
Country VARCHAR(100)
);
Datasets
-- Inserting data into Users table
INSERT INTO Users (UserID, UserName, Email, Country)
VALUES
(1, 'John Doe', '[email protected]', 'USA'),
(2, 'Jane Smith', '[email protected]', 'Canada'),
(3, 'Alice Johnson', '[email protected]', 'UK');
Solutions
PostgreSQL Solution
SELECT
u.UserID,
u.UserName,
AVG(EXTRACT(MONTH FROM d2.PurchaseDate - d1.PurchaseDate)) AS avg_months_between_pur
chases
FROM
Users u
JOIN
Devices d1 ON u.UserID = d1.UserID
JOIN
Devices d2 ON u.UserID = d2.UserID
WHERE
d1.PurchaseDate < d2.PurchaseDate
GROUP BY
u.UserID, u.UserName
HAVING
COUNT(d1.DeviceID) > 1
ORDER BY
793
1000+ SQL Interview Questions & Answers | By Zero Analyst
avg_months_between_purchases DESC;
MySQL Solution
SELECT
u.UserID,
u.UserName,
AVG(TIMESTAMPDIFF(MONTH, d1.PurchaseDate, d2.PurchaseDate)) AS avg_months_between_pu
rchases
FROM
Users u
JOIN
Devices d1 ON u.UserID = d1.UserID
JOIN
Devices d2 ON u.UserID = d2.UserID
WHERE
d1.PurchaseDate < d2.PurchaseDate
GROUP BY
u.UserID, u.UserName
HAVING
COUNT(d1.DeviceID) > 1
ORDER BY
avg_months_between_purchases DESC;
• Q.631
Device Types per User
Write a SQL query to calculate the number of distinct device types each user owns and filter
the results to show only those users who have purchased at least three different device types.
The output should include UserID, UserName, and the total number of distinct device types
they own, ordered by the total device types in descending order.
Explanation
To solve this:
• Join the Users and Devices tables to get the device type for each user.
• Count the distinct device types for each user using COUNT(DISTINCT).
• Filter users who have purchased at least three different device types.
• Group by UserID and UserName and order by the number of distinct device types in
descending order.
Datasets and SQL Schemas
-- Creating Users table
CREATE TABLE Users (
UserID INT,
UserName VARCHAR(100),
Email VARCHAR(100),
Country VARCHAR(100)
);
Datasets
-- Inserting data into Users table
INSERT INTO Users (UserID, UserName, Email, Country)
VALUES
(1, 'John Doe', '[email protected]', 'USA'),
(2, 'Jane Smith', '[email protected]', 'Canada'),
(3, 'Alice Johnson', '[email protected]', 'UK');
794
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT
u.UserID,
u.UserName,
COUNT(DISTINCT d.DeviceType) AS TotalDeviceTypes
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
GROUP BY
u.UserID, u.UserName
HAVING
COUNT(DISTINCT d.DeviceType) >= 3
ORDER BY
TotalDeviceTypes DESC;
MySQL Solution
SELECT
u.UserID,
u.UserName,
COUNT(DISTINCT d.DeviceType) AS TotalDeviceTypes
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
GROUP BY
u.UserID, u.UserName
HAVING
COUNT(DISTINCT d.DeviceType) >= 3
ORDER BY
TotalDeviceTypes DESC;
• Q.632
Total Device Storage per User
Write a SQL query to calculate the total iCloud storage used by each user across all their
devices. Only include users who have more than one device and filter for users who are using
more than 100GB of total storage. The result should include UserID, UserName, and
TotalStorageUsed, ordered by TotalStorageUsed in descending order.
Explanation
To solve this:
• Join the Users, Devices, and StorageUsage tables.
• Sum the total storage used for each user across all their devices.
• Filter to include only users who have more than one device and are using more than
100GB of storage.
• Group by UserID and UserName.
• Order the results by TotalStorageUsed in descending order.
Datasets and SQL Schemas
795
1000+ SQL Interview Questions & Answers | By Zero Analyst
Datasets
-- Inserting data into Users table
INSERT INTO Users (UserID, UserName, Email, Country)
VALUES
(1, 'John Doe', '[email protected]', 'USA'),
(2, 'Jane Smith', '[email protected]', 'Canada'),
(3, 'Alice Johnson', '[email protected]', 'UK');
Solutions
PostgreSQL Solution
SELECT
u.UserID,
u.UserName,
SUM(s.StorageUsed) AS TotalStorageUsed
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
JOIN
StorageUsage s ON d.DeviceID = s.DeviceID
GROUP BY
u.UserID, u.UserName
HAVING
COUNT(d.DeviceID) > 1
AND SUM(s.StorageUsed) > 100
ORDER BY
796
1000+ SQL Interview Questions & Answers | By Zero Analyst
TotalStorageUsed DESC;
MySQL Solution
SELECT
u.UserID,
u.UserName,
SUM(s.StorageUsed) AS TotalStorageUsed
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
JOIN
StorageUsage s ON d.DeviceID = s.DeviceID
GROUP BY
u.UserID, u.UserName
HAVING
COUNT(d.DeviceID) > 1
AND SUM(s.StorageUsed) > 100
ORDER BY
TotalStorageUsed DESC;
• Q.633
Most Popular Device in a Given Month
Write a SQL query to find the most popular device purchased in each month of 2022. The
popularity of a device is determined by the number of purchases (i.e., the total quantity sold)
in that month. The result should include the Month, DeviceType, and the
TotalQuantitySold, ordered by the Month and TotalQuantitySold in descending order.
Explanation
To solve this:
• Join the Devices and Sales tables to get the device type and the quantity sold for each
sale.
• Extract the month and year from the PurchaseDate to group the data by month.
• Sum the quantity_sold for each device in each month.
• Group the results by month and device type, and order by the Month and
TotalQuantitySold in descending order to show the most popular devices.
Datasets
-- Inserting data into Devices table
INSERT INTO Devices (DeviceID, DeviceType, PurchaseDate)
VALUES
(1, 'iPhone', '2021-01-10'),
(2, 'MacBook', '2022-05-20'),
(3, 'iPad', '2022-06-15'),
(4, 'AirPods', '2022-06-20'),
(5, 'Apple Watch', '2022-07-05');
797
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT
EXTRACT(MONTH FROM s.SaleDate) AS Month,
d.DeviceType,
SUM(s.QuantitySold) AS TotalQuantitySold
FROM
Sales s
JOIN
Devices d ON s.DeviceID = d.DeviceID
WHERE
EXTRACT(YEAR FROM s.SaleDate) = 2022
GROUP BY
Month, d.DeviceType
ORDER BY
Month, TotalQuantitySold DESC;
MySQL Solution
SELECT
MONTH(s.SaleDate) AS Month,
d.DeviceType,
SUM(s.QuantitySold) AS TotalQuantitySold
FROM
Sales s
JOIN
Devices d ON s.DeviceID = d.DeviceID
WHERE
YEAR(s.SaleDate) = 2022
GROUP BY
Month, d.DeviceType
ORDER BY
Month, TotalQuantitySold DESC;
• Q.634
Users with the Most Devices
Write a SQL query to find the top 3 users who own the most devices. The result should
include UserID, UserName, and the TotalDevices, ordered by TotalDevices in descending
order.
Explanation
To solve this:
• Join the Users and Devices tables to get device ownership details for each user.
• Count the total number of devices for each user.
• Order the result by the number of devices in descending order.
• Limit the result to show only the top 3 users with the most devices.
Datasets and SQL Schemas
-- Creating Users table
CREATE TABLE Users (
UserID INT,
UserName VARCHAR(100),
798
1000+ SQL Interview Questions & Answers | By Zero Analyst
Email VARCHAR(100)
);
Datasets
-- Inserting data into Users table
INSERT INTO Users (UserID, UserName, Email)
VALUES
(1, 'John Doe', '[email protected]'),
(2, 'Jane Smith', '[email protected]'),
(3, 'Alice Johnson', '[email protected]'),
(4, 'Robert Brown', '[email protected]');
Solutions
PostgreSQL Solution
SELECT
u.UserID,
u.UserName,
COUNT(d.DeviceID) AS TotalDevices
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
GROUP BY
u.UserID, u.UserName
ORDER BY
TotalDevices DESC
LIMIT 3;
MySQL Solution
SELECT
u.UserID,
u.UserName,
COUNT(d.DeviceID) AS TotalDevices
FROM
Users u
JOIN
Devices d ON u.UserID = d.UserID
GROUP BY
u.UserID, u.UserName
ORDER BY
TotalDevices DESC
LIMIT 3;
• Q.635
Average Device Age by Device Type
799
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to calculate the average age of each device type as of January 1, 2023.
The result should include the DeviceType and the AverageDeviceAge, ordered by the
DeviceType.
Explanation
To solve this:
• Join the Devices table to calculate the age of each device as of January 1, 2023.
• Calculate the age of each device by subtracting the PurchaseDate from the fixed date
(January 1, 2023).
• Group the results by DeviceType.
• Calculate the average age for each device type.
Datasets and SQL Schemas
-- Creating Devices table
CREATE TABLE Devices (
DeviceID INT,
DeviceType VARCHAR(50),
PurchaseDate DATE
);
Datasets
-- Inserting data into Devices table
INSERT INTO Devices (DeviceID, DeviceType, PurchaseDate)
VALUES
(1, 'iPhone', '2021-01-10'),
(2, 'MacBook', '2022-05-20'),
(3, 'iPad', '2021-02-15'),
(4, 'AirPods', '2022-06-25'),
(5, 'Apple Watch', '2021-03-25');
Solutions
PostgreSQL Solution
SELECT
DeviceType,
AVG(EXTRACT(YEAR FROM DATE '2023-01-01' - PurchaseDate)) AS AverageDeviceAge
FROM
Devices
GROUP BY
DeviceType
ORDER BY
DeviceType;
MySQL Solution
SELECT
DeviceType,
AVG(TIMESTAMPDIFF(YEAR, PurchaseDate, '2023-01-01')) AS AverageDeviceAge
FROM
Devices
GROUP BY
DeviceType
ORDER BY
DeviceType;
• Q.636
Most Profitable Device by Region
Write a SQL query to find the most profitable device by region. Profitability is determined by
the total revenue generated from sales of each device, where revenue is calculated by
multiplying the quantity sold by the sale price. The result should include Region,
800
1000+ SQL Interview Questions & Answers | By Zero Analyst
Datasets
-- Inserting data into Devices table
INSERT INTO Devices (DeviceID, DeviceType, SalePrice)
VALUES
(1, 'iPhone', 999.99),
(2, 'MacBook', 1999.99),
(3, 'iPad', 799.99),
(4, 'AirPods', 249.99),
(5, 'Apple Watch', 399.99);
Solutions
PostgreSQL Solution
801
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
r.Region,
d.DeviceType,
SUM(s.QuantitySold * d.SalePrice) AS TotalRevenue
FROM
Sales s
JOIN
Devices d ON s.DeviceID = d.DeviceID
JOIN
Regions r ON s.SaleID = r.SaleID
GROUP BY
r.Region, d.DeviceType
ORDER BY
r.Region, TotalRevenue DESC;
MySQL Solution
SELECT
r.Region,
d.DeviceType,
SUM(s.QuantitySold * d.SalePrice) AS TotalRevenue
FROM
Sales s
JOIN
Devices d ON s.DeviceID = d.DeviceID
JOIN
Regions r ON s.SaleID = r.SaleID
GROUP BY
r.Region, d.DeviceType
ORDER BY
r.Region, TotalRevenue DESC;
• Q.637
Device Compatibility Check
Write a SQL query to find all pairs of users who own two devices that are compatible with
each other. Compatibility is determined by checking if a user owns both an iPhone and an
iPad. The query should return UserID1, UserID2, and the DevicePair as 'iPhone + iPad'.
Ensure that the result only contains unique pairs of users (i.e., (1, 2) and (2, 1) should not
appear twice).
Explanation
To solve this:
• Join the Devices table with itself to compare the devices owned by two users.
• Filter the devices to include only 'iPhone' and 'iPad'.
• Ensure that each pair is unique by using a condition where UserID1 < UserID2 to avoid
duplication.
Datasets and SQL Schemas
-- Creating Users table
CREATE TABLE Users (
UserID INT,
UserName VARCHAR(100),
Email VARCHAR(100)
);
802
1000+ SQL Interview Questions & Answers | By Zero Analyst
Datasets
-- Inserting data into Users table
INSERT INTO Users (UserID, UserName, Email)
VALUES
(1, 'John Doe', '[email protected]'),
(2, 'Jane Smith', '[email protected]'),
(3, 'Alice Johnson', '[email protected]');
Solutions
PostgreSQL Solution
SELECT
DISTINCT LEAST(d1.UserID, d2.UserID) AS UserID1,
GREATEST(d1.UserID, d2.UserID) AS UserID2,
'iPhone + iPad' AS DevicePair
FROM
Devices d1
JOIN
Devices d2 ON d1.UserID < d2.UserID
WHERE
d1.DeviceType = 'iPhone' AND d2.DeviceType = 'iPad';
MySQL Solution
SELECT
DISTINCT LEAST(d1.UserID, d2.UserID) AS UserID1,
GREATEST(d1.UserID, d2.UserID) AS UserID2,
'iPhone + iPad' AS DevicePair
FROM
Devices d1
JOIN
Devices d2 ON d1.UserID < d2.UserID
WHERE
d1.DeviceType = 'iPhone' AND d2.DeviceType = 'iPad';
• Q.638
803
1000+ SQL Interview Questions & Answers | By Zero Analyst
UserID INT,
DeviceType VARCHAR(50),
PurchaseDate DATE
);
Datasets
-- Inserting data into Devices table
INSERT INTO Devices (DeviceID, UserID, DeviceType, PurchaseDate)
VALUES
(1, 1, 'iPhone', '2021-01-10'),
(2, 1, 'iPad', '2022-05-20'),
(3, 2, 'iPhone', '2021-06-15'),
(4, 2, 'MacBook', '2022-07-10'),
(5, 3, 'iPad', '2022-06-25');
Solutions
PostgreSQL Solution
SELECT
d1.UserID,
d1.DeviceType AS DeviceType1,
d2.DeviceType AS DeviceType2,
a.AppID
FROM
Devices d1
JOIN
Devices d2 ON d1.UserID = d2.UserID AND d1.DeviceID != d2.DeviceID
JOIN
Apps a ON a.DeviceID = d1.DeviceID
WHERE
d1.DeviceType != d2.DeviceType
AND EXISTS (
SELECT 1
FROM Apps a2
WHERE a2.DeviceID = d2.DeviceID AND a2.AppID = a.AppID
)
GROUP BY
d1.UserID, d1.DeviceType, d2.DeviceType, a.AppID;
MySQL Solution
SELECT
d1.UserID,
d1.DeviceType AS DeviceType1,
d2.DeviceType AS DeviceType2,
a.AppID
FROM
Devices d1
JOIN
Devices d2 ON d1.UserID = d2.UserID AND d1.DeviceID != d2.DeviceID
JOIN
Apps a ON a.DeviceID = d1.DeviceID
WHERE
d1.DeviceType != d2.DeviceType
AND EXISTS (
804
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT 1
FROM Apps a2
WHERE a2.DeviceID = d2.DeviceID AND a2.AppID = a.AppID
)
GROUP BY
d1.UserID, d1.DeviceType, d2.DeviceType, a.AppID;
• Q.639
Device Battery Health Check
Write a SQL query to find users who own multiple devices and have at least one device with
a battery health percentage of less than 30%. Return the UserID, UserName, the DeviceID,
DeviceType, and the BatteryHealth of the device with poor battery health.
Explanation
To solve this:
• Join the Devices table with the BatteryHealth table to track battery health percentages.
• Filter the users who own more than one device.
• Check if any device has a battery health of less than 30%.
• Return the UserID, DeviceID, DeviceType, and BatteryHealth of the devices with poor
battery health.
Datasets and SQL Schemas
-- Creating Devices table
CREATE TABLE Devices (
DeviceID INT,
UserID INT,
DeviceType VARCHAR(50),
PurchaseDate DATE
);
Datasets
-- Inserting data into Devices table
INSERT INTO Devices (DeviceID, UserID, DeviceType, PurchaseDate)
VALUES
(1, 1, 'iPhone', '2021-01-10'),
(2, 1, 'iPad', '2022-05-20'),
(3, 2, 'iPhone', '2021-06-15'),
(4, 2, 'MacBook', '2022-07-10'),
(5, 3, 'iPad', '2022-06-25');
Solutions
PostgreSQL Solution
SELECT
d.UserID,
u.UserName,
d.DeviceID,
d.DeviceType,
805
1000+ SQL Interview Questions & Answers | By Zero Analyst
b.BatteryHealth
FROM
Devices d
JOIN
BatteryHealth b ON d.DeviceID = b.DeviceID
JOIN
Users u ON d.UserID = u.UserID
WHERE
b.BatteryHealth < 30
AND EXISTS (
SELECT 1
FROM Devices d2
WHERE d2.UserID = d.UserID AND d2.DeviceID != d.DeviceID
)
ORDER BY
d.UserID, d.DeviceID;
MySQL Solution
SELECT
d.UserID,
u.UserName,
d.DeviceID,
d.DeviceType,
b.BatteryHealth
FROM
Devices d
JOIN
BatteryHealth b ON d.DeviceID = b.DeviceID
JOIN
Users u ON d.UserID = u.UserID
WHERE
b.BatteryHealth < 30
AND EXISTS (
SELECT 1
FROM Devices d2
WHERE d2.UserID = d.UserID AND d2.DeviceID != d.DeviceID
)
ORDER BY
d.UserID, d.DeviceID;
• Q.640
Device Interaction Log
Write a SQL query to find the most frequent device interaction (e.g., pairing an iPhone with
an Apple Watch) that occurs for each user. An interaction is defined as a user pairing two
devices of different types (e.g., iPhone and Apple Watch). Return the UserID, DeviceType1,
DeviceType2, and the InteractionCount (number of pairings).
Explanation
To solve this:
• Join the Devices table with itself to track pairings between devices of different types for
the same user.
• Group by the UserID and the device types involved in the interaction.
• Count the interactions and return the most frequent interaction for each user.
Datasets and SQL Schemas
-- Creating Devices table
CREATE TABLE Devices (
DeviceID INT,
UserID INT,
DeviceType VARCHAR(50),
PurchaseDate DATE
);
Datasets
806
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT
d1.UserID,
d1.DeviceType AS DeviceType1,
d2.DeviceType AS DeviceType2,
COUNT(*) AS InteractionCount
FROM
Devices d1
JOIN
Devices d2 ON d1.UserID = d2.UserID
WHERE
d1.DeviceType != d2.DeviceType
GROUP BY
d1.UserID, d1.DeviceType, d2.DeviceType
ORDER BY
InteractionCount DESC;
MySQL Solution
SELECT
d1.UserID,
d1.DeviceType AS DeviceType1,
d2.DeviceType AS DeviceType2,
COUNT(*) AS InteractionCount
FROM
Devices d1
JOIN
Devices d2 ON d1.UserID = d2.UserID
WHERE
d1.DeviceType != d2.DeviceType
GROUP BY
d1.UserID, d1.DeviceType, d2.DeviceType
ORDER BY
InteractionCount DESC;
Adobe
• Q.641
Find the total number of visitors (distinct users) to the website for each day.
Explanation
This query requires counting the number of distinct users (using DISTINCT) who visited the
website on each specific day. The data is stored in a website_visits table with user visits,
and the goal is to group the data by date.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE website_visits (
visit_id INT,
user_id INT,
visit_date DATE
);
• - Datasets
807
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT(DISTINCT ...) to count unique visitors
• Grouping data by date using GROUP BY
Solutions
• - PostgreSQL solution
SELECT visit_date, COUNT(DISTINCT user_id) AS total_visitors
FROM website_visits
GROUP BY visit_date
ORDER BY visit_date;
• - MySQL solution
SELECT visit_date, COUNT(DISTINCT user_id) AS total_visitors
FROM website_visits
GROUP BY visit_date
ORDER BY visit_date;
• Q.642
Find the average time spent on the website per session for each user in the last 30 days.
Explanation
This question asks to calculate the average time spent per session for each user. We will need
a session_duration column that contains the duration (in seconds or minutes) of each
session. The query will filter sessions within the last 30 days, group by user, and calculate the
average session duration.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE user_sessions (
session_id INT,
user_id INT,
session_duration INT, -- Duration in seconds
session_date DATE
);
• - Datasets
INSERT INTO user_sessions (session_id, user_id, session_duration, session_date)
VALUES
(1, 101, 300, '2022-12-01'),
(2, 102, 400, '2022-12-02'),
(3, 101, 500, '2022-12-05'),
(4, 103, 250, '2022-12-05'),
(5, 101, 350, '2022-12-10');
Learnings
• Using AVG() to calculate the average session duration
• Filtering data using date ranges
• Grouping data by user to calculate individual averages
Solutions
• - PostgreSQL solution
SELECT user_id, AVG(session_duration) AS avg_session_duration
FROM user_sessions
WHERE session_date >= CURRENT_DATE - INTERVAL '30 days'
808
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY user_id
ORDER BY avg_session_duration DESC;
• - MySQL solution
SELECT user_id, AVG(session_duration) AS avg_session_duration
FROM user_sessions
WHERE session_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY user_id
ORDER BY avg_session_duration DESC;
• Q.643
Question 3
Identify the top 3 pages most visited in the last 7 days.
Explanation
This question asks to find the most popular pages on the website based on the number of
visits. The page_visits table records visits to various pages, and we need to filter data by
the last 7 days and count the number of visits per page.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE page_visits (
visit_id INT,
page_id INT,
visit_date DATE
);
• - Datasets
INSERT INTO page_visits (visit_id, page_id, visit_date)
VALUES
(1, 201, '2022-12-05'),
(2, 202, '2022-12-06'),
(3, 201, '2022-12-06'),
(4, 203, '2022-12-06'),
(5, 202, '2022-12-07');
Learnings
• Using COUNT() to count the number of visits
• Filtering data by date to consider only the last 7 days
• Using LIMIT to get the top results
Solutions
• - PostgreSQL solution
SELECT page_id, COUNT(*) AS total_visits
FROM page_visits
WHERE visit_date >= CURRENT_DATE - INTERVAL '7 days'
GROUP BY page_id
ORDER BY total_visits DESC
LIMIT 3;
• - MySQL solution
SELECT page_id, COUNT(*) AS total_visits
FROM page_visits
WHERE visit_date >= CURDATE() - INTERVAL 7 DAY
GROUP BY page_id
ORDER BY total_visits DESC
LIMIT 3;
• Q.644
809
1000+ SQL Interview Questions & Answers | By Zero Analyst
Calculate the conversion rate of visitors who clicked on a campaign and then made a
purchase within 3 days.
Explanation
This query calculates the conversion rate for a specific campaign. It checks for users who
clicked on the campaign and then made a purchase within a 3-day window. The conversion
rate is calculated as the ratio of users who made a purchase to those who clicked on the
campaign.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE campaign_clicks (
click_id INT,
user_id INT,
campaign_id INT,
click_date DATE
);
• - Datasets
INSERT INTO campaign_clicks (click_id, user_id, campaign_id, click_date)
VALUES
(1, 101, 1, '2022-12-01'),
(2, 102, 1, '2022-12-02'),
(3, 103, 2, '2022-12-03'),
(4, 101, 1, '2022-12-05'),
(5, 104, 1, '2022-12-07');
• - Table creation
CREATE TABLE purchases (
purchase_id INT,
user_id INT,
purchase_date DATE
);
• - Datasets
INSERT INTO purchases (purchase_id, user_id, purchase_date)
VALUES
(1, 101, '2022-12-03'),
(2, 102, '2022-12-10'),
(3, 103, '2022-12-07'),
(4, 105, '2022-12-02');
Learnings
• Using JOIN to link clicks and purchases
• Filtering data based on time windows (3-day period)
• Calculating conversion rates
Solutions
• - PostgreSQL solution
WITH click_to_purchase AS (
SELECT c.user_id, c.campaign_id
FROM campaign_clicks c
LEFT JOIN purchases p ON c.user_id = p.user_id
WHERE p.purchase_date BETWEEN c.click_date AND c.click_date + INTERVAL '3 days'
)
SELECT campaign_id,
COUNT(DISTINCT user_id) AS total_conversions,
(COUNT(DISTINCT user_id) / (SELECT COUNT(DISTINCT user_id) FROM campaign_clicks W
HERE campaign_id = 1)) * 100 AS conversion_rate
FROM click_to_purchase
GROUP BY campaign_id;
• - MySQL solution
WITH click_to_purchase AS (
SELECT c.user_id, c.campaign_id
FROM campaign_clicks c
810
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY to group by user and month
• Using HAVING to filter users with purchases in multiple campaigns
• Date-based filtering using MONTH() and YEAR()
Solutions
• - PostgreSQL solution
SELECT user_id,
EXTRACT(MONTH FROM purchase_date) AS month,
EXTRACT(YEAR FROM purchase_date) AS year,
COUNT(DISTINCT campaign_id) AS campaign_count
FROM campaign_purchases
GROUP BY user_id, year, month
HAVING COUNT(DISTINCT campaign_id) >= 2;
• - MySQL solution
SELECT user_id,
MONTH(purchase_date) AS month,
YEAR(purchase_date) AS year,
COUNT(DISTINCT campaign_id) AS campaign_count
FROM campaign_purchases
GROUP BY user_id, year, month
HAVING COUNT(DISTINCT campaign_id) >= 2;
• Q.646
Find the average spend per user in each product category for the year 2022.
811
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
This query calculates the average amount spent per user in each product category during the
year 2022. We need to aggregate the total spend per user per category and then compute the
average for each category.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE product_purchases (
purchase_id INT,
user_id INT,
category_id INT,
purchase_date DATE,
purchase_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO product_purchases (purchase_id, user_id, category_id, purchase_date, purchas
e_amount)
VALUES
(1, 101, 1, '2022-01-01', 100.00),
(2, 101, 2, '2022-02-01', 200.00),
(3, 102, 1, '2022-03-01', 150.00),
(4, 103, 2, '2022-04-01', 250.00),
(5, 101, 1, '2022-05-01', 120.00),
(6, 102, 1, '2022-06-01', 300.00);
Learnings
• Using AVG() to calculate average spend per user
• Filtering data for a specific year
• Grouping by category to calculate averages per category
Solutions
• - PostgreSQL solution
SELECT category_id,
AVG(purchase_amount) AS avg_spend_per_user
FROM product_purchases
WHERE EXTRACT(YEAR FROM purchase_date) = 2022
GROUP BY category_id;
• - MySQL solution
SELECT category_id,
AVG(purchase_amount) AS avg_spend_per_user
FROM product_purchases
WHERE YEAR(purchase_date) = 2022
GROUP BY category_id;
• Q.647
Find the top 3 products with the highest total sales in the last quarter (3 months) of the
year.
Explanation
This query involves calculating the total sales for each product in the last quarter of the year
(October to December), then identifying the top 3 products with the highest sales. You will
need to filter the data for the relevant months and aggregate the sales by product.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE product_sales (
sale_id INT,
product_id INT,
sale_amount DECIMAL(10, 2),
sale_date DATE
812
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO product_sales (sale_id, product_id, sale_amount, sale_date)
VALUES
(1, 101, 300.00, '2022-10-15'),
(2, 102, 500.00, '2022-11-05'),
(3, 101, 200.00, '2022-11-10'),
(4, 103, 400.00, '2022-12-01'),
(5, 101, 150.00, '2022-12-20'),
(6, 104, 600.00, '2022-12-25');
Learnings
• Filtering data for a specific time period (last quarter)
• Aggregating sales by product
• Using ORDER BY and LIMIT to find the top products
Solutions
• - PostgreSQL solution
SELECT product_id,
SUM(sale_amount) AS total_sales
FROM product_sales
WHERE sale_date BETWEEN '2022-10-01' AND '2022-12-31'
GROUP BY product_id
ORDER BY total_sales DESC
LIMIT 3;
• - MySQL solution
SELECT product_id,
SUM(sale_amount) AS total_sales
FROM product_sales
WHERE sale_date BETWEEN '2022-10-01' AND '2022-12-31'
GROUP BY product_id
ORDER BY total_sales DESC
LIMIT 3;
• Q.648
Find the number of distinct users who purchased a specific product in each month of
2022.
Explanation
This query involves counting the distinct users who purchased a specific product (e.g.,
product_id = 101) in each month of 2022. The goal is to show the number of distinct users
by month for the chosen product. This is a useful metric for tracking engagement with a
product over time.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE product_purchases (
purchase_id INT,
user_id INT,
product_id INT,
purchase_date DATE
);
• - Datasets
INSERT INTO product_purchases (purchase_id, user_id, product_id, purchase_date)
VALUES
(1, 101, 101, '2022-01-15'),
(2, 102, 101, '2022-02-10'),
(3, 103, 101, '2022-02-20'),
(4, 104, 101, '2022-03-05'),
(5, 105, 101, '2022-04-10'),
(6, 101, 101, '2022-04-15');
813
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT(DISTINCT ...) to count unique users
• Grouping by month and year using EXTRACT() or MONTH()
• Filtering for a specific product
Solutions
• - PostgreSQL solution
SELECT EXTRACT(MONTH FROM purchase_date) AS month,
COUNT(DISTINCT user_id) AS distinct_users
FROM product_purchases
WHERE product_id = 101 AND EXTRACT(YEAR FROM purchase_date) = 2022
GROUP BY month
ORDER BY month;
• - MySQL solution
SELECT MONTH(purchase_date) AS month,
COUNT(DISTINCT user_id) AS distinct_users
FROM product_purchases
WHERE product_id = 101 AND YEAR(purchase_date) = 2022
GROUP BY month
ORDER BY month;
• Q.649
Identify the top 5 products that have the highest average sales amount per transaction
in 2022.
Explanation
In this query, we calculate the average sale amount for each product in 2022. The goal is to
identify the top 5 products that generated the highest average sales per transaction. The AVG()
function will help calculate the average sale amount for each product.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE product_sales (
sale_id INT,
product_id INT,
sale_amount DECIMAL(10, 2),
sale_date DATE
);
• - Datasets
INSERT INTO product_sales (sale_id, product_id, sale_amount, sale_date)
VALUES
(1, 101, 200.00, '2022-01-10'),
(2, 102, 500.00, '2022-03-15'),
(3, 103, 700.00, '2022-05-20'),
(4, 101, 300.00, '2022-07-12'),
(5, 104, 450.00, '2022-08-25'),
(6, 101, 350.00, '2022-10-30'),
(7, 102, 600.00, '2022-11-01');
Learnings
• Using AVG() to calculate the average sales amount
• Filtering data for a specific year
• Sorting the results to find the top products based on average sales
Solutions
• - PostgreSQL solution
SELECT product_id,
AVG(sale_amount) AS avg_sale_amount
FROM product_sales
814
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to link clicks with conversions
• Filtering actions based on a specific time window (7 days)
• Aggregating data based on campaign performance
• Using COUNT(DISTINCT ...) to count unique users
Solutions
• - PostgreSQL solution
WITH campaign_conversions AS (
SELECT ci.campaign_id, ci.user_id
FROM campaign_interactions ci
JOIN campaign_interactions cii
ON ci.user_id = cii.user_id AND ci.campaign_id = cii.campaign_id
WHERE ci.interaction_type = 'Clicked'
AND cii.interaction_type = 'Converted'
AND cii.interaction_date BETWEEN ci.interaction_date AND ci.interaction_date + INTER
VAL '7 days'
)
815
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT(*) to calculate the number of sessions
• Filtering data based on the year (2022)
• Grouping by product and month
• Using AVG() to calculate the average number of sessions per user
Solutions
• - PostgreSQL solution
SELECT product_id,
EXTRACT(MONTH FROM session_date) AS month,
AVG(session_count) AS avg_sessions_per_user
816
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM (
SELECT user_id, product_id, EXTRACT(MONTH FROM session_date) AS month, COUNT(*) AS s
ession_count
FROM product_sessions
WHERE EXTRACT(YEAR FROM session_date) = 2022
GROUP BY user_id, product_id, month
) AS monthly_sessions
GROUP BY product_id, month
ORDER BY product_id, month;
• - MySQL solution
SELECT product_id,
MONTH(session_date) AS month,
AVG(session_count) AS avg_sessions_per_user
FROM (
SELECT user_id, product_id, MONTH(session_date) AS month, COUNT(*) AS session_count
FROM product_sessions
WHERE YEAR(session_date) = 2022
GROUP BY user_id, product_id, month
) AS monthly_sessions
GROUP BY product_id, month
ORDER BY product_id, month;
• Q.652
Find the correlation between user activity (clicks and views) and product adoption rates
for Adobe Creative Cloud (e.g., Photoshop, Illustrator) for a given campaign.
Explanation
This question involves calculating the relationship between user activities (clicks and views)
and product adoption rates (the number of users who subscribed or started using the product)
for a specific campaign. The correlation metric is needed to understand the effect of user
engagement on product adoption. You will need to join activity data with product adoption
data and calculate the correlation.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE campaign_activity (
activity_id INT,
user_id INT,
campaign_id INT,
activity_type VARCHAR(50), -- 'Clicked' or 'Viewed'
activity_date DATE
);
• - Datasets
INSERT INTO campaign_activity (activity_id, user_id, campaign_id, activity_type, activit
y_date)
VALUES
(1, 101, 1, 'Clicked', '2022-01-01'),
(2, 101, 1, 'Viewed', '2022-01-02'),
(3, 102, 1, 'Clicked', '2022-01-05'),
(4, 103, 1, 'Viewed', '2022-01-06'),
(5, 104, 1, 'Clicked', '2022-01-07');
• - Table creation
CREATE TABLE product_adoption (
adoption_id INT,
user_id INT,
product_id INT,
campaign_id INT,
adoption_date DATE
);
• - Datasets
INSERT INTO product_adoption (adoption_id, user_id, product_id, campaign_id, adoption_da
te)
VALUES
(1, 101, 1, 1, '2022-01-10'),
(2, 102, 2, 1, '2022-01-15'),
817
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Joining activity data with adoption data
• Counting different activity types (clicks, views)
• Analyzing user engagement and adoption for correlation
• Correlation analysis techniques (though SQL itself won't calculate correlation directly, we
can prepare the data for such analysis)
Solutions
• - PostgreSQL solution
SELECT a.campaign_id,
COUNT(DISTINCT CASE WHEN ac.activity_type = 'Clicked' THEN ac.user_id END) AS cli
cks,
COUNT(DISTINCT CASE WHEN ac.activity_type = 'Viewed' THEN ac.user_id END) AS view
s,
COUNT(DISTINCT pa.user_id) AS product_adoptions
FROM campaign_activity ac
LEFT JOIN product_adoption pa ON ac.user_id = pa.user_id AND ac.campaign_id = pa.campaig
n_id
WHERE ac.campaign_id = 1
GROUP BY a.campaign_id;
• - MySQL solution
SELECT ac.campaign_id,
COUNT(DISTINCT CASE WHEN ac.activity_type = 'Clicked' THEN ac.user_id END) AS cli
cks,COUNT(DISTINCT CASE WHEN ac.activity_type = 'Viewed' THEN ac.user_id END) AS views,C
OUNT(DISTINCT pa.user_id) AS product_adoptions FROM campaign_activity acLEFT JOIN produc
t_adoption pa ON ac.user_id = pa.user_id AND ac.campaign_id = pa.campaign_idWHERE ac.cam
paign_id = 1
GROUP BY ac.campaign_id;
• Q.653
Learnings
818
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT customer_id,
SUM(revenue) AS total_spent
FROM adobe_transactions
WHERE product != 'Photoshop'
AND customer_id IN (
SELECT DISTINCT customer_id
FROM adobe_transactions
WHERE product = 'Photoshop'
)
GROUP BY customer_id
ORDER BY customer_id;
• - MySQL solution
SELECT customer_id,
SUM(revenue) AS total_spent
FROM adobe_transactions
WHERE product != 'Photoshop'
AND customer_id IN (
SELECT DISTINCT customer_id
FROM adobe_transactions
WHERE product = 'Photoshop'
)
GROUP BY customer_id
ORDER BY customer_id;
• Q.654
Adobe User Behavior
Explanation
The task is to identify active users of Adobe products who have used any product more than
4 times in a month and have provided reviews with a rating of 4 stars or higher. The
solution should return the user_id and name of those users who meet both conditions.
819
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO usage (user_id, product, usage_date)
VALUES
(123, 'Photoshop', '2022-06-05'),
(123, 'Photoshop', '2022-06-06'),
(123, 'Photoshop', '2022-06-07'),
(123, 'Photoshop', '2022-06-08'),
(123, 'Photoshop', '2022-06-09'),
(265, 'Lightroom', '2022-07-10'),
(265, 'Lightroom', '2022-07-11'),
(265, 'Lightroom', '2022-07-12'),
(362, 'Illustrator', '2022-07-17'),
(362, 'Illustrator', '2022-07-18');
• - Table creation
CREATE TABLE reviews (
review_id INT,
user_id INT,
submit_date DATE,
product_id VARCHAR(100),
stars INT
);
• - Datasets
INSERT INTO reviews (review_id, user_id, submit_date, product_id, stars)
VALUES
(6171, 123, '2022-06-08', 'Photoshop', 4),
(5293, 362, '2022-07-18', 'Illustrator', 5),
(7802, 265, '2022-07-12', 'Lightroom', 4);
Learnings
• Using INNER JOIN to combine users and their activity data
• Using HAVING COUNT() to filter users based on activity frequency
• Using EXISTS to check for users who have relevant reviews
• Filtering data based on a dynamic month condition
• Aggregating usage data per user within a specific time frame
Solutions
• - PostgreSQL solution
SELECT u.user_id, u.name
FROM users u
INNER JOIN (
SELECT user_id
FROM usage
WHERE EXTRACT(MONTH FROM usage_date) = EXTRACT(MONTH FROM CURRENT_DATE)
GROUP BY user_id
HAVING COUNT(DISTINCT usage_date) > 4
) act ON u.user_id = act.user_id
WHERE EXISTS (
SELECT 1
FROM reviews r
WHERE r.user_id = u.user_id AND r.stars >= 4
);
• - MySQL solution
SELECT u.user_id, u.name
FROM users u
INNER JOIN (
SELECT user_id
FROM usage
WHERE MONTH(usage_date) = MONTH(CURRENT_DATE)
GROUP BY user_id
HAVING COUNT(DISTINCT usage_date) > 4
) act ON u.user_id = act.user_id
WHERE EXISTS (
820
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT 1
FROM reviews r
WHERE r.user_id = u.user_id AND r.stars >= 4
);
• Q.655
Average Monthly Subscription Duration
Explanation
The task is to calculate the average subscription duration for each Adobe product in terms
of days. The subscription duration is the difference between the start_date and end_date
for each subscription. After calculating the duration for each subscription, the average
duration for each product should be computed.
Learnings
• Calculating the duration between two dates using DATEDIFF()
• Using AVG() to compute the average of the durations
• JOIN operation between subscriptions and products tables
• Grouping results by product name
Solutions
• - PostgreSQL solution
SELECT
P.product_name,
AVG(DATE_PART('day', S.end_date - S.start_date)) AS avg_subscription_days
FROM
subscriptions S
JOIN
821
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.656
Calculate the Average Spending of Customers
Explanation
The task is to calculate the average order value for each customer by joining the Customers
and Orders tables. The average order value for a customer is the average of the
total_amount from all their orders. The result should display each customer's customer_id,
first_name, last_name, and their corresponding average order value.
Learnings
• Using AVG() to calculate the average of a numeric column
• Joining two tables with JOIN based on a common column (customer_id)
822
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT c.customer_id, c.first_name, c.last_name, AVG(o.total_amount) AS avg_order_value
FROM Customers c
JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name;
• - MySQL solution
SELECT c.customer_id, c.first_name, c.last_name, AVG(o.total_amount) AS avg_order_value
FROM Customers c
JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name;
• Q.657
Calculate Product Rating and Popularity Based on Review Data
Explanation
The task is to calculate the popularity and average rating for each product based on review
data. The popularity is defined as the square root of the number of reviews for a product,
rounded to the nearest whole number. However, if a product has less than 10 reviews, it
should not be included in the popularity calculation, but its average rating should still be
computed. Products should be ranked by popularity and then by average rating in descending
order.
Learnings
• Using COUNT(*) to count the number of reviews for each product
• Calculating popularity with the square root of the review count, using ROUND()
• Using AVG() to calculate the average rating for each product
• Applying conditions to handle products with fewer than 10 reviews
• Sorting the result by popularity and average rating
Solutions
823
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
SELECT product_id,
CASE
WHEN COUNT(*) >= 10 THEN ROUND(SQRT(COUNT(*)))
ELSE NULL
END AS popularity,
ROUND(AVG(stars)::numeric, 2) AS avg_rating
FROM reviews
GROUP BY product_id
ORDER BY popularity DESC NULLS LAST, avg_rating DESC;
• - MySQL solution
SELECT product_id,
CASE
WHEN COUNT(*) >= 10 THEN ROUND(SQRT(COUNT(*)))
ELSE NULL
END AS popularity,
ROUND(AVG(stars), 2) AS avg_rating
FROM reviews
GROUP BY product_id
ORDER BY popularity DESC NULLS LAST, avg_rating DESC;
Explanation of Solution
• COUNT(*) is used to calculate the number of reviews for each product.
• SQRT(COUNT(*)) calculates the square root of the number of reviews, and ROUND() rounds
it to the nearest integer.
• AVG(stars) calculates the average rating for each product, rounded to two decimal places.
• The CASE condition ensures that popularity is only calculated if the product has at least 10
reviews; otherwise, it is set to NULL.
• The result is sorted first by popularity (in descending order), and then by average rating
in descending order for products with the same popularity.
• Q.658
Analyzing AI-Based Product Usage Patterns
Explanation
Adobe uses AI to monitor and analyze how users engage with AI-driven tools across its
products. The task is to identify users who have used AI tools more than 5 times within a
given month and then categorize them by the product they used. Additionally, calculate the
total usage count and average session duration for these users.
824
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT() to calculate the number of times a tool has been used by a user
• Calculating the total usage count and average session duration for each user
• Grouping data by product and filtering for users with more than 5 tool uses within a month
• Sorting results to identify the most frequent users of AI tools
Solutions
• - PostgreSQL solution
SELECT u.user_id, p.product_id, COUNT(*) AS usage_count, AVG(session_duration) AS avg_se
ssion_duration
FROM ai_usage u
JOIN products p ON u.product_id = p.product_id
WHERE u.usage_date BETWEEN '2023-10-01' AND '2023-10-31'
GROUP BY u.user_id, p.product_id
HAVING COUNT(*) > 5
ORDER BY usage_count DESC, avg_session_duration DESC;
• - MySQL solution
SELECT u.user_id, p.product_id, COUNT(*) AS usage_count, AVG(session_duration) AS avg_se
ssion_duration
FROM ai_usage u
JOIN products p ON u.product_id = p.product_id
WHERE u.usage_date BETWEEN '2023-10-01' AND '2023-10-31'
GROUP BY u.user_id, p.product_id
HAVING COUNT(*) > 5
ORDER BY usage_count DESC, avg_session_duration DESC;
• Q.659
AI Tool Effectiveness in Adobe Products
Explanation
Adobe wants to evaluate how AI tools are impacting user performance across different
products. For this, we need to calculate the average rating of users who interacted with AI
tools and compare it against users who did not. The query should exclude users with fewer
than 3 AI interactions. Results should be ordered by the difference in ratings between AI and
non-AI users.
825
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Differentiating users based on whether they used AI tools or not
• Counting AI tool interactions with COUNT()
• Filtering users with at least 3 AI interactions
• Using AVG() to calculate ratings for AI and non-AI users
• Comparing average ratings for both groups and sorting results
Solutions
• - PostgreSQL solution
SELECT
u.product_id,
AVG(CASE WHEN ai.usage_id IS NOT NULL THEN ur.rating ELSE NULL END) AS ai_avg_rating
,
AVG(CASE WHEN ai.usage_id IS NULL THEN ur.rating ELSE NULL END) AS non_ai_avg_rating
,
(AVG(CASE WHEN ai.usage_id IS NOT NULL THEN ur.rating ELSE NULL END) -
AVG(CASE WHEN ai.usage_id IS NULL THEN ur.rating ELSE NULL END)) AS rating_differen
ce
FROM user_ratings ur
LEFT JOIN ai_usage ai ON ur.user_id = ai.user_id
GROUP BY u.product_id
HAVING COUNT(ai.usage_id) >= 3
ORDER BY rating_difference DESC;
• - MySQL solution
SELECT
ur.product_id,
AVG(CASE WHEN ai.usage_id IS NOT NULL THEN ur.rating ELSE NULL END) AS ai_avg_rating
,
AVG(CASE WHEN ai.usage_id IS NULL THEN ur.rating ELSE NULL END) AS non_ai_avg_rating
,
(AVG(CASE WHEN ai.usage_id IS NOT NULL THEN ur.rating ELSE NULL END) -
AVG(CASE WHEN ai.usage_id IS NULL THEN ur.rating ELSE NULL END)) AS rating_differen
ce
FROM user_ratings ur
LEFT JOIN ai_usage ai ON ur.user_id = ai.user_id
GROUP BY ur.product_id
HAVING COUNT(ai.usage_id) >= 3
ORDER BY rating_difference DESC;
• Q.660
Find the user(s) who spent the most money in consecutive transactions (in a single month).
Return the user ID, the month, and the sum of the consecutive transactions, where the
826
1000+ SQL Interview Questions & Answers | By Zero Analyst
consecutive transactions are defined as a period of two or more transactions with no gaps of
more than 1 day in between.
Explanation
This complex query requires identifying "consecutive" transactions, which means
transactions where the gap between them is no more than 1 day. We then need to find the
highest spend for a user in such consecutive transaction periods within each month. The
challenge is to identify groups of consecutive transactions and calculate their total spend.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
VALUES
(1, 101, '2022-01-01', 100.00),
(2, 101, '2022-01-02', 150.00),
(3, 101, '2022-01-05', 200.00),
(4, 101, '2022-01-06', 50.00),
(5, 102, '2022-01-01', 300.00),
(6, 102, '2022-01-02', 400.00),
(7, 102, '2022-01-04', 500.00),
(8, 103, '2022-01-03', 250.00),
(9, 103, '2022-01-05', 350.00);
Learnings
• Use of window functions like LAG() and LEAD() to identify consecutive rows
• Use of date difference (DATEDIFF() or similar) to identify gaps
• Grouping consecutive transactions and calculating sums
• Dealing with transaction data that spans across multiple time periods
Solutions
• - PostgreSQL solution
WITH consecutive_transactions AS (
SELECT user_id, transaction_date, transaction_amount,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY transaction_date) -
ROW_NUMBER() OVER (PARTITION BY user_id, EXTRACT(MONTH FROM transaction_date)
ORDER BY transaction_date) AS grp
FROM transactions
)
SELECT user_id,
EXTRACT(MONTH FROM transaction_date) AS month,
SUM(transaction_amount) AS total_spent
FROM consecutive_transactions
GROUP BY user_id, month, grp
ORDER BY total_spent DESC
LIMIT 1;
• - MySQL solution
WITH consecutive_transactions AS (
SELECT user_id, transaction_date, transaction_amount,
@grp := IF(@prev_user_id = user_id AND DATEDIFF(transaction_date, @prev_date)
<= 1, @grp, @grp + 1) AS grp,
@prev_user_id := user_id,
@prev_date := transaction_date
FROM transactions
ORDER BY user_id, transaction_date
)
827
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT user_id,
MONTH(transaction_date) AS month,
SUM(transaction_amount) AS total_spent
FROM consecutive_transactions
GROUP BY user_id, month, grp
ORDER BY total_spent DESC
LIMIT 1;
Samsung
• Q.661
Question
Write a SQL query to calculate the average rating (stars) for each product per month. The
submit_date is saved in the 'YYYY-MM-DD HH:MI:SS' format.
Explanation
You need to:
• Extract the month and year from the submit_date.
• Calculate the average rating (stars) for each product per month.
• Group the results by product_id and the extracted month/year combination.
• Order the result by month/year and product_id to make it easy to interpret.
Datasets and SQL Schemas
• - Reviews table creation
CREATE TABLE reviews (
review_id INT,
user_id INT,
submit_date TIMESTAMP,
product_id INT,
stars INT
);
• - Insert sample data into reviews table
INSERT INTO reviews (review_id, user_id, submit_date, product_id, stars)
VALUES
(6171, 123, '2022-06-08 00:00:00', 50001, 4),
(7802, 265, '2022-06-10 00:00:00', 69852, 4),
(5293, 362, '2022-06-18 00:00:00', 50001, 3),
(6352, 192, '2022-07-26 00:00:00', 69852, 3),
(4517, 981, '2022-07-05 00:00:00', 69852, 2);
Learnings
• date_part() to extract parts of a timestamp (e.g., month, year)
• AVG() to calculate the average of the ratings
• GROUP BY to group by product and time period
• ORDER BY to sort the results in chronological order by month and product
Solutions
• - PostgreSQL solution
SELECT
EXTRACT(YEAR FROM submit_date) AS year,
EXTRACT(MONTH FROM submit_date) AS month,
product_id,
AVG(stars) AS avg_stars
FROM
reviews
GROUP BY
year, month, product_id
ORDER BY
year, month, product_id;
828
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT
YEAR(submit_date) AS year,
MONTH(submit_date) AS month,
product_id,
AVG(stars) AS avg_stars
FROM
reviews
GROUP BY
year, month, product_id
ORDER BY
year, month, product_id;
829
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• JOIN operations to link tables together based on related keys
• SUM() to calculate the total number of units sold
• GROUP BY to aggregate data by region and model
• ORDER BY to sort the results based on the total units sold per region
Solutions
• - PostgreSQL and MySQL solution
SELECT r.region_name,
m.model_name,
SUM(s.units_sold) AS total_sold
FROM sales s
JOIN regions r ON s.region_id = r.region_id
JOIN smartphone_models m ON s.model_id = m.model_id
GROUP BY r.region_name, m.model_name
ORDER BY r.region_name, total_sold DESC;
830
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to:
• Join the sales, smartphone_models, and products tables.
• Filter for sales in 2022.
• Calculate the average price for each Galaxy model by using the AVG() function.
• Make sure that only models containing "Galaxy" in their name are considered.
831
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Joins: The query involves joining the sales, smartphone_models, and products tables to
link model information and product prices with sales data.
• Filtering by Year: Use the EXTRACT(YEAR FROM sale_date) = 2022 or
YEAR(sale_date) = 2022 condition to focus on sales that happened in 2022.
• Average Price Calculation: Use the AVG(price_usd) function to compute the average
purchase price for each Galaxy model.
• Filtering by Product Name: You can filter the smartphone_models table for models that
contain "Galaxy" in the model_name column.
Solutions
• - PostgreSQL solution
SELECT
sm.model_name,
AVG(p.price_usd) AS avg_purchase_price
FROM
sales s
JOIN
smartphone_models sm ON s.model_id = sm.model_id
JOIN
products p ON sm.model_id = p.model_id
WHERE
EXTRACT(YEAR FROM s.sale_date) = 2022
AND sm.model_name LIKE '%Galaxy%'
GROUP BY
sm.model_name
ORDER BY
avg_purchase_price DESC;
• - MySQL solution
SELECT
sm.model_name,
AVG(p.price_usd) AS avg_purchase_price
FROM
sales s
JOIN
smartphone_models sm ON s.model_id = sm.model_id
JOIN
products p ON sm.model_id = p.model_id
WHERE
YEAR(s.sale_date) = 2022
AND sm.model_name LIKE '%Galaxy%'
GROUP BY
sm.model_name
ORDER BY
avg_purchase_price DESC;
• Q.664
Question
Find the Top 3 Samsung Products Based on Customer Ratings
832
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to find the top 3 Samsung products based on average customer ratings.
Explanation
You need to:
• Join the reviews and products tables to link product names with customer ratings.
• Group the results by product_id to calculate the average rating (AVG(stars)).
• Sort the results by average rating in descending order to get the top-rated products.
• Limit the results to the top 3 products.
-- Reviews Data
INSERT INTO reviews (review_id, customer_id, product_id, stars)
VALUES
(1, 1, 101, 5), -- Galaxy S21 - 5 stars
(2, 2, 102, 4), -- Galaxy Note20 - 4 stars
(3, 3, 103, 5), -- Galaxy Z Fold3 - 5 stars
(4, 4, 104, 3), -- Galaxy S20 - 3 stars
(5, 5, 105, 4), -- Galaxy A52 - 4 stars
(6, 6, 106, 5), -- Galaxy Z Flip3 - 5 stars
(7, 7, 107, 4), -- Galaxy A72 - 4 stars
(8, 8, 108, 4), -- Galaxy S22 - 4 stars
(9, 9, 109, 2), -- Galaxy Note10 - 2 stars
(10, 10, 101, 5), -- Galaxy S21 - 5 stars
(11, 11, 102, 3), -- Galaxy Note20 - 3 stars
(12, 12, 103, 5), -- Galaxy Z Fold3 - 5 stars
(13, 13, 104, 4), -- Galaxy S20 - 4 stars
(14, 14, 105, 5), -- Galaxy A52 - 5 stars
(15, 15, 106, 4); -- Galaxy Z Flip3 - 4 stars
Learnings
• Joining Tables: The query involves joining the reviews and products tables on
product_id to link the product names with ratings.
• Grouping Data: The GROUP BY clause is used to group by product_id so that the average
rating can be calculated for each product.
833
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Aggregating Data: The AVG(stars) function is used to calculate the average rating for
each product.
• Sorting and Limiting Results: The ORDER BY clause sorts the products by their average
rating in descending order, and the LIMIT clause restricts the results to the top 3 products.
Solutions
• - PostgreSQL solution
SELECT
p.product_name,
AVG(r.stars) AS avg_rating
FROM
reviews r
JOIN
products p ON r.product_id = p.product_id
GROUP BY
p.product_name
ORDER BY
avg_rating DESC
LIMIT 3;
• - MySQL solution
SELECT
p.product_name,
AVG(r.stars) AS avg_rating
FROM
reviews r
JOIN
products p ON r.product_id = p.product_id
GROUP BY
p.product_name
ORDER BY
avg_rating DESC
LIMIT 3;
• Q.665
Question
Identify Customers Who Have Purchased More Than One Model in the Last Year
Write a SQL query to identify customers who have purchased more than one Samsung
Galaxy model in the last year.
Explanation
You need to:
• Join the customers and purchases tables to get customer information and their purchase
history.
• Filter for purchases made in the last year (i.e., the past 12 months).
• Use the HAVING clause to identify customers who have purchased more than one distinct
model.
• Return the customer's name, email, and the count of distinct Galaxy models purchased.
834
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Purchases Data
INSERT INTO purchases (purchase_id, customer_id, product, purchase_date)
VALUES
(1, 1, 'Galaxy S21', '2023-01-15'),
(2, 1, 'Galaxy Z Flip3', '2023-02-10'),
(3, 2, 'Galaxy Note20', '2022-12-05'),
(4, 2, 'Galaxy A72', '2022-10-30'),
(5, 3, 'Galaxy S21', '2023-03-05'),
(6, 3, 'Galaxy Note20', '2023-04-01'),
(7, 4, 'Galaxy S20', '2023-06-25'),
(8, 4, 'Galaxy Z Fold3', '2023-07-15'),
(9, 5, 'Galaxy A52', '2022-09-10'),
(10, 5, 'Galaxy A72', '2022-11-20');
Learnings
• Joining Tables: The query involves joining the customers and purchases tables to get
the purchase history for each customer.
• Filtering by Date: The WHERE clause filters the purchases to include only those made in
the last year. You can use the CURRENT_DATE function and subtract INTERVAL 1 YEAR to get
the last 12 months.
• Grouping and Aggregating: The GROUP BY clause groups purchases by customer_id to
calculate the number of distinct products purchased per customer.
• Using HAVING: The HAVING clause ensures that only customers who purchased more
than one distinct product are selected.
Solutions
• - PostgreSQL solution
SELECT
c.name,
c.email,
COUNT(DISTINCT p.product) AS distinct_products_purchased
FROM
customers c
JOIN
purchases p ON c.customer_id = p.customer_id
WHERE
p.purchase_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY
c.customer_id
HAVING
COUNT(DISTINCT p.product) > 1;
• - MySQL solution
SELECT
c.name,
c.email,
COUNT(DISTINCT p.product) AS distinct_products_purchased
FROM
customers c
835
1000+ SQL Interview Questions & Answers | By Zero Analyst
JOIN
purchases p ON c.customer_id = p.customer_id
WHERE
p.purchase_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY
c.customer_id
HAVING
COUNT(DISTINCT p.product) > 1;
• Q.666
Question
Calculate the Average Sale Price for Each Samsung Product
As a Data Analyst at Samsung, you are asked to analyze the sale data. For each product,
calculate the average sale price per month, for the year 2022. Assume today is July 31st,
2022.
Explanation
Calculate the average sale price for each product per month, filtering for sales from the year
2022. Group the results by month and product ID, then order by month and product ID.
Learnings
• Use of EXTRACT function to extract month and year from sale_date.
• Aggregation with AVG to calculate the average price.
• Grouping data by multiple columns (month and product_id).
• Filtering data using WHERE based on the year.
Solutions
• - PostgreSQL solution
SELECT
EXTRACT(MONTH FROM sale_date) AS month,
product_id,
AVG(price) AS avg_price
FROM
sales
WHERE
EXTRACT(YEAR FROM sale_date) = 2022
GROUP BY
month, product_id
ORDER BY 1, 2;
• - MySQL solution
836
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
MONTH(sale_date) AS month,
product_id,
AVG(price) AS avg_price
FROM
sales
WHERE
YEAR(sale_date) = 2022
GROUP BY
month, product_id
ORDER BY 1, 2;
• Q.667
Question
Write a SQL query to filter the customers who purchased the Samsung Galaxy S21 in the
year 2022 and are signed up for the Samsung Members program. The query should return
their contact details for a promotional email campaign.
Explanation
You need to:
• Join the customers table with the purchases table on the customer_id field.
• Filter the data to include only customers who purchased the Galaxy S21 in the year 2022.
• Check if the customer is signed up for the Samsung Members program (where
user_signed_up = 1).
• Select the relevant contact details (name, email) from the customers table.
Datasets and SQL Schemas
• - Customers table creation
CREATE TABLE customers (
customer_id INT,
name VARCHAR(255),
email VARCHAR(255),
user_signed_up INT -- 1 for signed up, 0 for not signed up
);
• - Purchases table creation
CREATE TABLE purchases (
purchase_id INT,
customer_id INT,
product VARCHAR(255),
year INT
);
• - Insert sample data into customers table
INSERT INTO customers (customer_id, name, email, user_signed_up)
VALUES
(9615, 'James Smith', '[email protected]', 1),
(7021, 'Samantha Brown', '[email protected]', 1),
(8523, 'John Doe', '[email protected]', 0),
(6405, 'Anna Johnson', '[email protected]', 1),
(9347, 'Emma Black', '[email protected]', 0);
• - Insert sample data into purchases table
INSERT INTO purchases (purchase_id, customer_id, product, year)
VALUES
(5171, 9615, 'S21', 2022),
(7802, 7021, 'S21', 2022),
(8235, 8523, 'Note20', 2022),
(6320, 6405, 'S21', 2021),
(7395, 9347, 'S21', 2022);
Learnings
• JOIN operations to merge customer and purchase data
• WHERE clause for filtering based on multiple conditions
837
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Find the Most Popular Samsung Smartphone Model in Each Region
Write a SQL query to find the most popular Samsung smartphone model in each region based
on the total number of units sold. The sales table contains data on region_id, model_id, and
units_sold.
Explanation
You need to:
• Join the sales table with the regions and smartphone_models tables.
• Aggregate the total sales for each model by region.
• Use ORDER BY to sort the models by the total number of units sold in each region and
LIMIT to select only the most popular model per region.
838
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
-- Regions Data
INSERT INTO regions (region_id, region_name)
VALUES
(1, 'North America'),
(2, 'Europe'),
(3, 'Asia'),
(4, 'Australia');
-- Sales Data
INSERT INTO sales (sale_id, region_id, model_id, units_sold)
VALUES
-- North America
(1, 1, 101, 1500), -- Galaxy S21
(2, 1, 102, 1200), -- Galaxy Note20
(3, 1, 104, 1800), -- Galaxy S20
(4, 1, 105, 2200), -- Galaxy A52
(5, 1, 106, 1400), -- Galaxy Z Flip3
(6, 1, 107, 1000), -- Galaxy A72
-- Europe
(7, 2, 101, 800), -- Galaxy S21
(8, 2, 103, 1500), -- Galaxy Z Fold3
(9, 2, 104, 900), -- Galaxy S20
(10, 2, 108, 1800), -- Galaxy S22
(11, 2, 109, 1300), -- Galaxy Note10
-- Asia
(12, 3, 101, 2000), -- Galaxy S21
(13, 3, 103, 1200), -- Galaxy Z Fold3
(14, 3, 105, 1600), -- Galaxy A52
(15, 3, 106, 1900), -- Galaxy Z Flip3
(16, 3, 108, 1700), -- Galaxy S22
-- Australia
(17, 4, 102, 1100), -- Galaxy Note20
(18, 4, 104, 950), -- Galaxy S20
(19, 4, 106, 800), -- Galaxy Z Flip3
(20, 4, 107, 1400), -- Galaxy A72
(21, 4, 108, 1300); -- Galaxy S22
Learnings
• Using JOIN to combine multiple tables based on common columns (region_id,
model_id).
• Aggregating sales data using SUM to calculate the total units sold.
• Sorting results with ORDER BY and limiting the output with LIMIT to select the top result
per region.
• The use of grouping (GROUP BY) to calculate the total units sold per model and region.
Solutions
• - PostgreSQL solution
WITH RankedModels AS (
SELECT
839
1000+ SQL Interview Questions & Answers | By Zero Analyst
r.region_name,
m.model_name,
SUM(s.units_sold) AS total_units_sold,
ROW_NUMBER() OVER (PARTITION BY r.region_name ORDER BY SUM(s.units_sold) DESC) AS ra
nk
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models m ON s.model_id = m.model_id
GROUP BY
r.region_name, m.model_name
)
SELECT
region_name,
model_name,
total_units_sold
FROM
RankedModels
WHERE
rank = 1;
• - MySQL solution
WITH RankedModels AS (
SELECT
r.region_name,
m.model_name,
SUM(s.units_sold) AS total_units_sold,
RANK() OVER (PARTITION BY r.region_name ORDER BY SUM(s.units_sold) DESC) AS rank
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models m ON s.model_id = m.model_id
GROUP BY
r.region_name, m.model_name
)
SELECT
region_name,
model_name,
total_units_sold
FROM
RankedModels
WHERE
rank = 1;
• Q.669
Question
Finding all customers who bought 'Galaxy' series
You are the Data Analyst at Samsung and your manager has asked you to find all customers
who have purchased from the 'Galaxy' series. Samsung has multiple product lines, but you
are only interested in customers who purchased any product with 'Galaxy' in its product
name. For this task, use the 'customers' and 'products' tables. The 'products' table has a
column called 'product_name' where all the product names are stored.
Explanation
To find customers who bought products from the 'Galaxy' series, use the SQL LIKE operator
with the % wildcard to match any product names containing 'Galaxy'. Join the 'customers' and
'products' tables on customer_id.
840
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE customers (
customer_id INT,
first_name VARCHAR(50),
last_name VARCHAR(50)
);
Learnings
• Use of LIKE with wildcard % to search for partial matches in a string.
• SQL JOIN operation to combine data from multiple tables based on a related column
(customer_id).
• Filtering data based on conditions using WHERE.
Solutions
• - PostgreSQL solution
SELECT c.first_name, c.last_name, p.product_name
FROM customers c
JOIN products p ON c.customer_id = p.customer_id
WHERE p.product_name LIKE '%Galaxy%';
• - MySQL solution
SELECT c.first_name, c.last_name, p.product_name
FROM customers c
JOIN products p ON c.customer_id = p.customer_id
WHERE p.product_name LIKE '%Galaxy%';
• Q.670
Question
Calculate the Average Price of Samsung Products Sold in 2022 by Region
Write a SQL query to calculate the average price of Samsung products sold in 2022, grouped
by region.
Explanation
You need to:
• Join the sales, regions, and smartphone_models tables.
• Filter the sales data to only include records from 2022.
841
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Join the smartphone_models table to include product pricing information (assuming there
is a price column in the smartphone_models table).
• Calculate the average price of Samsung products sold per region by using the AVG()
function.
-- Europe
(7, 2, 101, 800, '2022-01-15'),
(8, 2, 103, 1500, '2022-02-25'),
(9, 2, 104, 900, '2022-04-10'),
(10, 2, 108, 1800, '2022-06-05'),
(11, 2, 109, 1300, '2022-07-25'),
-- Asia
(12, 3, 101, 2000, '2022-03-05'),
(13, 3, 103, 1200, '2022-05-15'),
842
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Australia
(17, 4, 102, 1100, '2022-04-01'),
(18, 4, 104, 950, '2022-05-20'),
(19, 4, 106, 800, '2022-07-05'),
(20, 4, 107, 1400, '2022-06-10'),
(21, 4, 108, 1300, '2022-08-25');
Learnings
• Joining Multiple Tables: This query involves joining three tables: sales, regions, and
smartphone_models to get the relevant data.
• Filtering Data by Year: You filter the sales data to only include records for the year 2022.
• Aggregating Data: The AVG() function is used to calculate the average price of Samsung
products sold by region.
• Handling Price Data: The price information is retrieved from the smartphone_models
table, which is then used in the aggregation.
Solutions
• - PostgreSQL solution
SELECT
r.region_name,
AVG(m.price) AS avg_price
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models m ON s.model_id = m.model_id
WHERE
EXTRACT(YEAR FROM s.sale_date) = 2022
GROUP BY
r.region_name
ORDER BY
r.region_name;
• - MySQL solution
SELECT
r.region_name,
AVG(m.price) AS avg_price
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models m ON s.model_id = m.model_id
WHERE
YEAR(s.sale_date) = 2022
GROUP BY
r.region_name
ORDER BY
r.region_name;
This query calculates the average price of Samsung products sold in 2022, grouped by region.
• Q.671
Question
Calculate the Total Sales of Galaxy S21 in 2022 by Region
Write a SQL query to calculate the total units sold for the Galaxy S21 model across different
regions in 2022.
843
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to:
• Join the sales, regions, and smartphone_models tables.
• Filter the sales data for the Galaxy S21 model (identified by its model_name).
• Filter the sales by the year 2022.
• Aggregate the total units sold by region.
-- Europe
(7, 2, 101, 800, '2022-01-15'), -- Galaxy S21
(8, 2, 103, 1500, '2022-02-25'), -- Galaxy Z Fold3
(9, 2, 104, 900, '2022-04-10'), -- Galaxy S20
(10, 2, 108, 1800, '2022-06-05'), -- Galaxy S22
(11, 2, 109, 1300, '2022-07-25'), -- Galaxy Note10
844
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Asia
(12, 3, 101, 2000, '2022-03-05'), -- Galaxy S21
(13, 3, 103, 1200, '2022-05-15'), -- Galaxy Z Fold3
(14, 3, 105, 1600, '2022-06-30'), -- Galaxy A52
(15, 3, 106, 1900, '2022-08-10'), -- Galaxy Z Flip3
(16, 3, 108, 1700, '2022-09-15'), -- Galaxy S22
-- Australia
(17, 4, 102, 1100, '2022-04-01'), -- Galaxy Note20
(18, 4, 104, 950, '2022-05-20'), -- Galaxy S20
(19, 4, 106, 800, '2022-07-05'), -- Galaxy Z Flip3
(20, 4, 107, 1400, '2022-06-10'), -- Galaxy A72
(21, 4, 108, 1300, '2022-08-25'); -- Galaxy S22
Learnings
• Filtering sales based on the product model name (Galaxy S21).
• Filtering by year using YEAR(sale_date) = 2022.
• Aggregating the total units sold by region using SUM.
• Using JOIN to combine relevant data from the sales, regions, and smartphone_models
tables.
Solutions
• - PostgreSQL solution
SELECT
r.region_name,
SUM(s.units_sold) AS total_units_sold
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models m ON s.model_id = m.model_id
WHERE
m.model_name = 'Galaxy S21'
AND EXTRACT(YEAR FROM s.sale_date) = 2022
GROUP BY
r.region_name
ORDER BY
r.region_name;
• - MySQL solution
SELECT
r.region_name,
SUM(s.units_sold) AS total_units_sold
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models m ON s.model_id = m.model_id
WHERE
m.model_name = 'Galaxy S21'
AND YEAR(s.sale_date) = 2022
GROUP BY
r.region_name
ORDER BY
r.region_name;
• Q.672
Question
Identify the Customers Who Bought the Galaxy Note20 and Rated it Below 3
845
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to find customers who purchased the Galaxy Note20 and gave a rating of
less than 3. Return their name, email, and rating.
Explanation
You need to:
• Join the customers, purchases, and reviews tables.
• Filter the records to include only those customers who purchased the Galaxy Note20.
• Filter for reviews where the rating (stars) is less than 3.
• Return the customer's name, email, and the review rating.
-- Purchases Data
INSERT INTO purchases (purchase_id, customer_id, product, year, units_sold)
VALUES
(1, 1, 'Galaxy Note20', 2022, 1),
(2, 2, 'Galaxy Note20', 2022, 2),
(3, 3, 'Galaxy S21', 2022, 1),
(4, 4, 'Galaxy Note20', 2022, 1),
(5, 5, 'Galaxy Note20', 2022, 1);
-- Reviews Data
INSERT INTO reviews (review_id, customer_id, product_id, stars)
VALUES
(1, 1, 102, 2), -- John Doe rated Galaxy Note20 with 2 stars
(2, 2, 102, 1), -- Sophia Brown rated Galaxy Note20 with 1 star
(3, 3, 101, 4), -- Liam Smith rated Galaxy S21 with 4 stars
(4, 4, 102, 3), -- Ava Johnson rated Galaxy Note20 with 3 stars
(5, 5, 102, 1); -- Noah Thompson rated Galaxy Note20 with 1 star
Learnings
846
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Joining Multiple Tables: You will need to join customers, purchases, and reviews to
get the relevant customer information and their corresponding reviews.
• Filtering Data: The filter conditions will include both the product ('Galaxy Note20')
and the review rating (stars < 3).
• Return Specific Columns: The output needs to return the customer's name, email, and
their rating for the Galaxy Note20.
Solutions
• - PostgreSQL solution
SELECT
c.name,
c.email,
r.stars AS rating
FROM
customers c
JOIN
purchases p ON c.customer_id = p.customer_id
JOIN
reviews r ON c.customer_id = r.customer_id
WHERE
p.product = 'Galaxy Note20'
AND r.stars < 3;
• - MySQL solution
SELECT
c.name,
c.email,
r.stars AS rating
FROM
customers c
JOIN
purchases p ON c.customer_id = p.customer_id
JOIN
reviews r ON c.customer_id = r.customer_id
WHERE
p.product = 'Galaxy Note20'
AND r.stars < 3;
• Q.673
Question
Identify the Customers Who Bought Galaxy Buds After Buying Samsung Galaxy S23
Ultra
Write a SQL query to identify customers who purchased Galaxy Buds after purchasing the
Samsung Galaxy S23 Ultra. Return the customer's name, email, and the purchase details for
both products.
Explanation
You need to:
• Join the customers and purchases tables to link customers with their purchase history.
• Use a self-join on the purchases table to match customers who bought both products,
ensuring that the purchase of Galaxy Buds occurs after the purchase of the Galaxy S23
Ultra.
• Filter based on the product names: Galaxy S23 Ultra first, and Galaxy Buds second.
• Return the relevant customer details, along with the product names and purchase dates.
847
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100)
);
Learnings
• Self-Join: A self-join is required to match two different purchases of the same customer,
with one product bought before the other.
• Filtering Products: The query involves filtering for Galaxy S23 Ultra and Galaxy Buds.
• Ensuring Order of Purchases: The query ensures that Galaxy Buds are purchased after
the Galaxy S23 Ultra by comparing the purchase dates.
848
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT
c.name,
c.email,
p1.product AS first_product,
p1.purchase_date AS first_purchase_date,
p2.product AS second_product,
p2.purchase_date AS second_purchase_date
FROM
customers c
JOIN
purchases p1 ON c.customer_id = p1.customer_id
JOIN
purchases p2 ON c.customer_id = p2.customer_id
WHERE
p1.product = 'Galaxy S23 Ultra'
AND p2.product = 'Galaxy Buds'
AND p1.purchase_date < p2.purchase_date;
• - MySQL solution
SELECT
c.name,
c.email,
p1.product AS first_product,
p1.purchase_date AS first_purchase_date,
p2.product AS second_product,
p2.purchase_date AS second_purchase_date
FROM
customers c
JOIN
purchases p1 ON c.customer_id = p1.customer_id
JOIN
purchases p2 ON c.customer_id = p2.customer_id
WHERE
p1.product = 'Galaxy S23 Ultra'
AND p2.product = 'Galaxy Buds'
AND p1.purchase_date < p2.purchase_date;
This query identifies the customers who bought Galaxy Buds after purchasing the Galaxy
S23 Ultra, along with the purchase details of both products.
• Q.674
Identify the Region with the Highest Total Sales of Samsung Galaxy S21 in 2022
Question
Write a SQL query to identify the region with the highest total sales (in units) of the
Samsung Galaxy S21 in 2022. Return the region name, total units sold, and the region's total
sales.
Explanation
• You need to join the sales, regions, and smartphone_models tables.
• Filter for the Samsung Galaxy S21 model and sales in 2022.
• Aggregate the total units sold per region.
• Return the region with the highest total units sold, along with the total units and sales.
849
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT
r.region_name,
SUM(s.units_sold) AS total_units_sold,
SUM(s.units_sold * p.price_usd) AS total_sales
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models sm ON s.model_id = sm.model_id
JOIN
products p ON sm.model_id = p.model_id
WHERE
sm.model_name = 'Galaxy S21'
AND EXTRACT(YEAR FROM s.sale_date) = 2022
GROUP BY
r.region_name
ORDER BY
total_units_sold DESC
LIMIT 1;
850
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT
r.region_name,
SUM(s.units_sold) AS total_units_sold,
SUM(s.units_sold * p.price_usd) AS total_sales
FROM
sales s
JOIN
regions r ON s.region_id = r.region_id
JOIN
smartphone_models sm ON s.model_id = sm.model_id
JOIN
products p ON sm.model_id = p.model_id
WHERE
sm.model_name = 'Galaxy S21'
AND YEAR(s.sale_date) = 2022
GROUP BY
r.region_name
ORDER BY
total_units_sold DESC
LIMIT 1;
• Q.675
Find the Customers Who Have Given More Than One Review for the Same Product
Question
Write a SQL query to find all customers who have given more than one review for the same
Samsung product. Return the customer name, product name, and the count of reviews.
Explanation
• You need to join the customers, reviews, and products tables.
• Use GROUP BY to identify customers who have provided multiple reviews for the same
product.
• Filter for customers who have given more than one review for the same product.
851
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT
c.name,
p.product_name,
COUNT(r.review_id) AS review_count
FROM
reviews r
JOIN
customers c ON r.customer_id = c.customer_id
JOIN
products p ON r.product_id = p.product_id
GROUP BY
c.name, p.product_name
HAVING
COUNT(r.review_id) > 1;
• - MySQL solution
SELECT
c.name,
p.product_name,
COUNT(r.review_id) AS review_count
FROM
reviews r
JOIN
customers c ON r.customer_id = c.customer_id
JOIN
products p ON r.product_id = p.product_id
GROUP BY
c.name, p.product_name
HAVING
COUNT(r.review_id) > 1;
• Q.676
Calculate the Total Revenue for Each Product in 2022
Question
Write a SQL query to calculate the total revenue for each Samsung product sold in 2022.
Revenue is calculated as the number of units sold multiplied by the product price.
852
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Join the sales, products, and smartphone_models tables.
• Filter the sales data for 2022.
• Multiply the units sold by the product price to calculate total revenue.
• Return the total revenue for each product.
Solutions
• - PostgreSQL solution
SELECT
sm.model_name,
SUM(s.units_sold * p.price_usd) AS total_revenue
FROM
sales s
JOIN
smartphone_models sm ON s.model_id = sm.model_id
JOIN
products p ON sm.product_id = p.product_id
WHERE
EXTRACT(YEAR FROM s.sale_date) = 2022
GROUP BY
sm.model_name
ORDER BY
total_revenue DESC;
853
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT
sm.model_name,
SUM(s.units_sold * p.price_usd) AS total_revenue
FROM
sales s
JOIN
smartphone_models sm ON s.model_id = sm.model_id
JOIN
products p ON sm.product_id = p.product_id
WHERE
YEAR(s.sale_date) = 2022
GROUP BY
sm.model_name
ORDER BY
total_revenue DESC;
• Q.677
Products Table
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50)
);
SQL Solution:
SELECT category, COUNT(*) AS total_products
FROM products
GROUP BY category;
• Q.678
Find Customers Who Have Bought More Than One Product
Question:
Write a SQL query to find customers who have purchased more than one product.
854
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
You need to:
• Join the customers and purchases tables.
• Group by customer and count the number of products they have bought.
• Filter customers who have purchased more than one product.
Customers Table
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100)
);
Purchases Table
CREATE TABLE purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
product_id INT,
purchase_date DATE
);
SQL Solution:
SELECT c.name, c.email
FROM customers c
JOIN purchases p ON c.customer_id = p.customer_id
GROUP BY c.customer_id
HAVING COUNT(p.product_id) > 1;
Explanation:
855
1000+ SQL Interview Questions & Answers | By Zero Analyst
To identify products that have received a rating below 3 for more than 50% of their reviews:
• Join the reviews and products tables on the product_id to get product details for each
review.
• Calculate the percentage of reviews with a rating below 3 for each product by using COUNT
and FILTER for the reviews with stars below 3.
• Filter products where the percentage of reviews with a rating below 3 is greater than 50%.
Reviews Table
CREATE TABLE reviews (
review_id INT PRIMARY KEY,
customer_id INT,
product_id INT,
stars INT,
review_date DATE
);
Products Table
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100)
);
SQL Solution:
SELECT p.product_name
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name
HAVING COUNT(CASE WHEN r.stars < 3 THEN 1 END) * 1.0 / COUNT(r.review_id) > 0.5;
856
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The COUNT(CASE WHEN r.stars < 3 THEN 1 END) counts the number of reviews with a
rating below 3 for each product.
• The COUNT(r.review_id) counts the total number of reviews for each product.
• HAVING Clause:
• We calculate the ratio of reviews with a rating below 3 by dividing the count of low-star
reviews by the total number of reviews.
• The HAVING clause filters products where the ratio of low-star reviews is greater than 50%
(i.e., more than 0.5).
Key Takeaways:
• CASE WHEN is used to selectively count reviews that meet the condition of having a
rating below 3.
• HAVING is used after aggregation (GROUP BY) to filter based on the calculated ratio.
• JOIN ensures we link products with their corresponding reviews to analyze the ratings.
• Q.680
Find Customers Who Have Never Bought a Galaxy Model
Explanation:
To identify customers who have never bought a Samsung Galaxy model:
• Use a LEFT JOIN between the customers and purchases tables.
• Filter for customers who do not have any purchase records for Galaxy models by using the
WHERE condition with a NULL check on the product_id or product_name column for Galaxy
products.
Customers Table
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100)
);
Purchases Table
CREATE TABLE purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
product VARCHAR(100),
purchase_date DATE
857
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
SQL Solution:
SELECT c.customer_id, c.name, c.email
FROM customers c
LEFT JOIN purchases p ON c.customer_id = p.customer_id AND p.product LIKE 'Galaxy%'
WHERE p.purchase_id IS NULL;
Key Takeaways:
• LEFT JOIN ensures that all customers are included in the result, even if they don't have
matching records in the purchases table.
• LIKE 'Galaxy%' is used to filter only Galaxy model purchases.
• NULL check (WHERE p.purchase_id IS NULL) identifies customers who have no
records for Galaxy products.
IBM
• Q.681
Calculate Average Sales per Region
Explanation
IBM is analyzing sales data across multiple regions. The task is to calculate the average sales
for each region. You need to group the data by region and return the average sales for each
region.
858
1000+ SQL Interview Questions & Answers | By Zero Analyst
sale_date DATE
);
• - Datasets
INSERT INTO sales (sale_id, region, sale_amount, sale_date)
VALUES
(1, 'North', 200.00, '2023-01-01'),
(2, 'South', 300.00, '2023-01-05'),
(3, 'East', 150.00, '2023-01-07'),
(4, 'North', 400.00, '2023-01-10'),
(5, 'South', 250.00, '2023-01-12');
Learnings
• Grouping data by region
• Calculating the average of a column using AVG()
• Summarizing sales data by region
Solutions
• - PostgreSQL solution
SELECT region, AVG(sale_amount) AS avg_sales
FROM sales
GROUP BY region
ORDER BY region;
• - MySQL solution
SELECT region, AVG(sale_amount) AS avg_sales
FROM sales
GROUP BY region
ORDER BY region;
• Q.682
Total Sales by Product
Explanation
IBM wants to know how much total sales were made for each product. The task is to sum the
sales for each product and return the total sales amount for each product.
Learnings
• Grouping data by product
• Calculating the sum of a column using SUM()
859
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT product_name, SUM(sale_amount) AS total_sales
FROM product_sales
GROUP BY product_name
ORDER BY total_sales DESC;
• - MySQL solution
SELECT product_name, SUM(sale_amount) AS total_sales
FROM product_sales
GROUP BY product_name
ORDER BY total_sales DESC;
• Q.683
Counting Orders Above a Threshold
Explanation
IBM is analyzing customer order data and wants to count how many orders exceed a
specified amount (e.g., $500). The task is to count the number of orders above the given
threshold.
Learnings
• Using the COUNT() function to count rows
• Filtering data using WHERE clause with a threshold
• Aggregating data based on a condition
Solutions
• - PostgreSQL solution
SELECT COUNT(*) AS orders_above_500
FROM customer_orders
WHERE order_amount > 500;
• - MySQL solution
SELECT COUNT(*) AS orders_above_500
FROM customer_orders
WHERE order_amount > 500;
• Q.684
860
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using EXTRACT(MONTH FROM date) to get the month from a date
• Aggregating data by region and month
• Calculating the total revenue for each region per month
• Sorting the results by month and region
Solutions
• - PostgreSQL solution
SELECT region,
EXTRACT(MONTH FROM sale_date) AS month,
SUM(sale_amount) AS total_revenue
FROM sales
GROUP BY region, EXTRACT(MONTH FROM sale_date)
ORDER BY month, region;
• - MySQL solution
SELECT region,
MONTH(sale_date) AS month,
SUM(sale_amount) AS total_revenue
FROM sales
GROUP BY region, MONTH(sale_date)
ORDER BY month, region;
• Q.685
Identify Customers Who Have Made More Than 3 Orders in a Month
Explanation
IBM wants to identify customers who are highly engaged by placing more than 3 orders
within a given month. The task is to return the list of customer IDs who have made more than
3 orders in any given month.
861
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT() to count the number of orders per customer
• Grouping by customer_id and month
• Filtering customers who have made more than 3 orders in a month
• Using HAVING to apply conditions to aggregated results
Solutions
• - PostgreSQL solution
SELECT customer_id,
EXTRACT(MONTH FROM order_date) AS month,
COUNT(*) AS order_count
FROM customer_orders
GROUP BY customer_id, EXTRACT(MONTH FROM order_date)
HAVING COUNT(*) > 3
ORDER BY customer_id, month;
• - MySQL solution
SELECT customer_id,
MONTH(order_date) AS month,
COUNT(*) AS order_count
FROM customer_orders
GROUP BY customer_id, MONTH(order_date)
HAVING COUNT(*) > 3
ORDER BY customer_id, month;
• Q.686
Product Sales Performance Based on Region and Category
Explanation
IBM wants to assess product performance by region and category. The task is to calculate the
total sales for each product category and region for the last quarter, ensuring to exclude
products with sales below a threshold (e.g., $500).
862
1000+ SQL Interview Questions & Answers | By Zero Analyst
category VARCHAR(50),
region VARCHAR(50),
sale_amount DECIMAL(10, 2),
sale_date DATE
);
• - Datasets
INSERT INTO product_sales (sale_id, product_id, product_name, category, region, sale_amo
unt, sale_date)
VALUES
(1, 101, 'Laptop', 'Electronics', 'North', 1500.00, '2023-03-05'),
(2, 102, 'Smartphone', 'Electronics', 'South', 600.00, '2023-03-10'),
(3, 103, 'Tablet', 'Electronics', 'East', 200.00, '2023-03-15'),
(4, 104, 'Monitor', 'Electronics', 'North', 350.00, '2023-03-20'),
(5, 105, 'Headphones', 'Accessories', 'South', 700.00, '2023-03-25'),
(6, 106, 'Charger', 'Accessories', 'East', 100.00, '2023-03-30');
Learnings
• Aggregating sales by category and region
• Filtering products with sales above a threshold
• Using GROUP BY to calculate total sales per category and region
• Working with date ranges for the last quarter
Solutions
• - PostgreSQL solution
SELECT category, region, SUM(sale_amount) AS total_sales
FROM product_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-03-31'
GROUP BY category, region
HAVING SUM(sale_amount) > 500
ORDER BY total_sales DESC;
• - MySQL solution
SELECT category, region, SUM(sale_amount) AS total_sales
FROM product_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-03-31'
GROUP BY category, region
HAVING SUM(sale_amount) > 500
ORDER BY total_sales DESC;
• Q.687
Extract Product Code from Product Name
Explanation
IBM wants to clean up product data by extracting product codes from the product names,
which are formatted as "Product Name - [ProductCode]". The task is to extract only the
ProductCode from the product name, ensuring the product code is a 6-character
alphanumeric string.
863
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using REGEXP (or REGEXP_REPLACE) for pattern matching
• Extracting a substring with regular expressions
• Handling alphanumeric patterns with wildcards in regex
Solutions
• - PostgreSQL solution
SELECT product_id,
REGEXP_REPLACE(product_name, '^.* - ([A-Za-z0-9]{6})$', '\1') AS product_code
FROM products;
• - MySQL solution
SELECT product_id,
REGEXP_SUBSTR(product_name, '([A-Za-z0-9]{6})$') AS product_code
FROM products;
• Q.688
Find All Email Addresses with Specific Domain
Explanation
IBM wants to identify all the email addresses from a specific domain (e.g., @ibm.com). Write
a query that uses wildcards to find email addresses from the customers table that end with
@ibm.com.
Learnings
• Using wildcards (% in LIKE clause) for pattern matching
• Matching email addresses based on a domain name
• Using LIKE for simple text pattern matching
Solutions
• - PostgreSQL solution
SELECT customer_id, customer_name, email
FROM customers
WHERE email LIKE '%@ibm.com';
864
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT customer_id, customer_name, email
FROM customers
WHERE email LIKE '%@ibm.com';
• Q.689
Masking Part of Credit Card Numbers
Explanation
IBM needs to ensure that only part of the credit card number is visible in reports. The goal is
to mask the middle digits of the credit card number, leaving the first 4 and last 4 digits
visible (e.g., 1234-****-****-5678).
Write a query to achieve this using text functions and regular expressions.
Learnings
• Using REGEXP_REPLACE to mask parts of a string
• Regular expressions to replace a range of characters
• Working with text manipulation functions in SQL
Solutions
• - PostgreSQL solution
SELECT payment_id, customer_id,
REGEXP_REPLACE(credit_card_number, '(\d{4})\d{8}(\d{4})', '\1-****-****-\2') AS m
asked_card
FROM payments;
• - MySQL solution
SELECT payment_id, customer_id,
CONCAT(SUBSTRING(credit_card_number, 1, 4), '-****-****-', SUBSTRING(credit_card_
number, 13, 4)) AS masked_card
FROM payments;
865
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Masking sensitive data (e.g., credit card numbers) using regular expressions.
• Q.690
Calculate Discounted Price Based on Product Category
Explanation
IBM wants to calculate the discounted price for each product. The discount is applied based
on the product category as follows:
• Electronics: 10% discount
• Clothing: 20% discount
• Home Goods: 15% discount
• All other categories: no discount
Write a SQL query that calculates the discounted price for each product, using a CASE
statement based on the category.
Learnings
• Using the CASE statement for conditional logic
• Performing calculations based on categories
• Applying discounts based on different conditions
Solutions
• - PostgreSQL solution
SELECT product_id, product_name, category, price,
CASE
WHEN category = 'Electronics' THEN price * 0.90
WHEN category = 'Clothing' THEN price * 0.80
WHEN category = 'Home Goods' THEN price * 0.85
ELSE price
END AS discounted_price
FROM products;
• - MySQL solution
SELECT product_id, product_name, category, price,
CASE
WHEN category = 'Electronics' THEN price * 0.90
WHEN category = 'Clothing' THEN price * 0.80
WHEN category = 'Home Goods' THEN price * 0.85
ELSE price
END AS discounted_price
866
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM products;
• Q.691
Question 2: Find Customers Who Have Purchased More Than One Product in
a Single Transaction
Explanation
IBM wants to identify customers who have purchased more than one product in a single
transaction. The task is to join the orders and order_items tables and use a CASE statement
to flag whether a customer bought multiple products in the same transaction.
Learnings
• Using JOIN to combine data from related tables
• Using COUNT() and GROUP BY to count the number of products in each order
• Using CASE to flag when a customer buys more than one product in a transaction
Solutions
• - PostgreSQL solution
SELECT o.customer_id, o.order_id,
CASE
WHEN COUNT(oi.product_id) > 1 THEN 'Multiple Products'
ELSE 'Single Product'
END AS product_type
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY o.customer_id, o.order_id;
• - MySQL solution
SELECT o.customer_id, o.order_id,
CASE
867
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using CASE to create conditional categories based on sales vs target
• Performing comparison operations (>, <, >=, <=) within CASE statements
• Labeling employees based on their performance
Solutions
• - PostgreSQL solution
SELECT employee_id, employee_name, sales, target,
CASE
WHEN sales > target * 1.2 THEN 'Exceeded'
WHEN sales >= target THEN 'Met'
ELSE 'Below Target'
END AS performance
FROM employee_sales;
• - MySQL solution
SELECT employee_id, employee_name, sales, target,
CASE
WHEN sales > target * 1.2 THEN 'Exceeded'
WHEN sales >= target THEN 'Met'
ELSE 'Below Target'
868
1000+ SQL Interview Questions & Answers | By Zero Analyst
END AS performance
FROM employee_sales;
869
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using aggregate functions (SUM) to calculate the total investment per investor.
• Using GROUP BY to group the data by country and get the highest investment.
• Using window functions like ROW_NUMBER() to rank investors and select the top investor
for each country.
Solutions
• - PostgreSQL solution
WITH ranked_investors AS (
SELECT
i.investor_id,
i.investor_name,
i.country,
SUM(inv.investment_amount) AS total_investment,
ROW_NUMBER() OVER (PARTITION BY i.country ORDER BY SUM(inv.investment_amount) DE
SC) AS rank
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
GROUP BY
i.investor_id, i.investor_name, i.country
)
SELECT
investor_id,
investor_name,
country,
total_investment
FROM ranked_investors
WHERE rank = 1
ORDER BY country;
• - MySQL solution
WITH ranked_investors AS (
SELECT
i.investor_id,
i.investor_name,
i.country,
SUM(inv.investment_amount) AS total_investment,
ROW_NUMBER() OVER (PARTITION BY i.country ORDER BY SUM(inv.investment_amount) DE
SC) AS rank
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
GROUP BY
i.investor_id, i.investor_name, i.country
)
SELECT
investor_id,
investor_name,
country,
total_investment
FROM ranked_investors
WHERE rank = 1
ORDER BY country;
• Q.694
Find the Most Active Investors in IBM by Investment Growth Rate
Explanation
870
1000+ SQL Interview Questions & Answers | By Zero Analyst
IBM wants to track investors who have shown the highest growth rate in their investments
over the past year. The growth rate is calculated as the percentage change in the total
investment made in the last 12 months compared to the previous 12 months.
The task is to calculate the investment growth rate for each investor and identify the top
investors by growth. The solution should take into account the following:
• The data needs to be grouped by each investor.
• For each investor, compare their investment in the last 12 months to the investment in the
previous 12 months.
• Return investors with a positive growth rate, sorted in descending order by growth.
Learnings
• Using window functions for partitioned calculations across different time frames.
• Calculating percentage change in investment over time.
• Combining aggregation and time-based filtering (last 12 months, previous 12 months).
Solutions
• - PostgreSQL solution
WITH investment_growth AS (
SELECT
i.investor_id,
i.investor_name,
871
1000+ SQL Interview Questions & Answers | By Zero Analyst
SUM(CASE WHEN inv.investment_date >= CURRENT_DATE - INTERVAL '1 year' THEN inv.i
nvestment_amount ELSE 0 END) AS last_year_investment,
SUM(CASE WHEN inv.investment_date >= CURRENT_DATE - INTERVAL '2 year' AND inv.in
vestment_date < CURRENT_DATE - INTERVAL '1 year' THEN inv.investment_amount ELSE 0 END)
AS previous_year_investment
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
GROUP BY
i.investor_id, i.investor_name
)
SELECT
investor_id,
investor_name,
last_year_investment,
previous_year_investment,
ROUND(((last_year_investment - previous_year_investment) / previous_year_investment)
* 100, 2) AS growth_rate
FROM
investment_growth
WHERE
previous_year_investment > 0
ORDER BY
growth_rate DESC;
• - MySQL solution
WITH investment_growth AS (
SELECT
i.investor_id,
i.investor_name,
SUM(CASE WHEN inv.investment_date >= CURDATE() - INTERVAL 1 YEAR THEN inv.invest
ment_amount ELSE 0 END) AS last_year_investment,
SUM(CASE WHEN inv.investment_date >= CURDATE() - INTERVAL 2 YEAR AND inv.investm
ent_date < CURDATE() - INTERVAL 1 YEAR THEN inv.investment_amount ELSE 0 END) AS previou
s_year_investment
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
GROUP BY
i.investor_id, i.investor_name
)
SELECT
investor_id,
investor_name,
last_year_investment,
previous_year_investment,
ROUND(((last_year_investment - previous_year_investment) / previous_year_investment)
* 100, 2) AS growth_rate
FROM
investment_growth
WHERE
previous_year_investment > 0
ORDER BY
growth_rate DESC;
• Q.695
Identify the Most Profitable Countries for IBM Investors
Explanation
IBM is interested in analyzing which countries bring the highest total investment profit. In
this case, the profit is calculated by comparing the total investments made by the investors
from each country and sorting by the total amount invested.
The task is to write a SQL query that calculates the total investment amount per country
and ranks countries by their total investment in IBM.
872
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Aggregating data by country using SUM.
• Using GROUP BY and ORDER BY to rank countries by total investment.
• Combining JOINs with aggregate functions to calculate total investments.
Solutions
• - PostgreSQL solution
SELECT
i.country,
SUM(inv.investment_amount) AS total_investment
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
GROUP BY
i.country
ORDER BY
total_investment DESC;
• - MySQL solution
SELECT
i.country,
SUM(inv.investment_amount) AS total_investment
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
GROUP BY
i.country
ORDER BY
total_investment DESC;
• Q.696
Track Investors Who Have Increased Investment in AI Products Over Time
873
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
IBM wants to identify investors who have shown increased interest in AI-related products
over time. The goal is to track whether an investor has increased their investment in
products like IBM AI or IBM Watson by comparing the current investment amount to the
investment amount from the same period last year.
The solution should:
• Identify investments in AI-related products.
• Calculate the year-over-year investment growth for each investor.
• Return investors who have increased their investments in AI products over the past year.
Learnings
• Using DATE functions to calculate year-over-year changes.
• *
Filtering specific product categories** (AI-related products).
• Using self-joins to compare investment amounts over time.
Solutions
• - PostgreSQL solution
WITH current_year_investments AS (
SELECT
inv.investor_id,
inv.product_id,
SUM(inv.investment_amount) AS current_investment
FROM
investments inv
JOIN ai_products aip ON inv.product_id = aip.product_id
WHERE
inv.investment_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY
inv.investor_id, inv.product_id
),
previous_year_investments AS (
SELECT
inv.investor_id,
inv.product_id,
SUM(inv.investment_amount) AS previous_investment
FROM
investments inv
JOIN ai_products aip ON inv.product_id = aip.product_id
WHERE
874
1000+ SQL Interview Questions & Answers | By Zero Analyst
875
1000+ SQL Interview Questions & Answers | By Zero Analyst
generated by that product. The task is to calculate the ROI for each product and identify the
investors who have invested in the top 3 products with the highest ROI.
The solution should:
• Calculate the ROI for each product.
• Identify the top 3 products with the highest ROI.
• List the investors who have invested in these top 3 products, along with their total
investment amount.
Learnings
• Calculating ROI as the ratio of investment to revenue.
• Using JOINs to combine revenue and investment data.
• Filtering products based on top N values (highest ROI).
Solutions
• - PostgreSQL solution
WITH product_roi AS (
SELECT
inv.product_id,
SUM(inv.investment_amount) AS total_investment,
r.revenue,
SUM(inv.investment_amount) / r.revenue AS roi
FROM
investments inv
JOIN revenue r ON inv.product_id = r.product_id
GROUP BY
inv.product_id, r.revenue
ORDER BY
roi DESC
LIMIT 3
)
SELECT
i.investor_id,
i.investor_name,
SUM(inv.investment_amount) AS total_investment
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
JOIN product_roi pr ON inv.product_id = pr.product_id
GROUP BY
i.investor_id, i.investor_name
ORDER BY
876
1000+ SQL Interview Questions & Answers | By Zero Analyst
total_investment DESC;
• - MySQL solution
WITH product_roi AS (
SELECT
inv.product_id,
SUM(inv.investment_amount) AS total_investment,
r.revenue,
SUM(inv.investment_amount) / r.revenue AS roi
FROM
investments inv
JOIN revenue r ON inv.product_id = r.product_id
GROUP BY
inv.product_id, r.revenue
ORDER BY
roi DESC
LIMIT 3
)
SELECT
i.investor_id,
i.investor_name,
SUM(inv.investment_amount) AS total_investment
FROM
investors i
JOIN investments inv ON i.investor_id = inv.investor_id
JOIN product_roi pr ON inv.product_id = pr.product_id
GROUP BY
i.investor_id, i.investor_name
ORDER BY
total_investment DESC;
• Q.698
Calculate Project Hierarchy with Recursive CTE
Explanation
IBM has a project management database where each project can have sub-projects (child
projects). The task is to return the project hierarchy, showing each project along with its
level in the hierarchy and its parent project (if any).
For this, you’ll need to use a recursive CTE to find the hierarchy of projects and display
them with their parent-child relationships.
Learnings
• Using recursive CTE to navigate hierarchical structures
• Handling parent-child relationships in relational databases
877
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
WITH RECURSIVE project_hierarchy AS (
-- Base case: Select the root projects
SELECT project_id, project_name, parent_project_id, 1 AS level
FROM projects
WHERE parent_project_id IS NULL
UNION ALL
-- Recursive case: Select child projects and increment the level
SELECT p.project_id, p.project_name, p.parent_project_id, ph.level + 1
FROM projects p
INNER JOIN project_hierarchy ph ON p.parent_project_id = ph.project_id
)
SELECT project_id, project_name, parent_project_id, level
FROM project_hierarchy
ORDER BY level, parent_project_id;
• - MySQL solution
WITH RECURSIVE project_hierarchy AS (
-- Base case: Select the root projects
SELECT project_id, project_name, parent_project_id, 1 AS level
FROM projects
WHERE parent_project_id IS NULL
UNION ALL
-- Recursive case: Select child projects and increment the level
SELECT p.project_id, p.project_name, p.parent_project_id, ph.level + 1
FROM projects p
INNER JOIN project_hierarchy ph ON p.parent_project_id = ph.project_id
)
SELECT project_id, project_name, parent_project_id, level
FROM project_hierarchy
ORDER BY level, parent_project_id;
• Q.699
Find Projects with Circular Dependencies
Explanation
IBM’s project database has a circular dependency problem, where projects might
mistakenly reference each other as parent-child in a cycle. The goal is to write a recursive
query that can detect and return projects that are part of a circular reference, i.e., projects
where one project is a descendant of itself.
Learnings
• Using recursive CTEs to detect cycles or circular dependencies
878
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
WITH RECURSIVE project_hierarchy AS (
-- Base case: Select root projects
SELECT project_id, project_name, parent_project_id
FROM projects
WHERE parent_project_id IS NOT NULL
UNION ALL
-- Recursive case: Traverse through the project hierarchy
SELECT p.project_id, p.project_name, p.parent_project_id
FROM projects p
INNER JOIN project_hierarchy ph ON p.parent_project_id = ph.project_id
)
SELECT DISTINCT ph1.project_id, ph1.project_name
FROM project_hierarchy ph1
JOIN project_hierarchy ph2 ON ph1.project_id = ph2.parent_project_id
WHERE ph1.project_id = ph2.project_id;
• - MySQL solution
WITH RECURSIVE project_hierarchy AS (
-- Base case: Select root projects
SELECT project_id, project_name, parent_project_id
FROM projects
WHERE parent_project_id IS NOT NULL
UNION ALL
-- Recursive case: Traverse through the project hierarchy
SELECT p.project_id, p.project_name, p.parent_project_id
FROM projects p
INNER JOIN project_hierarchy ph ON p.parent_project_id = ph.project_id
)
SELECT DISTINCT ph1.project_id, ph1.project_name
FROM project_hierarchy ph1
JOIN project_hierarchy ph2 ON ph1.project_id = ph2.parent_project_id
WHERE ph1.project_id = ph2.project_id;
• Q.700
Find Project Milestones Over Time
Explanation
IBM wants to analyze project milestones over time. Each project has a set of milestones, and
the task is to find all milestones that occurred in the last three months for each project,
ordered by the milestone date. This involves a recursive CTE to calculate dates relative to
the current date and joining with the milestones table.
879
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using recursive CTEs to calculate dates and manipulate time-based data
• Working with date functions in SQL
• Joining CTEs with actual tables to retrieve relevant data
Solutions
• - PostgreSQL solution
WITH RECURSIVE recent_milestones AS (
SELECT milestone_id, project_id, milestone_name, milestone_date
FROM milestones
WHERE milestone_date > CURRENT_DATE - INTERVAL '3 months'
UNION ALL
SELECT m.milestone_id, m.project_id, m.milestone_name, m.milestone_date
FROM milestones m
JOIN recent_milestones rm ON m.project_id = rm.project_id
WHERE m.milestone_date > CURRENT_DATE - INTERVAL '3 months'
)
SELECT project_id, milestone_name, milestone_date
FROM recent_milestones
ORDER BY milestone_date;
• - MySQL solution
WITH RECURSIVE recent_milestones AS (
SELECT milestone_id, project_id, milestone_name, milestone_date
FROM milestones
WHERE milestone_date > CURDATE() - INTERVAL 3 MONTH
UNION ALL
SELECT m.milestone_id, m.project_id, m.milestone_name, m.milestone_date
FROM milestones m
JOIN recent_milestones rm ON m.project_id = rm.project_id
WHERE m.milestone_date > CURDATE() - INTERVAL 3 MONTH
)
SELECT project_id, milestone_name, milestone_date
FROM recent_milestones
ORDER BY milestone_date;
Dell
• Q.701
880
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Identifying Top Purchasing Customers
Dell needs to identify the top customers who have spent the most on their products over the
past year. This analysis will help the company identify its "power users" for special offers
and future product development.
Explanation
The task requires calculating the total amount spent by each customer over the past year. This
involves joining the orders and products tables, filtering orders made in the last year,
summing the total spent by each user, and retrieving the top 5 users based on the total spent.
Datasets and SQL Schemas
• - Orders table creation
CREATE TABLE orders (
order_id INT,
user_id INT,
order_date DATE,
product_id INT
);
• - Products table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(255),
price_usd DECIMAL(10, 2)
);
• - Insert sample data into orders table
INSERT INTO orders (order_id, user_id, order_date, product_id)
VALUES
(2001, 456, '2021-04-09', 106),
(2002, 789, '2021-05-10', 105),
(2003, 456, '2021-06-15', 107),
(2004, 123, '2021-07-20', 105),
(2005, 789, '2021-08-25', 106);
• - Insert sample data into products table
INSERT INTO products (product_id, product_name, price_usd)
VALUES
(105, 'Inspiron Laptop', 500),
(106, 'Alienware Gaming Desktop', 2000),
(107, 'Dell Monitor', 200);
Learnings
• JOIN operation to combine data from multiple tables
• DATE functions for filtering records based on time (e.g., the last year)
• Aggregation with SUM() to calculate total spending
• Grouping by user to get a total for each customer
• Ordering and LIMIT for retrieving top results
Solutions
• - PostgreSQL solution
SELECT o.user_id, SUM(p.price_usd) AS total_spent
FROM orders o
JOIN products p ON o.product_id = p.product_id
WHERE o.order_date >= DATE_TRUNC('year', CURRENT_DATE) - INTERVAL '1 year'
GROUP BY o.user_id
ORDER BY total_spent DESC
LIMIT 5;
• - MySQL solution
SELECT o.user_id, SUM(p.price_usd) AS total_spent
FROM orders o
881
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• EXTRACT() function to extract specific date parts (e.g., month)
• AVG() function to calculate the average of numerical values
• GROUP BY to aggregate data by specific fields (month and product_id)
• ORDER BY for sorting results
Solutions
• - PostgreSQL solution
SELECT
EXTRACT(MONTH FROM submit_date) AS month,
product_id,
AVG(stars) AS avg_stars
FROM
reviews
GROUP BY
month, product_id
ORDER BY
product_id, month;
• - MySQL solution
SELECT
MONTH(submit_date) AS month,
product_id,
AVG(stars) AS avg_stars
FROM
882
1000+ SQL Interview Questions & Answers | By Zero Analyst
reviews
GROUP BY
month, product_id
ORDER BY
product_id, month;
• Q.703
Product Sales by Category
Question
Dell wants to analyze the sales performance of its products by category. Write a SQL query
to calculate the total sales for each product category in the past quarter, including the total
revenue generated.
Explanation
You need to join the orders, products, and categories tables to calculate the total sales
for each category over the past quarter. The query should filter orders made in the last
quarter, group the results by category, and sum up the total revenue based on product prices.
Datasets and SQL Schemas
• - Orders table creation
CREATE TABLE orders (
order_id INT,
user_id INT,
order_date DATE,
product_id INT,
quantity INT
);
• - Products table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(255),
price_usd DECIMAL(10, 2),
category_id INT
);
• - Categories table creation
CREATE TABLE categories (
category_id INT,
category_name VARCHAR(255)
);
• - Insert sample data into orders table
INSERT INTO orders (order_id, user_id, order_date, product_id, quantity)
VALUES
(1001, 123, '2023-09-12', 20001, 2),
(1002, 456, '2023-09-15', 20002, 1),
(1003, 789, '2023-09-25', 20001, 3);
• - Insert sample data into products table
INSERT INTO products (product_id, product_name, price_usd, category_id)
VALUES
(20001, 'Alienware Laptop', 1500, 1),
(20002, 'XPS Desktop', 1200, 2);
• - Insert sample data into categories table
INSERT INTO categories (category_id, category_name)
VALUES
(1, 'Laptops'),
(2, 'Desktops');
Learnings
• JOIN between multiple tables (orders, products, categories)
• DATE functions for filtering the past quarter's data
• Aggregation with SUM() and GROUP BY
883
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• COUNT() function for counting orders
• GROUP BY for summarizing results by customer
• DATE filtering for the last 6 months
Solutions
• - PostgreSQL solution
SELECT user_id, COUNT(order_id) AS order_count
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY user_id
ORDER BY order_count DESC;
884
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - MySQL solution
SELECT user_id, COUNT(order_id) AS order_count
FROM orders
WHERE order_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY user_id
ORDER BY order_count DESC;
• Q.705
Product Popularity by Region
Question
Dell wants to track the popularity of its products by region. Write a SQL query to calculate
the number of units sold for each product by region in the past month.
Explanation
You need to join the orders, products, and regions tables to calculate the number of units
sold per product by region for the past month. The query should filter orders by the last
month and group by product and region.
Datasets and SQL Schemas
• - Orders table creation
CREATE TABLE orders (
order_id INT,
user_id INT,
order_date DATE,
product_id INT,
quantity INT,
region_id INT
);
• - Products table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(255)
);
• - Regions table creation
CREATE TABLE regions (
region_id INT,
region_name VARCHAR(255)
);
• - Insert sample data into orders table
INSERT INTO orders (order_id, user_id, order_date, product_id, quantity, region_id)
VALUES
(3001, 123, '2023-12-05', 1001, 2, 1),
(3002, 456, '2023-12-10', 1002, 3, 2),
(3003, 789, '2023-12-15', 1001, 1, 1),
(3004, 123, '2023-12-20', 1003, 4, 3);
• - Insert sample data into products table
INSERT INTO products (product_id, product_name)
VALUES
(1001, 'Alienware Laptop'),
(1002, 'XPS Desktop'),
(1003, 'Inspiron Laptop');
• - Insert sample data into regions table
INSERT INTO regions (region_id, region_name)
VALUES
(1, 'North America'),
(2, 'Europe'),
(3, 'Asia');
Learnings
• JOIN across multiple tables (orders, products, regions)
• Aggregation for counting units sold
885
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.706
Count of Complaints by Product
Question
Write a SQL query to count the number of complaints for each product.
Explanation
This task involves calculating the total number of complaints for each product by grouping
the data by product_id and counting the occurrences of complaints.
Datasets and SQL Schemas
• - Complaints table creation
CREATE TABLE complaints (
complaint_id INT,
user_id INT,
product_id INT,
complaint_date DATE,
complaint_text VARCHAR(255)
);
• - Insert sample data into complaints table
INSERT INTO complaints (complaint_id, user_id, product_id, complaint_date, complaint_tex
t)
VALUES
(101, 123, 50001, '2023-08-10', 'Broken screen'),
(102, 456, 50002, '2023-09-05', 'Not working as expected'),
(103, 789, 50001, '2023-09-12', 'Battery draining too fast'),
(104, 101, 50001, '2023-09-15', 'Overheating issue');
Learnings
• COUNT() for counting occurrences
• GROUP BY to aggregate data by product
• Basic filtering without advanced logic
Solutions
• - PostgreSQL solution
SELECT product_id, COUNT(*) AS complaint_count
FROM complaints
GROUP BY product_id
886
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.707
Average Rating of Customer Feedback
Question
Write a SQL query to calculate the average rating given by customers for their feedback.
Explanation
You need to calculate the average of the rating column for each feedback entry. The query
should use the AVG() function to compute the average.
Datasets and SQL Schemas
• - Feedback table creation
CREATE TABLE feedback (
feedback_id INT,
user_id INT,
product_id INT,
feedback_date DATE,
rating INT
);
• - Insert sample data into feedback table
INSERT INTO feedback (feedback_id, user_id, product_id, feedback_date, rating)
VALUES
(201, 123, 50001, '2023-09-10', 4),
(202, 456, 50002, '2023-09-15', 5),
(203, 789, 50001, '2023-09-20', 3),
(204, 101, 50001, '2023-09-25', 2);
Learnings
• AVG() to calculate the average value of a column
• GROUP BY to compute averages by product or customer
• Filtering based on feedback dates or other criteria
Solutions
• - PostgreSQL solution
SELECT product_id, AVG(rating) AS avg_rating
FROM feedback
GROUP BY product_id
ORDER BY avg_rating DESC;
• - MySQL solution
SELECT product_id, AVG(rating) AS avg_rating
FROM feedback
GROUP BY product_id
ORDER BY avg_rating DESC;
• Q.708
Number of Service Requests in the Last 30 Days
Question
Write a SQL query to calculate the number of service requests made in the last 30 days.
Explanation
887
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to filter the service_requests table to count how many requests were made in the
last 30 days, using the current date.
Datasets and SQL Schemas
• - Service Requests table creation
CREATE TABLE service_requests (
request_id INT,
user_id INT,
service_type VARCHAR(100),
request_date DATE,
request_status VARCHAR(50)
);
• - Insert sample data into service_requests table
INSERT INTO service_requests (request_id, user_id, service_type, request_date, request_s
tatus)
VALUES
(301, 123, 'Warranty', '2023-09-10', 'Completed'),
(302, 456, 'Technical Support', '2023-09-05', 'Pending'),
(303, 789, 'Return', '2023-08-28', 'Completed'),
(304, 101, 'Warranty', '2023-09-20', 'Pending');
Learnings
• COUNT() to count rows
• DATE filtering using CURRENT_DATE
• Basic date arithmetic for filtering recent entries
Solutions
• - PostgreSQL solution
SELECT COUNT(*) AS service_request_count
FROM service_requests
WHERE request_date >= CURRENT_DATE - INTERVAL '30 days';
• - MySQL solution
SELECT COUNT(*) AS service_request_count
FROM service_requests
WHERE request_date >= CURDATE() - INTERVAL 30 DAY;
• Q.709
Identify Dell Products with Replacement Requests Within 6 Months of Purchase
Question
Write a SQL query to identify the Dell products where customers requested a replacement
within 6 months of their purchase.
Explanation
You need to find the products for which replacement requests were made within 6 months of
the purchase date. This requires joining the orders table with the service_requests table,
filtering the data based on the request_type (assuming "Replacement" is a type) and
ensuring the request was made within 6 months of the order date.
Datasets and SQL Schemas
• - Orders table creation
CREATE TABLE orders (
order_id INT,
user_id INT,
order_date DATE,
product_id INT,
quantity INT
);
• - Service Requests table creation
888
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• JOIN operation to combine data from multiple tables
• DATE filtering to compare dates (e.g., 6 months from purchase date)
• Conditional filtering based on the request type (Replacement)
Solutions
• - PostgreSQL solution
SELECT o.product_id, p.product_name
FROM orders o
JOIN service_requests sr ON o.product_id = sr.product_id
JOIN products p ON o.product_id = p.product_id
WHERE sr.request_type = 'Replacement'
AND sr.request_date <= o.order_date + INTERVAL '6 months'
ORDER BY o.product_id;
• - MySQL solution
SELECT o.product_id, p.product_name
FROM orders o
JOIN service_requests sr ON o.product_id = sr.product_id
JOIN products p ON o.product_id = p.product_id
WHERE sr.request_type = 'Replacement'
AND sr.request_date <= DATE_ADD(o.order_date, INTERVAL 6 MONTH)
ORDER BY o.product_id;
889
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to find out the total sales (in terms of the number of units) for each
product in each region.
Explanation
This task requires joining the Sales, Products, and Regions tables. You need to group the
data by product name and region name, and then calculate the total number of units sold for
each product in each region. The SUM() function will be used to get the total number of units
sold.
Datasets and SQL Schemas
• - Products table creation
CREATE TABLE Products (
product_id INT,
name VARCHAR(255),
price DECIMAL(10, 2),
release_date DATE
);
• - Insert sample data into Products table
INSERT INTO Products (product_id, name, price, release_date)
VALUES
(1, 'Laptop A', 1200, '2020-01-01'),
(2, 'Laptop B', 1500, '2020-07-01'),
(3, 'Desktop A', 1700, '2021-02-01');
• - Regions table creation
CREATE TABLE Regions (
region_id INT,
name VARCHAR(255)
);
• - Insert sample data into Regions table
INSERT INTO Regions (region_id, name)
VALUES
(1, 'North America'),
(2, 'Europe'),
(3, 'Asia-Pacific');
• - Sales table creation
CREATE TABLE Sales (
sales_id INT,
product_id INT,
region_id INT,
sale_date DATE,
units_sold INT
);
• - Insert sample data into Sales table
INSERT INTO Sales (sales_id, product_id, region_id, sale_date, units_sold)
VALUES
(1, 1, 1, '2021-02-01', 3),
(2, 2, 2, '2021-03-01', 5),
(3, 3, 3, '2021-04-01', 2),
(4, 1, 2, '2021-05-01', 4),
(5, 2, 3, '2021-06-01', 1);
Learnings
• JOIN between multiple tables (Sales, Products, and Regions)
• GROUP BY to aggregate data by product and region
• SUM() to calculate the total units sold
• ORDER BY for sorting results by product and region
Solutions
• - PostgreSQL solution
SELECT P.name AS "Product", R.name AS "Region", SUM(S.units_sold) AS "Total Units Sold"
FROM Sales S
890
1000+ SQL Interview Questions & Answers | By Zero Analyst
891
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DATE filtering to get recent records (last year and last 6 months)
• JOINs between purchase and tech_support based on customer_id
• GROUP BY and HAVING to exclude customers with major support issues
• Filtering based on ratings and issue levels
Solutions
• - PostgreSQL solution
SELECT p.customer_id
FROM purchase p
LEFT JOIN tech_support ts ON p.customer_id = ts.customer_id
AND ts.support_date >= CURRENT_DATE - INTERVAL '6 months'
AND ts.issue_level >= 4 -- filtering for major issues
WHERE p.purchase_date >= CURRENT_DATE - INTERVAL '1 year'
AND p.rating >= 4 -- filtering for good ratings
GROUP BY p.customer_id
HAVING COUNT(ts.support_id) = 0; -- ensuring no major tech support issues
• - MySQL solution
SELECT p.customer_id
FROM purchase p
LEFT JOIN tech_support ts ON p.customer_id = ts.customer_id
AND ts.support_date >= CURDATE() - INTERVAL 6 MONTH
AND ts.issue_level >= 4 -- filtering for major issues
WHERE p.purchase_date >= CURDATE() - INTERVAL 1 YEAR
AND p.rating >= 4 -- filtering for good ratings
GROUP BY p.customer_id
HAVING COUNT(ts.support_id) = 0; -- ensuring no major tech support issues
892
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Use the AVG() function to compute the average units sold per month for each model.
• Ensure the month is extracted correctly from the sale date using the date_trunc()
function (in PostgreSQL) or other date extraction methods in MySQL.
Datasets and SQL Schemas
• - Sales table creation
CREATE TABLE sales (
id INT,
sale_date DATE,
laptop_model VARCHAR(255),
units_sold INT
);
• - Insert sample data into sales table
INSERT INTO sales (id, sale_date, laptop_model, units_sold)
VALUES
(1, '2022-05-01', 'Dell Inspiron', 100),
(2, '2022-05-01', 'Dell XPS', 150),
(3, '2022-05-02', 'Dell Inspiron', 120),
(4, '2022-05-02', 'Dell XPS', 140),
(5, '2022-05-03', 'Dell Inspiron', 110),
(6, '2022-05-03', 'Dell XPS', 160);
Learnings
• DATE extraction to group by month
• AVG() to calculate the average sales per group
• GROUP BY to group data by month and laptop model
• ORDER BY to sort the results by month and laptop model
Solutions
• - PostgreSQL solution
SELECT date_trunc('month', sale_date)::date AS Month,
laptop_model,
AVG(units_sold) AS average_units_sold
FROM sales
GROUP BY date_trunc('month', sale_date)::date, laptop_model
ORDER BY Month, laptop_model;
• - MySQL solution
SELECT DATE_FORMAT(sale_date, '%Y-%m-01') AS Month,
laptop_model,
AVG(units_sold) AS average_units_sold
FROM sales
GROUP BY DATE_FORMAT(sale_date, '%Y-%m-01'), laptop_model
ORDER BY Month, laptop_model;
893
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.713
Find the Average Customer Rating for Each Product
Question
Write a SQL query to find the average customer rating for each product based on customer
reviews.
Explanation
You need to calculate the average rating for each product by grouping the data by
product_id and then using the AVG() function on the rating column.
Datasets and SQL Schemas
• - Reviews table creation
CREATE TABLE reviews (
review_id INT,
product_id INT,
customer_id INT,
rating INT,
review_date DATE
);
• - Insert sample data into reviews table
INSERT INTO reviews (review_id, product_id, customer_id, rating, review_date)
VALUES
(1, 101, 1001, 4, '2023-01-15'),
(2, 101, 1002, 5, '2023-02-10'),
(3, 102, 1003, 3, '2023-01-20'),
(4, 103, 1004, 4, '2023-03-01'),
(5, 102, 1005, 2, '2023-02-25');
Learnings
• AVG() for calculating the average rating
• GROUP BY to aggregate by product
Solutions
• - PostgreSQL and MySQL solution
SELECT product_id, AVG(rating) AS avg_rating
FROM reviews
GROUP BY product_id
ORDER BY product_id;
• Q.714
Find Products with a Consistent Rating Over the Past Year
Question
Write a SQL query to identify products that have received a consistent customer rating over
the past year. The consistency is defined as products where the standard deviation of ratings
over the last year is less than or equal to 0.5.
Explanation
You need to:
• Calculate the standard deviation of ratings for each product within the last year.
• Filter products where the standard deviation is less than or equal to 0.5, indicating that
ratings for the product are relatively consistent.
Datasets and SQL Schemas
• - Reviews table creation
894
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• STDDEV() for calculating standard deviation
• Filtering based on conditions using HAVING
• DATE filtering to get records from the last year
Solutions
• - PostgreSQL and MySQL solution
SELECT product_id, STDDEV(rating) AS rating_stddev
FROM reviews
WHERE review_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY product_id
HAVING STDDEV(rating) <= 0.5
ORDER BY product_id;
• Q.715
Count the Number of Products with Each Rating
Question
Write a SQL query to count how many products have received each rating (1 to 5) from
customers.
Explanation
You need to count the occurrences of each rating for products in the reviews table. Use the
GROUP BY clause to group by rating and then use the COUNT() function to count how many
times each rating has been given.
Datasets and SQL Schemas
• - Reviews table creation
CREATE TABLE reviews (
review_id INT,
product_id INT,
customer_id INT,
rating INT,
review_date DATE
);
• - Insert sample data into reviews table
INSERT INTO reviews (review_id, product_id, customer_id, rating, review_date)
VALUES
(1, 101, 1001, 4, '2023-01-15'),
(2, 101, 1002, 5, '2023-02-10'),
(3, 102, 1003, 3, '2023-01-20'),
(4, 103, 1004, 5, '2023-03-01'),
(5, 102, 1005, 2, '2023-02-25'),
(6, 103, 1006, 4, '2023-03-15');
895
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• COUNT() to count occurrences
• GROUP BY to group by product rating
Solutions
• - PostgreSQL and MySQL solution
SELECT rating, COUNT(*) AS product_count
FROM reviews
GROUP BY rating
ORDER BY rating;
• Q.716
Identify Products with the Lowest Average Rating
Question
Write a SQL query to find the products with the lowest average customer rating (rating less
than 3).
Explanation
You need to calculate the average rating for each product, and then filter products that have
an average rating less than 3.
Datasets and SQL Schemas
• - Reviews table creation
CREATE TABLE reviews (
review_id INT,
product_id INT,
customer_id INT,
rating INT,
review_date DATE
);
• - Insert sample data into reviews table
INSERT INTO reviews (review_id, product_id, customer_id, rating, review_date)
VALUES
(1, 101, 1001, 4, '2023-01-15'),
(2, 101, 1002, 5, '2023-02-10'),
(3, 102, 1003, 2, '2023-01-20'),
(4, 103, 1004, 3, '2023-03-01'),
(5, 102, 1005, 1, '2023-02-25');
Learnings
• AVG() for calculating average ratings
• HAVING clause to filter products with a rating less than 3
Solutions
• - PostgreSQL and MySQL solution
SELECT product_id, AVG(rating) AS avg_rating
FROM reviews
GROUP BY product_id
HAVING AVG(rating) < 3
ORDER BY avg_rating;
• Q.717
Find the Product with the Highest Variance in Customer Ratings Over Time
Question
896
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write a SQL query to find the product that has the highest variance in customer ratings over
time. Variance measures how spread out the ratings are, and a high variance means that
customer ratings for the product are highly inconsistent.
Explanation
You need to:
• Calculate the variance of ratings for each product.
• Use the VAR_POP() or VAR_SAMP() function (depending on the SQL dialect) to calculate
the variance of ratings for each product.
• Identify the product with the highest variance.
Datasets and SQL Schemas
• - Reviews table creation
CREATE TABLE reviews (
review_id INT,
product_id INT,
customer_id INT,
rating INT,
review_date DATE
);
• - Insert sample data into reviews table
INSERT INTO reviews (review_id, product_id, customer_id, rating, review_date)
VALUES
(1, 101, 1001, 4, '2023-01-15'),
(2, 101, 1002, 5, '2023-02-10'),
(3, 101, 1003, 3, '2023-03-01'),
(4, 102, 1004, 4, '2023-01-10'),
(5, 102, 1005, 2, '2023-02-15'),
(6, 103, 1006, 5, '2023-01-22'),
(7, 103, 1007, 1, '2023-03-12');
Learnings
• VAR_POP() or VAR_SAMP() for calculating variance
• GROUP BY to group by product
• ORDER BY to sort by the highest variance
Solutions
• - PostgreSQL and MySQL solution
SELECT product_id, VAR_POP(rating) AS rating_variance
FROM reviews
GROUP BY product_id
ORDER BY rating_variance DESC
LIMIT 1;
• Q.718
Find the Correlation Between Product Price and Customer Rating
Question
Write a SQL query to find the correlation between the price of the products and the customer
ratings. Assume there is a products table that contains the product_id and price, and a
reviews table that contains the product_id and rating. Calculate the Pearson correlation
coefficient between the product price and the customer rating.
Explanation
You need to:
• Join the reviews and products tables on product_id.
897
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Calculate the Pearson correlation coefficient between price and rating. The formula for
the Pearson correlation coefficient is:
Where:
Learnings
• JOIN between reviews and products tables
• Pearson Correlation formula
• Aggregating and Calculating sums using SQL functions
Solutions
• - PostgreSQL solution
WITH correlation_data AS (
SELECT p.price, r.rating
FROM reviews r
898
1000+ SQL Interview Questions & Answers | By Zero Analyst
899
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering based on rating using WHERE
• DATE functions to filter data from the last 6 months
• COUNT() for counting occurrences
• GROUP BY to group by product_id
Solutions
• - PostgreSQL and MySQL solution
SELECT product_id, COUNT(*) AS negative_feedback_count
FROM reviews
WHERE rating IN (1, 2)
AND review_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY product_id
ORDER BY negative_feedback_count DESC;
• Q.720
Track the Trend of Customer Ratings Over Time for a Specific Product
Question
Write a SQL query to track the trend of customer ratings (average rating per month) for a
specific product over the last year.
Explanation
You need to:
• Group the reviews by month for a specific product_id.
• Calculate the average rating per month.
• Filter reviews from the last year.
• Display the results in chronological order.
Datasets and SQL Schemas
• - Reviews table creation
CREATE TABLE reviews (
review_id INT,
product_id INT,
customer_id INT,
rating INT,
review_date DATE
);
• - Insert sample data into reviews table
INSERT INTO reviews (review_id, product_id, customer_id, rating, review_date)
VALUES
(1, 101, 1001, 5, '2023-01-10'),
(2, 101, 1002, 4, '2023-02-15'),
(3, 101, 1003, 3, '2023-03-01'),
(4, 101, 1004, 4, '2023-04-20'),
(5, 101, 1005, 5, '2023-05-10'),
(6, 101, 1006, 4, '2023-06-15');
Learnings
• EXTRACT(MONTH FROM date) to extract the month
• AVG() to calculate the average rating per month
• GROUP BY for monthly grouping
900
1000+ SQL Interview Questions & Answers | By Zero Analyst
American Express
• Q.721
Calculate Average Transaction Value per Customer
Explanation
You are asked to calculate the average transaction value for each customer in the American
Express database. The transaction data is stored in a transactions table. Calculate the
average transaction amount for each customer and order the results by the highest average
transaction value.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount)
VALUES
(1, 101, '2022-10-01 00:00:00', 150.00),
(2, 102, '2022-10-01 00:00:00', 200.00),
(3, 101, '2022-10-02 00:00:00', 50.00),
(4, 103, '2022-10-02 00:00:00', 300.00),
(5, 102, '2022-10-03 00:00:00', 75.00);
Learnings
• Aggregations: Using AVG() to calculate average values.
• GROUP BY: Grouping by customer ID to calculate per-customer statistics.
• Ordering: Sorting results by average transaction value.
Solutions
PostgreSQL Solution:
SELECT
customer_id,
AVG(amount) AS average_transaction_value
FROM
901
1000+ SQL Interview Questions & Answers | By Zero Analyst
transactions
GROUP BY
customer_id
ORDER BY
average_transaction_value DESC;
MySQL Solution:
SELECT
customer_id,
AVG(amount) AS average_transaction_value
FROM
transactions
GROUP BY
customer_id
ORDER BY
average_transaction_value DESC;
• Q.722
Find Top 5 Customers by Total Spending
Explanation
You are tasked with identifying the top 5 customers with the highest total spending. The
spending is recorded in the transactions table. Calculate the total amount spent by each
customer and display the top 5 customers ordered by total spending.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount)
VALUES
(1, 101, '2022-10-01 00:00:00', 250.00),
(2, 102, '2022-10-01 00:00:00', 400.00),
(3, 101, '2022-10-02 00:00:00', 150.00),
(4, 103, '2022-10-02 00:00:00', 500.00),
(5, 102, '2022-10-03 00:00:00', 200.00);
Learnings
• SUM(): Aggregating the total amount spent by each customer.
• GROUP BY: Grouping by customer to calculate total spending.
• LIMIT: Using LIMIT to restrict the result to top N records.
Solutions
PostgreSQL Solution:
SELECT
customer_id,
SUM(amount) AS total_spending
FROM
transactions
GROUP BY
customer_id
902
1000+ SQL Interview Questions & Answers | By Zero Analyst
ORDER BY
total_spending DESC
LIMIT 5;
MySQL Solution:
SELECT
customer_id,
SUM(amount) AS total_spending
FROM
transactions
GROUP BY
customer_id
ORDER BY
total_spending DESC
LIMIT 5;
• Q.723
Identify Customers Who Made More Than 3 Transactions in a Month
Explanation
You need to identify customers who have made more than three transactions in any given
month. The data is stored in the transactions table. Count the number of transactions per
customer per month and return customers who have made more than 3 transactions in a
month.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount)
VALUES
(1, 101, '2022-10-01 00:00:00', 150.00),
(2, 101, '2022-10-02 00:00:00', 200.00),
(3, 101, '2022-10-03 00:00:00', 100.00),
(4, 102, '2022-10-01 00:00:00', 50.00),
(5, 102, '2022-10-01 00:00:00', 75.00),
(6, 102, '2022-10-03 00:00:00', 100.00);
Learnings
• COUNT(): Counting transactions for each customer per month.
• EXTRACT(): Extracting the month from the transaction date.
• HAVING: Filtering groups based on aggregate conditions (transactions > 3).
Solutions
PostgreSQL Solution:
SELECT
customer_id,
EXTRACT(MONTH FROM transaction_date) AS month,
COUNT(*) AS transaction_count
FROM
903
1000+ SQL Interview Questions & Answers | By Zero Analyst
transactions
GROUP BY
customer_id, EXTRACT(MONTH FROM transaction_date)
HAVING
COUNT(*) > 3;
MySQL Solution:
SELECT
customer_id,
MONTH(transaction_date) AS month,
COUNT(*) AS transaction_count
FROM
transactions
GROUP BY
customer_id, MONTH(transaction_date)
HAVING
COUNT(*) > 3;
• Q.724
Calculate the Monthly Average Spend Per Customer (Excluding Refunds)
Explanation
In the transactions table, some transactions are refunds. You are required to calculate the
monthly average spend per customer, but excluding refunds. A refund transaction is
indicated by a negative amount. For each customer, calculate the total spend for each month,
then find the average spend across all months for each customer.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount)
VALUES
(1, 101, '2022-10-01 00:00:00', 200.00),
(2, 101, '2022-10-02 00:00:00', -50.00), -- Refund
(3, 101, '2022-11-01 00:00:00', 100.00),
(4, 102, '2022-10-01 00:00:00', 150.00),
(5, 102, '2022-11-01 00:00:00', 300.00),
(6, 103, '2022-10-01 00:00:00', 500.00),
(7, 103, '2022-10-15 00:00:00', -100.00); -- Refund
Learnings
• Filtering: Excluding transactions with negative amounts.
• Date functions: Extracting month and year from transaction_date.
• Aggregation: Calculating total spend per month, then averaging it over time.
• Handling Refunds: Ensuring refunds are excluded from the total spend.
Solutions
904
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL Solution:
SELECT
customer_id,
AVG(monthly_spend) AS average_monthly_spend
FROM (
SELECT
customer_id,
EXTRACT(MONTH FROM transaction_date) AS month,
EXTRACT(YEAR FROM transaction_date) AS year,
SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) AS monthly_spend
FROM
transactions
GROUP BY
customer_id, EXTRACT(MONTH FROM transaction_date), EXTRACT(YEAR FROM transaction_dat
e)
) AS monthly_data
GROUP BY
customer_id;
MySQL Solution:
SELECT
customer_id,
AVG(monthly_spend) AS average_monthly_spend
FROM (
SELECT
customer_id,
MONTH(transaction_date) AS month,
YEAR(transaction_date) AS year,
SUM(CASE WHEN amount > 0 THEN amount ELSE 0 END) AS monthly_spend
FROM
transactions
GROUP BY
customer_id, MONTH(transaction_date), YEAR(transaction_date)
) AS monthly_data
GROUP BY
customer_id;
• Q.725
Find Customers Who Increased Their Spending by More Than 50% Between Consecutive
Months
Explanation
You need to identify customers who increased their total spending by more than 50%
between two consecutive months. Use the transactions table to calculate monthly spending
and compare the current month's total to the previous month's total for each customer.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount)
VALUES
(1, 101, '2022-10-01 00:00:00', 200.00),
(2, 101, '2022-10-15 00:00:00', 100.00),
(3, 101, '2022-11-01 00:00:00', 300.00),
905
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Window Functions: Using LAG() to calculate the previous month's spending.
• Filtering: Identifying customers with more than a 50% increase in spending.
• Date Functions: Grouping by month and year.
Solutions
PostgreSQL Solution:
WITH monthly_spend AS (
SELECT
customer_id,
EXTRACT(MONTH FROM transaction_date) AS month,
EXTRACT(YEAR FROM transaction_date) AS year,
SUM(amount) AS total_spend
FROM
transactions
GROUP BY
customer_id, EXTRACT(MONTH FROM transaction_date), EXTRACT(YEAR FROM transaction_dat
e)
)
SELECT
current.customer_id,
current.year,
current.month,
current.total_spend AS current_month_spend,
previous.total_spend AS previous_month_spend
FROM
monthly_spend current
JOIN
monthly_spend previous ON current.customer_id = previous.customer_id
AND current.year = previous.year
AND current.month = previous.month + 1
WHERE
current.total_spend > previous.total_spend * 1.5;
MySQL Solution:
WITH monthly_spend AS (
SELECT
customer_id,
MONTH(transaction_date) AS month,
YEAR(transaction_date) AS year,
SUM(amount) AS total_spend
FROM
transactions
GROUP BY
customer_id, MONTH(transaction_date), YEAR(transaction_date)
)
SELECT
current.customer_id,
current.year,
current.month,
current.total_spend AS current_month_spend,
previous.total_spend AS previous_month_spend
FROM
monthly_spend current
JOIN
monthly_spend previous ON current.customer_id = previous.customer_id
AND current.year = previous.year
AND current.month = previous.month + 1
WHERE
current.total_spend > previous.total_spend * 1.5;
906
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.726
Identify Customers Who Have Made Purchases in 3 Different Years
Explanation
You need to identify customers who have made at least one purchase in 3 different years.
This involves counting the distinct years in which a customer made a transaction.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, customer_id, transaction_date, amount)
VALUES
(1, 101, '2020-01-01 00:00:00', 100.00),
(2, 101, '2021-05-01 00:00:00', 200.00),
(3, 101, '2022-07-01 00:00:00', 300.00),
(4, 102, '2020-03-01 00:00:00', 150.00),
(5, 102, '2021-09-01 00:00:00', 180.00),
(6, 103, '2021-06-01 00:00:00', 500.00);
Learnings
• Distinct: Using DISTINCT to count distinct years.
• GROUP BY: Grouping by customer and year.
• HAVING: Filtering customers who have made purchases in three distinct years.
Solutions
PostgreSQL Solution:
SELECT
customer_id
FROM
(SELECT
customer_id,
EXTRACT(YEAR FROM transaction_date) AS year
FROM
transactions
GROUP BY
customer_id, EXTRACT(YEAR FROM transaction_date)) AS customer_years
GROUP BY
customer_id
HAVING
COUNT(DISTINCT year) = 3;
MySQL Solution:
SELECT
customer_id
FROM
(SELECT
customer_id,
YEAR(transaction_date) AS year
FROM
907
1000+ SQL Interview Questions & Answers | By Zero Analyst
transactions
GROUP BY
customer_id, YEAR(transaction_date)) AS customer_years
GROUP BY
customer_id
HAVING
COUNT(DISTINCT year) = 3;
Explanation
Credit card transactions are tracked in a transactions table, and you need to identify credit
cards that are at high risk of fraudulent activity. A credit card is considered high-risk if it has
made transactions exceeding $1000 in a single day three or more times within the last 30
days. You need to calculate how many high-risk days each card has, and return the cards with
three or more high-risk days.
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, card_id, transaction_date, amount)
VALUES
(1, 201, '2022-12-01 00:00:00', 500.00),
(2, 201, '2022-12-01 00:00:00', 600.00),
(3, 201, '2022-12-02 00:00:00', 1200.00),
(4, 202, '2022-12-02 00:00:00', 800.00),
(5, 202, '2022-12-02 00:00:00', 200.00),
(6, 203, '2022-12-03 00:00:00', 2500.00),
(7, 203, '2022-12-03 00:00:00', 300.00),
(8, 203, '2022-12-05 00:00:00', 1200.00),
(9, 203, '2022-12-06 00:00:00', 500.00),
(10, 204, '2022-12-01 00:00:00', 150.00);
Learnings
• Date filtering: Using the DATE_SUB() function to filter data within the last 30 days.
• GROUP BY: Grouping by card_id and transaction date.
• COUNT: Counting the number of days a card has high-risk transactions (greater than
$1000).
• HAVING: Filtering cards that have three or more high-risk days.
Solutions
PostgreSQL Solution:
WITH high_risk_days AS (
SELECT
908
1000+ SQL Interview Questions & Answers | By Zero Analyst
card_id,
DATE(transaction_date) AS transaction_day,
SUM(amount) AS total_amount
FROM
transactions
WHERE
transaction_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY
card_id, DATE(transaction_date)
HAVING
SUM(amount) > 1000
)
SELECT
card_id,
COUNT(transaction_day) AS high_risk_day_count
FROM
high_risk_days
GROUP BY
card_id
HAVING
COUNT(transaction_day) >= 3;
MySQL Solution:
WITH high_risk_days AS (
SELECT
card_id,
DATE(transaction_date) AS transaction_day,
SUM(amount) AS total_amount
FROM
transactions
WHERE
transaction_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY
card_id, DATE(transaction_date)
HAVING
SUM(amount) > 1000
)
SELECT
card_id,
COUNT(transaction_day) AS high_risk_day_count
FROM
high_risk_days
GROUP BY
card_id
HAVING
COUNT(transaction_day) >= 3;
• Q.728
Identify Top 3 Cards with the Highest Monthly Spend
Explanation
You need to identify the top 3 credit cards with the highest total spending each month. The
transactions table records all the card transactions. For each card, calculate the total
monthly spending and return the top 3 cards with the highest monthly spending for a given
month (e.g., 2022-10).
909
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, card_id, transaction_date, amount)
VALUES
(1, 201, '2022-10-01 00:00:00', 250.00),
(2, 202, '2022-10-02 00:00:00', 500.00),
(3, 203, '2022-10-03 00:00:00', 1000.00),
(4, 201, '2022-10-10 00:00:00', 750.00),
(5, 202, '2022-10-15 00:00:00', 300.00),
(6, 203, '2022-10-18 00:00:00', 1200.00),
(7, 204, '2022-10-20 00:00:00', 100.00);
Learnings
• Grouping by Month: Extracting month and year from transaction_date.
• SUM(): Aggregating total amount spent for each card.
• LIMIT: Limiting the result to top N cards based on the total spend.
Solutions
PostgreSQL Solution:
SELECT
card_id,
SUM(amount) AS total_spent
FROM
transactions
WHERE
transaction_date >= '2022-10-01' AND transaction_date < '2022-11-01'
GROUP BY
card_id
ORDER BY
total_spent DESC
LIMIT 3;
MySQL Solution:
SELECT
card_id,
SUM(amount) AS total_spent
FROM
transactions
WHERE
transaction_date >= '2022-10-01' AND transaction_date < '2022-11-01'
GROUP BY
card_id
ORDER BY
total_spent DESC
LIMIT 3;
• Q.729
Detect Credit Cards with Suspicious Spending Patterns (High Frequency in Short Time)
Explanation
You are tasked with detecting suspicious spending patterns in the transactions table. A
suspicious pattern is defined as a credit card making more than 5 transactions in a 1-hour
period with a total amount greater than $500. You need to identify all card IDs that have
exhibited this behavior in the last 7 days.
910
1000+ SQL Interview Questions & Answers | By Zero Analyst
Datasets:
-- Insert data into transactions
INSERT INTO transactions (transaction_id, card_id, transaction_date, amount)
VALUES
(1, 201, '2022-12-01 10:00:00', 100.00),
(2, 201, '2022-12-01 10:30:00', 200.00),
(3, 201, '2022-12-01 11:00:00', 150.00),
(4, 201, '2022-12-01 11:15:00', 60.00),
(5, 201, '2022-12-01 11:30:00', 80.00),
(6, 202, '2022-12-01 10:00:00', 500.00),
(7, 202, '2022-12-01 10:40:00', 60.00),
(8, 203, '2022-12-01 14:00:00', 200.00),
(9, 203, '2022-12-01 14:30:00', 100.00),
(10, 203, '2022-12-01 14:40:00', 150.00);
Learnings
• Window Functions: Using LAG() or LEAD() to check if multiple transactions occurred
within a short time span.
• TIME_INTERVAL: Grouping and counting transactions within a specific time window
(e.g., 1 hour).
• Filtering Suspicious Patterns: Checking for patterns where the total transaction amount is
greater than a threshold within a given time period.
Solutions
PostgreSQL Solution:
WITH suspicious_transactions AS (
SELECT
card_id,
transaction_date,
amount,
COUNT(*) OVER (PARTITION BY card_id ORDER BY transaction_date
RANGE BETWEEN INTERVAL '1 hour' PRECEDING AND CURRENT ROW) AS transac
tions_in_hour,
SUM(amount) OVER (PARTITION BY card_id ORDER BY transaction_date
RANGE BETWEEN INTERVAL '1 hour' PRECEDING AND CURRENT ROW) AS tota
l_in_hour
FROM
transactions
WHERE
transaction_date >= CURRENT_DATE - INTERVAL '7 days'
)
SELECT DISTINCT card_id
FROM
suspicious_transactions
WHERE
transactions_in_hour > 5 AND total_in_hour > 500;
MySQL Solution:
WITH suspicious_transactions AS (
SELECT
card_id,
transaction_date,
amount,
COUNT(*) OVER (PARTITION BY card_id ORDER BY transaction_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS transactions_in_
hour,
911
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.730
Question
Identify the VIP Customers for American Express
For American Express, identify customers with a high frequency of transactions (above
$5000). These 'Whale' users are customers who have made multiple transactions above the
$5000 threshold.
Explanation
The task is to find customers who have made multiple transactions with amounts greater than
or equal to $5000. This can be achieved by counting the transactions per customer and
applying a filter for the transaction count.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id VARCHAR(50),
customer_id VARCHAR(50),
transaction_date TIMESTAMP,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, customer_id, transaction_date, transaction_amo
unt)
VALUES
('001', 'C001', '2022-08-01 00:00:00', 4800),
('002', 'C002', '2022-08-02 00:00:00', 6800),
('003', 'C003', '2022-08-03 00:00:00', 5000),
('004', 'C001', '2022-08-10 00:00:00', 5200),
('005', 'C002', '2022-08-22 00:00:00', 7000);
Learnings
• Use of COUNT() function to count transactions
• Filtering with HAVING for aggregate conditions
• Grouping by customer to aggregate data
• Transaction filtering with WHERE
Solutions
• - PostgreSQL solution
SELECT customer_id, COUNT(*) AS transaction_count
FROM transactions
WHERE transaction_amount >= 5000
GROUP BY customer_id
HAVING COUNT(*) > 1;
• - MySQL solution
SELECT customer_id, COUNT(*) AS transaction_count
912
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM transactions
WHERE transaction_amount >= 5000
GROUP BY customer_id
HAVING COUNT(*) > 1;
• Q.731
Question
Identify all different types of product and revenue for American Express
Given a table with product information and transaction data, write a SQL query to identify all
different product types along with the total revenue generated for each product.
Explanation
The task is to group the transactions by product type and sum the transaction amounts for
each product. This will give the total revenue per product.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id VARCHAR(50),
customer_id VARCHAR(50),
transaction_date TIMESTAMP,
transaction_amount DECIMAL(10, 2),
product_type VARCHAR(100)
);
• - Datasets
INSERT INTO transactions (transaction_id, customer_id, transaction_date, transaction_amo
unt, product_type)
VALUES
('001', 'C001', '2022-08-01 00:00:00', 4800, 'Credit Card'),
('002', 'C002', '2022-08-02 00:00:00', 6800, 'Loan'),
('003', 'C003', '2022-08-03 00:00:00', 5000, 'Credit Card'),
('004', 'C001', '2022-08-10 00:00:00', 5200, 'Insurance'),
('005', 'C002', '2022-08-22 00:00:00', 7000, 'Loan');
Learnings
• Use of SUM() to calculate revenue
• Grouping by product type to aggregate data
• Use of GROUP BY and SELECT to identify distinct values
Solutions
• - PostgreSQL solution
SELECT product_type, SUM(transaction_amount) AS total_revenue
FROM transactions
GROUP BY product_type;
• - MySQL solution
SELECT product_type, SUM(transaction_amount) AS total_revenue
FROM transactions
GROUP BY product_type;
• Q.732
Question
Filter out customers who are based in New York, have a credit score above 700, and have a
total transaction amount in the last year exceeding $5000. Return the customer IDs in
ascending order.
Explanation
913
1000+ SQL Interview Questions & Answers | By Zero Analyst
The query needs to filter customers based on their location (New York), credit score (>700),
and their total transaction amount in the last year (greater than $5000). We need to join the
customer and transaction tables, aggregate the transactions for each customer, and apply
the required filters. Finally, sort the customer IDs in ascending order.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE customer (
customer_id INT,
name VARCHAR(100),
location VARCHAR(50),
credit_score INT
);
Learnings
• Using JOIN to combine multiple tables
• Applying WHERE filters for location and credit score
• Filtering transactions based on date with WHERE and DATE
• Using GROUP BY for aggregating transaction totals per customer
• Sorting results with ORDER BY
Solutions
• - PostgreSQL solution
SELECT c.customer_id
FROM customer c
JOIN transaction t ON c.customer_id = t.customer_id
WHERE c.location = 'New York'
AND c.credit_score > 700
AND t.transaction_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY c.customer_id
HAVING SUM(t.transaction_amount) > 5000
ORDER BY c.customer_id;
• - MySQL solution
SELECT c.customer_id
FROM customer c
JOIN transaction t ON c.customer_id = t.customer_id
WHERE c.location = 'New York'
AND c.credit_score > 700
AND t.transaction_date >= CURDATE() - INTERVAL 1 YEAR
914
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY c.customer_id
HAVING SUM(t.transaction_amount) > 5000
ORDER BY c.customer_id;
• Q.733
Question
Calculate the click-through-rate (CTR) for each campaign. The CTR is the ratio of the
number of customers who clicked on a campaign to the number of customers who viewed the
campaign. It is calculated as:
CTR = (Number of 'Clicked' Actions / Number of 'Viewed' Actions) * 100
Explanation
To calculate the CTR, we need to count how many times each campaign was viewed and
clicked. Then, calculate the ratio of clicks to views and multiply by 100 to get the percentage.
We use a LEFT JOIN to ensure we include all campaigns, even those with no clicks or views,
and then aggregate the actions using SUM and CASE statements.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE campaigns (
campaign_id INT,
channel VARCHAR(50),
date DATE
);
Learnings
• Using LEFT JOIN to join tables and preserve all campaigns
• Using CASE statements within aggregation to count specific actions
• Calculating percentages with simple arithmetic after aggregation
• Grouping by campaign to calculate CTR for each campaign
Solutions
• - PostgreSQL solution
SELECT
c.campaign_id,
c.channel,
(SUM(CASE WHEN ck.action = 'Clicked' THEN 1 ELSE 0 END) /
915
1000+ SQL Interview Questions & Answers | By Zero Analyst
NULLIF(SUM(CASE WHEN ck.action = 'Viewed' THEN 1 ELSE 0 END), 0)) * 100 AS click_th
rough_rate
FROM
campaigns c
LEFT JOIN
clicks ck
ON
c.campaign_id = ck.campaign_id
GROUP BY
c.campaign_id,
c.channel;
• - MySQL solution
SELECT
c.campaign_id,
c.channel,
(SUM(CASE WHEN ck.action = 'Clicked' THEN 1 ELSE 0 END) /
NULLIF(SUM(CASE WHEN ck.action = 'Viewed' THEN 1 ELSE 0 END), 0)) * 100 AS click_th
rough_rate
FROM
campaigns c
LEFT JOIN
clicks ck
ON
c.campaign_id = ck.campaign_id
GROUP BY
c.campaign_id,
c.channel;
• Q.734
Calculate the total spend of each user in the past month, including both successful and failed
transactions.
Explanation
To calculate the total spend, we need to sum the transaction_amount for each user where
the transaction occurred in the past month. The query should filter by transaction status to
include only successful transactions. We will also group by the user to get the total spend per
user.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2),
status VARCHAR(20)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount,
status)
VALUES
(1, 101, '2022-12-01', 150.00, 'Successful'),
(2, 102, '2022-12-02', 200.00, 'Failed'),
(3, 103, '2022-12-03', 250.00, 'Successful'),
(4, 101, '2022-12-04', 100.00, 'Successful'),
(5, 102, '2022-12-05', 300.00, 'Successful');
Learnings
• Use of SUM() for aggregating total spend
• Filtering by status to include only successful transactions
• Use of date filtering with WHERE for the past month
• Grouping by user to calculate individual spend
916
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT user_id, SUM(transaction_amount) AS total_spend
FROM transactions
WHERE status = 'Successful'
AND transaction_date >= CURRENT_DATE - INTERVAL '1 month'
GROUP BY user_id;
• - MySQL solution
SELECT user_id, SUM(transaction_amount) AS total_spend
FROM transactions
WHERE status = 'Successful'
AND transaction_date >= CURDATE() - INTERVAL 1 MONTH
GROUP BY user_id;
• Q.735
Calculate the total charges incurred by each user in a given year, where charges are defined
as any transaction over $500.
Explanation
The task is to identify transactions with amounts over $500, and calculate the total charges
for each user. This should be filtered by transaction amount and grouped by user.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
VALUES
(1, 101, '2022-01-15', 550.00),
(2, 102, '2022-03-10', 200.00),
(3, 103, '2022-07-23', 600.00),
(4, 101, '2022-08-12', 450.00),
(5, 102, '2022-11-05', 700.00);
Learnings
• Filtering transactions by amount using WHERE
• Aggregating charges using SUM()
• Grouping by user to calculate the total charges for each
Solutions
• - PostgreSQL solution
SELECT user_id, SUM(transaction_amount) AS total_charges
FROM transactions
WHERE transaction_amount > 500
AND EXTRACT(YEAR FROM transaction_date) = 2022
GROUP BY user_id;
• - MySQL solution
SELECT user_id, SUM(transaction_amount) AS total_charges
FROM transactions
WHERE transaction_amount > 500
AND YEAR(transaction_date) = 2022
GROUP BY user_id;
• Q.736
917
1000+ SQL Interview Questions & Answers | By Zero Analyst
Calculate the total rewards points accumulated by each user based on their transaction
history, assuming 1 point is awarded for every $10 spent.
Explanation
We need to calculate the total points for each user by dividing the transaction amount by 10
(as 1 point = $10 spent). This should be done for all transactions, and the result should be
aggregated by user.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
VALUES
(1, 101, '2022-12-01', 150.00),
(2, 102, '2022-12-02', 200.00),
(3, 103, '2022-12-03', 250.00),
(4, 101, '2022-12-04', 100.00),
(5, 102, '2022-12-05', 300.00);
Learnings
• Use of arithmetic to calculate rewards points
• Aggregation with SUM() to calculate total points per user
• Grouping by user to calculate individual totals
Solutions
• - PostgreSQL solution
SELECT user_id, SUM(transaction_amount / 10) AS total_rewards_points
FROM transactions
GROUP BY user_id;
• - MySQL solution
SELECT user_id, SUM(transaction_amount / 10) AS total_rewards_points
FROM transactions
GROUP BY user_id;
• Q.737
Identify the top 3 users who spent the most money on transactions in the last 6 months.
Explanation
To find the top 3 users with the highest spend in the last 6 months, we need to sum the
transaction_amount for each user, filter the transactions by date (last 6 months), and then
order the users by the total spend in descending order. The LIMIT clause will be used to
restrict the result to the top 3.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
918
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(1, 101, '2022-06-01', 150.00),
(2, 102, '2022-07-15', 300.00),
(3, 103, '2022-08-20', 400.00),
(4, 101, '2022-09-10', 500.00),
(5, 102, '2022-10-01', 200.00),
(6, 104, '2022-11-15', 700.00),
(7, 105, '2022-12-05', 250.00);
Learnings
• Aggregating total spend per user
• Filtering by date using WHERE and INTERVAL
• Ordering by sum and limiting results with LIMIT
• Working with date functions to calculate date ranges
Solutions
• - PostgreSQL solution
SELECT user_id, SUM(transaction_amount) AS total_spent
FROM transactions
WHERE transaction_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 3;
• - MySQL solution
SELECT user_id, SUM(transaction_amount) AS total_spent
FROM transactions
WHERE transaction_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY user_id
ORDER BY total_spent DESC
LIMIT 3;
• Q.738
Calculate the average transaction amount for each user during the month of December.
Explanation
To find the average transaction amount for each user in December, we need to filter
transactions based on the month and year (December 2022), group by user_id, and then
calculate the average transaction amount using the AVG() function.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
VALUES
(1, 101, '2022-12-01', 150.00),
(2, 102, '2022-12-02', 200.00),
(3, 101, '2022-12-05', 350.00),
(4, 103, '2022-12-10', 400.00),
(5, 104, '2022-12-20', 500.00);
Learnings
• Filtering transactions by month using MONTH() and YEAR()
• Calculating average using AVG()
• Grouping by user_id to get average spend per user
919
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
SELECT user_id, AVG(transaction_amount) AS average_transaction_amount
FROM transactions
WHERE EXTRACT(MONTH FROM transaction_date) = 12
AND EXTRACT(YEAR FROM transaction_date) = 2022
GROUP BY user_id;
• - MySQL solution
SELECT user_id, AVG(transaction_amount) AS average_transaction_amount
FROM transactions
WHERE MONTH(transaction_date) = 12
AND YEAR(transaction_date) = 2022
GROUP BY user_id;
• Q.739
Determine the total number of transactions and total amount spent for each user in the last
year.
Explanation
This task requires calculating the total number of transactions and the total transaction
amount for each user over the past year. We need to filter transactions within the last 12
months and use COUNT() to get the number of transactions and SUM() to calculate the total
amount spent.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
VALUES
(1, 101, '2022-01-15', 150.00),
(2, 102, '2022-02-20', 200.00),
(3, 101, '2022-05-25', 250.00),
(4, 103, '2022-06-30', 300.00),
(5, 101, '2022-09-10', 100.00),
(6, 102, '2022-10-15', 350.00),
(7, 103, '2022-11-01', 500.00);
Learnings
• Using COUNT() to find the total number of transactions
• Using SUM() to find the total amount spent
• Filtering transactions by date (last 12 months)
• Grouping results by user for aggregated statistics
Solutions
• - PostgreSQL solution
SELECT user_id, COUNT(*) AS total_transactions, SUM(transaction_amount) AS total_spent
FROM transactions
WHERE transaction_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY user_id;
• - MySQL solution
SELECT user_id, COUNT(*) AS total_transactions, SUM(transaction_amount) AS total_spent
FROM transactions
WHERE transaction_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY user_id;
920
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.740
Find the users who spent more in the first half of the year (January to June) compared to the
second half (July to December) of the same year. Return their user IDs, the total amount
spent in both periods, and the difference.
Explanation
This question involves breaking down transactions into two distinct time periods (first half
and second half of the year) and comparing the total spend in each period for each user. We
need to calculate the total spend for each user in both periods and return users who spent
more in the first half. The query should also return the difference in the amounts spent
between the two periods.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE transactions (
transaction_id INT,
user_id INT,
transaction_date DATE,
transaction_amount DECIMAL(10, 2)
);
• - Datasets
INSERT INTO transactions (transaction_id, user_id, transaction_date, transaction_amount)
VALUES
(1, 101, '2022-01-15', 150.00),
(2, 101, '2022-03-01', 200.00),
(3, 101, '2022-06-20', 300.00),
(4, 101, '2022-07-10', 400.00),
(5, 101, '2022-11-05', 500.00),
(6, 102, '2022-01-01', 250.00),
(7, 102, '2022-05-15', 300.00),
(8, 102, '2022-08-12', 200.00),
(9, 102, '2022-12-25', 150.00);
Learnings
• Use of conditional aggregation with CASE statements
• Date-based filtering using MONTH() or EXTRACT()
• Comparing two different groups of aggregated data
• Understanding how to calculate differences across two periods
Solutions
• - PostgreSQL solution
SELECT user_id,
SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 1 AND 6 THEN transacti
on_amount ELSE 0 END) AS first_half_spend,
SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 7 AND 12 THEN transact
ion_amount ELSE 0 END) AS second_half_spend,
SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 1 AND 6 THEN transacti
on_amount ELSE 0 END) -
SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 7 AND 12 THEN transact
ion_amount ELSE 0 END) AS spend_difference
FROM transactions
GROUP BY user_id
HAVING SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 1 AND 6 THEN transacti
on_amount ELSE 0 END) >
SUM(CASE WHEN EXTRACT(MONTH FROM transaction_date) BETWEEN 7 AND 12 THEN tran
saction_amount ELSE 0 END);
• - MySQL solution
SELECT user_id,
SUM(CASE WHEN MONTH(transaction_date) BETWEEN 1 AND 6 THEN transaction_amount ELS
E 0 END) AS first_half_spend,
921
1000+ SQL Interview Questions & Answers | By Zero Analyst
EY
• Q.741
Explanation
To identify duplicate orders:
• Group the records by customer_id, product_id, and order_date.
• Use the HAVING clause to filter groups that appear more than once, which indicates
duplicates.
• Return the duplicate orders' details such as customer_id, product_id, and order_date.
Learnings
• Grouping records to identify duplicates.
• Using HAVING with COUNT to filter out duplicates.
• Self-join techniques to find specific records.
922
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT
customer_id,
product_id,
order_date,
COUNT(*) AS duplicate_count
FROM
customer_orders
GROUP BY
customer_id, product_id, order_date
HAVING
COUNT(*) > 1
ORDER BY
customer_id, product_id, order_date;
MySQL Solution
SELECT
customer_id,
product_id,
order_date,
COUNT(*) AS duplicate_count
FROM
customer_orders
GROUP BY
customer_id, product_id, order_date
HAVING
COUNT(*) > 1
ORDER BY
customer_id, product_id, order_date;
• Q.742
Question: Identifying Unmatched Employee Logins and Logouts
Using a table of employee login and logout events (event_id, employee_id, event_type,
event_timestamp), write a query to find employees who have a login event without a
corresponding logout event. Each event_type can either be login or logout, and each
employee should have only one login and one logout event at any given time. The output
should list the employee_id and the event_timestamp of their unmatched login event.
Explanation
To identify employees who have logged in but not logged out:
• Join the table with itself to match each login event with a logout event for the same
employee.
• Use a LEFT JOIN to include all login events, even those without a matching logout
event.
• Filter for records where the logout event is missing (NULL).
• Return the employee_id and event_timestamp of the unmatched login events.
923
1000+ SQL Interview Questions & Answers | By Zero Analyst
event_timestamp DATETIME
);
Learnings
• Using JOIN to match related events in the same table.
• The use of LEFT JOIN to capture missing logout events.
• Filtering with IS NULL to detect unmatched events.
Solutions
PostgreSQL Solution
SELECT
e1.employee_id,
e1.event_timestamp AS unmatched_login_timestamp
FROM
employee_events e1
LEFT JOIN
employee_events e2 ON e1.employee_id = e2.employee_id
AND e1.event_type = 'login'
AND e2.event_type = 'logout'
AND e1.event_timestamp < e2.event_timestamp
WHERE
e1.event_type = 'login'
AND e2.event_id IS NULL
ORDER BY
e1.employee_id;
MySQL Solution
SELECT
e1.employee_id,
e1.event_timestamp AS unmatched_login_timestamp
FROM
employee_events e1
LEFT JOIN
employee_events e2 ON e1.employee_id = e2.employee_id
AND e1.event_type = 'login'
AND e2.event_type = 'logout'
AND e1.event_timestamp < e2.event_timestamp
WHERE
e1.event_type = 'login'
AND e2.event_id IS NULL
ORDER BY
e1.employee_id;
• Q.743
924
1000+ SQL Interview Questions & Answers | By Zero Analyst
event_type, and event_timestamp. The events are either login or logout, and each
employee logs in and out once per day. The result should list the employee_id, the
first_day where the login duration was shorter than 8 hours, and the second_day where the
login duration was also shorter than 8 hours.
Explanation
To identify employees with shorter login durations for consecutive days:
• Pair up login and logout events for each employee on the same day.
• Calculate the duration between login and logout for each event.
• Filter the results to include only those where the duration is less than 8 hours.
• Use a self-join to find consecutive days where both days have a login duration of less than
8 hours.
• Return the employee_id, first_day, and second_day.
Learnings
• Calculating time differences between login and logout events.
• Using JOINs and self-joins to find consecutive days.
• Filtering on time durations (less than 8 hours).
• Date manipulation to find consecutive events.
Solutions
PostgreSQL Solution
WITH LoginDurations AS (
SELECT
employee_id,
DATE(event_timestamp) AS login_date,
EXTRACT(EPOCH FROM (MAX(CASE WHEN event_type = 'logout' THEN event_timestamp END
)
- MIN(CASE WHEN event_type = 'login' THEN event_timestamp EN
D))) / 3600 AS duration_hours
925
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM
employee_events
WHERE
event_type IN ('login', 'logout')
GROUP BY
employee_id, DATE(event_timestamp)
)
SELECT
l1.employee_id,
l1.login_date AS first_day,
l2.login_date AS second_day
FROM
LoginDurations l1
JOIN
LoginDurations l2 ON l1.employee_id = l2.employee_id
AND l1.login_date = l2.login_date - INTERVAL '1 day'
WHERE
l1.duration_hours < 8
AND l2.duration_hours < 8
ORDER BY
l1.employee_id, l1.login_date;
MySQL Solution
WITH LoginDurations AS (
SELECT
employee_id,
DATE(event_timestamp) AS login_date,
TIMESTAMPDIFF(SECOND,
MIN(CASE WHEN event_type = 'login' THEN event_timestamp END),
MAX(CASE WHEN event_type = 'logout' THEN event_timestamp END)) / 3
600 AS duration_hours
FROM
employee_events
WHERE
event_type IN ('login', 'logout')
GROUP BY
employee_id, DATE(event_timestamp)
)
SELECT
l1.employee_id,
l1.login_date AS first_day,
l2.login_date AS second_day
FROM
LoginDurations l1
JOIN
LoginDurations l2 ON l1.employee_id = l2.employee_id
AND l1.login_date = DATE_SUB(l2.login_date, INTERVAL 1 DAY)
WHERE
l1.duration_hours < 8
AND l2.duration_hours < 8
ORDER BY
l1.employee_id, l1.login_date;
926
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To identify employees working for more than 1 hour consecutively for two days:
• Calculate the duration for each login/logout pair for each employee on each day.
• Join the table with itself to find consecutive days for the same employee.
• Filter the records where the total working time on both days exceeds 1 hour.
• Raise a flag for employees meeting the condition.
Learnings
• Calculating time differences between login and logout.
• Self-joining the table to compare consecutive days.
• Filtering based on working duration (more than 1 hour).
• Using flags to mark employees who meet the criteria.
Solutions
PostgreSQL Solution
WITH WorkingDurations AS (
SELECT
employee_id,
DATE(event_timestamp) AS work_day,
927
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
WITH WorkingDurations AS (
SELECT
employee_id,
DATE(event_timestamp) AS work_day,
TIMESTAMPDIFF(SECOND,
MIN(CASE WHEN event_type = 'login' THEN event_timestamp END),
MAX(CASE WHEN event_type = 'logout' THEN event_timestamp END)) / 3
600 AS working_hours
FROM
employee_events
WHERE
event_type IN ('login', 'logout')
GROUP BY
employee_id, DATE(event_timestamp)
)
SELECT
w1.employee_id,
w1.work_day AS first_day,
w2.work_day AS second_day,
'Flag' AS flag
FROM
WorkingDurations w1
JOIN
WorkingDurations w2 ON w1.employee_id = w2.employee_id
AND w1.work_day = DATE_SUB(w2.work_day, INTERVAL 1 DAY)
WHERE
w1.working_hours > 1
AND w2.working_hours > 1
ORDER BY
w1.employee_id, w1.work_day;
928
1000+ SQL Interview Questions & Answers | By Zero Analyst
This query helps identify employees who are consistently working for more than 1 hour on
two consecutive days, which could be useful for monitoring employee engagement or
workload.
• Q.745
Employee Work-Life Balance Analysis
Question: Write a SQL query to identify employees who are working more than 10 hours on
average per day in the last 7 days. These employees should be flagged as potentially at risk
for burnout. The data is provided in the employee_work_hours table, which tracks the daily
work hours for employees. The result should show the employee_id, average_work_hours,
and a flag indicating that they are at risk.
Explanation
• Calculate the average work hours for each employee over the last 7 days.
• Flag those with an average work hours greater than 10.
• Return the employee ID, their average_work_hours, and a flag indicating they are at risk.
Solution
SELECT
employee_id,
AVG(hours_worked) AS average_work_hours,
'At Risk' AS flag
FROM
employee_work_hours
WHERE
work_date >= CURDATE() - INTERVAL 7 DAY
GROUP BY
employee_id
HAVING
AVG(hours_worked) > 10
ORDER BY
average_work_hours DESC;
• Q.746
929
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Calculate the total break time for each employee on each day.
• Identify employees who have less than 30 minutes of break on consecutive days.
• Return employee_id, start_date, end_date, and flag them as needing more breaks.
Solution
WITH BreakDurations AS (
SELECT
employee_id,
DATE(break_start) AS break_date,
SUM(TIMESTAMPDIFF(MINUTE, break_start, break_end)) AS total_break_time
FROM
employee_breaks
GROUP BY
employee_id, DATE(break_start)
)
SELECT
b1.employee_id,
b1.break_date AS start_date,
b2.break_date AS end_date,
'Insufficient Breaks' AS flag
FROM
BreakDurations b1
JOIN
BreakDurations b2 ON b1.employee_id = b2.employee_id
AND DATEDIFF(b2.break_date, b1.break_date) = 1
WHERE
b1.total_break_time < 30
AND b2.total_break_time < 30
ORDER BY
b1.employee_id, b1.break_date;
• Q.747
930
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Calculate the total sick days taken by each employee in the last month.
• Identify employees who have taken more than 3 sick days.
• Return employee_id, sick_days_taken, and flag them as requiring follow-up.
Solution
SELECT
employee_id,
COUNT(sick_day_date) AS sick_days_taken,
'Follow-up Needed' AS flag
FROM
employee_sick_days
WHERE
sick_day_date >= CURDATE() - INTERVAL 1 MONTH
GROUP BY
employee_id
HAVING
COUNT(sick_day_date) > 3
ORDER BY
sick_days_taken DESC;
Key Learnings
• Identifying patterns in work hours to ensure employees maintain a healthy work-life
balance.
• Tracking break durations to ensure employees are taking enough rest.
• Monitoring sick leave patterns to ensure that employees are not over-utilizing sick days,
which could indicate burnout or other issues.
• Q.748
Tracking Employees with Consecutive Late Logins
931
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question: Write a SQL query to identify employees who have logged in late (after 9 AM) for
3 consecutive days in the past month. These employees might be exhibiting signs of poor
time management or work-life imbalance. The result should show the employee_id,
consecutive_late_days, and flag them as "Time Management Concern".
Explanation
• Track employees who log in after 9 AM.
• Identify employees who have consecutive late logins for 3 or more days.
• Flag these employees as having a time management concern.
Solution
WITH LateLogins AS (
SELECT
employee_id,
DATE(login_time) AS login_date,
IF(TIME(login_time) > '09:00:00', 1, 0) AS is_late
FROM
employee_logins
WHERE
login_time >= CURDATE() - INTERVAL 1 MONTH
)
SELECT
l1.employee_id,
COUNT(*) AS consecutive_late_days,
'Time Management Concern' AS flag
FROM
LateLogins l1
JOIN
LateLogins l2 ON l1.employee_id = l2.employee_id
AND DATEDIFF(l2.login_date, l1.login_date) = 1
JOIN
LateLogins l3 ON l2.employee_id = l3.employee_id
AND DATEDIFF(l3.login_date, l2.login_date) = 1
WHERE
l1.is_late = 1 AND l2.is_late = 1 AND l3.is_late = 1
GROUP BY
l1.employee_id
ORDER BY
consecutive_late_days DESC;
• Q.749
Flagging Employees for Long Working Hours Without Breaks
932
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question: Write a SQL query to flag employees who have worked for more than 9 hours
without taking any break. You are given the employee_work_hours table (with login and
logout timestamps) and the employee_breaks table (with break start and end times). Identify
employees who have worked continuously for more than 9 hours in a day without any break.
Explanation
• Calculate the total work time for each employee each day.
• Check if there is any break during the working hours.
• Flag employees who worked more than 9 hours without taking a break.
Solution
WITH WorkDurations AS (
SELECT
employee_id,
DATE(login_time) AS work_date,
TIMESTAMPDIFF(HOUR, login_time, logout_time) AS total_work_hours
FROM
employee_work_hours
),
Breaks AS (
SELECT
employee_id,
DATE(break_start) AS break_date,
SUM(TIMESTAMPDIFF(MINUTE, break_start, break_end)) / 60 AS total_break_hours
FROM
employee_breaks
GROUP BY
employee_id, DATE(break_start)
)
SELECT
w.employee_id,
w.work_date,
'No Breaks Taken' AS flag
FROM
WorkDurations w
933
1000+ SQL Interview Questions & Answers | By Zero Analyst
LEFT JOIN
Breaks b ON w.employee_id = b.employee_id AND w.work_date = b.break_date
WHERE
w.total_work_hours > 9
AND (b.total_break_hours IS NULL OR b.total_break_hours = 0)
ORDER BY
w.employee_id, w.work_date;
• Q.750
Identifying Employees Working on Weekends
Question: Write a SQL query to identify employees who have worked on both Saturday and
Sunday during the last month. Working on weekends can indicate high stress or workload.
The result should show employee_id, weekend_work_days, and a flag indicating they
worked on weekends.
Explanation
• Track all the weekend days (Saturday and Sunday) worked by employees.
• Identify employees who have worked on both Saturday and Sunday in the last month.
• Flag those employees as weekend workers.
Solution
WITH WeekendWork AS (
SELECT
employee_id,
DAYOFWEEK(login_time) AS work_day,
DATE(login_time) AS work_date
FROM
employee_work_hours
WHERE
login_time >= CURDATE() - INTERVAL 1 MONTH
)
SELECT
employee_id,
COUNT(DISTINCT work_date) AS weekend_work_days,
'Weekend Worker' AS flag
FROM
WeekendWork
WHERE
work_day IN (1, 7) -- 1 = Sunday, 7 = Saturday
GROUP BY
employee_id
HAVING
COUNT(DISTINCT work_date) = 2
ORDER BY
employee_id;
934
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Learnings
• Tracking consecutive behaviors such as late logins or long working hours can help
identify patterns related to employee wellbeing.
• Weekend working can indicate potential work overload or unhealthy work-life balance.
• Breaks and overtime tracking are crucial for identifying employees who might be
overworked or experiencing burnout.
These queries will help HR and managers focus on employees who may need additional
support to ensure their wellbeing and work-life balance
• Q.751
Question: Write a SQL query to identify employees who have been assigned more than 5
tasks per day for 3 consecutive days in the past month. This may indicate high stress or
workload imbalance. Return the employee_id, consecutive_days, and flag them as "High
Workload".
Explanation
• Track the number of tasks assigned to each employee each day.
• Identify employees who have been assigned more than 5 tasks per day for 3 consecutive
days.
• Flag these employees as having a high workload.
Solution
WITH TaskCounts AS (
SELECT
employee_id,
task_date,
COUNT(*) AS task_count
FROM
employee_tasks
GROUP BY
employee_id, task_date
)
935
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
t1.employee_id,
COUNT(DISTINCT t1.task_date) AS consecutive_days,
'High Workload' AS flag
FROM
TaskCounts t1
JOIN
TaskCounts t2 ON t1.employee_id = t2.employee_id
AND DATEDIFF(t2.task_date, t1.task_date) = 1
JOIN
TaskCounts t3 ON t2.employee_id = t3.employee_id
AND DATEDIFF(t3.task_date, t2.task_date) = 1
WHERE
t1.task_count > 5 AND t2.task_count > 5 AND t3.task_count > 5
GROUP BY
t1.employee_id
HAVING
COUNT(DISTINCT t1.task_date) = 3
ORDER BY
consecutive_days DESC;
• Q.752
Flagging Employees with Excessive Overtime (More Than 12 Hours) on Weekdays
Question: Write a SQL query to identify employees who have worked more than 12 hours
on any weekdays in the past month. These employees may be facing work overload or
burnout. Return the employee_id, work_date, and flag them as "Excessive Overtime".
Explanation
• Calculate the work duration for each employee per day.
• Identify days when employees have worked more than 12 hours on weekdays (Monday
to Friday).
• Flag employees who have worked excessive hours as at risk of burnout.
Solution
WITH WorkDurations AS (
SELECT
employee_id,
DATE(login_time) AS work_date,
TIMESTAMPDIFF(HOUR, login_time, logout_time) AS work_hours,
DAYOFWEEK(login_time) AS day_of_week
FROM
employee_work_hours
WHERE
login_time >= CURDATE() - INTERVAL 1 MONTH
)
SELECT
employee_id,
936
1000+ SQL Interview Questions & Answers | By Zero Analyst
work_date,
'Excessive Overtime' AS flag
FROM
WorkDurations
WHERE
work_hours > 12
AND day_of_week BETWEEN 2 AND 6 -- Weekdays: Monday (2) to Friday (6)
ORDER BY
work_date DESC;
• Q.753
Detecting Employees with Consistently Low Engagement (No Activity for 7+ Days)
Question: Write a SQL query to detect employees who have not logged in or performed any
work-related activities (e.g., task assignments or project updates) for 7 or more consecutive
days. These employees may be experiencing disengagement, burnout, or personal issues.
Return the employee_id, first_inactive_date, and flag them as "Low Engagement".
Explanation
• Track employees who have not logged in or performed any work activities for 7+
consecutive days.
• Detect employees who might be disengaged or facing work-related burnout.
• Flag these employees as having low engagement.
Solution
WITH ActivityGaps AS (
SELECT
employee_id,
activity_date,
LEAD(activity_date) OVER (PARTITION BY employee_id ORDER BY activity_date) AS ne
xt_activity_date
FROM
employee_activities
)
SELECT
employee_id,
MIN(activity_date) AS first_inactive_date,
'Low Engagement' AS flag
FROM
ActivityGaps
WHERE
DATEDIFF(next_activity_date, activity_date) > 1
AND DATEDIFF(next_activity_date, activity_date) <= 7
GROUP BY
employee_id
ORDER BY
first_inactive_date DESC;
937
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Learnings
• High task load or excessive overtime may indicate employee burnout or poor work-life
balance.
• Identifying disengagement or lack of activity can help address potential workforce
wellbeing issues early.
• Workload tracking and activity analysis help in identifying employees who may need
additional support to maintain optimal wellbeing.
• Q.754
Explanation
• Track task completion rate for each employee on a monthly basis.
• Identify employees who have completed more than 90% of their assigned tasks for 3
consecutive months.
• Flag these employees as "High Performers".
Solution
WITH TaskCompletionRate AS (
SELECT
employee_id,
EXTRACT(MONTH FROM assigned_date) AS task_month,
COUNT(*) AS total_tasks,
SUM(CASE WHEN task_status = 'Completed' THEN 1 ELSE 0 END) AS completed_tasks
FROM
employee_tasks
GROUP BY
employee_id, EXTRACT(MONTH FROM assigned_date)
),
HighPerformers AS (
938
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
employee_id,
COUNT(*) AS consecutive_months
FROM
TaskCompletionRate
WHERE
completed_tasks / total_tasks > 0.9
GROUP BY
employee_id
)
SELECT
employee_id,
consecutive_months,
'High Performer' AS flag
FROM
HighPerformers
WHERE
consecutive_months >= 3
ORDER BY
consecutive_months DESC;
Explanation
• Calculate the average task completion rate for each department in a given quarter.
• Identify employees whose task completion rate exceeds the department's average for that
quarter.
• Flag these employees as "Above Average Performers".
Solution
WITH DepartmentCompletionRate AS (
SELECT
e.department_id,
EXTRACT(QUARTER FROM et.assigned_date) AS task_quarter,
AVG(CASE WHEN et.task_status = 'Completed' THEN 1 ELSE 0 END) AS avg_completion_
rate
FROM
employees e
JOIN
939
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Learnings
• Task completion rates are a useful metric for tracking employee performance,
highlighting high performers and underperformers.
• Consecutive months of high task completion or a decline in performance can be critical
indicators of employee motivation or burnout.
• Understanding employees' relative performance within their department helps in fostering
a competitive and motivating work environment.
These queries help to monitor employee productivity and engagement, ultimately aiding in
the improvement of overall organizational performance and ensuring employee well-
being.
• Q.755
Identifying Employees with Declining Performance Over the Past 6 Months
Question: Write a SQL query to identify employees whose task completion rate has
declined by more than 20% over the past 6 months. Return the employee_id,
previous_rate, current_rate, and flag them as "Declining Performance".
Explanation
• Track task completion rates for employees over the past 6 months.
• Calculate the percentage change in task completion rate.
• Flag employees whose performance has declined by more than 20%.
940
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solution
WITH MonthlyCompletionRates AS (
SELECT
employee_id,
EXTRACT(MONTH FROM assigned_date) AS task_month,
COUNT(*) AS total_tasks,
SUM(CASE WHEN task_status = 'Completed' THEN 1 ELSE 0 END) AS completed_tasks
FROM
employee_tasks
WHERE
assigned_date >= CURDATE() - INTERVAL 6 MONTH
GROUP BY
employee_id, EXTRACT(MONTH FROM assigned_date)
),
PerformanceChange AS (
SELECT
employee_id,
MAX(CASE WHEN task_month = EXTRACT(MONTH FROM CURDATE() - INTERVAL 1 MONTH) THEN
completed_tasks / total_tasks ELSE 0 END) AS previous_rate,
MAX(CASE WHEN task_month = EXTRACT(MONTH FROM CURDATE()) THEN completed_tasks /
total_tasks ELSE 0 END) AS current_rate
FROM
MonthlyCompletionRates
GROUP BY
employee_id
)
SELECT
employee_id,
previous_rate,
current_rate,
'Declining Performance' AS flag
FROM
PerformanceChange
WHERE
(previous_rate - current_rate) > 0.2
ORDER BY
current_rate ASC;
• Q.756
Question
Calculate the Click-Through Rate (CTR) for Ernst & Young Webinars and identify the
webinar with the highest CTR.
Explanation
941
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to calculate the click-through rate (CTR), which is the ratio of users who clicked on
an ad to the users who saw the ad. Then, determine which webinar has the highest CTR. The
relevant data is stored across three tables: ad_impressions, ad_clicks, and
webinar_registrations. You need to join these tables and calculate the CTR for each webinar.
Datasets:
-- Insert data into ad_impressions
INSERT INTO ad_impressions (impression_id, webinar_id, date, user_id)
VALUES
(5621, 1011, '2022-10-01 00:00:00', 124),
(5622, 1920, '2022-10-01 00:00:00', 278),
(5623, 1011, '2022-10-01 00:00:00', 345),
(5624, 1011, '2022-10-01 00:00:00', 234),
(5625, 1920, '2022-10-01 00:00:00', 678);
Learnings
• Joins: Using JOIN to combine data from different tables based on common fields
(webinar_id and user_id).
• DISTINCT: Ensuring unique user counts for impressions, clicks, and registrations.
• Aggregations: Using COUNT() to aggregate impressions, clicks, and registrations.
• Subqueries: Using a WITH clause for intermediate calculations before the final result.
942
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution:
WITH CTR AS (
SELECT
I.webinar_id,
COUNT(DISTINCT I.user_id) AS impressions,
COUNT(DISTINCT C.user_id) AS clicks,
COUNT(DISTINCT R.user_id) AS registrations
FROM
ad_impressions I
JOIN
ad_clicks C ON I.webinar_id = C.webinar_id AND I.user_id = C.user_id
JOIN
webinar_registrations R ON C.webinar_id = R.webinar_id AND C.user_id = R.user_id
WHERE
I.date <= C.date AND C.date <= R.date
GROUP BY
I.webinar_id
)
SELECT
webinar_id,
impressions,
clicks,
registrations,
(clicks::float / impressions::float * 100) AS click_through_rate
FROM
CTR
ORDER BY
click_through_rate DESC;
MySQL Solution:
WITH CTR AS (
SELECT
I.webinar_id,
COUNT(DISTINCT I.user_id) AS impressions,
COUNT(DISTINCT C.user_id) AS clicks,
COUNT(DISTINCT R.user_id) AS registrations
FROM
ad_impressions I
JOIN
ad_clicks C ON I.webinar_id = C.webinar_id AND I.user_id = C.user_id
JOIN
webinar_registrations R ON C.webinar_id = R.webinar_id AND C.user_id = R.user_id
WHERE
I.date <= C.date AND C.date <= R.date
GROUP BY
I.webinar_id
)
SELECT
webinar_id,
impressions,
clicks,
registrations,
(clicks / impressions * 100) AS click_through_rate
FROM
CTR
ORDER BY
click_through_rate DESC;
• Q.757
Identifying Project Delays and Resource Allocation Issues
943
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question: Write a SQL query to identify projects that have experienced a delay of more
than 15% in their original estimated duration. For each delayed project, calculate the
additional time it took beyond the estimated duration, and identify whether the delay was
due to under- or over-allocation of resources. Return the project_id, estimated_duration,
actual_duration, over_or_under_allocation, and flag them as "Delayed Project".
Explanation
• Track the estimated duration and actual duration of each project.
• Calculate the percentage of delay for each project.
• Determine if the delay is due to under-allocation (fewer resources than expected) or over-
allocation (more resources than expected).
• Flag projects that have experienced delays of more than 15%.
Solution
WITH DelayAnalysis AS (
SELECT
p.project_id,
p.estimated_duration,
p.actual_duration,
(p.actual_duration - p.estimated_duration) / p.estimated_duration * 100 AS delay
_percentage,
SUM(CASE WHEN ra.allocated_hours < ra.actual_hours THEN 1 ELSE 0 END) AS under_a
llocation,
SUM(CASE WHEN ra.allocated_hours > ra.actual_hours THEN 1 ELSE 0 END) AS over_al
location
FROM
projects p
LEFT JOIN
resource_allocation ra ON p.project_id = ra.project_id
GROUP BY
p.project_id, p.estimated_duration, p.actual_duration
)
944
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
da.project_id,
da.estimated_duration,
da.actual_duration,
CASE
WHEN da.under_allocation > da.over_allocation THEN 'Under-Allocation'
WHEN da.over_allocation > da.under_allocation THEN 'Over-Allocation'
ELSE 'Balanced'
END AS over_or_under_allocation,
'Delayed Project' AS flag
FROM
DelayAnalysis da
WHERE
da.delay_percentage > 15
ORDER BY
da.delay_percentage DESC;
Key Learnings
• Project delays and resource allocation issues often correlate with employee burnout and
performance declines. Tracking these issues is critical for maintaining a healthy work
environment.
• Monitoring employee efficiency helps identify potential overwork issues, ensuring that
employees are not being overloaded with tasks.
• Skewed resource allocation can lead to unbalanced workloads and performance
inefficiencies. Identifying and addressing this helps in redistributing work more evenly across
teams.
• Q.758
Tracking Employee Project Assignment Efficiency
Question: Write a SQL query to calculate the efficiency of each employee on projects,
defined as the ratio of actual hours worked to allocated hours on all their projects. Identify
employees whose efficiency is greater than 1.2 (i.e., they worked more than 120% of their
allocated hours). Return the employee_id, total_allocated_hours,
total_actual_hours, and flag them as "Overworked".
Explanation
• Calculate the total allocated hours and actual hours for each employee across all projects.
• Calculate the efficiency as the ratio of actual hours to allocated hours.
• Flag employees whose efficiency is greater than 1.2 as "Overworked".
Solution
SELECT
employee_id,
SUM(allocated_hours) AS total_allocated_hours,
SUM(actual_hours) AS total_actual_hours,
(SUM(actual_hours) / SUM(allocated_hours)) AS efficiency,
CASE
945
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Analyzing Consulting Project Performance
Ernst & Young (EY) is a multinational professional services firm and is one of the "Big
Four" accounting firms. EY is involved with a number of consulting projects with different
clients and the stakeholder wants to understand the consulting project performance. You are
provided with the following two tables:
• projects - Each row records information of a project involving a specific client.
• billing - Each row records information of billing to the clients for each project.
The stakeholder would like to know:
• The total billing amount for each project.
• The average monthly billing amount for each project.
Write a SQL PostgreSQL query to help answer the stakeholder's questions.
Explanation
• Join the projects table with the billing table based on project_id to retrieve billing
information for each project.
• Calculate the total billing amount for each project using SUM().
• Calculate the average monthly billing for each project by dividing the total billing amount
by the number of months between the project_start_date and project_end_date.
• Use EXTRACT() to get the year and month part of the dates for calculating the total number
of months between the start and end date of each project.
• Group the results by project_id and project_name to ensure you get the summary for
each project.
946
1000+ SQL Interview Questions & Answers | By Zero Analyst
billing_amount DECIMAL(10, 2)
);
Solutions
Solution (PostgreSQL)
SELECT
projects.project_id,
projects.project_name,
SUM(billing.billing_amount) AS total_billing,
(SUM(billing.billing_amount)/
((EXTRACT(YEAR FROM projects.project_end_date) - EXTRACT(YEAR FROM projects.project_
start_date))*12 +
(EXTRACT(MONTH FROM projects.project_end_date) - EXTRACT(MONTH FROM projects.project
_start_date)))) AS avg_monthly_billing
FROM
projects
INNER JOIN
billing ON projects.project_id = billing.project_id
GROUP BY
projects.project_id,
projects.project_name,
projects.project_start_date,
projects.project_end_date;
Learnings
• Joins: This problem involves an INNER JOIN between the projects and billing tables to
get the combined data for each project.
• Aggregation: The use of SUM() aggregates the total billing amount for each project.
• Date Calculations: The EXTRACT() function is used to calculate the number of months
between project_start_date and project_end_date. This allows the calculation of the
average monthly billing.
• Grouping: The query groups by project_id and project_name to get the results per
project.
• Q.760
Question
Identify Ernst & Young's Top Billing Clients
947
1000+ SQL Interview Questions & Answers | By Zero Analyst
Ernst & Young (EY) is a multinational professional services network. It primarily provides
assurance (including financial audit), tax, consulting, and advisory services to its clients. For
EY, a VIP user or "whale" would be a client that has substantial expenditures with EY in
terms of billing amounts. You are given a database that contains two tables: a "clients" table
and a "billings" table.
Write a SQL query to identify EY's top 10 clients in terms of total billed amount in the past
year.
Explanation
• The task requires identifying the top 10 clients based on the total amount billed in the past
year.
• You will need to join the clients table with the billings table using the client_id
field.
• Filter the billings records to only include those within the past year.
• Sum the total billed amount for each client and order the results by the total billed amount
in descending order.
• Return the top 10 clients with the highest billed amounts.
Solutions
948
1000+ SQL Interview Questions & Answers | By Zero Analyst
ON c.client_id = b.client_id
WHERE b.billing_date BETWEEN (CURRENT_DATE - INTERVAL '1 year') AND CURRENT_DATE
GROUP BY c.client_name
ORDER BY total_billed DESC
LIMIT 10;
Learnings
• Joins: This problem involves using a JOIN to combine data from two tables (clients and
billings) based on the client_id.
• Filtering: The query uses a date filter (WHERE b.billing_date BETWEEN ...) to ensure
only billing records from the last year are considered.
• Aggregation: The query uses SUM() to aggregate the billing amounts for each client.
• Ordering and Limiting: Sorting by total_billed in descending order helps identify the
top clients, and LIMIT 10 ensures we get the top 10 results.
Capgemini
• Q.761
Find Employees Who Report to More Than One Manager
Question:
Write a SQL query to find employees who report to more than one manager.
Explanation:
You need to:
• Identify employees with multiple manager IDs.
• Group the results by employee and use the HAVING clause to filter out employees with only
one manager.
Employees Table
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
manager_id INT
);
SQL Solution:
SELECT employee_id, name
FROM employees
GROUP BY employee_id, name
HAVING COUNT(DISTINCT manager_id) > 1;
• Q.762
949
1000+ SQL Interview Questions & Answers | By Zero Analyst
Employees Table
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(100)
);
SQL Solution:
SELECT name, department
FROM employees
WHERE department NOT IN (
SELECT department
FROM employees
GROUP BY department
HAVING COUNT(*) > 1
);
950
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this, you need to join the employees and usage_logs tables. For each client,
calculate the average usage per week, and filter the results to show only those clients with an
average usage greater than 5 times per week.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE employees (
employee_id INT,
client_name VARCHAR(255)
);
-- usage_logs data
INSERT INTO usage_logs (log_id, employee_id, login_date)
VALUES
(7381, 2431, '2022-07-05'),
(8127, 3716, '2022-07-06'),
(6743, 4293, '2022-07-06'),
(9823, 5632, '2022-07-07'),
(6257, 6427, '2022-07-07');
Learnings
• Joins: Combine data from multiple tables.
• Aggregation: Calculate average usage.
• Filtering: Use WHERE and HAVING for conditions.
• Date functions: Use date ranges to limit the data.
Solutions
• - PostgreSQL solution
SELECT e.client_name, AVG(u.usage_count) AS avg_usage
FROM employees e
LEFT JOIN (
SELECT employee_id, COUNT(*) AS usage_count
FROM usage_logs
WHERE login_date >= CURRENT_DATE - INTERVAL '1 week'
GROUP BY employee_id
) u ON e.employee_id = u.employee_id
GROUP BY e.client_name
HAVING AVG(u.usage_count) > 5;
• - MySQL solution
SELECT e.client_name, AVG(u.usage_count) AS avg_usage
FROM employees e
LEFT JOIN (
SELECT employee_id, COUNT(*) AS usage_count
FROM usage_logs
WHERE login_date >= CURDATE() - INTERVAL 1 WEEK
GROUP BY employee_id
) u ON e.employee_id = u.employee_id
GROUP BY e.client_name
HAVING AVG(u.usage_count) > 5;
951
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.764
Question
Write a SQL query to identify employees who have been absent for three or more
consecutive days from the "attendance" table.
Explanation
The query uses the ROW_NUMBER() function to assign sequential numbers to attendance
records, and then calculates a grp value to identify consecutive days. It groups by this grp
value and filters those groups where the count of absences is three or more.
Learnings
• Using ROW_NUMBER() to assign sequential numbers.
• Identifying consecutive sequences using ROW_NUMBER() and EXTRACT().
• Grouping by calculated values and applying conditions to filter results.
Solutions
• - PostgreSQL solution
SELECT employee_id, MIN(attendance_date) AS start_date, MAX(attendance_date) AS end_date
FROM (
SELECT
employee_id,
attendance_date,
ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY attendance_date) - EXTRACT(
DAY FROM attendance_date) AS grp
FROM attendance
WHERE status = 'absent'
) AS sub
GROUP BY employee_id, grp
HAVING COUNT(*) >= 3;
• - MySQL solution
SELECT employee_id, MIN(attendance_date) AS start_date, MAX(attendance_date) AS end_date
FROM (
SELECT
employee_id,
attendance_date,
ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY attendance_date) - DATEDIFF
(attendance_date, '1970-01-01') AS grp
FROM attendance
WHERE status = 'absent'
) AS sub
952
1000+ SQL Interview Questions & Answers | By Zero Analyst
Sales Table
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
product_id INT,
customer_id INT,
sale_date DATE
);
Products Table
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100)
);
SQL Solution:
SELECT p.product_name
FROM products p
JOIN sales s ON p.product_id = s.product_id
WHERE s.sale_date BETWEEN '2023-09-01' AND '2023-09-30'
GROUP BY p.product_id, p.product_name
ORDER BY COUNT(DISTINCT s.customer_id) DESC
LIMIT 1;
• Q.766
Find Employees Who Have Worked in All Departments
953
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question:
Write a SQL query to find employees who have worked in all the departments at least once.
Assume there is a table that tracks which employees have worked in which department.
Explanation:
• You need to check if an employee has worked in all departments.
• This involves using a COUNT on the department column, comparing it with the total
number of distinct departments.
Employees Table
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100)
);
Employee_Departments Table
CREATE TABLE employee_departments (
employee_id INT,
department VARCHAR(100)
);
SQL Solution:
SELECT e.name
FROM employees e
JOIN employee_departments ed ON e.employee_id = ed.employee_id
GROUP BY e.employee_id, e.name
HAVING COUNT(DISTINCT ed.department) = (SELECT COUNT(DISTINCT department) FROM
employee_departments);
• Q.767
Find Employees Who Have Not Worked in Consecutive Months
Question:
Write a SQL query to find employees who have not worked in two consecutive months.
Explanation:
• Use a self-join on the employee_attendance table to compare each month's records for an
employee.
• Ensure there are no consecutive month entries for an employee.
954
1000+ SQL Interview Questions & Answers | By Zero Analyst
Employee_Attendance Table
CREATE TABLE employee_attendance (
employee_id INT,
attendance_date DATE
);
SQL Solution:
SELECT e1.employee_id
FROM employee_attendance e1
LEFT JOIN employee_attendance e2
ON e1.employee_id = e2.employee_id
AND MONTH(e1.attendance_date) = MONTH(e2.attendance_date) - 1
WHERE e2.attendance_date IS NULL
GROUP BY e1.employee_id;
Key Takeaways:
• Self-Joins and Subqueries: Often used when comparing values within the same table or
when looking for relative differences.
• Window Functions: Helpful for ranking, counting, and partitioning data without needing
multiple joins.
• Advanced Aggregation: Sometimes, you need to aggregate data by comparing against
other aggregates (like counting the distinct departments or comparing counts across rows).
• Date and Time Functions: These can be tricky but are necessary when comparing dates or
months (e.g., MONTH() or DATEPART()).
• Q.768
Find the Employees Who Have Worked the Most Hours in Each Department
Question:
Write a SQL query to find the employee who has worked the most hours in each department.
Explanation:
You need to:
• Join the employee_hours table with the employees and departments tables.
• Group by department and employee to calculate the total hours worked.
• Use ROW_NUMBER() to rank employees within each department by their total hours and
return the employee with the most hours worked in each department.
Employee Table
955
1000+ SQL Interview Questions & Answers | By Zero Analyst
Departments Table
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100)
);
Employee_Hours Table
CREATE TABLE employee_hours (
employee_id INT,
department_id INT,
hours_worked INT,
work_date DATE
);
SQL Solution:
WITH RankedEmployees AS (
SELECT e.name, d.department_name,
SUM(eh.hours_worked) AS total_hours,
ROW_NUMBER() OVER (PARTITION BY eh.department_id ORDER BY SUM(eh.hours_worked
) DESC) AS rank
FROM employee_hours eh
JOIN employees e ON eh.employee_id = e.employee_id
JOIN departments d ON eh.department_id = d.department_id
GROUP BY e.employee_id, d.department_name
)
SELECT name, department_name, total_hours
FROM RankedEmployees
WHERE rank = 1;
• Q.769
Question
Find all employees who make more money than their direct boss.
Explanation
To solve this, you need to compare the salary of each employee with that of their direct
manager. This can be done by self-joining the employees table, using the manager_id to
956
1000+ SQL Interview Questions & Answers | By Zero Analyst
match employees with their managers, and filtering where the employee's salary is greater
than the manager's salary.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE employees (
employee_id INT,
name VARCHAR(255),
salary DECIMAL(10, 2),
department_id INT,
manager_id INT
);
• - Datasets
-- employees data
INSERT INTO employees (employee_id, name, salary, department_id, manager_id)
VALUES
(1, 'Emma Thompson', 3800, 1, NULL),
(2, 'Daniel Rodriguez', 2230, 1, 10),
(3, 'Olivia Smith', 8000, 1, 8),
(4, 'Noah Johnson', 6800, 2, 8),
(5, 'Sophia Martinez', 1750, 1, 10),
(8, 'William Davis', 7000, 2, NULL),
(10, 'James Anderson', 4000, 1, NULL);
Learnings
• Self-joins: Join a table with itself to compare records.
• Comparison: Compare employee and manager salaries.
• Filtering: Use conditions to filter employees who earn more than their managers.
Solutions
• - PostgreSQL solution
SELECT e.employee_id, e.name AS employee_name
FROM employees e
JOIN employees m ON e.manager_id = m.employee_id
WHERE e.salary > m.salary;
• - MySQL solution
SELECT e.employee_id, e.name AS employee_name
FROM employees e
JOIN employees m ON e.manager_id = m.employee_id
WHERE e.salary > m.salary;
• Q.770
Question
Find the quantity details of the first and last orders for each customer.
Explanation
To solve this, you need to identify each customer's first and last order by order_date and
then retrieve the quantities of those orders. This can be done by using a subquery or window
functions to get the first and last orders per customer, and then joining the results to retrieve
the order quantities.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE orders (
order_id INT,
customer_id INT,
order_date DATE,
quantity INT
);
• - Datasets
957
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- orders data
INSERT INTO orders (order_id, customer_id, order_date, quantity)
VALUES
(1, 101, '2022-01-05', 3),
(2, 101, '2022-02-15', 5),
(3, 102, '2022-03-10', 2),
(4, 102, '2022-04-01', 4),
(5, 103, '2022-05-15', 7);
Learnings
• Window functions or subqueries: To find first and last order per customer.
• Aggregation: Grouping data by customer to find first and last orders.
• Sorting: Use ORDER BY to determine the first and last order date.
Solutions
• - PostgreSQL solution
WITH ranked_orders AS (
SELECT
customer_id,
order_id,
quantity,
order_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date ASC) AS row_num_
asc,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS row_num
_desc
FROM orders
)
SELECT
customer_id,
MAX(CASE WHEN row_num_asc = 1 THEN quantity END) AS first_order_qty,
MAX(CASE WHEN row_num_desc = 1 THEN quantity END) AS last_order_qty
FROM ranked_orders
GROUP BY customer_id;
• - MySQL solution
SELECT
customer_id,
MAX(CASE WHEN order_date = (SELECT MIN(order_date) FROM orders o2 WHERE o2.customer_
id = o1.customer_id) THEN quantity END) AS first_order_qty,
MAX(CASE WHEN order_date = (SELECT MAX(order_date) FROM orders o2 WHERE o2.customer_
id = o1.customer_id) THEN quantity END) AS last_order_qty
FROM orders o1
GROUP BY customer_id;
• Q.771
Question
Capgemini has a large customer base, and you are required to filter the customers whose
names start with 'CAP'. Write a SQL query to find all records where CustomerName starts
with 'CAP'.
Explanation
The query filters records from the customers table where the CustomerName begins with
'CAP'. The % wildcard in the LIKE clause matches any characters following 'CAP'.
958
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO customers (CustomerID, CustomerName)
VALUES
(1, 'CAPITAL'),
(2, 'CAPRICE'),
(3, 'APPLE'),
(4, 'CAPITALIZE');
Learnings
• Usage of LIKE with wildcards (%) to filter string patterns.
• Filtering data based on specific string matches.
Solutions
• - PostgreSQL solution
SELECT CustomerID, CustomerName
FROM customers
WHERE CustomerName LIKE 'CAP%';
• - MySQL solution
SELECT CustomerID, CustomerName
FROM customers
WHERE CustomerName LIKE 'CAP%';
• Q.772
Question 2: Find Employees Who Have Worked in All Departments in the Last
6 Months
Question:
Write a SQL query to find employees who have worked in every department in the last 6
months.
Explanation:
You need to:
• Identify the departments that exist in the organization.
• Check if employees have worked in all those departments in the last 6 months.
• Use HAVING to ensure that the number of distinct departments worked by the employee
matches the total number of departments.
Employee Table
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100)
);
Departments Table
CREATE TABLE departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(100)
);
959
1000+ SQL Interview Questions & Answers | By Zero Analyst
Employee_Hours Table
CREATE TABLE employee_hours (
employee_id INT,
department_id INT,
hours_worked INT,
work_date DATE
);
SQL Solution:
SELECT e.name
FROM employees e
JOIN employee_hours eh ON e.employee_id = eh.employee_id
WHERE eh.work_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY e.employee_id
HAVING COUNT(DISTINCT eh.department_id) = (SELECT COUNT(*) FROM departments);
• Q.773
Question
Find the customers who have not made a purchase in the last year (2023) but have made a
purchase in the current year (2024).
Explanation
To solve this, you need to filter customers who made purchases in 2024 but did not make any
purchases in 2023. Use the YEAR() function to filter orders by year, then apply a condition to
exclude customers who made purchases in 2023.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE orders (
order_id INT,
customer_id INT,
order_date DATE,
quantity INT
);
• - Datasets
-- orders data
INSERT INTO orders (order_id, customer_id, order_date, quantity)
VALUES
(1, 101, '2024-01-05', 3),
(2, 102, '2023-06-15', 5),
(3, 103, '2024-02-10', 2),
(4, 101, '2024-08-15', 4),
(5, 102, '2023-03-01', 7);
Learnings
960
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
This query first identifies the direct friends of the user, then it finds potential friends by
looking for users who share mutual friends with the given user. It excludes direct friends
from the potential friends list and ranks them based on the number of mutual friends.
Learnings
• Using JOIN to find mutual relationships.
• Using WITH clauses (Common Table Expressions) to structure complex queries.
• Excluding direct friends using NOT IN or similar filtering.
• Aggregating mutual relationships using COUNT().
Solutions
961
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - PostgreSQL solution
WITH DirectFriends AS (
SELECT user_id, friend_id
FROM friendships
WHERE user_id = :user_id
),
MutualFriends AS (
SELECT f1.friend_id AS mutual_friend, f2.friend_id AS potential_friend
FROM friendships f1
JOIN friendships f2 ON f1.friend_id = f2.user_id
WHERE f1.user_id = :user_id AND f2.friend_id != :user_id
)
SELECT potential_friend, COUNT(*) AS mutual_count
FROM MutualFriends
WHERE potential_friend NOT IN (SELECT friend_id FROM DirectFriends)
GROUP BY potential_friend
ORDER BY mutual_count DESC;
• - MySQL solution
WITH DirectFriends AS (
SELECT user_id, friend_id
FROM friendships
WHERE user_id = :user_id
),
MutualFriends AS (
SELECT f1.friend_id AS mutual_friend, f2.friend_id AS potential_friend
FROM friendships f1
JOIN friendships f2 ON f1.friend_id = f2.user_id
WHERE f1.user_id = :user_id AND f2.friend_id != :user_id
)
SELECT potential_friend, COUNT(*) AS mutual_count
FROM MutualFriends
WHERE potential_friend NOT IN (SELECT friend_id FROM DirectFriends)
GROUP BY potential_friend
ORDER BY mutual_count DESC;
• Q.775
Question
Write a SQL query to find the number of customers who have called more than three times
between 3 PM and 6 PM.
Explanation
This query filters the calls made between 3 PM and 6 PM using EXTRACT(HOUR FROM
call_time), groups the results by customer, counts the number of calls per customer, and
then filters those customers with more than three calls.
962
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using EXTRACT() to filter based on specific hours in timestamps.
• Grouping results with GROUP BY to aggregate data by customer.
• Applying HAVING to filter the groups after aggregation.
Solutions
• - PostgreSQL solution
SELECT customer_id, COUNT(*) AS call_count
FROM calls
WHERE EXTRACT(HOUR FROM call_time) BETWEEN 15 AND 18
GROUP BY customer_id
HAVING COUNT(*) > 3;
• - MySQL solution
SELECT customer_id, COUNT(*) AS call_count
FROM calls
WHERE HOUR(call_time) BETWEEN 15 AND 18
GROUP BY customer_id
HAVING COUNT(*) > 3;
• Q.776
Question
Write a SQL query to calculate the median search frequency from a table of search logs.
Explanation
This query uses the PERCENTILE_CONT(0.5) function to calculate the median search
frequency for each search term. The function computes the 50th percentile (median) by
ordering the frequencies within each search term group.
Learnings
• Using PERCENTILE_CONT() to calculate percentiles or medians.
• Grouping data by a column to calculate aggregation per group.
• Working with ordered data using WITHIN GROUP in percentile functions.
Solutions
• - PostgreSQL solution
SELECT
963
1000+ SQL Interview Questions & Answers | By Zero Analyst
search_term,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY frequency) AS median_frequency
FROM search_logs
GROUP BY search_term;
• - MySQL solution
MySQL does not directly support PERCENTILE_CONT. You would need a different approach
to calculate the median, such as using window functions or manual percentile calculation.
Here's an example using variables:
SELECT search_term,
AVG(frequency) AS median_frequency
FROM (
SELECT search_term, frequency,
@rownum := @rownum + 1 AS rownum,
@total_rows := @total_rows + 1 AS total_rows
FROM search_logs, (SELECT @rownum := 0, @total_rows := 0) AS vars
ORDER BY search_term, frequency
) AS sorted_data
WHERE rownum IN (FLOOR((total_rows + 1) / 2), FLOOR((total_rows + 2) / 2))
GROUP BY search_term;
• Q.777
Question
Calculate the monthly average review score for each product. The output should be sorted in
ascending order of the month and then by product_id.
Explanation
To solve this, you need to group the reviews by month and product, calculate the average
review score for each group, and then order the results first by month and then by
product_id. The submit_date should be converted to extract the year and month.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE reviews (
review_id INT,
user_id INT,
submit_date TIMESTAMP,
product_id INT,
stars INT
);
• - Datasets
-- reviews data
INSERT INTO reviews (review_id, user_id, submit_date, product_id, stars)
VALUES
(6171, 123, '2022-06-08 00:00:00', 50001, 4),
(7802, 265, '2022-06-10 00:00:00', 69852, 4),
(5293, 362, '2022-06-18 00:00:00', 50001, 3),
(6352, 192, '2022-07-26 00:00:00', 69852, 3),
(4517, 981, '2022-07-05 00:00:00', 69852, 2);
Learnings
• Date functions: Use DATE_TRUNC() or EXTRACT() to group by month and year.
• Aggregation: Use AVG() to calculate the average score for each group.
• Sorting: Order the results by month and product.
Solutions
• - PostgreSQL solution
SELECT
EXTRACT(YEAR FROM submit_date) AS year,
EXTRACT(MONTH FROM submit_date) AS month,
964
1000+ SQL Interview Questions & Answers | By Zero Analyst
product_id,
AVG(stars) AS avg_review_score
FROM reviews
GROUP BY year, month, product_id
ORDER BY year, month, product_id;
• - MySQL solution
SELECT
YEAR(submit_date) AS year,
MONTH(submit_date) AS month,
product_id,
AVG(stars) AS avg_review_score
FROM reviews
GROUP BY year, month, product_id
ORDER BY year, month, product_id;
• Q.778
Question
Given a table of transactions, write a SQL query to identify customers who have made the
same payment amount more than once.
Explanation
This query groups the transactions by both customer_id and amount, counts how many
times each combination occurs, and then filters out those with more than one occurrence,
indicating repeated payments.
Learnings
• Using GROUP BY to aggregate data based on multiple columns.
• Using COUNT() to identify duplicates or repeated entries.
• Filtering grouped data with HAVING to apply conditions on aggregated results.
Solutions
• - PostgreSQL solution
SELECT customer_id, amount, COUNT(*)
FROM transactions
GROUP BY customer_id, amount
HAVING COUNT(*) > 1;
• - MySQL solution
SELECT customer_id, amount, COUNT(*)
FROM transactions
GROUP BY customer_id, amount
HAVING COUNT(*) > 1;
965
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.779
Identify the Top 3 Customers Who Have Spent the Most on Products
Question:
Write a SQL query to find the top 3 customers who have spent the most on products,
including the total amount spent by each.
Explanation:
You need to:
• Join the customers, purchases, and products tables.
• Multiply the product price by the number of units purchased to calculate the total amount
spent by each customer.
• Use ORDER BY to sort customers based on the total amount spent and return the top 3.
Customers Table
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100)
);
Products Table
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
price DECIMAL(10, 2)
);
Purchases Table
CREATE TABLE purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
product_id INT,
units INT
);
SQL Solution:
966
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• - PostgreSQL solution
WITH date_series AS (
SELECT generate_series('2025-01-01'::DATE, '2025-01-05'::DATE, '1 day'::INTERVAL) AS
sale_date
)
SELECT ds.sale_date,
COALESCE(SUM(s.amount) OVER (ORDER BY ds.sale_date), 0) AS cumulative_sales
FROM date_series ds
LEFT JOIN sales s ON s.sale_date = ds.sale_date
ORDER BY ds.sale_date;
• - MySQL solution
WITH RECURSIVE date_series AS (
SELECT '2025-01-01' AS sale_date
UNION ALL
SELECT DATE_ADD(sale_date, INTERVAL 1 DAY)
967
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM date_series
WHERE sale_date < '2025-01-05'
)
SELECT ds.sale_date,
COALESCE(SUM(s.amount) OVER (ORDER BY ds.sale_date), 0) AS cumulative_sales
FROM date_series ds
LEFT JOIN sales s ON s.sale_date = ds.sale_date
ORDER BY ds.sale_date;
Question:
Assign ranks to employees based on their salaries in descending order.
Explanation:
Use the RANK() window function to assign a rank to each employee based on their salary,
ordered from highest to lowest. The RANK() function will handle ties by giving the same rank
to employees with equal salaries, but will leave gaps in ranking.
Learnings:
• Using the RANK() window function to assign a rank based on an ordered set.
• Handling ties in ranking using RANK(), which leaves gaps between ranks.
• Window functions allow for more advanced analysis within a result set.
Solutions
• - PostgreSQL solution
SELECT employee_name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;
• - MySQL solution
SELECT employee_name, salary, RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;
• Q.782
Question:
968
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
To identify employees who attended for three consecutive days, you can use the LEAD() and
LAG() window functions to access previous and next rows. By comparing the dates of
consecutive rows, you can check if the difference between attendance dates is exactly one
day, and then filter those who meet the condition for three consecutive days.
Learnings:
• Using the LEAD() and LAG() window functions to access values from previous and next
rows.
• Calculating date differences to detect consecutive patterns.
• Filtering results based on specific conditions in a windowed result set.
Solutions
• - PostgreSQL solution
SELECT DISTINCT a.employee_id
FROM (
SELECT employee_id,
attendance_date,
LEAD(attendance_date) OVER (PARTITION BY employee_id ORDER BY attendance_date
) AS next_date,
LAG(attendance_date) OVER (PARTITION BY employee_id ORDER BY attendance_date)
AS prev_date
FROM attendance
) a
WHERE a.next_date = a.attendance_date + INTERVAL '1 day'
AND a.prev_date = a.attendance_date - INTERVAL '1 day';
• - MySQL solution
SELECT DISTINCT a.employee_id
FROM (
SELECT employee_id,
attendance_date,
LEAD(attendance_date) OVER (PARTITION BY employee_id ORDER BY attendance_date
) AS next_date,
LAG(attendance_date) OVER (PARTITION BY employee_id ORDER BY attendance_date)
AS prev_date
FROM attendance
) a
WHERE DATEDIFF(a.next_date, a.attendance_date) = 1
AND DATEDIFF(a.attendance_date, a.prev_date) = 1;
969
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.783
Question:
Write a query to detect duplicate orders in a table.
Explanation:
To identify duplicate orders, you can use aggregation with GROUP BY on the order-related
fields (such as order_id or customer_id). Then, filter the groups using HAVING to find those
with a count greater than 1, which indicates duplicates.
Learnings:
• Using GROUP BY to group records based on key columns (e.g., order_id or customer_id).
• Using HAVING to filter aggregated data and find duplicates.
• Identifying duplicate entries in a dataset based on specific criteria.
Solutions
• - PostgreSQL solution
SELECT order_id, customer_id, COUNT(*) AS duplicate_count
FROM orders
GROUP BY order_id, customer_id
HAVING COUNT(*) > 1;
• - MySQL solution
SELECT order_id, customer_id, COUNT(*) AS duplicate_count
FROM orders
GROUP BY order_id, customer_id
HAVING COUNT(*) > 1;
• Q.784
Question:
How do you implement auto-incrementing fields in SQL?
Explanation:
Auto-incrementing fields are used to automatically generate unique values for a primary key
column, typically for identifiers like ID. In SQL, different databases use different
970
1000+ SQL Interview Questions & Answers | By Zero Analyst
mechanisms for this feature. For SQL Server, the IDENTITY property is used, whereas
MySQL uses AUTO_INCREMENT.
Learnings:
• SQL Server uses the IDENTITY property for auto-incrementing fields, where you can
specify a starting value and increment.
• MySQL uses the AUTO_INCREMENT keyword to automatically increment field values with
each new row insertion.
• Both methods allow you to automatically generate unique values for primary key columns.
Solutions
• - SQL Server solution
CREATE TABLE employees (
employee_id INT IDENTITY(1,1) PRIMARY KEY,
employee_name VARCHAR(100),
salary DECIMAL(10, 2)
);
• - MySQL solution
CREATE TABLE employees (
employee_id INT AUTO_INCREMENT PRIMARY KEY,
employee_name VARCHAR(100),
salary DECIMAL(10, 2)
);
• Q.785
Question:
How can you insert NULL values into a column during data insertion?
Explanation:
To insert NULL values into a column, ensure the column allows NULL by not specifying NOT
NULL when creating the table. Then, use the NULL keyword in the INSERT statement where
you want to insert a NULL value.
971
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Table creation
CREATE TABLE employees (
employee_id INT,
employee_name VARCHAR(100),
department_id INT NULL, -- Allowing NULL values in department_id
salary DECIMAL(10, 2)
);
• - Datasets
INSERT INTO employees (employee_id, employee_name, department_id, salary)
VALUES
(1, 'Amit', NULL, 60000), -- Inserting NULL for department_id
(2, 'Priya', 101, 70000),
(3, 'Ravi', NULL, 75000); -- Inserting NULL for department_id
Learnings:
• A column must be defined to allow NULL values (i.e., no NOT NULL constraint).
• NULL can be explicitly inserted using the NULL keyword in the INSERT statement.
• NULL is different from an empty string or zero — it represents the absence of a value.
Solutions
• - Postgres SQL and MySQL solution
INSERT INTO employees (employee_id, employee_name, department_id, salary)
VALUES
(1, 'Amit', NULL, 60000), -- Insert NULL for department_id
(2, 'Priya', 101, 70000),
(3, 'Ravi', NULL, 75000); -- Insert NULL for department_id
• Q.786
Question:
What are temporary tables, and when would you use them?
Explanation:
Temporary tables are tables that exist only for the duration of a session or transaction. They
can store intermediate results and are automatically dropped when the session ends or when
explicitly dropped. You would use them when you need to perform multiple complex
operations and want to store intermediate results temporarily to simplify queries or improve
performance.
Learnings:
972
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Temporary Tables are useful for storing intermediate results that are needed only during
the session or transaction.
• They are automatically dropped at the end of the session or when the connection is closed,
saving storage and avoiding clutter.
• Use cases include complex reporting, batch processing, and multi-step data
transformations where intermediate data is needed but doesn’t need to be persisted.
Solutions
• - Postgres SQL / MySQL solution
-- Creating a temporary table
CREATE TEMPORARY TABLE temp_employees (
employee_id INT,
employee_name VARCHAR(100),
salary DECIMAL(10, 2)
);
Question:
Differentiate between clustered and non-clustered indexes.
Explanation:
Clustered and non-clustered indexes are two types of database indexing techniques used to
improve the performance of data retrieval. The key difference lies in how the data is
physically stored and accessed.
• Clustered Index: The table's data is physically organized in the order of the clustered
index. There can only be one clustered index per table because the data can be sorted in only
one order. In MySQL and PostgreSQL, the primary key is often implemented as a clustered
index.
• Non-Clustered Index: A non-clustered index does not alter the physical order of the data.
Instead, it creates a separate structure that holds the indexed values and a pointer to the actual
data rows. Multiple non-clustered indexes can be created on a table.
973
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Clustered Index:
• Data is physically sorted based on the index.
• Only one clustered index can exist per table.
• Typically created by default on primary keys.
• Non-Clustered Index:
• Does not alter the physical data order.
• Can have multiple non-clustered indexes on a table.
• Indexes are stored separately with pointers to data rows.
Solutions
MySQL Solution:
-- Clustered Index (Default for PRIMARY KEY)
CREATE TABLE employees (
employee_id INT PRIMARY KEY, -- Clustered index automatically created
employee_name VARCHAR(100),
salary DECIMAL(10, 2)
);
-- Non-Clustered Index
CREATE INDEX idx_salary ON employees (salary); -- Non-clustered index
PostgreSQL Solution:
-- Clustered Index (PRIMARY KEY creates a clustered index)
CREATE TABLE employees (
employee_id INT PRIMARY KEY, -- Clustered index by default
employee_name VARCHAR(100),
salary DECIMAL(10, 2)
);
-- Non-Clustered Index
CREATE INDEX idx_salary ON employees (salary); -- Non-clustered index
Key Differences:
• Clustered Index:
• Directly affects the physical storage order of data.
• Only one per table.
• Fast access to rows in the order of the index.
• Non-Clustered Index:
• Does not affect physical storage.
• Can have multiple non-clustered indexes.
• Slower access than clustered, but can be more flexible for queries involving multiple
columns or conditions.
• Q.788
Question:
Explain the concept of transactions in SQL.
Explanation:
A transaction in SQL is a sequence of one or more SQL operations that are executed as a
single unit of work. The transaction ensures data integrity and consistency, following the
974
1000+ SQL Interview Questions & Answers | By Zero Analyst
ACID properties. These properties ensure that transactions are processed reliably and
concurrently without causing data corruption.
• ACID Properties:
• Atomicity: A transaction is atomic; it either completes entirely or not at all. If an error
occurs, the changes are rolled back.
• Consistency: The database must transition from one consistent state to another after the
transaction.
• Isolation: Transactions are isolated from each other; changes made by one transaction are
not visible to others until committed.
• Durability: Once a transaction is committed, its changes are permanent, even if there’s a
system failure.
SQL transactions are controlled using the following commands:
• BEGIN: Starts a transaction.
• COMMIT: Commits the changes made by the transaction to the database.
• ROLLBACK: Rolls back (undoes) all changes made in the current transaction.
Learnings:
• Transactions ensure data consistency and integrity.
• The ACID properties are critical for maintaining the reliability of database operations.
• BEGIN, COMMIT, and ROLLBACK are essential SQL commands for controlling
transactions.
Solutions
• - Example of a Transaction
BEGIN;
Key Points:
• BEGIN starts a new transaction.
• COMMIT saves changes made during the transaction.
975
1000+ SQL Interview Questions & Answers | By Zero Analyst
• ROLLBACK undoes changes if an error occurs or if you want to discard the transaction.
• Q.789
Question:
Describe the different normalization forms and their purposes.
Explanation:
Normalization is the process of organizing a database to reduce redundancy and dependency.
The goal is to ensure that the data is logically stored, efficient, and easy to maintain. There
are several normal forms (NF), each addressing specific types of issues in database design.
• First Normal Form (1NF):
• Purpose: Ensures that the table only contains atomic (indivisible) values and that each
column has a unique name.
• Requirement: Each record must be unique, and each column must contain atomic values
(i.e., no repeating groups or arrays).
Example (1NF):
CREATE TABLE orders (
order_id INT,
customer_name VARCHAR(100),
product_names VARCHAR(255) -- This violates 1NF, as multiple product names are stor
ed in one column.
);
976
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• 1NF: Ensures atomicity of columns (no repeating groups).
• 2NF: Eliminates partial dependencies on a composite primary key.
977
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions:
• - SQL for Normalization Forms
-- 1NF Example (Corrected)
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_name VARCHAR(100),
product_name VARCHAR(100)
);
Question:
What is a view in SQL, and how is it different from a table? Provide an example of how to
create and use a view.
Explanation:
A view in SQL is a virtual table that consists of a stored query result. It does not store data
physically but retrieves data from underlying tables when queried. Views can simplify
complex queries, provide a layer of security (by restricting access to specific columns or
rows), and present a customized perspective of the data.
978
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Difference from a Table: A table stores data physically, while a view stores the SQL
query to retrieve data.
• Use of Views: Views are typically used to simplify complex joins, aggregate data, or
present a specific subset of data to the user.
Learnings:
• A view is a stored query and does not store data itself.
• Views are useful for simplifying complex queries and enhancing security.
• Unlike tables, views can represent data from multiple tables or provide a filtered subset.
Solutions
• - Creating a View
CREATE VIEW employee_salary_view AS
SELECT e.employee_id, e.employee_name, d.department_name, e.salary
FROM employees e
JOIN departments d ON e.department_id = d.department_id;
• - Querying a View
SELECT * FROM employee_salary_view;
• - Dropping a View
DROP VIEW IF EXISTS employee_salary_view;
Data Engineer
• Q.791
Question
Design a database schema for a stand-alone fast food restaurant.
Follow-up: Write a query to find the top three items by revenue and the percentage of
customers who order drinks with their meals.
Explanation
The task is to design a database schema for a fast food restaurant with tables that include
customers, orders, items, and transactions. After the schema is designed, write a query that
979
1000+ SQL Interview Questions & Answers | By Zero Analyst
identifies the top three items by revenue and calculates the percentage of customers who
order drinks along with their meals.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
email VARCHAR(100)
);
Learnings
• Understanding database normalization for restaurant-related entities (customers, orders,
menu items).
• Using aggregate functions like SUM() for revenue calculations.
• Utilizing JOIN operations to combine data from multiple tables.
980
1000+ SQL Interview Questions & Answers | By Zero Analyst
981
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using TIMESTAMP for tracking precise entry and exit times.
• Calculating time differences using EXTRACT(EPOCH FROM ...) in PostgreSQL or
TIMESTAMPDIFF() in MySQL.
• Using aggregate functions and ORDER BY to find the fastest car.
Solutions
• - PostgreSQL solution
-- Query to get the time of the fastest car on the current day
SELECT v.license_plate,
(EXTRACT(EPOCH FROM (bc.exit_time - bc.entry_time)) / 60) AS time_minutes
FROM BridgeCrossing bc
JOIN Vehicles v ON bc.vehicle_id = v.vehicle_id
WHERE bc.entry_time::DATE = CURRENT_DATE
ORDER BY time_minutes ASC
LIMIT 1;
• - MySQL solution
-- Query to get the time of the fastest car on the current day
SELECT v.license_plate,
(TIMESTAMPDIFF(SECOND, bc.entry_time, bc.exit_time) / 60) AS time_minutes
FROM BridgeCrossing bc
JOIN Vehicles v ON bc.vehicle_id = v.vehicle_id
WHERE DATE(bc.entry_time) = CURDATE()
ORDER BY time_minutes ASC
LIMIT 1;
• Q.793
Question:
Explain SQL injection and how to prevent it.
Explanation:
SQL injection is a security vulnerability that occurs when an attacker can manipulate a SQL
query by injecting malicious SQL code into user inputs. This can allow unauthorized access,
modification, or deletion of data in the database. To prevent SQL injection, always use
parameterized queries (prepared statements) to safely bind user inputs and validate inputs to
ensure they conform to expected formats.
Learnings:
• SQL injection exploits user inputs to manipulate SQL queries.
• Using parameterized queries ensures user inputs are treated as data, not executable code.
• Input validation helps ensure that only valid data is processed, preventing harmful inputs.
Solutions:
982
1000+ SQL Interview Questions & Answers | By Zero Analyst
983
1000+ SQL Interview Questions & Answers | By Zero Analyst
session_end TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES Users(user_id)
);
• - Sample datasets
-- Inserting sample data into Users table
INSERT INTO Users (user_id, user_agent, ip_address)
VALUES
(1, 'Mozilla/5.0', '192.168.1.1'),
(2, 'Chrome/91.0', '192.168.1.2');
Learnings
• Using foreign keys to establish relationships between users, pages, clicks, and sessions.
• Tracking detailed click data including the position of clicks on a page.
• Storing user agent and IP address information for session tracking.
• Utilizing TIMESTAMP for accurate recording of event times.
Solutions
• - PostgreSQL solution
-- Query to get the total number of clicks per page
SELECT p.page_url, COUNT(c.click_id) AS total_clicks
FROM Clicks c
JOIN Pages p ON c.page_id = p.page_id
GROUP BY p.page_url;
• - MySQL solution
-- Query to get the total number of clicks per page
SELECT p.page_url, COUNT(c.click_id) AS total_clicks
FROM Clicks c
JOIN Pages p ON c.page_id = p.page_id
GROUP BY p.page_url;
• Q.795
Question
Explain how you would perform an ETL (Extract, Transform, Load) process using SQL.
Explanation
The task is to describe how to implement an ETL process using SQL. The process involves
extracting data from a source, transforming it into the desired format, and then loading it into
the destination database. You will use SQL queries for each step to manipulate the data and
ensure it’s correctly loaded into the target system.
Datasets and SQL Schemas
• - Table creation
-- Source table (raw data)
CREATE TABLE source_data (
984
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• The Extract phase involves fetching data from a source, often a raw or transactional
system.
• The Transform phase includes cleaning and aggregating data, such as calculating totals or
grouping by categories.
• The Load phase moves the transformed data into a target table, ensuring the structure
matches the target schema.
• Using JOIN, GROUP BY, and INSERT INTO statements in SQL to manipulate and load data.
Solutions
• - PostgreSQL solution
-- ETL Process in SQL
-- Transform: Clean and aggregate the data (already done in the CTE above)
-- Transform: Clean and aggregate the data (already done in the CTE above)
985
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DELETE: Removes rows one at a time and logs each deletion in the transaction log. Can
be rolled back, slower for large datasets.
• TRUNCATE: Removes all rows in the table without logging individual row deletions.
Faster but cannot be rolled back (unless in a transaction).
• DELETE can have conditions (WHERE clause), whereas TRUNCATE removes all rows
from the table.
• TRUNCATE may reset identity columns, while DELETE does not.
• TRUNCATE is a DDL command, whereas DELETE is a DML command.
Solutions
• - PostgreSQL solution
-- DELETE example (removes specific rows)
DELETE FROM Employees WHERE department = 'HR';
986
1000+ SQL Interview Questions & Answers | By Zero Analyst
What are the differences between OLAP (Online Analytical Processing) and OLTP (Online
Transaction Processing)?
Explanation
The task is to compare OLAP and OLTP systems, highlighting their primary use cases, data
models, and performance characteristics.
Learnings
• OLAP: Used for complex queries and analytics, typically in data warehouses. It supports
multidimensional analysis and is optimized for read-heavy operations (e.g., aggregations,
reporting).
• OLTP: Focused on transaction processing, handling day-to-day operations such as order
processing, inventory management, and user activities. It is optimized for fast, reliable insert,
update, and delete operations.
• OLAP systems typically involve star or snowflake schemas and deal with large amounts of
historical data, whereas OLTP systems use normalized schemas to ensure data integrity and
fast transaction processing.
• OLAP databases are usually read-heavy, while OLTP databases are write-heavy.
Key Differences
987
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM sales_data
GROUP BY department
ORDER BY total_sales DESC;
• - OLTP query example (inserting a transaction)
INSERT INTO orders (order_id, customer_id, product_id, quantity, order_date)
VALUES (101, 5, 3, 2, '2025-01-15');
• Q.798
Question
Explain the ACID properties in database systems.
Explanation
The ACID properties ensure that database transactions are processed reliably and adhere to
specific rules that guarantee data integrity and consistency, even in the face of system failures
or errors. ACID stands for Atomicity, Consistency, Isolation, and Durability.
Learnings
• Atomicity: Guarantees that all operations within a transaction are completed successfully.
If any operation fails, the entire transaction is rolled back.
• Consistency: Ensures that a transaction brings the database from one valid state to another,
maintaining all rules, constraints, and triggers.
• Isolation: Ensures that transactions are executed independently of one another. Even if
multiple transactions are occurring concurrently, each one will be isolated from others until it
is complete.
• Durability: Once a transaction is committed, it is permanent. Even in the event of a system
crash, the results of the transaction are preserved.
Solutions
• Atomicity:
Example: In a banking system, if a transfer involves debiting one account and crediting
another, both operations must succeed. If one fails, the whole transaction is rolled back (i.e.,
neither account is modified).
• Consistency:
Example: A transaction that deducts money from one account and adds it to another must
maintain the integrity of the account balances. If the transaction violates a constraint (e.g.,
negative balance), it is rolled back.
• Isolation:
Example: If two customers try to withdraw money from the same account simultaneously, the
transactions are isolated to prevent one from affecting the other. One transaction may have to
wait until the other finishes, ensuring correctness.
• Durability:
Example: Once a transaction that updates an order status is committed, even if the system
crashes right after, the status update will persist once the system is restored.
• Q.799
Question
What is Partitioning in databases, and why is it important for query performance?
Explanation
988
1000+ SQL Interview Questions & Answers | By Zero Analyst
The task is to explain database partitioning, which involves splitting a large table into
smaller, more manageable pieces (partitions). Partitioning helps improve query performance,
data management, and scalability by allowing operations to target specific subsets of data.
Learnings
• Partitioning: Splitting a table into smaller, distinct pieces (partitions) based on certain key
columns (e.g., date, region, etc.).
• It improves query performance by limiting the amount of data to scan during queries (e.g.,
partitioning by date allows querying only the relevant time period).
• Partitioning enhances data management, enabling better load balancing, parallel
processing, and more efficient data access.
• Types of Partitioning:
• Range Partitioning: Partitioned by a range of values (e.g., date ranges).
• List Partitioning: Partitioned by a specific list of values (e.g., regions, countries).
• Hash Partitioning: Data is divided based on a hash function applied to a column value.
Example
A large sales table can be partitioned by year, so each year’s data is stored in a separate
partition, improving query performance when filtering by date.
Solutions
• - Example of Range Partitioning (PostgreSQL)
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
sale_date DATE,
amount DECIMAL
)
PARTITION BY RANGE (sale_date);
CREATE TABLE sales_2021 PARTITION OF sales FOR VALUES FROM ('2021-01-01') TO ('2021-12-3
1');
CREATE TABLE sales_2022 PARTITION OF sales FOR VALUES FROM ('2022-01-01') TO ('2022-12-
31');
• - Example of List Partitioning (MySQL)
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department VARCHAR(50),
salary DECIMAL(10, 2)
)
PARTITION BY LIST (department) (
PARTITION hr VALUES IN ('HR'),
PARTITION engineering VALUES IN ('Engineering'),
PARTITION finance VALUES IN ('Finance')
);
• Q.800
Question
How do you optimize SQL queries for better performance?
Explanation
The task is to explain how to optimize SQL queries to improve their execution speed, reduce
resource usage, and enhance overall database performance. Query optimization is crucial for
handling large datasets, complex queries, and ensuring faster response times.
Learnings
989
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Indexing: Create indexes on columns that are frequently queried (e.g., in WHERE, JOIN, or
ORDER BY clauses) to speed up data retrieval.
• *Avoiding SELECT ***: Select only the necessary columns rather than using SELECT * to
reduce the amount of data returned.
• Using Joins Efficiently: Use appropriate JOIN types (e.g., INNER JOIN, LEFT JOIN) and
ensure the join condition is indexed to minimize the number of rows processed.
• Query Refactoring: Rewrite complex queries to break them into smaller, more efficient
parts, using temporary tables or subqueries when necessary.
• Using LIMIT: When dealing with large result sets, use LIMIT to restrict the number of
rows returned.
• Proper Data Types: Use the smallest possible data types for your columns (e.g., using INT
instead of BIGINT when possible).
• Query Execution Plan: Use EXPLAIN (PostgreSQL) or EXPLAIN PLAN (MySQL) to
analyze the query execution plan and identify bottlenecks.
• Avoiding N+1 Problem: Use JOIN or IN to avoid making multiple queries in loops (N+1
queries).
Solutions
• - Example of Using Indexing (PostgreSQL/MySQL)
-- Create an index on a frequently queried column
CREATE INDEX idx_employee_name ON employees (employee_name);
• - Refactoring a Complex Query to Use JOIN Instead of Subqueries
-- Inefficient query with subquery
SELECT name, (SELECT department FROM departments WHERE employee_id = employees.id) AS de
pt
FROM employees;
Key Takeaways
• Indexing improves data retrieval speed.
• Use JOIN instead of subqueries when possible.
• Limit the amount of data retrieved with LIMIT and specific column selection.
• Analyze the query execution plan to identify inefficiencies.
Business Analyst
• Q.801
Question
What is the difference between DELETE TABLE and TRUNCATE TABLE in SQL?
Explanation
990
1000+ SQL Interview Questions & Answers | By Zero Analyst
DELETE TABLE and TRUNCATE TABLE are both used to remove data from a table, but they
differ in how they operate, their impact on the transaction log, and their ability to be rolled
back.
• DELETE removes specific rows based on a condition and can be rolled back if wrapped in
a transaction. It logs individual row deletions, which can make it slower for large datasets.
• TRUNCATE removes all rows from a table, does not log individual row deletions, and is
generally faster. It does not allow for row-specific conditions and may not be rolled back in
some database systems unless in a transaction.
Learnings
• DELETE can be rolled back (if not committed), supports WHERE clauses, and operates row-
by-row, generating a more extensive transaction log.
• TRUNCATE cannot selectively delete rows (no WHERE clause), removes all rows quickly,
and has minimal logging, making it faster but less flexible.
• DELETE maintains the table structure and any associated constraints like foreign keys,
while TRUNCATE can reset identity columns, and may not enforce certain constraints
depending on the DBMS.
Key Differences
Foreign Key Does not affect foreign keys Cannot be used if foreign keys exist
Constraints unless explicitly handled unless temporarily removed
991
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• The BETWEEN operator is used to filter data within a range (inclusive of the boundary
values).
• Alphabetical ranges with strings are based on the lexicographical (dictionary) order.
• In this case, the query will select all employees with last names starting from 'Bailey' to
'Frederick', including both 'Bailey' and 'Frederick'.
Solution
• - SQL query to select records within the specified range
SELECT * FROM employees
WHERE last_name BETWEEN 'Bailey' AND 'Frederick';
Key Takeaways
• The BETWEEN operator can be used for both numerical and string ranges.
• When used with strings, BETWEEN compares lexicographical order (alphabetical order).
• This query will return all records where the last_name is between 'Bailey' and 'Frederick',
inclusive.
• Q.803
Question
Write an SQL query to find the year from a YYYY-MM-DD date.
Explanation
992
1000+ SQL Interview Questions & Answers | By Zero Analyst
This task involves extracting the year part from a DATE column that is in the format YYYY-MM-
DD. You can use built-in SQL date functions to extract specific parts of a date, such as the
YEAR() function, which is available in most SQL databases.
Learnings
• SQL provides date functions like YEAR(), MONTH(), and DAY() to extract specific parts of a
date.
• Extracting the year is useful for time-based analysis, like grouping orders by year or
filtering by a specific year.
• The YEAR() function works for DATE, DATETIME, and TIMESTAMP columns.
Solution
• - SQL query to extract the year from a DATE
SELECT order_id, order_date, YEAR(order_date) AS order_year
FROM orders;
Key Takeaways
• The YEAR() function extracts the year from a DATE value.
• The query will return the year from each order_date in the orders table.
• The output column order_year will contain only the year part of the date (e.g., 2023,
2024, 2025).
• Q.804
Question
Write an SQL query to select the second highest salary in the engineering department.
Explanation
This task involves selecting the second highest salary from the employees table, specifically
for the "Engineering" department. To achieve this, you can use different approaches, such as
using the LIMIT with ORDER BY, or using subqueries with DISTINCT and MAX(). The most
efficient solution often involves using ROW_NUMBER() or a similar window function (if
supported by the database).
Datasets and SQL Schemas
• - Table creation
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10, 2)
);
993
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Sample datasets
INSERT INTO employees (employee_id, first_name, last_name, department, salary)
VALUES
(1, 'John', 'Doe', 'Engineering', 80000),
(2, 'Jane', 'Smith', 'Engineering', 95000),
(3, 'Alice', 'Johnson', 'Marketing', 70000),
(4, 'Bob', 'Brown', 'Engineering', 85000),
(5, 'Charlie', 'Davis', 'Engineering', 90000);
Learnings
• You can find the second highest salary using a subquery, DISTINCT, LIMIT, or window
functions (e.g., ROW_NUMBER()).
• The common method involves selecting the maximum salary from the set of salaries that
are less than the highest salary.
Solutions
Key Takeaways
• The subquery approach is simple and effective but can be less efficient with larger datasets.
• Using ROW_NUMBER() is ideal for more complex use cases or large datasets, where you
need to rank rows.
• LIMIT with OFFSET is an easy solution for finding the second highest value in
MySQL/PostgreSQL.
• Q.805
Question
What is the PRIMARY KEY in SQL?
Explanation
A PRIMARY KEY in SQL is a constraint that uniquely identifies each record in a table. It
ensures that no two rows have the same value for the primary key columns and that the
primary key columns cannot contain NULL values. Each table can have only one PRIMARY
994
1000+ SQL Interview Questions & Answers | By Zero Analyst
KEY, and this key can consist of a single column or a combination of multiple columns
(composite primary key).
Learnings
• A PRIMARY KEY must have unique values for each record in the table.
• It does not allow NULL values in any of the columns that are part of the primary key.
• Each table can only have one PRIMARY KEY, but the key can span multiple columns
(composite key).
• A PRIMARY KEY automatically creates a unique index on the column(s) to enforce
uniqueness and speed up queries.
• The primary key is often used in relationships (e.g., as a foreign key in other tables).
Solution
• - Table creation with a PRIMARY KEY on a single column
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(100),
last_name VARCHAR(100),
department VARCHAR(50)
);
• - Table creation with a composite PRIMARY KEY (multiple columns)
CREATE TABLE project_assignments (
project_id INT,
employee_id INT,
PRIMARY KEY (project_id, employee_id)
);
Key Takeaways
• The PRIMARY KEY constraint enforces uniqueness and prevents NULL values.
• A table can only have one PRIMARY KEY, but it can be made up of one or more columns.
• It is crucial for maintaining data integrity and is often used to establish relationships
between tables.
• Q.806
Question
What is the difference between INNER JOIN and OUTER JOIN?
Explanation
The difference between INNER JOIN and OUTER JOIN lies in the way they handle unmatched
rows between the tables being joined.
• INNER JOIN: Returns only the rows where there is a match in both tables. If a row from
one table does not have a corresponding match in the other table, it is excluded from the
result set.
• OUTER JOIN: Returns all rows from one table, and the matching rows from the other
table. If there is no match, NULL values are returned for columns of the table without a match.
There are three types of OUTER JOIN:
• LEFT OUTER JOIN (LEFT JOIN): Returns all rows from the left table and the matched
rows from the right table.
• RIGHT OUTER JOIN (RIGHT JOIN): Returns all rows from the right table and the
matched rows from the left table.
995
1000+ SQL Interview Questions & Answers | By Zero Analyst
• FULL OUTER JOIN: Returns all rows from both tables, with matching rows where
available. If there is no match, NULL values are returned for the columns of the table without a
match.
Learnings
• INNER JOIN returns only the intersecting data between two tables.
• OUTER JOIN returns all data from one table, and for the unmatched rows from the other,
it fills in NULL.
• LEFT OUTER JOIN includes all rows from the left table, and only matching rows from
the right table.
• RIGHT OUTER JOIN includes all rows from the right table, and only matching rows
from the left table.
• FULL OUTER JOIN includes all rows from both tables, filling in NULL where no match
exists.
Key Differences
Match Only rows with matching All rows from one table, and matched rows
Criteria values in both tables from the other
Key Takeaways
• INNER JOIN is used when you only want the rows that exist in both tables.
• OUTER JOIN is used when you want to include all rows from one table, and matching
rows from the other table, with NULL for unmatched rows.
996
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The type of OUTER JOIN determines which table’s rows are always included (LEFT,
RIGHT, or FULL).
• Q.807
Question
How would you find the total sales for each product category?
Explanation
To find the total sales for each product category, you can use the SUM() aggregate function to
sum up the sales for each category. The GROUP BY clause is used to group the results by
product category, so the total sales are calculated for each individual category.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
price DECIMAL(10, 2)
);
Learnings
• The SUM() function is used to calculate the total sales for each category.
• The GROUP BY clause groups the data by category, so that the sum is computed for each
category individually.
• You can combine multiple tables with JOIN if sales data and product details are in different
tables.
Solution
SELECT p.category, SUM(s.quantity * p.price) AS total_sales
FROM sales s
JOIN products p ON s.product_id = p.product_id
GROUP BY p.category;
Key Takeaways
997
1000+ SQL Interview Questions & Answers | By Zero Analyst
In this example, the CASE statement classifies employees into 'Low', 'Medium', or 'High'
salary categories based on their salary.
Here, the CASE statement updates the bonus for employees in the "Engineering" department
based on their performance rating.
998
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM employees
ORDER BY
CASE
WHEN salary < 50000 THEN 1
WHEN salary BETWEEN 50000 AND 100000 THEN 2
WHEN salary > 100000 THEN 3
END;
In this case, the CASE statement is used in the ORDER BY clause to prioritize rows based on
salary ranges.
Key Takeaways
• The CASE statement provides conditional logic within SQL queries, helping transform or
classify data based on specific conditions.
• The CASE expression can be used in SELECT, UPDATE, and ORDER BY clauses.
• Simple CASE compares a column or expression with fixed values, while Searched CASE
evaluates multiple conditions.
• Q.809
Question
What is the difference between WHERE and HAVING clauses in SQL?
Explanation
The WHERE and HAVING clauses are both used to filter data, but they are used in different
contexts:
• WHERE: Filters rows before any grouping or aggregation is applied. It is used to filter
data at the individual row level and can be used with all types of queries, including those
without aggregation.
• HAVING: Filters data after aggregation or grouping has been performed. It is used to filter
groups of rows (i.e., after GROUP BY), based on aggregated values like COUNT(), SUM(),
AVG(), etc.
Learnings
• WHERE filters data before aggregation and can be used on individual rows.
• HAVING filters data after aggregation and is used to filter groups created by GROUP BY.
• The WHERE clause cannot be used with aggregate functions (like SUM(), AVG()), but HAVING
can.
Key Differences
999
1000+ SQL Interview Questions & Answers | By Zero Analyst
In this example, the WHERE clause filters products with a price greater than 100 before any
aggregation.
Here, the HAVING clause filters categories where the average price is greater than 50, after
grouping the products by category.
Key Takeaways
• WHERE filters individual rows based on conditions and is applied before any aggregation.
• HAVING filters the result of aggregations and is used after GROUP BY.
• Q.810
Question
How can subqueries be used in SQL?
Explanation
A subquery is a query nested inside another query. It can be used to:
• Retrieve a single value or multiple values.
• Provide results to be used by the main query.
• Be used in various clauses, including SELECT, FROM, WHERE, and HAVING.
Subqueries can be categorized into:
• Scalar Subquery: Returns a single value (used in SELECT, WHERE, etc.).
• Correlated Subquery: Depends on values from the outer query and is evaluated once for
each row.
• Non-correlated Subquery: Independent of the outer query and is executed only once.
• Inline Subquery: Used in the FROM clause to act as a derived table or view.
Learnings
• Subqueries can be used in SELECT, WHERE, FROM, and HAVING clauses.
• Scalar subqueries return single values, while multi-row subqueries can return multiple
values for use in comparison.
• Correlated subqueries reference the outer query, whereas non-correlated subqueries do
not.
• Subqueries can simplify complex queries by breaking them down into smaller parts.
Solutions
1000
1000+ SQL Interview Questions & Answers | By Zero Analyst
In this case, the subquery (SELECT AVG(salary) FROM employees) calculates the average
salary, and the main query returns employees whose salary is greater than the calculated
average.
Here, the subquery (SELECT department, salary FROM employees) is used to filter data
before calculating the average salary for each department.
In this correlated subquery, the inner query references the outer query’s department value
(e.department) to calculate the average salary for that department.
Key Takeaways
• Subqueries are queries embedded within another query and can be used in various clauses
like SELECT, WHERE, FROM, and HAVING.
• Scalar subqueries return a single value, whereas multi-row subqueries return multiple
values.
• Correlated subqueries depend on the outer query and are evaluated for each row, while
non-correlated subqueries are independent and run once.
• Subqueries can make complex queries more readable by breaking them into smaller logical
parts.
SQL Developer
• Q.811
Question
How do you find duplicate records in a table?
Answer
To find duplicate records in a table, you can use the GROUP BY clause to group rows based on
the columns that might have duplicates, and the HAVING clause to filter groups that appear
more than once.
1001
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Points:
• GROUP BY groups records based on the columns specified.
• HAVING COUNT(*) > 1 filters groups that have more than one occurrence, identifying
duplicates.
• Q.812
Question
What are indexes, and how do they improve query performance?
Answer
Indexes are database objects that improve the speed of data retrieval operations on a table at
the cost of additional storage space and slower write operations. They work similarly to an
index in a book, allowing the database to quickly locate rows based on a specified column or
set of columns.
Trade-offs:
• Increased Storage: Indexes take up additional disk space because they maintain a separate
data structure.
• Slower Write Operations: INSERT, UPDATE, and DELETE operations are slower because the
index must also be updated every time data is modified. This increases the overhead.
Key Points:
• Indexes are used to improve read performance (SELECT queries) by speeding up data
retrieval.
• They can slow down write operations (INSERT, UPDATE, DELETE) because the index needs
to be updated.
1002
1000+ SQL Interview Questions & Answers | By Zero Analyst
• It's important to balance indexing based on the specific workload and query patterns of the
application.
• Q.813
Question
What is a subquery, and how is it different from a join?
Answer
A subquery is a query nested inside another query. It can return a single value, a set of
values, or a table of results, and is often used in the SELECT, WHERE, or FROM clauses. A join,
on the other hand, is used to combine data from two or more tables based on a related
column.
Key Points:
• Subqueries are useful for filtering or computing values that will be used by the outer
query, especially when a direct relationship isn't needed.
• Joins are generally more efficient when you need to combine data from multiple tables and
work with large datasets.
1003
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Subqueries can be more readable and useful for certain operations (like filtering based on
aggregated results), while joins are better suited for combining rows across multiple tables.
• Q.814
Question
What is a stored procedure, and how does it differ from a function?
Answer
A stored procedure is a precompiled collection of SQL statements stored in the database
that can be executed by the database engine. It can perform operations like querying,
updating, and deleting data, as well as executing complex business logic. A function is
similar but typically designed to return a value and is often used in expressions or queries.
Example: Function
A function to calculate the total salary of employees in a department:
CREATE FUNCTION GetDepartmentSalary(department_id INT)
RETURNS DECIMAL
BEGIN
DECLARE total_salary DECIMAL;
SELECT SUM(salary) INTO total_salary FROM employees WHERE department_id = department
_id;
RETURN total_salary;
END;
1004
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Points:
• Stored Procedures: Used for executing operations that may or may not return data; they
don’t return values directly in queries.
• Functions: Used to calculate and return a value that can be directly used in SQL queries or
expressions.
• Stored Procedures allow for more complex operations, including multiple SQL
statements, control flow, and error handling, while Functions are generally simpler and
designed to return a single value.
• Q.815
Question
What is the purpose of the GROUP BY clause? Provide an example.
Answer
The GROUP BY clause in SQL is used to group rows that have the same values in specified
columns into summary rows, typically for the purpose of aggregation. It allows you to
aggregate data based on one or more columns using aggregate functions like COUNT(), SUM(),
AVG(), MIN(), and MAX().
Example:
Find the total sales for each product category in a sales table.
SELECT category, SUM(sales_amount) AS total_sales
FROM sales
GROUP BY category;
In this example:
• category is the column we group by.
• SUM(sales_amount) calculates the total sales for each product category.
• The query returns one row for each unique product category, showing the total sales
amount for that category.
Key Points:
• GROUP BY groups rows based on specified columns.
• It is used to perform aggregate functions on groups of rows.
• It simplifies data analysis by summarizing data at different levels (e.g., by department,
product, or date).
• Q.816
Question
What are window functions in SQL? Provide examples.
Answer
1005
1000+ SQL Interview Questions & Answers | By Zero Analyst
Window functions in SQL are special types of functions that perform calculations across a
set of table rows related to the current row, but unlike aggregate functions, they do not group
the result set into a single output row. Window functions allow you to perform calculations
like ranking, running totals, and moving averages without collapsing the result set.
Examples:
2. RANK(): Rank employees based on their salary, with gaps for ties
SELECT employee_id, first_name, salary,
RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees;
• This ranks employees by salary in descending order. If two employees have the same
salary, they will share the same rank, and the next rank will be skipped (e.g., two employees
ranked 1, then the next employee will be ranked 3).
3. LEAD(): Get the salary of the next employee in the result set
SELECT employee_id, first_name, salary,
LEAD(salary) OVER (ORDER BY salary DESC) AS next_salary
FROM employees;
• This provides the salary of the next employee in the list, ordered by salary in descending
order. If there is no next row, it returns NULL.
4. LAG(): Get the salary of the previous employee in the result set
SELECT employee_id, first_name, salary,
LAG(salary) OVER (ORDER BY salary DESC) AS previous_salary
FROM employees;
• This provides the salary of the previous employee in the list, ordered by salary in
descending order. If there is no previous row, it returns NULL.
Key Points:
1006
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Window functions allow you to perform complex calculations across rows while keeping
the individual row data intact.
• They are used with the OVER() clause, which defines how the window or partition is
created (e.g., by ordering or partitioning data).
• Window functions like ROW_NUMBER(), RANK(), LEAD(), and LAG() are especially useful
for tasks like ranking, calculating running totals, or comparing values across rows in a result
set.
• Q.817
Question:
What are normalization and denormalization in database design?
Answer:
• Normalization: The process of organizing a database to reduce redundancy and
dependency by dividing large tables into smaller, manageable ones and using relationships
between them. The goal is to minimize data duplication and ensure data integrity.
• Types: Includes several normal forms (1NF, 2NF, 3NF, BCNF, etc.), with each successive
form removing specific types of redundancy and dependency.
• Denormalization: The process of combining tables or introducing redundancy back into a
database to improve read performance. Denormalization is used when performance is
prioritized over data integrity, often in cases like data warehousing.
Key Points:
• Normalization improves data integrity and reduces redundancy.
• Denormalization improves read performance by reducing the complexity of joins but
increases redundancy and the potential for anomalies during updates.
• Q.818
Question:
What is the difference between CHAR and VARCHAR data types?
Answer:
• CHAR (Fixed-Length String): Stores a string of fixed length. If the string is shorter than
the defined length, the remaining space is padded with spaces.
• Example: CHAR(10) will always take up 10 bytes, even if the string stored is only 5
characters long.
• VARCHAR (Variable-Length String): Stores a string with a length that can vary. It only
uses as much space as needed to store the string, plus a small amount for length information.
• Example: VARCHAR(10) will store a string of up to 10 characters, but it will only use as
many bytes as needed for the actual string length.
Key Points:
• CHAR is useful when you know the string length will always be fixed (e.g., country codes,
phone numbers).
• VARCHAR is more efficient when string length varies because it saves storage space.
• Q.819
What is the purpose of using foreign keys in database design?
Answer:
1007
1000+ SQL Interview Questions & Answers | By Zero Analyst
A foreign key is a column (or combination of columns) in a table that refers to the primary
key of another table. It establishes a relationship between the two tables, ensuring referential
integrity.
Purpose:
• Referential Integrity: Ensures that a foreign key value must match a value in the
referenced table’s primary key or be NULL.
• Relationship Representation: Used to define relationships between tables (e.g., one-to-
many, many-to-many).
Example: In an orders table, the customer_id can be a foreign key that references the id
field in the customers table, ensuring that every order corresponds to an existing customer.
Key Points:
• Foreign keys prevent invalid data by ensuring that relationships between tables are valid.
• They help maintain consistent and accurate data across multiple tables.
• Q.820
Question:
What is the difference between UNION and UNION ALL in SQL?
Answer:
• UNION: Combines the results of two or more SELECT queries and removes duplicate rows
from the final result set. The result will only include unique rows.
• Example: SELECT column1 FROM table1 UNION SELECT column1 FROM table2;
• UNION ALL: Combines the results of two or more SELECT queries but does not remove
duplicates. It returns all rows, including duplicates.
• Example: SELECT column1 FROM table1 UNION ALL SELECT column1 FROM table2;
Key Points:
• UNION removes duplicates and may be slower due to the extra overhead of eliminating
duplicates.
• UNION ALL is faster because it does not perform duplicate removal, but it may return
duplicate rows.
1008
1000+ SQL Interview Questions & Answers | By Zero Analyst
• DROP: Completely removes a table from the database, including its structure, data, and
any associated indexes or constraints. This action cannot be rolled back.
• Example: DROP TABLE employees;
Key Points:
• DELETE: Selective row deletion, can be rolled back.
• TRUNCATE: Removes all rows quickly, but cannot be rolled back in most systems.
• DROP: Removes the table entirely, including structure and data.
• Q.822
Question:
What is the difference between INNER JOIN and LEFT JOIN?
Answer:
• INNER JOIN: Returns only the rows where there is a match in both tables. If there is no
match, those rows are excluded.
• LEFT JOIN (or LEFT OUTER JOIN): Returns all rows from the left table, and the
matching rows from the right table. If no match exists, the result is NULL on the right side.
Example:
-- INNER JOIN
SELECT a.id, b.name
FROM users a
INNER JOIN orders b ON a.id = b.user_id;
-- LEFT JOIN
SELECT a.id, b.name
FROM users a
LEFT JOIN orders b ON a.id = b.user_id;
• Q.823
Question:
How do you optimize a slow SQL query?
Answer:
To optimize a slow SQL query:
• Indexing: Create indexes on columns used in WHERE, JOIN, and ORDER BY clauses.
• Query Refactoring: Rewrite complex subqueries as joins or vice versa.
• Limit Data: Use LIMIT or TOP to return only necessary rows.
• Analyze Execution Plan: Use EXPLAIN (PostgreSQL) or EXPLAIN PLAN (Oracle) to check
how SQL is executed and identify bottlenecks.
• *Avoid SELECT *** : Select only the columns you need.
• Q.824
Question:
What is a JOIN in SQL, and can you name different types of joins?
Answer:
A JOIN is used to combine rows from two or more tables based on a related column between
them.
1009
1000+ SQL Interview Questions & Answers | By Zero Analyst
Types of Joins:
• INNER JOIN: Returns rows when there is a match in both tables.
• LEFT JOIN (OUTER JOIN): Returns all rows from the left table and matched rows from
the right table.
• RIGHT JOIN (OUTER JOIN): Returns all rows from the right table and matched rows
from the left table.
• FULL JOIN (OUTER JOIN): Returns rows when there is a match in one of the tables.
• CROSS JOIN: Returns the Cartesian product of two tables (all combinations of rows).
• Q.825
Question:
What is the purpose of the GROUP BY clause in SQL?
Answer:
The GROUP BY clause is used to group rows that have the same values in specified columns
into summary rows, typically for aggregation. It is often used with aggregate functions like
COUNT(), SUM(), AVG(), MIN(), and MAX() to perform calculations on each group.
Example:
SELECT department, COUNT(*) AS total_employees
FROM employees
GROUP BY department;
• Q.826
Question:
What is the difference between HAVING and WHERE clauses in SQL?
Answer:
• WHERE: Filters rows before any grouping is done. It is used for filtering individual rows
based on conditions.
• HAVING: Filters groups after the GROUP BY operation. It is used to filter the results of
aggregated data.
Example:
-- WHERE (before grouping)
SELECT employee_id, salary
FROM employees
WHERE salary > 50000;
1010
1000+ SQL Interview Questions & Answers | By Zero Analyst
• A JOIN combines data from two or more tables based on a related column, returning a
new result set with columns from both tables.
Difference:
• Subqueries are useful for filtering or performing calculations, while joins are used to
combine data from multiple tables into a single result set.
Example:
-- Subquery Example
SELECT employee_id, first_name
FROM employees
WHERE department_id = (SELECT department_id FROM departments WHERE name = 'Sales');
-- JOIN Example
SELECT e.employee_id, e.first_name, d.name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE d.name = 'Sales';
• Q.828
What is the purpose of INDEX in SQL?
Answer:
An index is used to improve the speed of data retrieval operations on a table. It provides a
quick way to look up rows based on the values of one or more columns. However, indexes
can slow down write operations like INSERT, UPDATE, and DELETE since the index must also
be updated.
Key Points:
• Indexes are used to speed up queries that use WHERE, ORDER BY, or JOIN operations.
• They use additional disk space and affect performance for INSERT, UPDATE, and DELETE.
• Q.829
Question
What are transactions in SQL, and why are they important?
Answer:
A transaction is a sequence of one or more SQL operations that are executed as a single unit.
It ensures that the database remains in a consistent state, even in the case of errors.
ACID Properties of Transactions:
• Atomicity: Ensures that all operations within the transaction are completed; if not, the
transaction is rolled back.
• Consistency: Ensures the database transitions from one consistent state to another.
• Isolation: Ensures that transactions are isolated from each other.
• Durability: Ensures that once a transaction is committed, it is permanently stored.
Example:
BEGIN TRANSACTION;
COMMIT;
• Q.830
1011
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question:
What are the differences between TRUNCATE, DELETE, and DROP?
Answer:
• DELETE: Removes specific rows from a table based on a condition. It can be rolled back
and does not remove the table structure.
• Example: DELETE FROM employees WHERE id = 5;
• TRUNCATE: Removes all rows from a table and cannot be rolled back in most databases.
It is faster than DELETE as it does not log individual row deletions.
• Example: TRUNCATE TABLE employees;
• DROP: Completely removes a table, including its data, structure, and any associated
constraints and indexes. This operation cannot be rolled back.
• Example: DROP TABLE employees;
You are given a table sales with information about sales transactions. Write an SQL query
to identify the top 3 products sold by revenue (i.e., quantity * price) for each region. Display
the region, product name, total revenue, and rank of the product.
Explanation
To solve this, first calculate the total revenue for each product by multiplying quantity and
price. Then, use window functions like RANK() or ROW_NUMBER() to rank the products
within each region based on the total revenue. Finally, filter the results to show only the top 3
products per region.
1012
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Window functions: Using RANK() or ROW_NUMBER() allows ranking of items within
groups (here, regions).
• Aggregation: SUM(quantity * price) is used to calculate total revenue for each product.
• Partitioning: The PARTITION BY clause in window functions helps in calculating ranks
separately for each region.
• Filtering: Using WHERE rank <= 3 filters the results to show only the top 3 products for
each region.
Solutions
PostgreSQL solution:
WITH product_sales AS (
SELECT region,
product_name,
SUM(quantity * price) AS total_revenue
FROM sales
GROUP BY region, product_name
)
SELECT region,
product_name,
total_revenue,
RANK() OVER (PARTITION BY region ORDER BY total_revenue DESC) AS rank
FROM product_sales
WHERE rank <= 3
ORDER BY region, rank;
MySQL solution:
WITH product_sales AS (
SELECT region,
product_name,
SUM(quantity * price) AS total_revenue
FROM sales
GROUP BY region, product_name
)
SELECT region,
product_name,
total_revenue,
RANK() OVER (PARTITION BY region ORDER BY total_revenue DESC) AS rank
FROM product_sales
WHERE rank <= 3
ORDER BY region, rank;
Both PostgreSQL and MySQL solutions are identical in this case, as they both support
RANK() window function and CTEs (Common Table Expressions).
• Q.832
You are given a sales table with product sales data. Write an SQL query to calculate the 7-
day moving average of sales (quantity * price) for each product.
Explanation
To solve this, calculate the 7-day moving average of sales for each product. The moving
average should be based on the total revenue (quantity * price) over the last 7 days, including
1013
1000+ SQL Interview Questions & Answers | By Zero Analyst
the current day. Use window functions such as AVG() with the ROWS BETWEEN clause to
define the moving window.
Learnings
• Window functions: Using AVG() with ROWS BETWEEN enables calculating moving
averages within a defined window.
• ROWS BETWEEN: Defines the moving window for each row, where 6 PRECEDING
includes the current row and the 6 previous rows, totaling 7 rows.
• PARTITION BY: Ensures that the calculation is done separately for each product.
• Date ordering: The window is ordered by sale_date to ensure the moving average
calculation respects the time series.
Solutions
PostgreSQL solution:
SELECT product_id,
sale_date,
AVG(quantity * price) OVER (
PARTITION BY product_id
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_sales
FROM sales
ORDER BY product_id, sale_date;
MySQL solution:
SELECT product_id,
sale_date,
AVG(quantity * price) OVER (
PARTITION BY product_id
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_sales
1014
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM sales
ORDER BY product_id, sale_date;
Both PostgreSQL and MySQL solutions are identical in this case, as they both support
window functions like AVG() with ROWS BETWEEN.
• Q.833
You are given a table order_details with predicted and actual delivery times for orders.
Write a query to identify the delivery partners who have the most delayed orders (orders
where actual delivery time is later than the predicted delivery time).
Explanation
To solve this, first identify delayed orders by comparing actual_time with predicted_time
(i.e., actual_time > predicted_time). Then, group the results by del_partner and count
the number of delayed orders for each delivery partner. Finally, order the results to show the
delivery partners with the most delayed orders.
Learnings
• Simple comparisons: Using actual_time > predicted_time allows filtering delayed
orders directly.
• Aggregation: The use of COUNT(*) helps to count the number of delayed orders per
delivery partner.
• GROUP BY: Ensures results are grouped by delivery partner, so the delayed order count is
calculated for each.
• Ordering: ORDER BY delayed_orders DESC sorts the delivery partners based on the
number of delayed orders, from most to least.
Solutions
PostgreSQL solution:
1015
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT del_partner,
COUNT(*) AS delayed_orders
FROM order_details
WHERE actual_time > predicted_time
GROUP BY del_partner
ORDER BY delayed_orders DESC;
MySQL solution:
SELECT del_partner,
COUNT(*) AS delayed_orders
FROM order_details
WHERE actual_time > predicted_time
GROUP BY del_partner
ORDER BY delayed_orders DESC;
Both PostgreSQL and MySQL solutions are identical, as they both support the basic SQL
syntax for aggregation and comparison.
• Q.834
Write a query to calculate the median order value for each customer. If there is an even
number of orders, return the average of the two middle values.
Explanation
To calculate the median for each customer, you need to:
• Sort the orders by order_value for each customer.
• Assign row numbers using the ROW_NUMBER() window function to position each order.
• Count the total number of orders per customer using COUNT(*) OVER (PARTITION BY
customer_id).
• For odd-numbered orders, select the middle value.
• For even-numbered orders, calculate the average of the two middle values.
Learnings
1016
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Window functions: ROW_NUMBER() is used to assign a unique number to each row, and
COUNT() calculates the total number of orders for each customer.
• Partitioning: The query partitions the data by customer_id so that each customer’s orders
are handled independently.
• Median calculation: The median is computed by identifying the middle value(s), with
special handling for even numbers of rows using AVG().
• Order handling: The query sorts the orders for each customer by order_value to
calculate the median in a properly ordered sequence.
Solutions
PostgreSQL solution:
WITH ordered_orders AS (
SELECT customer_id, order_value,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_value) AS row_num,
COUNT(*) OVER (PARTITION BY customer_id) AS total_orders
FROM orders
)
SELECT customer_id,
AVG(order_value) AS median_order_value
FROM ordered_orders
WHERE row_num IN (total_orders / 2, total_orders / 2 + 1)
GROUP BY customer_id;
MySQL solution:
WITH ordered_orders AS (
SELECT customer_id, order_value,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_value) AS row_num,
COUNT(*) OVER (PARTITION BY customer_id) AS total_orders
FROM orders
)
SELECT customer_id,
AVG(order_value) AS median_order_value
FROM ordered_orders
WHERE row_num IN (total_orders / 2, total_orders / 2 + 1)
GROUP BY customer_id;
Both PostgreSQL and MySQL solutions are identical, as they both support window functions
like ROW_NUMBER() and COUNT()
• Q.835
You are given a purchase_history table with customer purchases. Write a query to identify
customers who have made purchases in the last 30 days but not in the last 7 days.
Explanation
To solve this, you need to:
• Filter customers who have made purchases in the last 30 days but not in the last 7 days.
• You can use a combination of NOT EXISTS or LEFT JOIN to check if there are any
purchases in the last 7 days for each customer.
• Ensure that customers who made purchases within the 30-day window but not in the last 7
days are selected.
1017
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date manipulation: CURRENT_DATE - INTERVAL is used to calculate the last 30 days and
7 days from the current date.
• Filtering with NOT EXISTS: Used to exclude customers who made purchases in the last 7
days.
• Subqueries: The NOT EXISTS clause ensures that there are no records for the same
customer in the last 7 days.
• Grouping and aggregation: GROUP BY ensures that we check for unique customers, and
HAVING filters out those who made purchases in the last 7 days.
Solutions
PostgreSQL solution:
SELECT customer_id
FROM purchase_history
WHERE purchase_date >= CURRENT_DATE - INTERVAL '30 days'
AND purchase_date < CURRENT_DATE - INTERVAL '7 days'
GROUP BY customer_id
HAVING NOT EXISTS (
SELECT 1
FROM purchase_history
WHERE customer_id = purchase_history.customer_id
AND purchase_date >= CURRENT_DATE - INTERVAL '7 days'
);
MySQL solution:
SELECT customer_id
FROM purchase_history
WHERE purchase_date >= CURDATE() - INTERVAL 30 DAY
AND purchase_date < CURDATE() - INTERVAL 7 DAY
GROUP BY customer_id
HAVING NOT EXISTS (
SELECT 1
FROM purchase_history
WHERE customer_id = purchase_history.customer_id
AND purchase_date >= CURDATE() - INTERVAL 7 DAY
);
1018
1000+ SQL Interview Questions & Answers | By Zero Analyst
The syntax for date manipulation differs slightly between PostgreSQL (CURRENT_DATE -
INTERVAL '30 days') and MySQL (CURDATE() - INTERVAL 30 DAY), but the logic
remains the same.
• Q.836
Given the sales table, calculate the percentage change in sales for each product between Q1
(January to March) and Q2 (April to June) of a given year.
Explanation
To solve this, you need to:
• Calculate sales for Q1 and Q2: Use conditional aggregation to sum the sales for Q1
(January to March) and Q2 (April to June).
• Formula for percentage change: The formula to calculate percentage change is:
Learnings
• Conditional Aggregation: Using CASE WHEN within SUM() allows you to conditionally
aggregate values for different periods (Q1 and Q2).
• Date functions: The EXTRACT(MONTH FROM sale_date) function is used to filter data
based on the month part of sale_date.
• Percentage Change Calculation: Proper handling of the formula (Q2_sales -
Q1_sales) / Q1_sales * 100 is essential to get the correct result.
• GROUP BY: Grouping by product_id ensures that the calculation is done for each
individual product.
1019
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL solution:
SELECT product_id,
SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) BETWEEN 1 AND 3 THEN quantity * price
ELSE 0 END) AS Q1_sales,
SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) BETWEEN 4 AND 6 THEN quantity * price
ELSE 0 END) AS Q2_sales,
100 * (SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) BETWEEN 4 AND 6 THEN quantity
* price ELSE 0 END) -
SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) BETWEEN 1 AND 3 THEN quantity
* price ELSE 0 END)) /
NULLIF(SUM(CASE WHEN EXTRACT(MONTH FROM sale_date) BETWEEN 1 AND 3 THEN qu
antity * price ELSE 0 END), 0)
) AS percentage_change
FROM sales
GROUP BY product_id;
MySQL solution:
SELECT product_id,
SUM(CASE WHEN MONTH(sale_date) BETWEEN 1 AND 3 THEN quantity * price ELSE 0 END)
AS Q1_sales,
SUM(CASE WHEN MONTH(sale_date) BETWEEN 4 AND 6 THEN quantity * price ELSE 0 END)
AS Q2_sales,
100 * (SUM(CASE WHEN MONTH(sale_date) BETWEEN 4 AND 6 THEN quantity * price ELSE
0 END) -
SUM(CASE WHEN MONTH(sale_date) BETWEEN 1 AND 3 THEN quantity * price ELSE
0 END)) /
NULLIF(SUM(CASE WHEN MONTH(sale_date) BETWEEN 1 AND 3 THEN quantity * pric
e ELSE 0 END), 0)
) AS percentage_change
FROM sales
GROUP BY product_id;
• NULLIF: This ensures that division by zero is avoided in case Q1 sales is zero, preventing a
division by zero error.
• Date function differences: PostgreSQL uses EXTRACT(MONTH FROM sale_date) while
MySQL uses MONTH(sale_date) to extract the month.
Answer:
• Data Warehousing:
• Purpose: Primarily designed for structured data and optimized for fast querying and
analysis.
• Data Structure: Highly structured (i.e., relational data models), with data being cleansed,
transformed, and organized into schemas (e.g., star schema, snowflake schema).
• Use Case: Best for analytical applications where you need to perform complex queries and
reporting, often on historical data. Examples include business intelligence and dashboarding.
1020
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Data Lakes:
• Purpose: Designed to store vast amounts of raw, unstructured, semi-structured, and
structured data.
• Data Structure: Can handle unstructured data (e.g., text files, images) and semi-structured
data (e.g., JSON, Parquet), and the schema is often applied at the time of reading (schema-on-
read).
• Use Case: Best for storing large volumes of diverse data types that may not yet be fully
understood, including big data, logs, or data from IoT devices. Commonly used in machine
learning, deep learning, and predictive analytics where data might need extensive
transformation and processing.
• Key Differences:
• Structure: Data warehouses use a predefined schema (schema-on-write) while data lakes
store raw data and apply schema later (schema-on-read).
• Flexibility: Data lakes are more flexible in terms of the types of data they store, while data
warehouses are rigid but optimized for fast querying.
• Speed: Data warehouses are faster for querying structured data, while data lakes can
handle much larger data sets but may require more time for transformation and analysis.
• Q.838
What Are the Advantages and Challenges of Using Partitioning in SQL Databases?
Answer:
Advantages of Partitioning:
• Improved Query Performance: Partitioning can significantly improve query performance
by limiting the amount of data scanned during query execution, especially when queries filter
on partitioned columns (e.g., date).
• Example: A large table partitioned by order_date can make querying recent data faster
because only the relevant partitions are scanned.
• Manageability: Partitioning allows for easier management of large tables. You can drop or
archive older partitions without affecting the rest of the table, helping with data retention
policies.
• Parallel Processing: Partitioning allows parallel query execution by processing different
partitions simultaneously, which speeds up query processing.
• Faster Backup and Restore: Since data is stored in separate partitions, you can backup or
restore individual partitions instead of the entire table, which can be more efficient for very
large datasets.
Challenges of Partitioning:
• Overhead of Partition Management: Deciding how to partition the data, such as by range
or list, and managing partitioned tables can require additional administrative overhead.
• Example: Deciding between RANGE (e.g., by date) or LIST (e.g., by region) partitioning can
have significant impacts on performance and maintenance.
• Suboptimal Querying: If queries do not filter on partitioned columns, partitioning can
result in poor performance because the database may scan the entire table (i.e., a "partition
scan").
• Increased Complexity: As partitions grow, the logic required for loading, querying, and
maintaining them becomes more complex, especially when dealing with partition boundaries
and managing stale partitions.
1021
1000+ SQL Interview Questions & Answers | By Zero Analyst
1022
1000+ SQL Interview Questions & Answers | By Zero Analyst
The GROUP BY clause is used to group rows that have the same values into summary rows. It
is often used in conjunction with aggregate functions like COUNT(), SUM(), AVG(), and MAX()
to compute aggregate statistics for each group.
For machine learning, GROUP BY can be used to summarize data, such as calculating the
average value of a feature for each class label.
Example:
SELECT label, AVG(feature1), AVG(feature2)
FROM data_points
GROUP BY label;
• Q.843
Question
How would you handle missing values in a dataset using SQL?
Answer:
There are several approaches to handle missing values in a dataset using SQL:
• Remove rows with missing values using WHERE clause.
• Replace missing values with a default value (e.g., mean, median, or mode) using
COALESCE().
Examples:
• Remove rows with missing values:
SELECT * FROM data_points WHERE feature1_value IS NOT NULL;
• Replace missing values with the mean:
SELECT
COALESCE(feature1_value, (SELECT AVG(feature1_value) FROM data_points)) AS feature1_
value
FROM data_points;
• Q.844
Question:
How would you perform a join between multiple tables in SQL to combine training features
and labels?
Answer:
To combine multiple tables (such as features and labels) for training, you would typically use
an INNER JOIN or LEFT JOIN based on a common key (e.g., dataset_id).
Example:
SELECT f.feature1_value, f.feature2_value, l.label_value
FROM features_data f
JOIN labels l ON f.dataset_id = l.dataset_id
WHERE f.dataset_id = 1;
• Q.845
Question:
How would you calculate the correlation coefficient between two features in a dataset using
SQL?
Answer:
You can calculate the Pearson correlation coefficient using SQL by computing the
covariance of two features divided by the product of their standard deviations.
1023
1000+ SQL Interview Questions & Answers | By Zero Analyst
Example:
SELECT
(SUM(feature1_value * feature2_value) - SUM(feature1_value) * SUM(feature2_value) /
COUNT(*)) /
(SQRT((SUM(POWER(feature1_value, 2)) - POWER(SUM(feature1_value), 2) / COUNT(*)) *
(SUM(POWER(feature2_value, 2)) - POWER(SUM(feature2_value), 2) / COUNT(*)))) A
S correlation
FROM data_points;
• Q.846
Question:
How would you implement an INSERT operation in SQL to add new training data?
Answer:
You would use the INSERT INTO statement to add new data to a table.
Example:
INSERT INTO data_points (dataset_id, feature1_value, feature2_value, label_value)
VALUES (1, 3.4, 5.1, 0);
• Q.847
Question:
How would you efficiently handle large datasets in SQL for machine learning purposes?
Answer:
To handle large datasets efficiently:
• Indexing: Create indexes on frequently queried columns, such as feature_id, label_id,
or any foreign key.
• Partitioning: Split large tables into smaller, manageable partitions based on certain
column values (e.g., DATE).
• Batch Processing: Query data in smaller batches instead of loading the entire dataset at
once.
• Use Aggregate Functions: Instead of returning all rows, aggregate data using functions
like AVG(), SUM(), etc., to reduce the dataset size.
• Q.848
Question:
How would you calculate the performance metrics (e.g., accuracy, precision, recall) using
SQL on a model's predictions?
Answer:
You can calculate performance metrics by comparing the predicted values against the actual
values from the test set.
Example (Accuracy):
SELECT
COUNT(CASE WHEN predicted_label = actual_label THEN 1 END) / COUNT(*) AS accuracy
FROM predictions;
• Q.849
Question:
1024
1000+ SQL Interview Questions & Answers | By Zero Analyst
How do you create a schema for storing machine learning model metadata, such as training
parameters and performance metrics?
Answer:
To store model metadata, you would design a schema with tables for:
• models: Stores information about the model (e.g., model_id, name, type).
• training_data: Stores training details like dataset used, hyperparameters, and timestamp.
• metrics: Stores performance metrics for each model (e.g., accuracy, precision, recall).
Example:
CREATE TABLE models (
model_id INT PRIMARY KEY,
name VARCHAR(100),
model_type VARCHAR(50),
created_at TIMESTAMP
);
Backend Developer
• Q.851
What is the difference between JOIN and UNION in SQL?
Answer:
• JOIN: Combines rows from two or more tables based on a related column (e.g., primary
key/foreign key). It can be an INNER JOIN, LEFT JOIN, RIGHT JOIN, or FULL
OUTER JOIN.
1025
1000+ SQL Interview Questions & Answers | By Zero Analyst
• UNION: Combines the result sets of two or more SELECT queries. It removes duplicate
rows by default, whereas UNION ALL includes all rows.
Example:
-- JOIN Example
SELECT a.name, b.department
FROM employees a
INNER JOIN departments b ON a.department_id = b.department_id;
-- UNION Example
SELECT name FROM employees
UNION
SELECT name FROM contractors;
• Q.852
How do you handle database migrations in a production environment?
Answer:
• Use migration tools like Flyway, Liquibase, or built-in database migration frameworks in
backend frameworks (e.g., Django's migrations, Rails ActiveRecord migrations).
• Version control: Ensure each migration is versioned and can be applied in sequence.
• Backups: Always take a backup of the database before running migrations.
• Testing: Test migrations on a staging environment before applying them in production.
• Rollback: Ensure that migrations are reversible, or create a manual rollback plan if needed.
• Q.853
What are transactions in SQL, and why are they important for backend development?
Answer:
A transaction is a sequence of SQL operations executed as a single unit, ensuring database
consistency. Transactions are crucial for maintaining data integrity in the event of errors or
system crashes.
ACID Properties:
• Atomicity: Ensures all operations in a transaction are completed, or none are.
• Consistency: The database starts and ends in a consistent state.
• Isolation: Ensures transactions do not interfere with each other.
• Durability: Once committed, the changes are permanent.
Example:
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE account_id = 1;
UPDATE accounts SET balance = balance + 100 WHERE account_id = 2;
COMMIT;
• Q.854
How would you create an index on a table, and when should you use it?
Answer:
An index is created to speed up queries by reducing the time taken to search for specific
rows.
Example:
CREATE INDEX idx_employee_name ON employees (name);
When to use:
1026
1000+ SQL Interview Questions & Answers | By Zero Analyst
• For columns frequently used in WHERE, JOIN, ORDER BY, or GROUP BY clauses.
• On primary/foreign keys for faster lookup.
When NOT to use:
• Avoid indexing columns that are updated frequently as it can slow down INSERT, UPDATE,
and DELETE operations.
• Q.855
What is Normalization in SQL, and why is it important?
Answer:
Normalization is the process of organizing a database to reduce redundancy and dependency
by dividing large tables into smaller tables and establishing relationships. The goal is to avoid
data anomalies.
Normal Forms:
• 1NF: Eliminate duplicate columns, and ensure that each field contains only atomic values.
• 2NF: Eliminate partial dependency (each non-key attribute must depend on the entire
primary key).
• 3NF: Eliminate transitive dependency (non-key attributes should depend only on the
primary key).
Why Important:
• Reduces data redundancy.
• Ensures data integrity.
• Simplifies updates, deletions, and insertions.
• Q.856
Question:
How would you prevent SQL Injection in a backend system?
Answer:
• Use Prepared Statements: Ensure that user input is treated as data, not executable code.
• Input Validation: Validate and sanitize all user inputs.
• Stored Procedures: Use parameterized queries and stored procedures that don’t
concatenate strings directly.
• Least Privilege: Ensure that the database user account has the least privileges necessary
for the task.
Example (Prepared Statement):
SELECT * FROM users WHERE username = ? AND password = ?;
• Q.857
Question:
What are foreign keys, and how do they ensure referential integrity?
Answer:
A foreign key is a column (or set of columns) in one table that refers to the primary key in
another table. Foreign keys ensure that relationships between tables are valid by enforcing
referential integrity.
Example:
1027
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question:
How would you implement pagination in SQL for large datasets?
Answer:
Pagination is implemented using LIMIT and OFFSET in SQL to retrieve a subset of rows.
Example (for MySQL/PostgreSQL):
SELECT * FROM employees LIMIT 10 OFFSET 20;
• Q.859
What is the difference between a clustered and non-clustered index?
Answer:
• Clustered Index: The data rows are stored in the table in the order of the index. A table
can have only one clustered index.
• Non-clustered Index: The index is stored separately from the table, and it contains a
pointer to the actual data. A table can have multiple non-clustered indexes.
• Q.860
Question:
How do you implement data integrity in SQL?
Answer:
Data integrity is ensured by using:
• Primary Keys: Ensures uniqueness for each record.
• Foreign Keys: Ensures valid relationships between tables.
• Check Constraints: Ensures that values in a column meet certain conditions.
• Not Null Constraints: Ensures that a column cannot have NULL values.
• Unique Constraints: Ensures all values in a column are unique.
Question:
Write a query to find the 3rd highest salary in the employees table.
Explanation:
You need to write a query that identifies the 3rd highest salary from the employees table.
Use a subquery along with the DISTINCT keyword to eliminate duplicate salaries, and use
LIMIT to fetch the 3rd highest value.
1028
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Use of DISTINCT to eliminate duplicates.
• Usage of subqueries to rank or filter records.
• Limiting results with LIMIT in SQL.
Solutions
• - PostgreSQL solution
SELECT MIN(salary) AS third_highest_salary
FROM (
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 3
) AS subquery;
• - MySQL solution
SELECT MIN(salary) AS third_highest_salary
FROM (
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 3
) AS subquery;
• Q.862
Question:
Identify employees earning above the average salary within their department.
Explanation:
Write a query to find employees whose salaries are greater than the average salary in their
respective department. A correlated subquery can be used to compare each employee's salary
against the average salary of their department.
1029
1000+ SQL Interview Questions & Answers | By Zero Analyst
VALUES
(1, 'Alice', 1, 60000),
(2, 'Bob', 1, 55000),
(3, 'Charlie', 2, 70000),
(4, 'David', 2, 65000),
(5, 'Eve', 1, 75000),
(6, 'Frank', 2, 60000);
Learnings:
• Use of correlated subqueries to compare row values with aggregated results.
• Use of AVG() to calculate the average salary within a department.
• Filtering records based on the result of the subquery.
Solutions
• - PostgreSQL solution
SELECT employee_name, salary, department_id
FROM employees e1
WHERE salary > (
SELECT AVG(salary)
FROM employees e2
WHERE e1.department_id = e2.department_id
);
• - MySQL solution
SELECT employee_name, salary, department_id
FROM employees e1
WHERE salary > (
SELECT AVG(salary)
FROM employees e2
WHERE e1.department_id = e2.department_id
);
• Q.863
Write a query to find employees who have worked in more than 2 departments.
Explanation
To solve this, you'll need to count the number of departments for each employee and filter out
those who worked in more than 2.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE EmployeeDepartment (
employee_id INT,
department_id INT
);
• - Datasets
INSERT INTO EmployeeDepartment (employee_id, department_id)
VALUES
(1, 101),
(1, 102),
(1, 103),
(2, 101),
(2, 104),
(3, 102);
Learnings
• Using GROUP BY and HAVING clauses for counting occurrences
• Aggregation in SQL
Solutions
• - PostgreSQL solution
1030
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT employee_id
FROM EmployeeDepartment
GROUP BY employee_id
HAVING COUNT(DISTINCT department_id) > 2;
• - MySQL solution
SELECT employee_id
FROM EmployeeDepartment
GROUP BY employee_id
HAVING COUNT(DISTINCT department_id) > 2;
• Q.864
Write a query to get the total sales per product for the last 30 days.
Explanation
You will need to filter sales by date and aggregate the sales for each product.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Sales (
sale_id INT,
product_id INT,
sale_date DATE,
sale_amount DECIMAL
);
• - Datasets
INSERT INTO Sales (sale_id, product_id, sale_date, sale_amount)
VALUES
(1, 101, '2025-01-01', 250),
(2, 102, '2025-01-05', 300),
(3, 101, '2025-01-10', 150),
(4, 103, '2025-01-15', 450),
(5, 101, '2025-01-20', 200);
Learnings
• Using WHERE to filter by date
• Aggregation and grouping data in SQL
Solutions
• - PostgreSQL solution
SELECT product_id, SUM(sale_amount) AS total_sales
FROM Sales
WHERE sale_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY product_id;
• - MySQL solution
SELECT product_id, SUM(sale_amount) AS total_sales
FROM Sales
WHERE sale_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY product_id;
• Q.865
Find the employees who do not have any sales in the Sales table.
Explanation
Use a LEFT JOIN to get all employees and then filter for those with no corresponding sales.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Employee (
employee_id INT,
name VARCHAR(100)
);
1031
1000+ SQL Interview Questions & Answers | By Zero Analyst
• - Datasets
INSERT INTO Employee (employee_id, name)
VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie');
• - Sales dataset (some employees may not have sales)
CREATE TABLE Sales (
sale_id INT,
employee_id INT,
sale_amount DECIMAL
);
INSERT INTO Sales (sale_id, employee_id, sale_amount)
VALUES
(1, 1, 500),
(2, 2, 700);
Learnings
• Using LEFT JOIN to identify missing data
• Filtering NULL values
Solutions
• - PostgreSQL solution
SELECT e.name
FROM Employee e
LEFT JOIN Sales s ON e.employee_id = s.employee_id
WHERE s.sale_id IS NULL;
• - MySQL solution
SELECT e.name
FROM Employee e
LEFT JOIN Sales s ON e.employee_id = s.employee_id
WHERE s.sale_id IS NULL;
• Q.866
Write a query to find the total revenue for each day.
Explanation
Group by the date and aggregate the sales amount to get the total revenue per day.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Sales (
sale_id INT,
sale_date DATE,
sale_amount DECIMAL
);
• - Datasets
INSERT INTO Sales (sale_id, sale_date, sale_amount)
VALUES
(1, '2024-01-01', 100),
(2, '2024-01-01', 150),
(3, '2024-01-02', 200);
Learnings
• Date-based aggregation
• Using GROUP BY for daily totals
Solutions
• - PostgreSQL solution
SELECT sale_date, SUM(sale_amount) AS total_revenue
FROM Sales
1032
1000+ SQL Interview Questions & Answers | By Zero Analyst
GROUP BY sale_date;
• - MySQL solution
SELECT sale_date, SUM(sale_amount) AS total_revenue
FROM Sales
GROUP BY sale_date;
• Q.867
Find all customers who have not placed any orders.
Explanation
Use a LEFT JOIN on the customers table and the orders table, filtering for customers with no
corresponding orders.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Customer (
customer_id INT,
customer_name VARCHAR(100)
);
• - Datasets
INSERT INTO Customer (customer_id, customer_name)
VALUES
(1, 'Alice'),
(2, 'Bob');
• - Orders dataset
CREATE TABLE Orders (
order_id INT,
customer_id INT,
order_amount DECIMAL
);
INSERT INTO Orders (order_id, customer_id, order_amount)
VALUES
(1, 1, 500);
Learnings
• Identifying missing relationships using LEFT JOIN
Solutions
• - PostgreSQL solution
SELECT c.customer_name
FROM Customer c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
• - MySQL solution
SELECT c.customer_name
FROM Customer c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
• Q.868
Find the employees who joined after a specific date.
Explanation
You need to filter employees based on their joining date.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Employee (
employee_id INT,
name VARCHAR(100),
join_date DATE
1033
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
• - Datasets
INSERT INTO Employee (employee_id, name, join_date)
VALUES
(1, 'Alice', '2023-01-10'),
(2, 'Bob', '2024-01-05');
Learnings
• Filtering by date condition
Solutions
• - PostgreSQL solution
SELECT name
FROM Employee
WHERE join_date > '2023-01-01';
• - MySQL solution
SELECT name
FROM Employee
WHERE join_date > '2023-01-01';
• Q.869
Find the most recent order placed by each customer.
Explanation
You need to find the most recent order for each customer by grouping the orders and using
MAX() on the order date.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Orders (
order_id INT,
customer_id INT,
order_date DATE,
order_amount DECIMAL
);
• - Datasets
INSERT INTO Orders (order_id, customer_id, order_date, order_amount)
VALUES
(1, 1, '2024-01-01', 300),
(2, 1, '2024-02-10', 500),
(3, 2, '2024-03-05', 700),
(4, 2, '2024-02-15', 200),
(5, 3, '2024-04-01', 400);
Solutions
• - PostgreSQL solution
SELECT customer_id, MAX(order_date) AS most_recent_order_date
FROM Orders
GROUP BY customer_id;
• - MySQL solution
SELECT customer_id, MAX(order_date) AS most_recent_order_date
FROM Orders
GROUP BY customer_id;
• Q.870
Find the employees who have the same salary.
Explanation
You need to identify employees who share the same salary. This involves grouping by the
salary column and filtering with HAVING to only include salaries that appear more than once.
1034
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY to group rows based on salary
• Filtering with HAVING to get only those salaries that have more than one employee
• Using aggregation functions like COUNT() to identify duplicate values
Solutions
• - PostgreSQL solution
SELECT salary, ARRAY_AGG(name) AS employees
FROM Employee
GROUP BY salary
HAVING COUNT(salary) > 1;
• - MySQL solution
SELECT salary, GROUP_CONCAT(name) AS employees
FROM Employee
GROUP BY salary
HAVING COUNT(salary) > 1;
• Q.871
Find the Nth Highest Salary
Explanation
The task is to find the nth highest salary in the Employees table. There are multiple ways to
approach this problem, such as using OFFSET and LIMIT for simpler queries, or using window
functions like DENSE_RANK() to handle ties in salary rankings. Below are a few approaches to
solve this.
1035
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using ORDER BY to sort the records by salary.
• Using OFFSET and LIMIT to directly access the nth row.
• Using DENSE_RANK() for more flexible ranking in the case of salary ties.
• The importance of understanding window functions for advanced SQL queries.
Solutions
Explanation:
• ORDER BY salary DESC sorts the employees by salary in descending order.
• OFFSET 4-1 LIMIT 1 skips the first 3 rows and retrieves the 4th highest salary (change 4
to any other value for different nth salaries).
Explanation:
• DENSE_RANK() assigns a rank to each employee based on their salary in descending order,
without gaps in ranks (if multiple employees have the same salary).
• We then filter the result to get the 4th highest salary by checking where the rank equals 4.
You can change 4 to any desired value for the nth highest salary.
1036
1000+ SQL Interview Questions & Answers | By Zero Analyst
If you want a strictly unique ranking for each salary (even if some salaries are the same), use
ROW_NUMBER(). This approach ensures that no two employees get the same rank, even if their
salaries are identical.
Solution:
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM Employees
) AS ranked_employees
WHERE row_num = 4; -- For 4th highest salary, change 4 to n
Explanation:
• ROW_NUMBER() generates a unique sequential number for each row, starting at 1, without
any gaps.
• Similar to the DENSE_RANK() method, we filter the result to get the 4th row based on the
generated row_num.
Comparison of Approaches
Simple, fast, works Doesn't handle ties properly (if there are
OFFSET/LIMIT
well for specific rows multiple employees with the same salary)
1037
1000+ SQL Interview Questions & Answers | By Zero Analyst
1. Query for Median when the Number of Employees is Odd (Current Data)
WITH SortedSalaries AS (
SELECT salary, ROW_NUMBER() OVER (ORDER BY salary) AS row_num, COUNT(*) OVER () AS t
otal_count
FROM Employees
)
SELECT AVG(salary) AS median_salary
FROM SortedSalaries
WHERE row_num IN ((total_count + 1) / 2, total_count / 2 + 1);
Explanation:
• ROW_NUMBER() assigns a unique row number to each salary in ascending order.
• COUNT(*) OVER () calculates the total number of records.
• The AVG(salary) computes the median:
• If the total count is odd, the middle row will be returned.
• If the total count is even, the two middle rows are averaged.
2. Query for Median when the Number of Employees is Even (After Adding
New Record)
Let's add one more record with a salary of 91,000:
INSERT INTO Employees (id, name, salary, department_id, manager_id, hire_date)
VALUES
(10, 'John', 91000, 1, 2, '2022-07-01');
Now the total number of employees is 10, an even number. The median will be the average of
the 5th and 6th highest salaries.
The same query from above will now return the correct median for the updated dataset:
WITH SortedSalaries AS (
SELECT salary, ROW_NUMBER() OVER (ORDER BY salary) AS row_num, COUNT(*) OVER () AS t
otal_count
1038
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM Employees
)
SELECT AVG(salary) AS median_salary
FROM SortedSalaries
WHERE row_num IN ((total_count + 1) / 2, total_count / 2 + 1);
Explanation of Changes:
• After adding the 10th employee, the number of records becomes even.
• The query will return the average of the 5th and 6th highest salaries as the median.
Explanation:
• PERCENTILE_CONT(0.5) computes the 50th percentile (the median).
• WITHIN GROUP (ORDER BY salary) orders the salaries before calculating the percentile.
• Q.873
Find Employee Details Who Have Salary Greater Than Their Manager's Salary
Explanation
This query aims to retrieve employees whose salary is greater than their manager's salary. To
achieve this, you need to perform a self-join on the Employees table, where the employee's
manager_id is matched to the id of the manager. Then, filter the results where the
employee's salary is greater than the manager's salary.
1039
1000+ SQL Interview Questions & Answers | By Zero Analyst
id INT,
name VARCHAR(100) NOT NULL,
salary NUMERIC(10, 2),
department_id INT,
manager_id INT,
hire_date DATE NOT NULL
);
• - Insert data into Employees table
INSERT INTO Employees (id, name, salary, department_id, manager_id, hire_date)
VALUES
(1, 'Alice', 90000, 1, NULL, '2022-01-15'),
(2, 'Bob', 80000, 2, 1, '2022-02-20'),
(3, 'Charlie', 75000, 2, 1, '2022-03-12'),
(4, 'David', 85000, 2, 1, '2022-03-25'),
(5, 'Eve', 95000, 2, 2, '2022-04-01'),
(6, 'Frank', 78000, 2, 2, '2022-04-20'),
(7, 'Grace', 60000, 3, 3, '2022-05-12'),
(8, 'Heidi', 88000, 3, 1, '2022-06-15'),
(9, 'Sam', 89000, 3, 2, '2022-05-01');
Learnings
• Self-join: Join a table with itself to compare an employee’s data with their manager’s data.
• Filtering: Use a condition to check which employees have a higher salary than their
managers.
• NULL handling: Handle employees who do not have managers (i.e., manager_id IS
NULL).
Solutions
1. Approach 1: Self-Join
The simplest way to solve this is to use a self-join. The Employees table is joined with itself
by matching employee.manager_id to manager.id, and we filter employees whose salary is
greater than their manager's salary.
SELECT e.id, e.name, e.salary, m.name AS manager_name, m.salary AS manager_salary
FROM Employees e
JOIN Employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
Explanation:
• e and m are aliases for the employee and manager respectively.
• We join the Employees table on e.manager_id = m.id to link each employee with their
respective manager.
• The WHERE e.salary > m.salary condition filters the employees whose salary is greater
than their manager’s.
1040
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM Employees m
WHERE m.id = e.manager_id
);
Explanation:
• The subquery (SELECT salary FROM Employees m WHERE m.id = e.manager_id)
finds the manager's salary.
• The outer query compares the employee's salary (e.salary) with the manager's salary
retrieved by the subquery.
Explanation:
• LAG(salary) OVER (PARTITION BY manager_id ORDER BY id) retrieves the salary of
the manager for each employee.
• The WHERE salary > manager_salary condition filters employees who have a higher
salary than their manager.
Explanation:
• LEFT JOIN ensures that employees without managers (i.e., manager_id is NULL) are
included in the result.
• COALESCE(m.salary, 0) handles the case where the employee doesn’t have a manager by
replacing NULL values with 0. This ensures that top-level managers (without managers) do not
satisfy the condition e.salary > m.salary unless their salary is greater than 0.
1041
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Bob, Charlie, David, Heidi, and Sam will not appear because their salaries are lower than
or equal to their manager’s salary.
• Eve, Frank, and Grace will appear if their salaries exceed their respective managers.
Key Takeaways
• Self-join is a straightforward approach to compare employees with their managers.
• Subqueries are helpful for breaking down the logic into smaller parts.
• Window functions like LAG() are useful for comparisons across rows.
• LEFT JOIN allows handling cases where employees don’t have managers.
• Q.874
Find Employee's Hierarchy Level
Explanation
The task is to determine the level of each employee in the company hierarchy. The level
should start at 1 for employees without managers, and increment by 1 for each level below
the top-level manager. This can be done using a recursive Common Table Expression (CTE)
that will traverse the hierarchy.
Solution
To find the hierarchy level, we can use a recursive CTE. This approach works by starting
with the employees who don't have managers (level 1), and then recursively finding each
subsequent level.
UNION ALL
1042
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
• Base Case: The query first selects employees with manager_id IS NULL (the top-level
employees, in this case, Alice), and assigns them a level of 1.
• Recursive Case: It then recursively joins the Employees table with the
EmployeeHierarchy CTE on manager_id = id, incrementing the level for each subsequent
level in the hierarchy.
• The UNION ALL combines the base case and recursive case, continuing the recursion until
all employees are processed.
• Finally, the query selects the id, name, and level for each employee, ordered by level
and id.
Expected Output
The query will return the following result:
id | name | level
----------------------
1 | Alice | 1
2 | Bob | 2
3 | Charlie | 3
4 | David | 3
5 | Eve | 4
6 | Frank | 4
7 | Grace | 4
Key Takeaways
• Recursive CTEs are useful for hierarchical data and allow for efficient querying of parent-
child relationships.
• The base case defines the starting point (employees with no manager), and the recursive
case calculates the level for employees reporting to those in the previous levels.
• The UNION ALL combines the base and recursive results to generate the hierarchy.
• Q.875
Find Each Customer's Latest and Second Latest Order Amount
Explanation
This query is designed to retrieve the latest and the second latest order amount for each
customer. To achieve this:
• We first find the latest order for each customer using a subquery to get the most recent
order date.
• Then, for each latest order, we find the second latest order by comparing the order_date
of previous orders for the same customer.
1043
1000+ SQL Interview Questions & Answers | By Zero Analyst
order_date DATE,
order_amount DECIMAL(10, 2)
);
• - Insert data into orders table
INSERT INTO orders (order_id, customer_id, order_date, order_amount) VALUES
(1, 101, '2024-01-10', 150.00),
(2, 101, '2024-02-15', 200.00),
(3, 101, '2024-03-20', 180.00),
(4, 102, '2024-01-12', 200.00),
(5, 102, '2024-02-25', 250.00),
(6, 102, '2024-03-10', 320.00),
(7, 103, '2024-01-25', 400.00),
(8, 103, '2024-02-15', 420.00);
Solution
The query can be optimized using a self-join or subqueries to find the latest and second
latest order amounts for each customer. Here’s the optimized query:
Explanation:
• Subquery for Latest Order: The outer query first selects the most recent order for each
customer by using a subquery to find the maximum order_date for each customer_id.
• Subquery for Second Latest Order: Inside the SELECT clause, a subquery is used to fetch
the second latest order. This subquery looks for orders with an order_date earlier than the
latest order (o3.order_date < o1.order_date) for the same customer_id. The
MAX(o3.order_amount) fetches the second highest order amount for each customer.
• The query ensures that for each customer, it returns their latest and second latest order
amounts.
• Q.876
Find Average Processing Time by Each Machine
Explanation
To calculate the average processing time for each machine, we need to:
• Pair the "start" and "end" activities for each process (same process_id for each machine).
• Calculate the difference between the "end" timestamp and the "start" timestamp to get the
processing time for each process.
• Group the results by machine_id and compute the average processing time for each
machine.
1044
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solution
To calculate the average processing time, we'll use the following approach:
• Join the start and end activities for each process_id and machine_id. We can use a
self-join on the Activity table by matching process_id and machine_id.
• Compute the processing time for each process by subtracting the start timestamp from
the end timestamp.
• Group by machine_id to calculate the average processing time per machine.
Explanation:
• Self-Join: We join the Activity table with itself by matching the machine_id and
process_id. We also ensure that the first part of the process (start activity) is joined with
the second part (end activity).
• Processing Time Calculation: The processing time for each process is calculated as
a2.timestamp - a1.timestamp where a1 represents the "start" activity and a2 represents
the "end" activity.
• AVG Function: We use AVG() to calculate the average processing time for each machine.
• Group By: The results are grouped by machine_id to compute the average processing
time for each machine.
• Q.877
Retrieve All Ids of a Person Whose Rating is Greater Than Their Friend's Rating
Explanation
1045
1000+ SQL Interview Questions & Answers | By Zero Analyst
The task is to retrieve the IDs of people based on the following conditions:
• If the person has friends, check if their rating is greater than that of their friend's rating.
• If the person doesn't have any friends (i.e., friend_id is NULL), retrieve the ID only if
their rating is greater than 85.
The solution involves:
• Using a LEFT JOIN to link each person with their rating and their friend's rating.
• Applying a WHERE clause to check the conditions for both cases: when the person has
friends and when they don't.
• DISTINCT is used to avoid duplicate entries in case a person has multiple friends who
meet the condition.
Solution
To achieve the required result, we:
• Perform a LEFT JOIN between the Friends table and the Ratings table for both the
person (f.id) and their friend (f.friend_id).
• In the WHERE clause, apply the following conditions:
• If the person has a friend (f.friend_id IS NOT NULL), check if their rating is greater
than the friend's rating.
• If the person doesn't have a friend (f.friend_id IS NULL), check if their rating is greater
than 85.
1046
1000+ SQL Interview Questions & Answers | By Zero Analyst
• DISTINCT ensures that the result does not contain duplicate IDs.
Query:
SELECT DISTINCT(f.id)
FROM Friends as f
LEFT JOIN Ratings as r ON r.id = f.id
LEFT JOIN Ratings as r2 ON f.friend_id = r2.id
WHERE
(f.friend_id IS NOT NULL AND r.rating > r2.rating) -- Person's rating is greater
than friend's rating
OR
(f.friend_id IS NULL AND r.rating > 85); -- Person doesn't have a
friend, but rating is greater than 85
• Q.878
Find the ID Where the Seat is Empty and Both the Seats Before and After It are Also
Empty
Explanation
To find the rows where the seat is empty (represented by 1), and both the previous seat
(prev_seat_id) and the next seat (next_seat_id) are also empty:
• We will use window functions LAG() and LEAD() to get the values of the previous and
next seats based on the id order.
• After retrieving the previous and next seats, we filter the result to include only the rows
where the current seat is empty (seat_id = 1), and both the previous and next seats are also
empty (prev_seat_id = 1 and next_seat_id = 1).
Solution
To achieve the desired result, we use:
• LAG(seat_id): Retrieves the seat status of the row before the current one (previous seat).
• LEAD(seat_id): Retrieves the seat status of the row after the current one (next seat).
• WHERE clause to check that the current seat is empty (seat_id = 1), and both the previous
and next seats are also empty (prev_seat_id = 1 and next_seat_id = 1).
Query:
SELECT
id,
seat_id
FROM
1047
1000+ SQL Interview Questions & Answers | By Zero Analyst
(
SELECT
*,
LAG(seat_id) OVER(ORDER BY id) AS prev_seat_id,
LEAD(seat_id) OVER(ORDER BY id) AS next_seat_id
FROM cinemas
) AS t1
WHERE
seat_id = 1
AND prev_seat_id = 1
AND next_seat_id = 1;
Explanation:
• LAG() and LEAD(): These window functions are used to retrieve the values of the
previous and next row's seat_id. They are ordered by the id field to ensure we look at
consecutive rows.
• WHERE clause: We filter the result to only include rows where:
• The current seat (seat_id) is empty (1).
• The previous (prev_seat_id) and next (next_seat_id) seats are also empty (1).
Explanation:
• Seat ID 4 is empty (seat_id = 1), and both the previous seat (id = 3) and the next seat
(id = 5) are also empty (seat_id = 1).
Key Takeaways
• Window Functions: LAG() and LEAD() are powerful for comparing a row with its
neighbors (previous or next row).
• Filtering with Window Functions: You can filter based on the results of window
functions in a subquery to get complex conditions involving neighboring rows.
• Order of Rows: In this case, we use ORDER BY id to ensure we are looking at consecutive
rows based on their id values.
• Q.879
Find Users Who Have Logged in on at Least 3 Consecutive Days
Explanation
The task is to find users who have logged in on at least 3 consecutive days. This can be
achieved by:
• Using Window Functions: We can use LAG() to compare the current login date with the
previous one to identify consecutive login streaks.
• Cumulative Streak Counting: We calculate consecutive logins using a SUM() function
over the computed streaks.
• Filter Consecutive Streaks: Finally, we filter users with streaks of at least 2 consecutive
days, which indicates at least 3 consecutive login days (since SUM(steaks) starts at 0).
1048
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solution
We use two Common Table Expressions (CTEs) to calculate the consecutive login days:
• First CTE (steak_table): For each login, we use the LAG() function to compare the
current login date with the previous one. If the dates are consecutive (i.e., the difference is
exactly 1 day), we mark the streak with a 1; otherwise, we mark it with 0.
• Second CTE (steak2): We calculate the cumulative sum of the steaks for each user to
identify the total consecutive login days.
• Final Query: We filter the results to only include users whose streak is 2 or greater, which
means they logged in for at least 3 consecutive days.
Query:
WITH steak_table AS
(
SELECT
user_id,
login_date,
CASE
WHEN login_date = LAG(login_date) OVER(PARTITION BY user_id ORDER BY login_d
ate) + INTERVAL '1 day' THEN 1
ELSE 0
END as steaks
FROM user_activity
),
steak2 AS
(
SELECT
user_id,
login_date,
SUM(steaks) OVER(PARTITION BY user_id ORDER BY login_date) as consecutive_login
FROM steak_table
)
SELECT
DISTINCT user_id
FROM steak2
WHERE consecutive_login >= 2;
Explanation:
• LAG(login_date): This function looks at the previous row's login_date for each user to
check if the current login_date is exactly 1 day after the previous one.
• CASE WHEN: If the login date is consecutive (i.e., the difference between the current and
previous login date is 1 day), we assign a 1 to indicate the streak continues.
1049
1000+ SQL Interview Questions & Answers | By Zero Analyst
• SUM(steaks): This cumulative sum aggregates consecutive streaks for each user, and if the
sum is >= 2, it means the user logged in for at least 3 consecutive days.
• DISTINCT user_id: We use DISTINCT to ensure each user appears only once in the result
set.
• Q.880
SQL Query to Find 3 Consecutive Available Seats in a Cinema Hall
Explanation
We need to find 3 consecutive available seats for three friends to sit together in a cinema hall.
We assume that:
• status = 0 indicates the seat is available.
• status = 1 indicates the seat is booked.
• We need to check for a sequence of three consecutive seats that are all available (i.e.,
status = 0 for three consecutive rows).
Solution Steps:
• Identify Consecutive Seats: We will use window functions like LEAD() or LAG() to
compare each seat with its next and previous seat.
• Filter Available Seats: We only want rows where the current seat and its two adjacent
seats are all available.
• Return Seat Numbers: We'll return the first and last seat numbers from the sequence of 3
available seats.
Datasets
-- Step 1: Create the cinema hall table
CREATE TABLE cinema_hall (
id INT PRIMARY KEY,
status INT CHECK (status IN (0, 1)) -- 0: Available, 1: Booked
);
SQL Query
WITH ConsecutiveSeats AS (
SELECT
id AS seat_id,
LEAD(status, 1) OVER(ORDER BY id) AS next_seat_status,
LEAD(status, 2) OVER(ORDER BY id) AS next_to_next_seat_status
FROM cinema_hall
)
SELECT
seat_id AS first_seat,
(seat_id + 2) AS last_seat
FROM ConsecutiveSeats
WHERE status = 0
AND next_seat_status = 0
AND next_to_next_seat_status = 0;
1050
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation:
• Window Functions (LEAD()):
• LEAD(status, 1) gives the status of the next seat.
• LEAD(status, 2) gives the status of the seat after the next one.
• Filtering the Seats:
• We check if the current seat and the next two seats have status = 0 (available).
• Output:
• If the condition is met, the first seat and the last seat from the sequence of 3 consecutive
available seats are returned.
id status
1 0
2 1
3 1
4 0
5 0
6 0
7 1
Query Result:
first_seat last_seat
4 6
1051
1000+ SQL Interview Questions & Answers | By Zero Analyst
Datasets
-- orders & customers
-- customer table
CREATE TABLE customers (
customer_id INT PRIMARY KEY, -- Customer ID
customer_name VARCHAR(100), -- Customer Name
email VARCHAR(100), -- Email Address
phone VARCHAR(15) -- Phone Number
);
SQL Query:
SELECT
o.order_id,
o.customer_id,
c.customer_name,
o.order_date,
o.total_amount,
o.states,
1052
1000+ SQL Interview Questions & Answers | By Zero Analyst
o.category
FROM orders o
JOIN customers c
ON o.customer_id = c.customer_id
ORDER BY o.states, o.order_date;
Explanation:
• JOIN: We are joining the orders table with the customers table on the customer_id to
retrieve the customer details along with their orders.
• ORDER BY: We are ordering the result set first by the states column and then by the
order_date within each state, ensuring that orders are grouped and ordered by both state and
date.
• Selected Columns: The query selects the order ID, customer ID, customer name, order
date, total amount, state, and category for each order.
Output Example:
Assuming we have the sample data you provided, the query will produce results like this:
2023-12- Electroni
5 1 Alice Johnson 700.00 Kerala
12 cs
Home
2024-11- West
6 2 Bob Smith 800.00 Applianc
20 Bengal
es
2024-10-
2 2 Bob Smith 1200.00 Karnataka Furniture
10
2023-05-
7 3 Charlie Davis 600.00 Rajasthan Furniture
10
2024-09- Tamil
3 3 Charlie Davis 300.00 Books
25 Nadu
2024-07-
8 5 Eve Adams 450.00 Gujarat Books
15
2024-03- Uttar
10 7 Grace Lee 550.00 Clothing
10 Pradesh
1053
1000+ SQL Interview Questions & Answers | By Zero Analyst
2024-06-
4 4 Diana Prince 1500.00 Delhi Clothing
05
2024-01- Electroni
9 6 Frank Taylor 1000.00 Punjab
25 cs
Key Takeaways:
• Ordering by Multiple Columns: The ORDER BY clause can be used to order by multiple
columns. In this case, first by states, then by order_date.
• Join Operations: The JOIN between orders and customers allows us to display
customer-specific details for each order.
• Q.882
Question
Write an SQL query to find the highest salary from the employees table where the salary
value occurs only once.
Explanation
To find the highest salary that appears only once in the table, we need to:
• Count the occurrences of each salary using GROUP BY.
• Filter out salaries that appear more than once using the HAVING clause.
• Select the maximum salary from the filtered results.
Table Creation
CREATE TABLE employees (
id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
age INT,
sex VARCHAR(1),
employee_title VARCHAR(50),
department VARCHAR(50),
salary INT,
target INT,
bonus INT,
email VARCHAR(100),
city VARCHAR(50),
address VARCHAR(100),
manager_id INT
);
1054
1000+ SQL Interview Questions & Answers | By Zero Analyst
(10, 'Jennifer', 'Dion', 34, 'F', 'Sales', 'Sales', 1000, 200, 150, '[email protected]
m', 'Alabama', NULL, 13),
(19, 'George', 'Joe', 50, 'M', 'Manager', 'Management', 250000, 0, 300, 'George@company.
com', 'Florida', '1003 Wyatt Street', 1),
(18, 'Laila', 'Mark', 26, 'F', 'Sales', 'Sales', 1000, 200, 150, '[email protected]', 'F
lorida', '3655 Spirit Drive', 11),
(20, 'Sarrah', 'Bicky', 31, 'F', 'Senior Sales', 'Sales', 2000, 200, 150, 'Sarrah@compan
y.com', 'Florida', '1176 Tyler Avenue', 19);
Learnings
• GROUP BY and HAVING: Grouping data by salary and filtering out salaries that appear
more than once using HAVING COUNT(salary) = 1.
• MAX(): Using the MAX() function to find the highest salary from the filtered results.
Solutions
PostgreSQL Solution
SELECT MAX(salary) AS highest_unique_salary
FROM employees
GROUP BY salary
HAVING COUNT(salary) = 1;
MySQL Solution
SELECT MAX(salary) AS highest_unique_salary
FROM employees
GROUP BY salary
HAVING COUNT(salary) = 1;
• Q.883
Question
Write an SQL query to find employees who have a salary higher than their managers.
Explanation
To find employees with a salary greater than their managers, we need to:
• Join the employees table with itself (self-join) where one instance represents the employee
and the other represents the manager.
• Compare the salary of the employee with that of their manager.
• Select the employees whose salary is greater than the manager’s salary.
Table Creation
CREATE TABLE employees (
id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
age INT,
sex VARCHAR(1),
employee_title VARCHAR(50),
department VARCHAR(50),
salary INT,
target INT,
bonus INT,
email VARCHAR(100),
city VARCHAR(50),
address VARCHAR(100),
manager_id INT
1055
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learnings
• Self-Join: Joining the employees table with itself to compare employees and their
managers.
• Condition: Comparing the salary of the employee and their manager_id to find the
employees who earn more than their managers.
Solutions
PostgreSQL Solution
SELECT e.first_name AS employee_first_name, e.last_name AS employee_last_name, e.salary
AS employee_salary,
m.first_name AS manager_first_name, m.last_name AS manager_last_name, m.salary AS
manager_salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
MySQL Solution
SELECT e.first_name AS employee_first_name, e.last_name AS employee_last_name, e.salary
AS employee_salary,
m.first_name AS manager_first_name, m.last_name AS manager_last_name, m.salary AS
manager_salary
FROM employees e
JOIN employees m ON e.manager_id = m.id
WHERE e.salary > m.salary;
• Q.884
Question
Write an SQL query to categorize customers into new or returning based on the number of
returns they have done. If the number of returns is greater than or equal to 1, they are
categorized as new; otherwise, they are categorized as returning. Use the sales and
returns tables.
Explanation
We need to count the number of returns each customer has made. If a customer has made one
or more returns, they are categorized as new; otherwise, they are returning. This can be
1056
1000+ SQL Interview Questions & Answers | By Zero Analyst
achieved by joining the sales and returns tables, counting the number of returns for each
customer, and then using a CASE statement to categorize them accordingly.
Table Creation
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
customer_id INT,
sale_date DATE,
sale_amount DECIMAL(10, 2)
);
Learnings
• COUNT() with GROUP BY: Counting the number of returns for each customer.
• JOINs: Joining sales and returns tables using customer_id.
• CASE: Using a CASE statement to categorize customers based on their return count.
• Aggregating: Aggregating returns data to calculate the number of returns per customer.
Solutions
PostgreSQL Solution
SELECT s.customer_id,
CASE
WHEN COUNT(r.return_id) >= 1 THEN 'New'
ELSE 'Returning'
END AS customer_category
FROM sales s
LEFT JOIN returns r ON s.customer_id = r.customer_id
GROUP BY s.customer_id;
1057
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT s.customer_id,
CASE
WHEN COUNT(r.return_id) >= 1 THEN 'New'
ELSE 'Returning'
END AS customer_category
FROM sales s
LEFT JOIN returns r ON s.customer_id = r.customer_id
GROUP BY s.customer_id;
• Q.885
Question
Write an SQL query to find the driver who cancelled the highest number of rides.
Explanation
We need to count the number of cancelled rides for each driver, and then identify the driver
with the highest count. This can be done by grouping the rides by driver and filtering for
cancelled rides, then sorting by the number of cancellations in descending order and limiting
the result to one.
Table Creation
CREATE TABLE rides (
ride_id INT PRIMARY KEY,
driver_id INT,
ride_status VARCHAR(20) -- 'completed', 'cancelled', etc.
);
Learnings
• Counting with GROUP BY: Using COUNT() to calculate the number of cancelled rides
per driver.
1058
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Filtering: Filtering only the cancelled rides using WHERE ride_status = 'cancelled'.
• Sorting and Limiting: Sorting by cancellation count in descending order and using LIMIT
1 to get the driver with the highest number of cancellations.
Solutions
PostgreSQL Solution
SELECT d.driver_name, COUNT(r.ride_id) AS cancelled_rides
FROM rides r
JOIN drivers d ON r.driver_id = d.driver_id
WHERE r.ride_status = 'cancelled'
GROUP BY d.driver_name
ORDER BY cancelled_rides DESC
LIMIT 1;
MySQL Solution
SELECT d.driver_name, COUNT(r.ride_id) AS cancelled_rides
FROM rides r
JOIN drivers d ON r.driver_id = d.driver_id
WHERE r.ride_status = 'cancelled'
GROUP BY d.driver_name
ORDER BY cancelled_rides DESC
LIMIT 1;
• Q.886
Question
Write an SQL query to find out which restaurant had the highest number of orders in each
quarter.
Explanation
We need to group the orders by the quarter of the year, then count the number of orders for
each restaurant in each quarter. For each quarter, we will find the restaurant with the highest
number of orders.
Table Creation
CREATE TABLE orders (
order_id INT PRIMARY KEY,
restaurant_id INT,
order_date DATE
);
1059
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date Extraction: Using functions like EXTRACT(QUARTER FROM date) (PostgreSQL) or
QUARTER(date) (MySQL) to extract the quarter from the order date.
• Aggregation: Using COUNT() to calculate the number of orders per restaurant.
• Subquery or Window Functions: Using a subquery or window function to identify the
restaurant with the highest number of orders for each quarter.
• GROUP BY and ORDER BY: Grouping by quarter and restaurant, then ordering to
identify the restaurant with the most orders.
Solutions
PostgreSQL Solution
WITH quarter_orders AS (
SELECT
EXTRACT(QUARTER FROM order_date) AS quarter,
restaurant_id,
COUNT(order_id) AS order_count
FROM orders
GROUP BY quarter, restaurant_id
)
SELECT
q.quarter,
r.restaurant_name,
q.order_count
FROM quarter_orders q
JOIN restaurants r ON q.restaurant_id = r.restaurant_id
WHERE (q.quarter, q.order_count) IN (
SELECT quarter, MAX(order_count)
FROM quarter_orders
GROUP BY quarter
)
ORDER BY q.quarter;
MySQL Solution
WITH quarter_orders AS (
SELECT
QUARTER(order_date) AS quarter,
restaurant_id,
COUNT(order_id) AS order_count
FROM orders
GROUP BY quarter, restaurant_id
)
SELECT
q.quarter,
r.restaurant_name,
q.order_count
FROM quarter_orders q
JOIN restaurants r ON q.restaurant_id = r.restaurant_id
WHERE (q.quarter, q.order_count) IN (
1060
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write an SQL query to find the best day of the week based on the highest total sales.
Explanation
To solve this, we need to group the sales by day of the week and calculate the total sales for
each day. Afterward, we can use the ORDER BY clause to sort the days by total sales in
descending order, and use LIMIT 1 to retrieve the best day.
Table Creation
CREATE TABLE sales (
sale_id INT PRIMARY KEY,
sale_date DATE,
sale_amount DECIMAL(10, 2)
);
Learnings
• Grouping by Date: Using the DAYOFWEEK() (MySQL) or EXTRACT(DOW FROM date)
(PostgreSQL) to extract the day of the week from the sale_date.
• Aggregation: Using SUM() to calculate the total sales for each day.
• Sorting: Sorting the results to identify the day with the highest total sales.
• Limiting Results: Using LIMIT 1 to return only the day with the highest sales.
Solutions
PostgreSQL Solution
SELECT TO_CHAR(sale_date, 'Day') AS day_of_week, SUM(sale_amount) AS total_sales
FROM sales
GROUP BY day_of_week
ORDER BY total_sales DESC
LIMIT 1;
MySQL Solution
SELECT DAYNAME(sale_date) AS day_of_week, SUM(sale_amount) AS total_sales
1061
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM sales
GROUP BY day_of_week
ORDER BY total_sales DESC
LIMIT 1;
• Q.888
Question
Write an SQL query to find the department with the highest average salary in the company.
Explanation
To find the department with the highest average salary, we need to first group employees by
their department and calculate the average salary for each department using the AVG function.
Then, we can use the ORDER BY clause to sort the results by average salary in descending
order and limit the result to the top department.
Table Creation
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(50),
department_id INT,
salary INT
);
Learnings
• Grouping and Aggregation: Using GROUP BY with AVG to calculate the average salary per
department.
• Sorting: Sorting the results by calculated average salary to identify the department with the
highest value.
• Limiting Results: Using LIMIT to get only the department with the highest average salary.
Solutions
PostgreSQL Solution
1062
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT d.department_name, AVG(e.salary) AS average_salary
FROM employees e
JOIN departments d ON e.department_id = d.department_id
GROUP BY d.department_name
ORDER BY average_salary DESC
LIMIT 1;
• Q.889
Question
Write an SQL query to find all employees who earn more than the average salary.
Explanation
To solve this, we need to calculate the average salary using the AVG function and then filter
out the employees who earn more than this average. This can be done using a subquery to
calculate the average salary and then a WHERE clause to compare employee salaries to that
average.
Table Creation
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
employee_name VARCHAR(50),
salary INT
);
Learnings
• Aggregation: Using AVG() to calculate the average salary.
• Subqueries: Using a subquery to filter based on a calculated value (average salary).
• Comparison: Using the WHERE clause to compare individual values against a calculated
metric.
Solutions
PostgreSQL Solution
SELECT employee_id, employee_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
1063
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT employee_id, employee_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
• Q.890
Question
Write an SQL query to find the name of the product with the highest price in each country.
Explanation
We need to identify the product with the highest price for each country. To do this, we can
use a combination of JOIN between the suppliers and products tables, and the GROUP BY
clause to group by country. For each group, we will use the MAX function to get the product
with the highest price, and then match the product name to that highest price.
Table Creation
CREATE TABLE suppliers (
supplier_id INT PRIMARY KEY,
supplier_name VARCHAR(25),
country VARCHAR(25)
);
1064
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• JOIN operation: Combining data from two tables using JOIN on supplier_id.
• GROUP BY: Grouping the results by country.
• MAX function: Finding the maximum value of a column (price).
• Subquery or Window Functions: Using a subquery or window function to find the
highest-priced product for each country.
Solutions
PostgreSQL Solution
SELECT s.country, p.product_name
FROM suppliers s
JOIN products p ON s.supplier_id = p.supplier_id
WHERE p.price = (
SELECT MAX(price)
FROM products
JOIN suppliers ON suppliers.supplier_id = products.supplier_id
WHERE suppliers.country = s.country
)
ORDER BY s.country;
MySQL Solution
SELECT s.country, p.product_name
FROM suppliers s
JOIN products p ON s.supplier_id = p.supplier_id
WHERE p.price = (
SELECT MAX(price)
FROM products
JOIN suppliers ON suppliers.supplier_id = products.supplier_id
WHERE suppliers.country = s.country
)
ORDER BY s.country;
• Q.891
Question
1065
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query to get the details of the employee with the second-highest salary from
each department.
Explanation
To find the employee with the second-highest salary in each department, we can:
• Use a subquery to rank employees within each department by salary in descending order.
• Use the ROW_NUMBER() window function to assign a rank to each employee within their
department.
• Filter out the employees with the first-highest salary and get the employee with the second-
highest salary by selecting where the rank is 2.
Table Creation
CREATE TABLE employees (
id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
age INT,
sex VARCHAR(1),
employee_title VARCHAR(50),
department VARCHAR(50),
salary INT,
target INT,
bonus INT,
email VARCHAR(100),
city VARCHAR(50),
address VARCHAR(100),
manager_id INT
);
Learnings
• Window Functions: Using ROW_NUMBER() to rank employees by salary within each
department.
• Subquery/CTE: Using a subquery or CTE (Common Table Expression) to get the second-
highest salary by filtering the row with rank 2.
Solutions
1066
1000+ SQL Interview Questions & Answers | By Zero Analyst
PostgreSQL Solution
WITH RankedEmployees AS (
SELECT id, first_name, last_name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees
)
SELECT id, first_name, last_name, department, salary
FROM RankedEmployees
WHERE rank = 2;
MySQL Solution
WITH RankedEmployees AS (
SELECT id, first_name, last_name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees
)
SELECT id, first_name, last_name, department, salary
FROM RankedEmployees
WHERE rank = 2;
Key Points
• ROW_NUMBER(): This function is used to assign a unique rank to each employee within their
department based on salary.
• PARTITION BY: Divides the data into partitions (in this case, by department).
• Filtering for Rank 2: We filter the results to get only the employee with the second-
highest salary in each department.
• Q.892
Question
Write an SQL query to calculate the total number of returned orders for each month, based on
the Orders and Returns tables.
Explanation
To calculate the total number of returned orders for each month, we need to:
• Join the Returns table with the Orders table based on the OrderID.
• Extract the month and year from the OrderDate in the Orders table.
• Count the number of returns for each month.
• Group the results by year and month to get the total returns for each month.
Table Creation
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderDate DATE,
TotalAmount DECIMAL(10, 2)
);
1067
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• JOINs: Using an inner join to combine Orders and Returns based on the OrderID.
• DATE Functions: Extracting the year and month from the OrderDate using functions like
YEAR() and MONTH().
• GROUP BY and COUNT(): Grouping by the year and month of the OrderDate and
counting the number of returned orders.
Solutions
PostgreSQL Solution
SELECT
TO_CHAR(o.OrderDate, 'YYYY-MM') AS Month,
COUNT(r.ReturnID) AS TotalReturns
FROM Returns r
JOIN Orders o ON r.OrderID = o.OrderID
GROUP BY TO_CHAR(o.OrderDate, 'YYYY-MM')
ORDER BY Month;
MySQL Solution
SELECT
DATE_FORMAT(o.OrderDate, '%Y-%m') AS Month,
COUNT(r.ReturnID) AS TotalReturns
FROM Returns r
JOIN Orders o ON r.OrderID = o.OrderID
GROUP BY DATE_FORMAT(o.OrderDate, '%Y-%m')
ORDER BY Month;
Key Points
• TO_CHAR() / DATE_FORMAT(): Functions used to extract the year and month from the
OrderDate column.
• COUNT(): Used to count the number of returns for each group.
• GROUP BY: Groups the results by month and year to get the total returned orders per month.
• Q.893
Question
Write an SQL query to find the top 2 products in the top 2 categories based on the total spend
amount.
Explanation
1068
1000+ SQL Interview Questions & Answers | By Zero Analyst
Table Creation
CREATE TABLE orders (
category VARCHAR(20),
product VARCHAR(20),
user_id INT,
spend DECIMAL(10, 2),
transaction_date DATE
);
Learnings
• SUM(): Used to calculate the total spend for each product within each category.
• RANK() or ROW_NUMBER(): Used to rank products within each category based on the total
spend.
• ORDER BY: Sorting categories and products based on spend to pick the top 2 categories and
top 2 products within each category.
Solutions
PostgreSQL Solution
WITH RankedProducts AS (
SELECT category, product, SUM(spend) AS total_spend,
RANK() OVER (PARTITION BY category ORDER BY SUM(spend) DESC) AS product_rank
FROM orders
GROUP BY category, product
),
RankedCategories AS (
SELECT category, SUM(spend) AS category_spend,
RANK() OVER (ORDER BY SUM(spend) DESC) AS category_rank
1069
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM orders
GROUP BY category
)
SELECT rp.category, rp.product, rp.total_spend
FROM RankedProducts rp
JOIN RankedCategories rc ON rp.category = rc.category
WHERE rp.product_rank <= 2 AND rc.category_rank <= 2
ORDER BY rc.category_rank, rp.product_rank;
MySQL Solution
WITH RankedProducts AS (
SELECT category, product, SUM(spend) AS total_spend,
RANK() OVER (PARTITION BY category ORDER BY SUM(spend) DESC) AS product_rank
FROM orders
GROUP BY category, product
),
RankedCategories AS (
SELECT category, SUM(spend) AS category_spend,
RANK() OVER (ORDER BY SUM(spend) DESC) AS category_rank
FROM orders
GROUP BY category
)
SELECT rp.category, rp.product, rp.total_spend
FROM RankedProducts rp
JOIN RankedCategories rc ON rp.category = rc.category
WHERE rp.product_rank <= 2 AND rc.category_rank <= 2
ORDER BY rc.category_rank, rp.product_rank;
Key Points
• SUM(spend): Used to calculate the total spend for each product and each category.
• RANK(): This window function ranks the products within each category by the total spend
and ranks the categories by their total spend.
• CTEs (Common Table Expressions): Used to first rank the categories and products and
then filter the top 2 from each.
• Q.894
Question
Write an SQL query to retrieve the third-highest salary from the Employee table.
Explanation
To retrieve the third-highest salary:
• Use a window function like DENSE_RANK() to assign ranks to the salaries in descending
order.
• Filter the result to get the row with rank 3, which corresponds to the third-highest salary.
Table Creation
DROP TABLE IF EXISTS Employees;
1070
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• DENSE_RANK(): A window function that assigns a rank to each row within the partition of a
result set, with no gaps in ranking when there are ties.
• Filtering by Rank: We filter out the third-highest salary by selecting the row with rank =
3.
Solutions
Key Points
• DENSE_RANK(): Assigns ranks to rows in the result set, allowing us to handle ties correctly.
• Filtering with WHERE: We use the rank value to filter and return the third-highest salary.
• Q.895
Question
Write an SQL query to find all products that haven't been sold in the last six months. Return
the product_id, product_name, category, and price of these products.
Explanation
To solve this problem:
• Identify the current date and calculate the date six months ago.
• Join the Products table with the Sales table on product_id.
• Check for products that have no sales records in the last six months using a LEFT
JOIN and a filter based on the sale date.
• Exclude products that have sales in the last six months by checking for NULL in the
sales date.
Table Creation
DROP TABLE IF EXISTS Products;
CREATE TABLE Products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(100),
category VARCHAR(50),
price DECIMAL(10, 2)
1071
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
Learnings
• LEFT JOIN: Used to include all products, even those without sales records in the last six
months.
• Date functions: Used to calculate the date six months ago and compare with sales data.
• IS NULL: Used to filter out products that have sales records in the last six months.
Solutions
Key Points
• LEFT JOIN: Ensures that even products with no sales in the last six months are included.
• CURRENT_DATE - INTERVAL '6 months': Computes the date six months ago from the
current date.
• WHERE s.sale_date IS NULL: Filters products that have no sales in the last six months.
• Q.896
Question
Write an SQL query to find customers who bought Airpods after purchasing an iPhone.
Explanation
To solve this:
• We need to identify customers who bought an iPhone and then bought Airpods later.
• This requires comparing the purchase dates of the two products for the same customer.
1072
1000+ SQL Interview Questions & Answers | By Zero Analyst
• We can achieve this by joining the Customers and Purchases tables, filtering the
purchases where the product is iPhone and Airpods, and ensuring that the Airpods purchase
happens after the iPhone purchase.
Table Creation
DROP TABLE IF EXISTS customers;
CREATE TABLE Customers (
CustomerID INT,
CustomerName VARCHAR(50)
);
Learnings
• JOIN: Used to link customers to their purchase history.
• Date Comparison: We compare the purchase dates to check if the purchase of Airpods
was after the iPhone.
• Subquery/Correlated Subquery: A correlated subquery can help filter for customers who
bought Airpods after buying iPhones.
Solutions
1073
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Points
• JOIN: We join the Purchases table twice, once for iPhone purchases and once for Airpods
purchases.
• WHERE clause: Ensures that only customers who bought Airpods after iPhones are selected.
• DISTINCT: Ensures that each customer is listed only once, even if they made multiple
qualifying purchases.
• Q.897
Question
Write a SQL query to classify employees into three categories based on their salary:
• "High" - Salary greater than $70,000
• "Medium" - Salary between $50,000 and $70,000 (inclusive)
• "Low" - Salary less than $50,000
Your query should return the EmployeeID, FirstName, LastName, Department, Salary, and
a new column SalaryCategory indicating the category to which each employee belongs.
Explanation
To solve this:
• Use CASE WHEN logic to create a new column SalaryCategory that classifies
employees based on their salary.
• Define the conditions for "High", "Medium", and "Low" salary categories.
• Select all required columns (EmployeeID, FirstName, LastName, Department, Salary,
and the new SalaryCategory).
Table Creation
DROP TABLE IF EXISTS employees;
Learnings
• CASE WHEN: A conditional expression used to create categories based on salary ranges.
• NUMERIC type: Used to store precise decimal values for salary amounts.
1074
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
Key Points
• CASE WHEN: The CASE statement is used to assign a value based on conditional checks.
• Salary Ranges: Using BETWEEN for the "Medium" category ensures we handle both the
lower and upper bounds inclusively.
• Classifying Data: The query returns a new column (SalaryCategory) that classifies
employees into salary ranges based on their salary value.
• Q.898
Question
Write a SQL query to show the unique_id of each employee from the Employees table. If an
employee does not have a corresponding unique ID in the EmployeeUNI table, return NULL for
that employee.
Explanation
To solve this:
• We need to join the Employees table with the EmployeeUNI table based on the employee
id.
• We will use a LEFT JOIN to include all employees from the Employees table, and show
the unique_id from the EmployeeUNI table where it exists. If a unique_id does not exist for
an employee, the result should return NULL for that employee's unique_id.
• Return the employee id, name, and unique_id (or NULL if not available).
Table Creation
DROP TABLE IF EXISTS Employees;
CREATE TABLE Employees (
id INT PRIMARY KEY,
name VARCHAR(255)
);
1075
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• LEFT JOIN: Used to ensure all records from the Employees table are included, and only
matching records from the EmployeeUNI table.
• NULL Handling: The LEFT JOIN ensures that if there is no match in the EmployeeUNI
table, NULL is returned for the unique_id.
Solutions
Key Points
• LEFT JOIN: Ensures all employees are shown, even if there is no corresponding
unique_id in the EmployeeUNI table.
• NULL for Missing Data: If an employee does not have a unique ID, the query returns
NULL in the unique_id column for that employee.
• Readable Output: The output will include the employee's id, name, and either the
unique_id (if available) or NULL.
• Q.899
Question
Write a SQL query to find all the employees who do not manage anyone. Return their
emp_id, name, and manager_id.
Explanation
To solve this:
• We need to identify employees who are not managers.
• An employee is a manager if their emp_id appears in the manager_id column for other
employees.
• Employees who do not manage anyone will not have their emp_id listed as a manager_id
in any other record.
• We can achieve this by using a LEFT JOIN to check if their emp_id appears in the
manager_id column of other employees. If it doesn't, that means they don't manage anyone.
1076
1000+ SQL Interview Questions & Answers | By Zero Analyst
Table Creation
DROP TABLE IF EXISTS employees;
Learnings
• LEFT JOIN: Used to find all employees, including those without managers.
• NOT EXISTS or NOT IN: These can be used to filter employees who don't have their
emp_id as a manager_id in any record.
Solutions
Key Points
• LEFT JOIN: Ensures all employees are considered, even those who don't have a
corresponding manager.
• m.manager_id IS NULL: Filters out those employees who do not manage anyone, i.e.,
those who are not referenced as a manager by any other employee.
• Q.900
Question
Find the top 2 customers who have spent the most money across all their orders. Return their
names, emails, and total amounts spent.
Explanation
To solve this problem:
• We need to calculate the total amount spent by each customer across all their orders.
• We'll use the SUM() function to calculate the total order amount for each customer.
• Then, we'll order the results by the total amount spent in descending order and limit the
result to the top 2 customers.
1077
1000+ SQL Interview Questions & Answers | By Zero Analyst
Table Creation
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
customer_email VARCHAR(100)
);
Learnings
• SUM(): This is an aggregation function used to calculate the total amount spent by each
customer.
• JOIN: Used to combine data from the customers and orders tables based on the
customer_id.
• ORDER BY: This is used to sort the result in descending order of total amounts spent.
• LIMIT: Restricts the result to only the top 2 customers.
Solutions
1078
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Takeaways
• JOIN: Used to combine related data from different tables (customers and orders).
• SUM(): Aggregates the total amount spent by each customer.
• LIMIT: Restricts the result to the top N records, in this case, the top 2 customers based on
spending.
• Q.901
Question
Write an SQL query to find customers who have made purchases in all product categories.
Explanation
To solve this problem, we need to identify customers who have purchased at least one item
from each distinct product category. We can do this by using a GROUP BY clause to count the
distinct categories per customer and comparing it to the total number of unique categories
available in the Purchases table.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE Customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(50)
);
• - Table creation
CREATE TABLE Purchases (
purchase_id INT PRIMARY KEY,
customer_id INT,
product_category VARCHAR(50),
FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);
• - Datasets
INSERT INTO Customers (customer_id, customer_name) VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie'),
(4, 'David'),
(5, 'Emma');
INSERT INTO Purchases (purchase_id, customer_id, product_category) VALUES
(101, 1, 'Electronics'),
(102, 1, 'Books'),
(103, 1, 'Clothing'),
(104, 1, 'Electronics'),
(105, 2, 'Clothing'),
(106, 1, 'Beauty'),
(107, 3, 'Electronics'),
(108, 3, 'Books'),
(109, 4, 'Books'),
(110, 4, 'Clothing'),
(111, 4, 'Beauty'),
(112, 5, 'Electronics'),
(113, 5, 'Books');
Learnings
• Use of JOIN to connect two tables.
• Use of COUNT(DISTINCT) to count unique categories per customer.
• Comparison between the total number of unique categories and the number of categories
per customer.
Solutions
• - PostgreSQL solution
1079
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of window functions like RANK() to assign rankings based on aggregated data.
1080
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Grouping by multiple columns (hotel_name, year, and month) for aggregating revenue.
• Use of DATE_TRUNC() (or equivalent) to extract the month and year from a DATE column.
• Filtering top-ranked rows using RANK() to get the best performing month.
Solutions
• - PostgreSQL solution
WITH monthly_revenue AS (
SELECT
hotel_name,
EXTRACT(YEAR FROM booking_date) AS year,
EXTRACT(MONTH FROM booking_date) AS month,
SUM(total_price) AS revenue
FROM hotel_bookings
GROUP BY hotel_name, EXTRACT(YEAR FROM booking_date), EXTRACT(MONTH FROM booking_dat
e)
), ranked_months AS (
SELECT
hotel_name,
year,
month,
revenue,
RANK() OVER (PARTITION BY hotel_name ORDER BY revenue DESC) AS rank
FROM monthly_revenue
)
SELECT hotel_name, year, month, revenue
FROM ranked_months
WHERE rank = 1;
• - MySQL solution
WITH monthly_revenue AS (
SELECT
hotel_name,
YEAR(booking_date) AS year,
MONTH(booking_date) AS month,
SUM(total_price) AS revenue
FROM hotel_bookings
GROUP BY hotel_name, YEAR(booking_date), MONTH(booking_date)
), ranked_months AS (
SELECT
hotel_name,
year,
month,
revenue,
RANK() OVER (PARTITION BY hotel_name ORDER BY revenue DESC) AS rank
FROM monthly_revenue
)
SELECT hotel_name, year, month, revenue
FROM ranked_months
WHERE rank = 1;
• Q.903
Question
Find the details of employees whose salary is greater than the average salary across the entire
company.
Explanation
To solve this problem, we need to calculate the average salary for all employees in the
company and then filter the employees whose salary is above this average. We can use a
subquery to calculate the average salary and then compare each employee's salary to that
value.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE employees (
1081
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of subqueries to calculate aggregate values like average salary.
• Filtering results based on comparison with aggregate values.
• Basic understanding of how to use WHERE clauses for comparisons.
Solutions
• - PostgreSQL solution
SELECT employee_id, employee_name, department, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
• - MySQL solution
SELECT employee_id, employee_name, department, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
• Q.904
Question
You have a table called products with below columns: product_id, product_name, price,
quantity_sold. Calculate the percentage contribution of each product to total revenue.
Round the result to 2 decimal places.
Explanation
To solve this, first, calculate the total revenue by multiplying price and quantity_sold for
each product. Then, calculate the percentage contribution of each product by dividing its
revenue by the total revenue of all products and multiplying by 100. The result should be
rounded to 2 decimal places.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100),
price DECIMAL(10, 2),
quantity_sold INT
);
• - Datasets
INSERT INTO products (product_id, product_name, price, quantity_sold) VALUES
(1, 'iPhone', 899.00, 600),
(2, 'iMac', 1299.00, 150),
(3, 'MacBook Pro', 1499.00, 500),
(4, 'AirPods', 499.00, 800),
(5, 'Accessories', 199.00, 300);
Learnings
1082
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of ROW_NUMBER() or RANK() to identify the first order for each customer.
1083
1000+ SQL Interview Questions & Answers | By Zero Analyst
1084
1000+ SQL Interview Questions & Answers | By Zero Analyst
revenue NUMERIC
);
• - Datasets
INSERT INTO amazon_transactions (user_id, item, purchase_date, revenue) VALUES
(109, 'milk', '2020-03-03', 123),
(139, 'biscuit', '2020-03-18', 421),
(120, 'milk', '2020-03-18', 176),
(108, 'banana', '2020-03-18', 862),
(130, 'milk', '2020-03-28', 333),
(103, 'bread', '2020-03-29', 862),
(122, 'banana', '2020-03-07', 952),
(125, 'bread', '2020-03-13', 317),
(139, 'bread', '2020-03-30', 929),
(141, 'banana', '2020-03-17', 812),
(116, 'bread', '2020-03-31', 226),
(128, 'bread', '2020-03-04', 112),
(146, 'biscuit', '2020-03-04', 362),
(119, 'banana', '2020-03-28', 127),
(142, 'bread', '2020-03-09', 503),
(122, 'bread', '2020-03-06', 593),
(128, 'biscuit', '2020-03-24', 160),
(112, 'banana', '2020-03-24', 262),
(149, 'banana', '2020-03-29', 382),
(100, 'banana', '2020-03-18', 599),
(130, 'milk', '2020-03-16', 604),
(103, 'milk', '2020-03-31', 290),
(112, 'banana', '2020-03-23', 523),
(102, 'bread', '2020-03-25', 325),
(120, 'biscuit', '2020-03-21', 858),
(109, 'bread', '2020-03-22', 432),
(101, 'milk', '2020-03-01', 449),
(138, 'milk', '2020-03-19', 961),
(100, 'milk', '2020-03-29', 410),
(129, 'milk', '2020-03-02', 771),
(123, 'milk', '2020-03-31', 434),
(104, 'biscuit', '2020-03-31', 957),
(110, 'bread', '2020-03-13', 210),
(143, 'bread', '2020-03-27', 870),
(130, 'milk', '2020-03-12', 176),
(128, 'milk', '2020-03-28', 498),
(133, 'banana', '2020-03-21', 837),
(150, 'banana', '2020-03-20', 927),
(120, 'milk', '2020-03-27', 793),
(109, 'bread', '2020-03-02', 362),
(110, 'bread', '2020-03-13', 262),
(140, 'milk', '2020-03-09', 468),
(112, 'banana', '2020-03-04', 381),
(117, 'biscuit', '2020-03-19', 831),
(137, 'banana', '2020-03-23', 490),
(130, 'bread', '2020-03-09', 149),
(133, 'bread', '2020-03-08', 658),
(143, 'milk', '2020-03-11', 317),
(111, 'biscuit', '2020-03-23', 204),
(150, 'banana', '2020-03-04', 299),
(131, 'bread', '2020-03-10', 155),
(140, 'biscuit', '2020-03-17', 810),
(147, 'banana', '2020-03-22', 702),
(119, 'biscuit', '2020-03-15', 355),
(116, 'milk', '2020-03-12', 468),
(141, 'milk', '2020-03-14', 254),
(143, 'bread', '2020-03-16', 647),
(105, 'bread', '2020-03-21', 562),
(149, 'biscuit', '2020-03-11', 827),
(117, 'banana', '2020-03-22', 249),
(150, 'banana', '2020-03-21', 450),
(134, 'bread', '2020-03-08', 981),
(133, 'banana', '2020-03-26', 353),
(127, 'milk', '2020-03-27', 300),
(101, 'milk', '2020-03-26', 740),
(137, 'biscuit', '2020-03-12', 473),
(113, 'biscuit', '2020-03-21', 278),
1085
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using self-joins to compare data from the same table.
• Identifying first and second purchases with date filters.
• Using DISTINCT to return unique user_ids.
• Working with date arithmetic (e.g., purchase_date - first_purchase_date <= 7).
Solutions
• - PostgreSQL solution
SELECT DISTINCT a1.user_id AS active_users
FROM amazon_transactions a1 -- First purchase table
JOIN amazon_transactions a2 -- Second purchase table
ON a1.user_id = a2.user_id
AND a1.purchase_date < a2.purchase_date -- First purchase is before second
AND a2.purchase_date - a1.purchase_date <= 7 -- Second purchase is within 7 days
ORDER BY a1.user_id;
• - MySQL solution
SELECT DISTINCT a1.user_id AS active_users
FROM amazon_transactions a1 -- First purchase table
JOIN amazon_transactions a2 -- Second purchase table
ON a1.user_id = a2.user_id
AND a1.purchase_date < a2.purchase_date -- First purchase is before second
AND DATEDIFF(a2.purchase_date, a1.purchase_date) <= 7 -- Second purchase is within
7 days
ORDER BY a1.user_id;
• Q.907
Question
For each week, find the total number of orders. Include only the orders that are from the first
quarter of 2023.
Explanation
1086
1000+ SQL Interview Questions & Answers | By Zero Analyst
To solve this:
• Filter the orders table to include only the data from the first quarter of 2023 (January to
March).
• Group the results by week. We can derive the week of the year using the DATE_TRUNC()
function in PostgreSQL or WEEK() in MySQL.
• Calculate the total number of orders for each week by summing the quantity for each
week.
• Return the week and the total number of orders.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE orders (
order_id INT PRIMARY KEY,
order_date DATE,
quantity INT
);
• - Datasets
INSERT INTO orders (order_id, order_date, quantity) VALUES
(1, '2023-01-02', 5),
(2, '2023-02-05', 3),
(3, '2023-02-07', 2),
(4, '2023-03-10', 6),
(5, '2023-02-15', 4),
(6, '2023-04-21', 8),
(7, '2023-05-28', 7),
(8, '2023-05-05', 3),
(9, '2023-08-10', 5),
(10, '2023-05-02', 6),
(11, '2023-02-07', 4),
(12, '2023-04-15', 9),
(13, '2023-03-22', 7),
(14, '2023-04-30', 8),
(15, '2023-04-05', 6),
(16, '2023-02-02', 6),
(17, '2023-01-07', 4),
(18, '2023-05-15', 9),
(19, '2023-05-22', 7),
(20, '2023-06-30', 8),
(21, '2023-07-05', 6);
Learnings
• Filtering data by a specific date range (first quarter of 2023).
• Grouping results by week using date functions.
• Aggregating data with SUM() for total orders per week.
Solutions
• - PostgreSQL solution
SELECT
DATE_TRUNC('week', order_date) AS week_start_date,
SUM(quantity) AS total_orders
FROM orders
WHERE order_date >= '2023-01-01' AND order_date <= '2023-03-31' -- Filter for the first
quarter
GROUP BY week_start_date
ORDER BY week_start_date;
• - MySQL solution
SELECT
YEARWEEK(order_date, 1) AS week_start_date, -- WEEK() function with 1 as the second
argument for Monday as the start of the week
SUM(quantity) AS total_orders
FROM orders
1087
1000+ SQL Interview Questions & Answers | By Zero Analyst
WHERE order_date >= '2023-01-01' AND order_date <= '2023-03-31' -- Filter for the first
quarter
GROUP BY week_start_date
ORDER BY week_start_date;
• Q.908
Question
Write a query to find the starting and ending transaction amounts for each customer. Return
customer_id, their first transaction amount, last transaction amount, and the respective
transaction dates.
Explanation
To solve this:
• For each customer, identify the first and last transaction.
• To get the first transaction, order the records by transaction_date in ascending order.
• To get the last transaction, order the records by transaction_date in descending order.
• Use ROW_NUMBER() (or RANK()) window function to rank the transactions for each
customer.
• Join the two results (first and last transactions) on customer_id.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE bank_transactions (
transaction_id SERIAL PRIMARY KEY,
bank_id INT,
customer_id INT,
transaction_amount DECIMAL(10, 2),
transaction_type VARCHAR(10),
transaction_date DATE
);
• - Datasets
INSERT INTO bank_transactions (bank_id, customer_id, transaction_amount, transaction_typ
e, transaction_date) VALUES
(1, 101, 500.00, 'credit', '2024-01-01'),
(1, 101, 200.00, 'debit', '2024-01-02'),
(1, 101, 300.00, 'credit', '2024-01-05'),
(1, 101, 150.00, 'debit', '2024-01-08'),
(1, 102, 1000.00, 'credit', '2024-01-01'),
(1, 102, 400.00, 'debit', '2024-01-03'),
(1, 102, 600.00, 'credit', '2024-01-05'),
(1, 102, 200.00, 'debit', '2024-01-09');
Learnings
• Using window functions (ROW_NUMBER(), RANK()) to get first and last transactions.
• Filtering the dataset by first and last rows for each customer.
• Sorting data by transaction date to identify the first and last transactions.
Solutions
• - PostgreSQL solution
WITH ranked_transactions AS (
SELECT
customer_id,
transaction_amount,
transaction_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY transaction_date ASC) AS fi
rst_rank,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY transaction_date DESC) AS l
ast_rank
FROM bank_transactions
1088
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
SELECT
customer_id,
FIRST_VALUE(transaction_amount) OVER (PARTITION BY customer_id ORDER BY first_rank)
AS first_transaction_amt,
FIRST_VALUE(transaction_amount) OVER (PARTITION BY customer_id ORDER BY last_rank) A
S last_transaction_amt,
MIN(transaction_date) AS first_transaction_date,
MAX(transaction_date) AS last_transaction_date
FROM ranked_transactions
WHERE first_rank = 1 OR last_rank = 1
GROUP BY customer_id;
• - MySQL solution
WITH ranked_transactions AS (
SELECT
customer_id,
transaction_amount,
transaction_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY transaction_date ASC) AS fi
rst_rank,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY transaction_date DESC) AS l
ast_rank
FROM bank_transactions
)
SELECT
customer_id,
MAX(CASE WHEN first_rank = 1 THEN transaction_amount END) AS first_transaction_amt,
MAX(CASE WHEN last_rank = 1 THEN transaction_amount END) AS last_transaction_amt,
MIN(CASE WHEN first_rank = 1 THEN transaction_date END) AS first_transaction_date,
MAX(CASE WHEN last_rank = 1 THEN transaction_date END) AS last_transaction_date
FROM ranked_transactions
GROUP BY customer_id;
• Q.909
Question
Write a query to fetch students with the minimum and maximum marks from the "Students"
table.
Explanation
You need to identify the students who have the lowest and highest marks. This can be done
by using aggregate functions such as MIN() and MAX(), and then joining these results with the
original table to retrieve the corresponding student details.
1089
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use of MIN() and MAX() aggregate functions to find the minimum and maximum values.
• Applying JOIN to retrieve full student details based on aggregate results.
• Filtering rows using WHERE to match the minimum and maximum marks.
Solutions
• - PostgreSQL solution
SELECT student_id, student_name, marks, class
FROM Students
WHERE marks = (SELECT MIN(marks) FROM Students)
OR marks = (SELECT MAX(marks) FROM Students);
• - MySQL solution
SELECT student_id, student_name, marks, class
FROM Students
WHERE marks = (SELECT MIN(marks) FROM Students)
OR marks = (SELECT MAX(marks) FROM Students);
• Q.910
Question
Write a query to find products that are sold by both Supplier A and Supplier B, excluding
products sold by only one supplier.
Explanation
To solve this problem, we need to identify products that are sold by both Supplier A and
Supplier B. We can achieve this by grouping the data by product_id and filtering the groups
to include only those where both suppliers are present. We will use the HAVING clause to
ensure that the product has records for both suppliers.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE products (
product_id INT,
product_name VARCHAR(100),
supplier_name VARCHAR(50)
);
• - Datasets
INSERT INTO products (product_id, product_name, supplier_name) VALUES
(1, 'Product 1', 'Supplier A'),
(1, 'Product 1', 'Supplier B'),
(3, 'Product 3', 'Supplier A'),
(3, 'Product 3', 'Supplier A'),
(5, 'Product 5', 'Supplier A'),
(5, 'Product 5', 'Supplier B'),
(7, 'Product 7', 'Supplier C'),
(8, 'Product 8', 'Supplier A'),
(7, 'Product 7', 'Supplier B'),
1090
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using GROUP BY to aggregate data based on common fields.
• Filtering groups with HAVING based on conditions applied to aggregated results.
• Understanding how to check for multiple conditions across rows in the same group (i.e.,
different suppliers for the same product).
Solutions
• - PostgreSQL solution
SELECT product_id, product_name
FROM products
GROUP BY product_id, product_name
HAVING COUNT(DISTINCT supplier_name) = 2
AND 'Supplier A' IN (SELECT DISTINCT supplier_name FROM products WHERE product_id = pr
oducts.product_id)
AND 'Supplier B' IN (SELECT DISTINCT supplier_name FROM products WHERE product_id = pr
oducts.product_id);
• - MySQL solution
SELECT product_id, product_name
FROM products
GROUP BY product_id, product_name
HAVING COUNT(DISTINCT supplier_name) = 2
AND 'Supplier A' IN (SELECT DISTINCT supplier_name FROM products WHERE product_id = pr
oducts.product_id)
AND 'Supplier B' IN (SELECT DISTINCT supplier_name FROM products WHERE product_id = pr
oducts.product_id);
Answer
The WHERE clause is used to filter records in SQL queries. It specifies the conditions that
must be met for a row to be included in the result set. It can be used with comparison
operators (like =, >, <), logical operators (like AND, OR), pattern matching (LIKE), and to
handle NULL values (IS NULL).
• Q.912
Question
What is the purpose of the GROUP BY clause in SQL? Provide an example.
Answer
1091
1000+ SQL Interview Questions & Answers | By Zero Analyst
The GROUP BY clause in SQL is used to group rows that have the same values in specified
columns into summary rows. It is often used with aggregate functions like COUNT(), SUM(),
AVG(), MAX(), and MIN() to perform calculations on each group of rows.
• Q.913
Question
Explain the ORDER OF EXECUTION in SQL.
Answer
The ORDER OF EXECUTION in SQL determines the sequence in which SQL clauses are
processed. Here's the typical order:
• FROM – Retrieves the data from the tables.
• JOIN – Combines rows from different tables based on a related column.
• WHERE – Filters rows based on the given condition.
• GROUP BY – Groups rows based on column values.
• HAVING – Filters groups after the GROUP BY operation.
• SELECT – Chooses the columns to display.
• DISTINCT – Removes duplicate rows (if any).
• ORDER BY – Sorts the result based on specified columns.
• LIMIT – Limits the number of rows returned.
Answer
1092
1000+ SQL Interview Questions & Answers | By Zero Analyst
• PRIMARY constraint ensures that the column has unique values and cannot contain NULL.
It is used to identify each record uniquely in a table.
• UNIQUE constraint also ensures that the column has unique values, but it allows NULL
values (depending on the database system).
Key differences:
• A table can have only one PRIMARY key, but it can have multiple UNIQUE
constraints.
• PRIMARY key columns cannot contain NULL values, but UNIQUE columns can.
Answer
• NULL represents the absence of a value or unknown data in a database. It is a special
marker used to indicate that no data exists for that field.
• "null" (with quotes) is just a string, a value like any other text. It is not the same as NULL
and represents the literal word "null".
Key difference:
• NULL is used to indicate missing or undefined data, while "null" is just a string
containing the characters "n", "u", "l", and "l".
Answer
1093
1000+ SQL Interview Questions & Answers | By Zero Analyst
• CHAR (Character) is a fixed-length data type. It stores data with a predefined length,
padding with spaces if the value is shorter than the specified length.
• VARCHAR (Variable Character) is a variable-length data type. It only uses the amount of
space required to store the actual value, without padding.
Key differences:
• CHAR is faster for fixed-length data, but it may waste storage if the data is shorter than
the defined length.
• VARCHAR is more efficient for variable-length data, as it only uses as much space as
needed for the value.
Answer
• VARIABLE in a stored procedure is used to store temporary data or intermediate results.
It can only be accessed and modified within the scope of the stored procedure.
• PARAMETER is a value passed into a stored procedure when it is called. It can be used
to accept input from the caller or return a value to the caller.
Key differences:
• VARIABLE is local to the stored procedure and is used for internal calculations or data
manipulation.
• PARAMETER is used to pass data into or out of the stored procedure.
1094
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
The ORDER BY clause in SQL is used to sort the result set of a query in ascending (ASC) or
descending (DESC) order based on one or more columns.
• ASC (ascending) is the default and sorts data from lowest to highest (e.g., A to Z, 1 to 10).
• DESC (descending) sorts data from highest to lowest (e.g., Z to A, 10 to 1).
Example 1: Sorting by one column (ascending by default)
SELECT * FROM Employees
ORDER BY LastName;
Answer
In SQL, NULL represents missing or undefined data. To handle NULL values, you can use the
following methods:
• IS NULL / IS NOT NULL
Check if a column contains NULL or not.
SELECT * FROM Employees WHERE Department IS NULL;
• COALESCE()
Replace NULL with a default value.
SELECT COALESCE(Salary, 0) FROM Employees;
1095
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
Normalization is the process of organizing data in a database to reduce redundancy and
improve data integrity. There are several "normal forms" (1NF, 2NF, and 3NF) that guide the
normalization process.
• First Normal Form (1NF):
• A table is in 1NF if all its columns contain atomic values (i.e., each column contains only
one value per row).
• No repeating groups or arrays.
• Example: Instead of storing multiple phone numbers in one column, each phone number
should be stored in a separate row.
-- Example of 1NF
| CustomerID | Name | Phone |
|------------|----------|------------|
| 1 | John | 1234567890 |
| 1 | John | 9876543210 |
• Second Normal Form (2NF):
• A table is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent
on the primary key (i.e., no partial dependency).
• This removes partial dependencies where non-key columns depend only on part of a
composite key.
• Example: Split a table with a composite key into multiple tables.
-- Example of 2NF (before and after)
-- Before 2NF (composite key: (OrderID, ProductID))
| OrderID | ProductID | ProductName | Quantity |
|---------|-----------|-------------|----------|
| 1 | 101 | Apple | 10 |
| 1 | 102 | Banana | 5 |
-- After 2NF
-- Orders table
| OrderID | CustomerID |
|---------|------------|
| 1 | 1001 |
1096
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- OrderDetails table
| OrderID | ProductID | Quantity |
|---------|-----------|----------|
| 1 | 101 | 10 |
| 1 | 102 | 5 |
-- Products table
| ProductID | ProductName |
|-----------|-------------|
| 101 | Apple |
| 102 | Banana |
• Third Normal Form (3NF):
• A table is in 3NF if it is in 2NF and there are no transitive dependencies (i.e., non-key
attributes should not depend on other non-key attributes).
• Every non-key attribute should depend only on the primary key.
• Example: Separate a column that depends on another non-key column into a new table.
-- Example of 3NF (before and after)
-- Before 3NF
| StudentID | StudentName | Department | DepartmentHead |
|-----------|-------------|------------|----------------|
| 1 | Alice | Physics | Dr. Smith |
-- After 3NF
-- Students table
| StudentID | StudentName | DepartmentID |
|-----------|-------------|--------------|
| 1 | Alice | 101 |
-- Departments table
| DepartmentID | Department | DepartmentHead |
|--------------|-------------|----------------|
| 101 | Physics | Dr. Smith |
Answer
1097
1000+ SQL Interview Questions & Answers | By Zero Analyst
A database trigger is a set of SQL statements that are automatically executed (or
"triggered") when certain events occur on a specific table or view. Triggers can be used to
enforce business rules, automate tasks, and maintain data integrity.
Types of Triggers:
• BEFORE Trigger: Executes before the operation (INSERT, UPDATE, DELETE) takes
place.
• AFTER Trigger: Executes after the operation (INSERT, UPDATE, DELETE) has
occurred.
• INSTEAD OF Trigger: Executes instead of the operation, used mainly for views.
This trigger logs every deletion from the Employees table into an Employee_Audit table
with a timestamp.
This trigger prevents insertion of records into the Employees table if the Salary is negative.
1098
1000+ SQL Interview Questions & Answers | By Zero Analyst
When a new order is placed, a trigger can automatically update the Inventory table to reduce
stock.
CREATE TRIGGER update_inventory
AFTER INSERT ON Orders
FOR EACH ROW
BEGIN
UPDATE Products
SET Stock = Stock - NEW.Quantity
WHERE ProductID = NEW.ProductID;
END;
This trigger ensures that stock levels in the Products table are updated when a new order is
placed in the Orders table.
Answer
• VIEW:
• A VIEW is a virtual table that represents the result of a query. It does not store data
physically; instead, it generates the data dynamically when queried.
• Every time you access a VIEW, the underlying query is executed to fetch the latest data.
• It is useful for simplifying complex queries, hiding sensitive data, or presenting data in a
particular format.
• MATERIALIZED VIEW:
• A MATERIALIZED VIEW is similar to a VIEW, but it stores the result of the query physically,
like a snapshot of the data at the time the view was created or last refreshed.
• It improves performance by allowing the data to be precomputed and stored, but it can
become outdated if the underlying data changes.
• You can refresh a MATERIALIZED VIEW manually or at set intervals to ensure it reflects the
most current data.
Key Differences:
• Storage:
• VIEW: Does not store data; it's virtual.
• MATERIALIZED VIEW: Stores the query result physically.
• Performance:
• VIEW: Query performance can be slower because data is fetched dynamically each time.
1099
1000+ SQL Interview Questions & Answers | By Zero Analyst
• MATERIALIZED VIEW: Generally faster for large datasets, as data is precomputed and
stored.
• Data Freshness:
• VIEW: Always shows the most up-to-date data.
• MATERIALIZED VIEW: May show stale data until it is refreshed.
• Usage:
• VIEW: Useful for complex queries that need to be executed repeatedly or for hiding
complexity.
• MATERIALIZED VIEW: Useful for improving performance where the query involves large or
complex aggregations and doesn't require real-time data.
Answer
• AND:
• The AND operator is used to combine multiple conditions, and all conditions must be true
for the overall expression to be true.
• If one condition is false, the entire expression evaluates to false.
Example:
SELECT * FROM Employees
WHERE Age > 30 AND Department = 'Sales';
This returns employees who are both older than 30 and belong to the 'Sales' department.
• OR:
• The OR operator is used to combine multiple conditions, and if any of the conditions is
true, the overall expression will be true.
• If one condition is true, the entire expression evaluates to true, even if the other condition
is false.
Example:
SELECT * FROM Employees
WHERE Age > 30 OR Department = 'Sales';
This returns employees who are either older than 30 or belong to the 'Sales' department.
Key Differences:
• Condition Requirement:
1100
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
• UNION:
• The UNION operator combines the results of two or more queries and removes duplicate
rows from the final result.
• It ensures that each row in the result set is unique.
• Performance: Slightly slower because it needs to check and remove duplicates.
Example:
SELECT Name FROM Employees WHERE Department = 'HR'
UNION
SELECT Name FROM Employees WHERE Department = 'Finance';
This query returns a list of unique employee names from both the HR and Finance
departments, removing any duplicates.
• UNION ALL:
• The UNION ALL operator combines the results of two or more queries and includes all
rows, even duplicates.
• It does not perform any duplicate removal, so it's faster than UNION because it simply
concatenates the result sets.
Example:
SELECT Name FROM Employees WHERE Department = 'HR'
UNION ALL
SELECT Name FROM Employees WHERE Department = 'Finance';
This query returns a list of employee names from both HR and Finance departments,
including duplicates.
1101
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Differences:
• Duplicate Rows:
• UNION: Removes duplicate rows.
• UNION ALL: Includes all rows, even duplicates.
• Performance:
• UNION: Slightly slower due to the process of eliminating duplicates.
• UNION ALL: Faster because it does not remove duplicates.
• Use Cases:
• UNION: Use when you want a distinct list of results.
• UNION ALL: Use when duplicates are acceptable or when performance is a priority.
Answer
• LEFT JOIN (or LEFT OUTER JOIN):
• A LEFT JOIN returns all rows from the left table and the matching rows from the right
table.
• If there is no match in the right table, the result will contain NULL values for columns from
the right table.
Example:
SELECT Employees.Name, Departments.DepartmentName
FROM Employees
LEFT JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
This query returns all employees, even those who do not belong to any department (with
NULL for the department name in such cases).
This query returns all employees and all departments, with NULL values for either employee
or department where there is no match.
1102
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Differences:
• Returned Rows:
• LEFT JOIN: Returns all rows from the left table, and matched rows from the right table. If
no match, returns NULL from the right table.
• FULL JOIN: Returns all rows from both the left and right tables. If no match, returns NULL
for missing data from the respective table.
• Use Cases:
• LEFT JOIN: Use when you want all records from the left table and any matching records
from the right table.
• FULL JOIN: Use when you want all records from both tables, regardless of whether they
have matching rows.
• NULL Values:
• LEFT JOIN: Can result in NULL values only for columns of the right table when there's no
match.
• FULL JOIN: Can result in NULL values for columns from either the left or the right table
when no match exists.
Answer
• SELF JOIN:
• A SELF JOIN is a join where a table is joined with itself. It is used when you need to
compare rows within the same table.
• Typically, a table is aliased to differentiate between the instances of the same table in the
query.
Example:
SELECT A.EmployeeName, B.EmployeeName AS ManagerName
FROM Employees A
LEFT JOIN Employees B ON A.ManagerID = B.EmployeeID;
This query joins the Employees table with itself to get each employee's manager's name. The
A and B are aliases for two instances of the same Employees table.
• CROSS JOIN:
• A CROSS JOIN produces the Cartesian product of two tables, meaning it returns every
combination of rows from both tables.
1103
1000+ SQL Interview Questions & Answers | By Zero Analyst
• There is no condition or relationship between the tables. If Table1 has m rows and Table2
has n rows, the result will contain m * n rows.
Example:
SELECT Products.ProductName, Colors.Color
FROM Products
CROSS JOIN Colors;
This query returns all combinations of products and colors, creating a product-color
combination for every row in both tables.
Key Differences:
• Purpose:
• SELF JOIN: Used to join a table with itself, typically to compare or relate rows within the
same table.
• CROSS JOIN: Used to produce all possible combinations of rows between two tables.
• Result:
• SELF JOIN: Produces a result based on logical relationships between rows in the same
table (e.g., employees and managers).
• CROSS JOIN: Produces every possible pair of rows from both tables, resulting in a large
result set.
• Conditions:
• SELF JOIN: Involves a condition to specify how the rows are related (usually via
primary/foreign key relationships).
• CROSS JOIN: No condition; it simply combines each row from the first table with every
row from the second table.
Answer
No, in SQL, you cannot directly use the ALTER statement to modify an existing VIEW.
However, you can modify a view by first dropping it and then recreating it with the new
definition.
1104
1000+ SQL Interview Questions & Answers | By Zero Analyst
While ALTER VIEW does not exist, some databases (like MySQL and PostgreSQL) offer
commands like CREATE OR REPLACE VIEW to recreate a view without needing to drop it
explicitly.
Example:
CREATE OR REPLACE VIEW view_name AS
SELECT column1, column2
FROM table_name
WHERE condition;
This approach allows you to modify the view definition without manually dropping and
creating it.
Answer
• CTE (Common Table Expression):
• A CTE is a temporary result set defined within the execution scope of a SELECT, INSERT,
UPDATE, or DELETE statement. It simplifies complex queries by providing a way to break them
into modular subqueries.
• A CTE is defined using the WITH clause and can be referenced multiple times within the
main query.
Example:
WITH EmployeeCTE AS (
SELECT EmployeeID, EmployeeName, Department
FROM Employees
)
SELECT * FROM EmployeeCTE;
In this example, the EmployeeCTE is a simple result set that is defined and used in the main
query.
• Recursive CTE:
• A recursive CTE is a type of CTE that refers to itself within its definition. It is used to
handle hierarchical or recursive data, such as organizational charts, folder structures, or
family trees.
• A recursive CTE typically has two parts:
1105
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Anchor member: The base query that provides the starting point of the recursion.
• Recursive member: The query that references the CTE itself and iteratively fetches
related data.
Example (e.g., hierarchical employee structure):
WITH RECURSIVE EmployeeHierarchy AS (
-- Anchor member (base case)
SELECT EmployeeID, EmployeeName, ManagerID
FROM Employees
WHERE ManagerID IS NULL
UNION ALL
-- Recursive member
SELECT e.EmployeeID, e.EmployeeName, e.ManagerID
FROM Employees e
JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID
)
SELECT * FROM EmployeeHierarchy;
This recursive CTE retrieves all employees and their managers in a hierarchical structure.
Key Differences:
• Purpose:
• CTE: Used for simplifying complex queries by breaking them into smaller, reusable
subqueries.
• Recursive CTE: Specifically used to handle hierarchical or recursive relationships in data
(e.g., finding a manager’s hierarchy).
• Structure:
• CTE: A non-recursive result set that can be referenced within a single query.
• Recursive CTE: Includes two parts (anchor and recursive) and references itself to handle
recursion.
• Use Case:
• CTE: Used for modularizing queries, performing joins, and aggregating data.
• Recursive CTE: Used to process hierarchical or recursive data, like calculating a path,
parent-child relationships, etc.
Answer
1106
1000+ SQL Interview Questions & Answers | By Zero Analyst
A subquery is a query nested inside another query (usually within the SELECT, INSERT,
UPDATE, or DELETE statement). Subqueries can be classified as correlated or non-correlated
based on how they relate to the outer query.
Correlated Subquery:
• A correlated subquery is a subquery that depends on the outer query for its values. For
each row processed by the outer query, the subquery is executed, and the values from the
outer query are passed into the subquery.
• The subquery references columns from the outer query (usually via correlation), and its
result depends on those values.
Example:
SELECT EmployeeID, Name
FROM Employees E
WHERE Salary > (
SELECT AVG(Salary)
FROM Employees
WHERE DepartmentID = E.DepartmentID
);
In this example:
• The subquery uses E.DepartmentID from the outer query, making it correlated. For each
employee in the outer query, the subquery recalculates the average salary for that particular
department.
Non-Correlated Subquery:
• A non-correlated subquery is a subquery that does not depend on the outer query. It is
executed once, and its result is used by the outer query. The subquery runs independently of
the outer query.
• The subquery does not reference any columns from the outer query.
Example:
SELECT EmployeeID, Name
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);
In this example:
• The subquery calculates the average salary of all employees once and is independent of the
outer query. It does not refer to any column from the outer Employees table.
Key Differences:
• Dependence on Outer Query:
• Correlated Subquery: Depends on the outer query for its values. It is evaluated for each
row in the outer query.
• Non-Correlated Subquery: Does not depend on the outer query. It is executed once and
its result is used by the outer query.
• Execution:
• Correlated Subquery: Evaluated once for each row of the outer query.
• Non-Correlated Subquery: Evaluated only once, independent of the outer query.
• Performance:
• Correlated Subquery: Can be slower, as the subquery is executed repeatedly for each row
in the outer query.
1107
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Non-Correlated Subquery: Usually more efficient since the subquery is executed only
once.
Answer
When troubleshooting a slow-running SQL query, the goal is to identify the bottlenecks and
optimize performance. Here are the steps you would typically follow:
3. Optimize Joins
• Why: Inefficient joins can cause performance problems, especially with large tables or if
using cartesian products (unintended cross joins).
1108
1000+ SQL Interview Questions & Answers | By Zero Analyst
• How:
• Review join conditions and ensure that you're using the most efficient type of join (e.g.,
INNER JOIN vs LEFT JOIN).
• Avoid joining unnecessary tables or using DISTINCT when it's not needed.
• Action: Rewrite complex queries, use appropriate join conditions, or limit the dataset with
WHERE before joining.
1109
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
1110
1000+ SQL Interview Questions & Answers | By Zero Analyst
Both ROW_NUMBER() and DENSE_RANK() are window functions in SQL that assign a unique
number to rows within a result set, but they behave differently when handling ties (duplicate
values).
ROW_NUMBER():
• The ROW_NUMBER() function assigns a unique sequential integer to each row, starting from
1 for the first row. When there are duplicate values, ROW_NUMBER() still assigns a unique
number to each row.
• There is no gap between the assigned numbers, even if there are duplicate values in the
result set.
Example:
SELECT EmployeeID, Salary, ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum
FROM Employees;
1 50000 1
2 50000 2
3 45000 3
4 40000 4
In this case, even though EmployeeID 1 and EmployeeID 2 have the same salary, they still
get unique RowNum values (1 and 2).
DENSE_RANK():
• The DENSE_RANK() function also assigns a rank to each row, but it does not skip ranks
when there are ties. If two rows have the same value, they receive the same rank, and the next
rank is incremented by 1.
• This means that there are no gaps in the ranking, unlike ROW_NUMBER().
Example:
SELECT EmployeeID, Salary, DENSE_RANK() OVER (ORDER BY Salary DESC) AS Rank
FROM Employees;
1 50000 1
2 50000 1
3 45000 2
1111
1000+ SQL Interview Questions & Answers | By Zero Analyst
4 40000 3
Here, EmployeeID 1 and EmployeeID 2 have the same salary and receive the same rank (1),
but EmployeeID 3 gets rank 2, and EmployeeID 4 gets rank 3. Notice that there are no gaps
in the ranking.
Key Differences:
• Handling Ties (Duplicate Values):
• ROW_NUMBER(): Assigns a unique number to every row, even if the rows have the
same value.
• DENSE_RANK(): Assigns the same rank to rows with the same value and does not skip
ranks.
• Gap in Values:
• ROW_NUMBER(): Will never have gaps in the numbering, even when there are ties.
• DENSE_RANK(): Will not leave gaps between ranks if there are ties (e.g., rank 1, rank 1,
rank 2, rank 3).
• Use Cases:
• ROW_NUMBER(): Useful when you need to generate a unique, sequential identifier for
each row, regardless of ties.
• DENSE_RANK(): Useful when you want to assign ranks and ensure no ranks are skipped,
particularly in scenarios like competition ranking.
Answer
SQL commands are categorized into different types based on their functionality. The main
types are:
1112
1000+ SQL Interview Questions & Answers | By Zero Analyst
1113
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.933
Question
Difference between OLAP (Online Analytical Processing) and OLTP (Online Transaction
Processing).
Answer
OLAP and OLTP are two types of data processing systems designed for different purposes.
Here's a breakdown of the key differences between them:
1114
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Differences:
Data Size Large volume of historical data Small, current transaction records
Multidimensional (Cubes) or
Data Structure Relational (Normalized)
Denormalized
Summary:
• OLAP is designed for data analysis, where the data is read-heavy, often aggregated, and
used for decision-making. It's used in scenarios like reporting, forecasting, and trend analysis.
• OLTP is designed for real-time transactions, focusing on speed and efficiency for day-
to-day operations like order processing, inventory management, and customer transactions.
Answer
1115
1000+ SQL Interview Questions & Answers | By Zero Analyst
SQL (Structured Query Language) and NoSQL (Not Only SQL) represent two different types
of database management systems, each with its own structure, use cases, and advantages.
Here's a breakdown of the key differences between them:
Key Differences:
1116
1000+ SQL Interview Questions & Answers | By Zero Analyst
Suitable for structured data and Suitable for unstructured data, high-
Use Case complex queries (e.g., banking, speed reads/writes, and big data (e.g.,
ERP) social media, IoT)
Summary:
• SQL databases are ideal for structured, transaction-heavy applications where data
consistency and relationships between data entities are crucial (e.g., banking systems,
inventory management).
• NoSQL databases are better suited for handling large volumes of unstructured or semi-
structured data, flexible schema requirements, and horizontal scaling (e.g., social
networks, real-time analytics, content management).
Answer
Both TEXT and VARCHAR are used to store variable-length strings in SQL, but there are
key differences in terms of storage, performance, and use cases.
1117
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Purpose: VARCHAR is used to store variable-length character strings, meaning it only takes
up as much space as the string requires, plus some overhead for length storage.
• Characteristics:
• Storage: Uses only the amount of space required to store the string plus a small overhead
(usually 1 or 2 bytes for length information).
• Length: You must specify a maximum length for the column when defining it (e.g.,
VARCHAR(255)).
• Performance: Faster for most operations since it is optimized for variable-length storage.
• Use Case: Ideal for fields where you know the maximum length of the string, like names,
email addresses, or titles.
Example:
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(100)
);
TEXT:
• Purpose: TEXT is used to store large amounts of text (such as long paragraphs or articles)
and can store data of variable length without a predefined size limit.
• Characteristics:
• Storage: Internally, TEXT may use a different storage mechanism (like a separate memory
area for large data), and it often has more overhead compared to VARCHAR.
• Length: There is no maximum length for TEXT (or it's significantly larger than VARCHAR),
but in some databases, the maximum size can still be constrained (e.g., in PostgreSQL, TEXT
can store up to 1GB of data).
• Performance: May not be as performant as VARCHAR for smaller strings because it is
optimized for large text data and can introduce additional overhead in certain operations.
• Use Case: Suitable for storing large blocks of text, such as descriptions, articles, or long-
form content.
Example:
CREATE TABLE Articles (
ArticleID INT,
Content TEXT
);
Key Differences:
1118
1000+ SQL Interview Questions & Answers | By Zero Analyst
Generally faster for short strings, May have some overhead for
Performance especially when a length is larger data, but optimized for large
specified text storage
Summary:
• VARCHAR is best used when you know the maximum size of the text and want to
optimize storage and performance.
• TEXT is used for larger blocks of text where the size can vary greatly, and there is no
predefined length constraint.
Answer
Database Transactions:
A database transaction is a sequence of one or more operations (such as insert, update,
delete, or select) that are executed as a single unit of work. These operations are executed in
a way that they either all succeed or all fail. If any operation within a transaction fails, the
entire transaction is rolled back to ensure the database remains in a consistent state.
Key Aspects of Transactions:
• Atomicity: A transaction is an atomic unit of work. This means that the entire transaction
is treated as a single operation, and either all changes are committed, or none are. If one part
of the transaction fails, the entire transaction is rolled back.
1119
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Consistency: A transaction must transition the database from one valid state to another.
The database must remain in a valid state before and after the transaction.
• Isolation: Transactions are isolated from one another. The intermediate steps of a
transaction are not visible to other transactions. The final result is only visible once the
transaction is completed.
• Durability: Once a transaction is committed, its changes are permanent, even if there is a
system crash.
ACID Properties:
The ACID properties define the essential characteristics that ensure reliable processing of
database transactions. They are:
1. Atomicity:
• Definition: Atomicity ensures that a transaction is all or nothing. If any part of the
transaction fails, the entire transaction is aborted, and the database remains unchanged.
• Example: If a bank transfer involves withdrawing money from one account and depositing
it into another, both actions must succeed. If either the withdrawal or the deposit fails, both
operations are rolled back.
Example:
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 1;
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 2;
COMMIT;
2. Consistency:
• Definition: The consistency property ensures that a transaction brings the database from
one valid state to another. The database must satisfy all integrity constraints (like foreign
keys, unique constraints, etc.) before and after the transaction.
• Example: A database enforces that an account balance cannot be negative. If a transaction
tries to withdraw more than the available balance, the transaction will fail, keeping the
database in a consistent state.
3. Isolation:
• Definition: Isolation ensures that transactions are executed independently of each other.
Even though multiple transactions might be happening concurrently, each transaction will
execute as if it is the only one.
• Levels of Isolation: SQL databases provide different levels of isolation to manage
concurrency:
• Read Uncommitted: Transactions can read data that is not yet committed.
• Read Committed: A transaction can only read committed data.
• Repeatable Read: Ensures that if a transaction reads a record, no other transaction can
modify it until the transaction is complete.
• Serializable: Ensures that transactions are executed in a way that they seem to be executed
sequentially, even though they may be running concurrently.
Example:
Two transactions trying to update the same record concurrently could lead to inconsistent
results, but isolation prevents this from happening by ensuring that one transaction's changes
are invisible to others until the transaction is complete.
1120
1000+ SQL Interview Questions & Answers | By Zero Analyst
4. Durability:
• Definition: Durability guarantees that once a transaction is committed, its effects are
permanent, even in the event of a system failure (like a power loss or crash).
• Example: Once a bank transfer transaction is committed, even if the system crashes
immediately after, the money will still have been transferred successfully when the system
recovers.
Example:
COMMIT; -- After this, the changes are permanent even if the system crashes.
Summary:
A database transaction is a sequence of operations that are executed as a single unit of
work. The ACID properties (Atomicity, Consistency, Isolation, Durability) ensure that
transactions are processed reliably, maintaining data integrity and consistency even in cases
of errors, crashes, or concurrent access.
Answer
Data integrity constraints are essential in maintaining the accuracy, consistency, and
reliability of data in SQL databases. They ensure that only valid data is stored in the
database, enforcing business rules and preventing data anomalies. Let's look at the NOT
NULL, UNIQUE, and CHECK constraints and their importance:
1121
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Purpose: The NOT NULL constraint ensures that a column cannot have a NULL value,
meaning that a value must be provided when inserting or updating records in the table.
• Importance:
• Data Completeness: Ensures that critical fields (e.g., user ID, order date) are always filled
with meaningful data.
• Prevents Missing Data: Guarantees that no rows are left with incomplete data, which
could lead to errors or inconsistencies in business logic.
• Improves Data Quality: Ensures that the application or system always has the necessary
information to operate properly.
Example:
CREATE TABLE Employees (
EmployeeID INT NOT NULL,
Name VARCHAR(100) NOT NULL,
DateOfBirth DATE NOT NULL
);
In this example, the NOT NULL constraint ensures that EmployeeID, Name, and DateOfBirth
are always provided for each employee record.
2. UNIQUE Constraint:
• Purpose: The UNIQUE constraint ensures that all values in a column or a set of columns are
distinct. No two rows can have the same value in the specified column(s).
• Importance:
• Prevents Duplicate Data: Helps in preventing redundancy by ensuring that there are no
duplicate values in important fields like email addresses, usernames, or customer IDs.
• Improves Data Quality: Ensures uniqueness, which is critical for identifying records
uniquely (e.g., an email address should be unique for each user).
• Key for Referential Integrity: When used in conjunction with primary keys or foreign
keys, it helps enforce relationships between tables.
Example:
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
Email VARCHAR(255) UNIQUE
);
Here, the UNIQUE constraint on Email ensures that no two customers can have the same email
address.
3. CHECK Constraint:
• Purpose: The CHECK constraint is used to enforce a condition on the values in a column. It
ensures that only values satisfying a specified condition are allowed in a column.
• Importance:
• Enforces Business Rules: The CHECK constraint allows you to enforce rules directly at the
database level. For example, ensuring that an employee's salary is always greater than 0 or
that a product's price is within a valid range.
• Data Validation: It provides a way to validate data before it’s inserted into the database,
preventing invalid or out-of-bound values.
1122
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Improves Data Consistency: By enforcing rules on data, it ensures that the data complies
with predefined conditions and constraints, improving data integrity.
Example:
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
Price DECIMAL(10, 2),
CHECK (Price > 0)
);
In this example, the CHECK constraint ensures that no product can have a price less than or
equal to 0.
Summary:
• NOT NULL ensures that fields always have a value, preventing incomplete data.
• UNIQUE ensures that important fields (like email or user IDs) contain distinct values,
preventing duplicates.
• CHECK enforces specific conditions or business rules on data, ensuring that only valid
data is entered into the database.
Together, these constraints help maintain high-quality, reliable, and consistent data, which is
essential for the smooth functioning of applications, reporting, and business processes.
1123
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
A stored procedure is a precompiled collection of SQL statements and logic stored in the
database, which can be executed with a single call. They are commonly used to encapsulate
business logic, manage repetitive tasks, and improve performance. Like any tool, stored
procedures come with both advantages and disadvantages.
1124
1000+ SQL Interview Questions & Answers | By Zero Analyst
make them DBMS-dependent. This can cause issues if you need to switch databases or use
multiple database systems.
• Portability Issues: Code written for one database system (e.g., SQL Server) may not be
compatible with another system (e.g., MySQL or PostgreSQL), making it harder to migrate to
a different platform.
• Performance Overhead:
• Complexity in Execution: While stored procedures are precompiled, if they contain
complex logic or many operations, they can still introduce performance overhead. Misusing
them (e.g., for tasks better handled by application code) can lead to slow performance.
• Resource Consumption: Long-running or resource-heavy stored procedures can place
strain on the database server, affecting the performance of other queries and operations.
• Debugging and Testing Challenges:
• Difficult to Debug: Debugging stored procedures can be more difficult compared to
regular application code, as many database systems do not offer robust debugging tools or a
simple way to step through stored procedure execution.
• Limited Testing: Stored procedures are often tested in isolation, but in real-world
applications, they interact with other components, which can make testing and integration
more challenging.
• Version Control and Change Management:
• Harder to Integrate into CI/CD Pipelines: Managing stored procedure versions and
integrating them into modern CI/CD (Continuous Integration/Continuous Deployment)
pipelines can be more complex compared to application code. Manual intervention is often
required to deploy changes to stored procedures.
• Lack of Good Version Control: Unlike regular application code, versioning and
maintaining changes to stored procedures can be harder without proper tools or processes,
especially in large systems.
• Vendor Lock-In:
• Proprietary SQL Extensions: Each DBMS has its own proprietary extensions for stored
procedures, making it harder to port your code to another database. For example, a stored
procedure written in T-SQL for SQL Server might not work in MySQL or PostgreSQL due to
differences in the procedural languages used by each database.
• Dependency on Database Vendors: Since stored procedures are highly dependent on the
underlying database system, this could lead to lock-in with a particular vendor (e.g., Oracle,
Microsoft SQL Server), which might limit flexibility in choosing or switching databases.
• Difficulty in Scaling:
• Centralized Logic: Since stored procedures execute on the database server, they can
become a bottleneck if not properly optimized, especially in high-traffic applications. This
can hinder scalability, particularly in distributed systems where logic might need to be
decentralized.
• Limited Parallelism: Complex stored procedures might not take full advantage of
database optimizations for parallelism, limiting performance when handling very large
volumes of data.
Summary:
Advantages:
• Improved Performance: Precompiled, reducing query compilation time.
• Code Reusability: Centralized business logic.
1125
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
In SQL, COMMIT and ROLLBACK are crucial statements that control the end of a
transaction and determine whether the changes made during the transaction should be saved
or discarded. These statements ensure that a series of operations can be treated as a single
unit of work, helping maintain data integrity and consistency. Here’s a deeper dive into their
roles:
COMMIT Statement:
• Purpose: The COMMIT statement is used to permanently save all the changes made during a
transaction to the database. Once a transaction is committed, the changes become visible to
other transactions, and they are permanent, even in the case of a system crash.
• When is it used?:
• You use COMMIT after performing a set of operations (like INSERT, UPDATE, or DELETE) that
you want to make permanent.
• Typically, COMMIT is used at the end of a transaction when you are sure the transaction has
been completed successfully.
• Importance:
• Data Persistence: Ensures that all changes to the database are saved and made permanent.
• Concurrency: Makes changes visible to other users or transactions. Once committed, other
transactions can read or modify the data.
1126
1000+ SQL Interview Questions & Answers | By Zero Analyst
In this example, after transferring money from one account to another, the COMMIT statement
ensures that both the withdrawal and deposit are permanently saved.
ROLLBACK Statement:
• Purpose: The ROLLBACK statement is used to undo all changes made during a transaction.
If an error occurs or you decide that you do not want to save the changes, you can use
ROLLBACK to revert the database to its state before the transaction began.
• When is it used?:
• You use ROLLBACK when you want to discard the changes made during a transaction. This
is typically done if an error occurs, or if the business logic or validation fails during the
transaction.
• You can also use ROLLBACK if you determine that a transaction was unnecessary or
incorrect.
• Importance:
• Error Handling: Allows you to cancel a transaction and maintain the consistency and
integrity of the database in the case of an error or failure.
• Atomicity: Guarantees that if part of a transaction fails, all changes made during the
transaction are rolled back, and the database remains in a valid state.
• Data Integrity: Ensures that partial changes to data are not left in an inconsistent or
corrupt state.
• Example:
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 1;
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 2;
-- If an error occurs, roll back the transaction
ROLLBACK; -- Undo all changes
In this example, if there’s an issue (e.g., insufficient funds, database error), ROLLBACK ensures
that neither account is affected, and the database returns to its state before the transaction
started.
1127
1000+ SQL Interview Questions & Answers | By Zero Analyst
Real-Life Scenario:
Imagine you're transferring money between two bank accounts:
• Step 1: You withdraw $100 from Account A.
• Step 2: You deposit $100 into Account B.
If the deposit to Account B fails for any reason (e.g., database crash or validation issue), you
don’t want Account A to have been debited. So, you ROLLBACK the entire transaction,
ensuring that neither account is modified. However, if both operations succeed, you would
issue a COMMIT, making the transaction permanent.
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 1;
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 2;
Summary:
• COMMIT makes all changes within a transaction permanent and visible to others.
• ROLLBACK undoes all changes made during a transaction, ensuring that no partial or
inconsistent data is saved.
These two statements are essential for maintaining data integrity, consistency, and ensuring
that transactions are either fully completed or fully discarded, in line with the ACID
properties of database transactions.
1128
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Amazon
• Microsoft
• Oracle
• Accenture
• IBM
• Q.940
Question
Explain the purpose of the LIKE operator in SQL and provide examples of its usage.
Answer
The LIKE operator in SQL is used to search for a specified pattern within a column's
values. It is primarily used in the WHERE clause to filter results based on pattern matching,
especially when you need to search for values that match a particular sequence of characters.
1129
1000+ SQL Interview Questions & Answers | By Zero Analyst
4. Combining % and _:
• To find names where the first character is "J" and the second character is any letter, and the
name is followed by any number of characters:
SELECT * FROM Employees
WHERE Name LIKE 'J_n%';
• Explanation: The J_n% pattern matches names that start with "J", have any letter in the
second position (represented by _), and can be followed by any characters. It would match
"Jan", "Jonny", "Jean", etc.
5. Case Sensitivity:
• In some SQL databases, the LIKE operator is case-insensitive by default (e.g., in MySQL),
but in others (e.g., PostgreSQL), it is case-sensitive.
• To perform a case-insensitive search, some systems offer functions like ILIKE (in
PostgreSQL) or use LOWER()/UPPER() functions:
SELECT * FROM Employees
WHERE LOWER(Name) LIKE 'john%';
Advantages:
• Provides flexible and powerful pattern matching in SQL queries.
• Helps in filtering data based on incomplete or approximate text.
Disadvantages:
• Can be slower on large datasets because it requires pattern matching.
• Wildcard searches starting with % (e.g., %John) may not use indexes effectively, leading to
slower performance.
1130
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.941
Question
Explain the concept of data warehousing and how it differs from traditional relational
databases.
Answer
Data Warehousing:
A data warehouse is a specialized type of database designed to support business
intelligence (BI) activities, such as reporting, data analysis, and decision-making. It is an
integrated, subject-oriented, time-variant, and non-volatile collection of data that helps
organizations make strategic business decisions.
Traditional Relational
Feature Data Warehousing
Databases (OLTP)
1131
1000+ SQL Interview Questions & Answers | By Zero Analyst
1132
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Performance:
• Data Warehouses: Optimized for read-heavy workloads and complex queries, with larger
data volumes. Queries are often aggregations or multi-table joins that analyze trends or past
performance.
• Relational Databases: Optimized for write-heavy workloads and transactional integrity.
The focus is on speed and accuracy for daily business operations.
• Data Refresh:
• Data Warehouses: Data is updated at regular intervals (e.g., daily, weekly) via batch
processing. It’s not real-time data, but a historical snapshot that enables trend analysis.
• Relational Databases: Data is updated in real-time with every transaction (e.g., when a
user places an order or updates their profile).
Summary:
• Data warehousing is designed for the storage and analysis of large volumes of
historical data, often from multiple sources, and is optimized for read-heavy, complex
queries and business intelligence tasks.
• Traditional relational databases are used for real-time transactional operations,
focusing on consistency, speed, and integrity of daily business processes.
Data warehousing helps organizations extract meaningful insights from vast amounts of
data, while traditional relational databases support the core transactional activities that keep
a business running.
Answer
1133
1000+ SQL Interview Questions & Answers | By Zero Analyst
1134
1000+ SQL Interview Questions & Answers | By Zero Analyst
1135
1000+ SQL Interview Questions & Answers | By Zero Analyst
Summary:
Database Triggers offer powerful ways to automate tasks like enforcing business rules,
logging changes, maintaining data integrity, and performing calculations. They are
particularly useful for ensuring data consistency and automating repetitive tasks directly in
the database layer. However, they can also add complexity and performance overhead, so
they should be used judiciously.
1136
1000+ SQL Interview Questions & Answers | By Zero Analyst
Discuss the concept of database concurrency control and how it is achieved in SQL
databases.
Answer
1. Locking Mechanisms:
Locks are used to prevent other transactions from accessing data in conflicting ways while
one transaction is performing operations.
• Shared Locks (Read Locks): A transaction can acquire a shared lock on a piece of data if
it intends to read the data. Multiple transactions can acquire shared locks on the same data
simultaneously.
Example: If Transaction 1 reads a record, other transactions can also read the same record,
but none can modify it until the lock is released.
1137
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Exclusive Locks (Write Locks): A transaction acquiring an exclusive lock on data ensures
that no other transaction can read or write that data until the lock is released. This is used for
operations that modify data.
Example: If Transaction 1 is updating a record, no other transactions can read or update that
record until Transaction 1 is finished.
• Deadlock: A situation where two or more transactions are waiting for each other to release
locks, causing a cycle. Most databases detect deadlocks and automatically terminate one of
the transactions to break the deadlock.
2. Isolation Levels:
The isolation level of a transaction determines the level of visibility other transactions have
to its uncommitted changes. SQL databases support several isolation levels, which control the
degree of locking and the kinds of concurrency phenomena (dirty reads, non-repeatable reads,
phantom reads) allowed.
Common isolation levels include:
• Read Uncommitted:
• Allows: Dirty reads, non-repeatable reads, and phantom reads.
• Explanation: Transactions can read data that is not committed yet (i.e., changes that might
later be rolled back). This level provides the least protection and the highest concurrency but
can lead to inconsistent results.
• Read Committed:
• Allows: Non-repeatable reads, but not dirty reads.
• Explanation: Transactions can only read committed data, but if a transaction reads data
and another transaction updates it, the first transaction’s result will differ when it reads the
same data again. It prevents dirty reads but still allows for other concurrency issues.
• Repeatable Read:
• Allows: Non-repeatable reads, but prevents phantom reads.
• Explanation: Once a transaction reads data, it will see the same data if it reads it again,
even if another transaction modifies it in the meantime. However, phantom reads may still
occur if other transactions insert or delete records.
• Serializable:
• Prevents: Dirty reads, non-repeatable reads, and phantom reads.
• Explanation: This is the highest level of isolation, where transactions are executed serially
(one at a time), ensuring no interference between them. However, this level can significantly
impact performance due to reduced concurrency.
Each database engine may have slight variations in how it implements isolation levels, but
these four are the most common.
1138
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Common Use: This is useful when reading is far more common than writing, and conflicts
are expected to be infrequent.
Example: A banking system where multiple users might view account balances
simultaneously, but updates (such as deposits) are relatively infrequent.
Example Scenario:
Imagine two users, User A and User B, trying to transfer money from the same bank
account.
• Without Concurrency Control: If both users attempt to withdraw money from the
account simultaneously, they might both see the same balance and each withdraw more than
what is available. This leads to overdrafts.
• With Concurrency Control: Using locks or serializable isolation, one of the users would
be forced to wait until the other finishes its transaction. This ensures that the balance is
consistent and prevents issues like overdrafts or inconsistent results.
Conclusion:
Database concurrency control is essential for ensuring that multiple transactions can occur
simultaneously without compromising the consistency and integrity of the data. It is achieved
through:
• Locking mechanisms (shared and exclusive locks),
• Isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable),
• Optimistic and pessimistic concurrency control, each providing different trade-offs
between data consistency and system performance.
By properly managing concurrency, databases can ensure that they handle multi-user
environments efficiently while maintaining data correctness.
1139
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Accenture
• IBM
• Q.944
Question
Explain the role of the SELECT INTO statement in SQL and provide examples of its usage.
Answer
Syntax:
SELECT column1, column2, ...
INTO new_table
FROM existing_table
WHERE condition;
• column1, column2, ...: The columns you want to select from the existing table.
• new_table: The name of the new table that will be created.
• existing_table: The name of the table from which data is selected.
• condition: An optional condition to filter the rows you want to insert into the new table.
Key Points:
• Creates a new table: The SELECT INTO statement creates a new table with the columns
and data types inferred from the source table(s).
• Data insertion: The data retrieved by the SELECT query is inserted into the newly created
table.
• No pre-existing table: The target table (new_table) is created automatically if it does not
already exist. If the table exists, you cannot use SELECT INTO; instead, you would need to use
INSERT INTO.
1140
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Result: A new table EmployeeCopy will be created with the same structure (columns and
data types) as the Employees table, and all the rows from Employees will be copied into
EmployeeCopy.
Important Considerations:
• Table Structure:
• The structure (columns and data types) of the new table is automatically derived from the
result of the SELECT statement.
1141
1000+ SQL Interview Questions & Answers | By Zero Analyst
• You cannot specify column data types in the SELECT INTO statement; they are inferred
from the source columns.
• No Constraints or Indexes:
• The new table created by SELECT INTO will not have any indexes, primary keys, or foreign
key constraints from the original table. You would need to manually add these after the table
is created if necessary.
• Cannot Use SELECT INTO with an Existing Table:
• If the target table (new_table) already exists, you cannot use SELECT INTO. Instead, you
would use INSERT INTO.
• Performance Considerations:
• SELECT INTO can be resource-intensive, especially when copying large datasets, because it
involves both creating a new table and inserting data into it.
Summary:
The SELECT INTO statement in SQL is a powerful tool to create a new table and insert data
into it based on the results of a query. It is often used for creating temporary tables, backups,
or summary reports. However, it creates tables without any constraints or indexes, so they
must be added separately if needed.
Answer
SQL (Structured Query Language) is a standard programming language used to manage
and manipulate relational databases. It allows you to query, insert, update, and delete data
stored in relational database systems (RDBMS).
1142
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
• SQL (Structured Query Language):
• Definition: A standardized language used to query, manipulate, and manage data in
relational databases.
• Purpose: Used for writing queries to interact with databases (e.g., SELECT, INSERT,
UPDATE).
• Independence: SQL is not a database system; it is a language used across different
databases.
• MySQL:
• Definition: A relational database management system (RDBMS) that uses SQL as its
query language.
• Purpose: MySQL stores, manages, and organizes data in tables. It implements SQL to
perform operations like data retrieval, manipulation, etc.
• Independence: MySQL is a software/database system, not a language.
Answer
The primary components of a SQL query are:
• SELECT: Specifies the columns to retrieve.
• FROM: Identifies the table from which to fetch data.
1143
1000+ SQL Interview Questions & Answers | By Zero Analyst
Example:
SELECT Name, Age
FROM Employees
WHERE Age > 30
ORDER BY Age DESC;
Answer
• Table: A collection of related data organized into rows and columns. It is the basic unit of
data storage in a database.
• Example: Employees table stores employee-related data.
• Row: A single, horizontal record in a table, representing one data entry.
• Example: A row in the Employees table represents one employee.
• Column: A vertical set of values in a table, representing a specific attribute of data.
• Example: Name, Age, and Salary are columns in the Employees table.
Answer
1144
1000+ SQL Interview Questions & Answers | By Zero Analyst
In SQL, you can comment out lines using two types of comment syntax:
• Single-line comment: Use - to comment a single line.
-- This is a single-line comment
SELECT * FROM Employees;
• Multi-line comment: Use /* to begin and / to end a comment block.
/* This is a
multi-line comment */
SELECT * FROM Employees;
Answer
The SELECT statement is used to retrieve data from one or more tables in a database. It
allows you to specify which columns to return, and you can filter, sort, and group the data as
needed.
Basic Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE condition;
• Purpose: Extracts data for analysis, reporting, or further processing.
Answer
To retrieve all columns from a table, use the * wildcard with the SELECT statement:
SELECT *
FROM table_name;
• Purpose: Fetches all columns and their data from the specified table.
1145
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
To eliminate duplicate records, use the DISTINCT keyword in your SELECT query:
SELECT DISTINCT column1, column2, ...
FROM table_name;
• Purpose: Returns only unique rows for the specified columns.
Answer
• GROUP BY:
• Purpose: Groups rows that have the same values in specified columns, often used with
aggregate functions (e.g., COUNT(), SUM()).
• Example:
SELECT department, COUNT(*)
FROM employees
GROUP BY department;
• ORDER BY:
• Purpose: Sorts the result set based on one or more columns, in ascending (ASC) or
descending (DESC) order.
• Example:
SELECT * FROM employees
ORDER BY salary DESC;
1146
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Netflix
• Tesla
• Amazon
• Facebook
• Salesforce
• Q.954
Question
How do you limit the number of records returned by a SQL query?
Answer
To limit the number of records, use the LIMIT clause (in MySQL, PostgreSQL, SQLite) or
TOP keyword (in SQL Server):
• MySQL/PostgreSQL/SQLite:
SELECT *
FROM table_name
LIMIT 10;
• SQL Server:
SELECT TOP 10 *
FROM table_name;
• Purpose: Restricts the result set to a specified number of rows.
Answer
In SQL, you can perform arithmetic operations using standard operators:
• Addition (+): Adds two values.
SELECT price + tax AS total_price FROM products;
• Subtraction (): Subtracts one value from another.
SELECT salary - deductions AS net_salary FROM employees;
• Multiplication (): Multiplies two values.
SELECT quantity * unit_price AS total_cost FROM orders;
• Division (/): Divides one value by another.
SELECT total_amount / quantity AS unit_price FROM sales;
• Modulus (%): Returns the remainder of a division.
SELECT salary % 2 AS remainder FROM employees;
1147
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Nike
• Apple
• Facebook
• Google
• Amazon
• Q.956
Question
Explain the purpose of the IN operator in SQL.
Answer
The IN operator is used to check if a value matches any value in a list or subquery. It
simplifies multiple OR conditions.
Syntax:
SELECT *
FROM table_name
WHERE column_name IN (value1, value2, value3, ...);
Example:
SELECT *
FROM employees
WHERE department IN ('Sales', 'Marketing', 'HR');
• Purpose: Matches a column’s value against a set of possible values or a subquery.
Answer
• EXISTS:
• Purpose: Checks if a subquery returns any rows. It returns TRUE if the subquery has at least
one row, otherwise FALSE.
• Usage: Typically used with correlated subqueries.
• Example:
SELECT *
FROM employees e
WHERE EXISTS (
SELECT 1 FROM departments d WHERE d.manager_id = e.employee_id
);
• IN:
• Purpose: Checks if a value matches any value in a list or subquery.
1148
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Differences:
• EXISTS checks for the presence of rows, while IN checks for a match to a list of values.
• EXISTS is more efficient when checking for existence, especially in subqueries, while IN
is better for comparing against fixed sets of values.
Answer
Aggregate functions perform a calculation on a set of values and return a single result. They
are commonly used with GROUP BY to summarize data.
1149
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
A self-join is a join where a table is joined with itself. It’s useful for comparing rows within
the same table.
Syntax:
SELECT A.column1, B.column2
FROM table_name A, table_name B
WHERE A.common_column = B.common_column;
Example:
Suppose you have an Employees table with EmployeeID, ManagerID, and Name columns, and
you want to find each employee's manager.
SELECT E.Name AS Employee, M.Name AS Manager
FROM Employees E
LEFT JOIN Employees M ON E.ManagerID = M.EmployeeID;
• Purpose: This query joins the Employees table with itself, comparing each employee's
ManagerID with the EmployeeID of other employees.
Answer
• Table:
• Definition: A table is a physical storage structure that holds data in rows and columns.
• Purpose: Stores actual data in a database.
• Example: employees, orders.
• View:
• Definition: A view is a virtual table created by a query that retrieves data from one or more
tables.
• Purpose: Provides a simplified or customized way to view data without storing it
physically.
• Example: CREATE VIEW EmployeeDetails AS SELECT Name, Age FROM employees
WHERE Age > 30;
Key Differences:
• Storage: Tables store data; views do not store data, only the query definition.
• Usage: Tables are used for permanent data storage, while views are used for convenient
data access.
1150
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
To perform a case-insensitive search, you can use:
• UPPER() or LOWER() functions:
• Convert both the column and search term to the same case.
SELECT *
FROM employees
WHERE UPPER(name) = UPPER('john');
• COLLATE clause (in some databases like MySQL, SQL Server):
• Use a case-insensitive collation.
SELECT *
FROM employees
WHERE name COLLATE Latin1_General_CI_AS = 'john';
Key Points:
• UPPER() / LOWER(): Changes both the column and search term to the same case for
comparison.
• COLLATE: Specifies case-insensitive collation for the query.
• Q.962
Question
Explain the purpose of the EXPLAIN statement in SQL.
Answer
The EXPLAIN statement is used to display the execution plan of a query. It provides insights
into how the SQL query will be executed by the database engine, showing details like:
• Which indexes are used
• The order in which tables are accessed
• The type of join used
• Estimated row counts and cost
Example:
EXPLAIN SELECT * FROM employees WHERE department = 'Sales';
• Purpose: Helps optimize queries by understanding how the database processes them,
allowing for better performance tuning.
1151
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.963
Question
How do you perform data migration in SQL?
Answer
Data migration in SQL involves transferring data from one database or table to another.
Common methods include:
• Using INSERT INTO with SELECT:
• Move data from one table to another within the same database or across databases.
INSERT INTO new_table (column1, column2)
SELECT column1, column2
FROM old_table;
• Export/Import:
• Export data from the source database to a file (CSV, SQL dump) and import it into the
destination database.
• Tools like mysqldump (MySQL) or pg_dump (PostgreSQL) are used for this purpose.
• ETL (Extract, Transform, Load):
• Use an ETL tool or SQL scripts to extract data from the source, transform it as needed, and
load it into the target system.
• Purpose: Helps in transferring, updating, or restructuring data when moving between
different environments or systems.
• Q.964
Question
What is a Database?
Answer
A database is a structured collection of data stored and managed electronically. It allows for
efficient storage, retrieval, modification, and management of data. Databases are used to store
data in a way that is easily accessible, searchable, and manageable.
Key Characteristics:
• Tables: Organized into rows (records) and columns (attributes).
• Queries: Data is accessed and manipulated using query languages like SQL.
• Management: Managed by a DBMS (Database Management System) like MySQL,
PostgreSQL, or Oracle.
• Purpose: Used to store and manage large volumes of data, ensuring consistency, security,
and easy access.
• Q.965
Question
What is DBMS?
1152
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
A DBMS (Database Management System) is software that enables users to create, manage,
and interact with databases. It provides an interface between users and the database, handling
data storage, retrieval, and manipulation while ensuring data integrity, security, and
concurrency control.
Examples of DBMS:
• Relational DBMS (RDBMS): MySQL, PostgreSQL, Oracle
• Non-relational DBMS: MongoDB, Cassandra
• Purpose: Helps organize and manage data efficiently, enabling seamless data access and
manipulation.
• Q.966
Question
What is RDBMS? How is it different from DBMS?
Answer
• RDBMS (Relational Database Management System):
• An RDBMS is a type of DBMS that stores data in tables (relations) and uses SQL for data
manipulation. It supports relationships between tables and ensures data integrity through
keys (primary, foreign).
• Examples: MySQL, PostgreSQL, Oracle, SQL Server.
• DBMS (Database Management System):
• A DBMS is a broader system that manages databases in general, but it doesn't necessarily
enforce relational data structures or relationships. It could store data in various formats
(hierarchical, network, object-based, etc.).
• Examples: File systems, Hierarchical DBMS like IBM's IMS.
Key Differences:
• Data Structure:
• RDBMS: Stores data in tables with rows and columns.
• DBMS: Can store data in various formats (e.g., hierarchical, network).
• Relationships:
• RDBMS: Supports relationships between tables using keys.
• DBMS: May not support relational data models.
• ACID Properties:
• RDBMS: Supports full ACID compliance (Atomicity, Consistency, Isolation, Durability).
• DBMS: May not fully support ACID properties.
1153
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.967
Question
What is a Foreign Key?
Answer
A Foreign Key is a column (or a set of columns) in a table that establishes a link between the
data in two tables. It refers to the primary key of another table, enforcing referential integrity
between the two tables.
Key Points:
• Purpose: Ensures that values in the foreign key column(s) correspond to valid entries in
the referenced table.
• Integrity: Prevents orphaned records by enforcing valid relationships.
Example:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
);
• Purpose: The CustomerID in the Orders table is a foreign key that links to the
CustomerID in the Customers table.
• Q.968
Question
Difference between Unique Key and Candidate Key.
Answer
• Unique Key:
• Purpose: Ensures that all values in a column (or a set of columns) are distinct across rows.
• Characteristics: A table can have multiple unique keys. It allows NULL values, but only
one NULL value is allowed in columns with a unique constraint.
• Example:
CREATE TABLE employees (
EmployeeID INT UNIQUE,
Email VARCHAR(255) UNIQUE
);
• Candidate Key:
• Purpose: A set of one or more columns that can uniquely identify a row in a table. It is a
potential candidate for being chosen as the Primary Key.
• Characteristics: A table can have multiple candidate keys, but only one is selected as the
Primary Key. Candidate keys cannot have NULL values.
• Example:
CREATE TABLE students (
StudentID INT PRIMARY KEY,
RollNo INT UNIQUE,
Email VARCHAR(255) UNIQUE
);
1154
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Here, StudentID, RollNo, and Email are candidate keys, but StudentID is chosen as the
primary key.
Key Differences:
• Uniqueness: Both enforce uniqueness, but candidate keys are potential primary keys,
while unique keys are constraints to ensure no duplicate values.
• NULL values: Unique keys can allow one NULL, while candidate keys cannot.
• Q.969
Question
Difference between Composite Key and Super Key.
Answer
• Composite Key:
• Definition: A composite key is a primary key that consists of two or more columns
combined to uniquely identify a record in a table.
• Purpose: Used when a single column is not sufficient to uniquely identify a row.
• Example:
Here,
(student_id, course_id) together form a composite key.
CREATE TABLE enrollment (
student_id INT,
course_id INT,
PRIMARY KEY (student_id, course_id)
);
• Super Key:
• Definition: A super key is any combination of columns that uniquely identifies a row in a
table. It can include additional columns that are not necessary for uniqueness.
• Purpose: A super key can contain extra attributes, making it a broader concept than a
candidate key.
• Example: In the enrollment table, (student_id, course_id) is a super key.
(student_id, course_id, name) is also a super key, but it's not minimal.
Key Differences:
• Composition: A composite key is a specific type of key formed by multiple columns,
whereas a super key can be any combination of columns (including extra, unnecessary ones).
• Uniqueness: All super keys are unique, but they can be non-minimal, meaning they may
contain extra attributes. Composite keys are minimal by definition.
• Q.970
Question
What is the difference between NULL and NOT NULL?
Answer
• NULL:
1155
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Differences:
• NULL allows for missing or undefined values, while NOT NULL enforces that a column
always has a value.
• NULL is used to represent unknown or undefined data, whereas NOT NULL ensures data
integrity by requiring values in the column.
• Q.971
Question
Difference between Default Constraint and Check Constraint.
Answer
• Default Constraint:
• Definition: A Default Constraint automatically assigns a default value to a column when
no value is provided during an insert operation.
• Purpose: Ensures that a column has a predefined value when no explicit value is supplied.
• Example:
Here, if
status is not provided during insertion, it will automatically default to 'Active'.
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
name VARCHAR(100),
status VARCHAR(10) DEFAULT 'Active'
);
• Check Constraint:
• Definition: A Check Constraint ensures that the values entered into a column satisfy a
specified condition or rule.
• Purpose: Restricts the data that can be entered into a column by enforcing a condition.
• Example:
Here, the
age column only allows values greater than or equal to 18.
CREATE TABLE employees (
1156
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Differences:
• Default Constraint provides a default value when no value is supplied, while Check
Constraint ensures that the values meet specific criteria or conditions.
• Default Constraint applies to missing values, whereas Check Constraint applies to all
values entered.
• Q.972
Question
What is the difference between Natural Join and Cross Join?
Answer
• Natural Join:
• Definition: A Natural Join automatically joins tables based on all columns with the same
name and compatible data types in both tables.
• Purpose: Simplifies the join by matching columns with the same name, eliminating
duplicates in the result set.
• Example:
In this example,
employees and departments are joined based on all columns with the same name (e.g.,
department_id).
SELECT *
FROM employees
NATURAL JOIN departments;
• Cross Join:
• Definition: A Cross Join returns the Cartesian product of two tables, meaning each row
from the first table is paired with every row from the second table.
• Purpose: Generates all possible combinations of rows from both tables.
• Example:
Key Differences:
• Natural Join automatically joins tables based on matching column names and eliminates
duplicates, whereas Cross Join returns every combination of rows from both tables
(Cartesian product).
• Natural Join is used for related data, while Cross Join is used for generating
combinations or testing purposes.
• Q.973
1157
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
What is the difference between INT and BIGINT?
Answer
• INT:
• Definition: The INT data type is used to store integer values within a specified range.
• Range: Typically from 2,147,483,648 to 2,147,483,647 (for signed integers).
• Storage: Requires 4 bytes of storage.
• Example:
CREATE TABLE employees (
employee_id INT
);
• BIGINT:
• Definition: The BIGINT data type is used to store larger integer values than INT.
• Range: Typically from 9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (for
signed integers).
• Storage: Requires 8 bytes of storage.
• Example:
CREATE TABLE transactions (
transaction_id BIGINT
);
Key Differences:
• Size: BIGINT can store much larger numbers compared to INT.
• Storage: BIGINT requires more storage (8 bytes) than INT (4 bytes).
• Use Case: Use INT for smaller numbers and BIGINT when handling large values, such as
for IDs or counters that might exceed the range of INT.
• Q.974
Question
What is the difference between DATE and DATETIME?
Answer
• DATE:
• Definition: The DATE data type is used to store only the date (year, month, and day)
without time information.
• Format: YYYY-MM-DD
• Range: Typically from '1000-01-01' to '9999-12-31'.
• Example:
CREATE TABLE events (
event_date DATE
);
• DATETIME:
• Definition: The DATETIME data type is used to store both the date and time (hours,
minutes, seconds) information.
• Format: YYYY-MM-DD HH:MM:SS
• Range: Typically from '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.
1158
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Example:
CREATE TABLE events (
event_datetime DATETIME
);
Key Differences:
• Time Information: DATE stores only the date, while DATETIME stores both date and time.
• Precision: DATETIME provides more detailed information with time, making it suitable for
timestamps.
• Use Case: Use DATE when you only need the date (e.g., birthdates), and DATETIME when
both date and time are required (e.g., event timestamps).
• Q.975
Question
What is the difference between FLOAT and DECIMAL?
Answer
• FLOAT:
• Definition: FLOAT is a data type used to store approximate numeric values with floating
decimal points. It is used for storing values that require a wide range but do not require
precise accuracy.
• Precision: Stores numbers in an approximate way, which may lead to rounding errors in
some cases.
• Usage: Suitable for scientific calculations or when the precision of the fractional part is not
crucial.
• Example:
CREATE TABLE products (
price FLOAT
);
• DECIMAL (or NUMERIC):
• Definition: DECIMAL stores exact numeric values with fixed precision and scale. It is used
when precise calculations are necessary, especially for financial data.
• Precision: The DECIMAL(p, s) type allows you to define the total number of digits (p) and
the number of digits after the decimal point (s), ensuring exact values.
• Usage: Ideal for storing monetary values or any data where exact precision is critical.
• Example:
CREATE TABLE transactions (
amount DECIMAL(10, 2)
);
Key Differences:
• Precision: DECIMAL stores exact values with a defined precision, while FLOAT is
approximate and may lead to rounding errors.
• Use Case: Use DECIMAL for financial data or when precision is critical, and FLOAT for
scientific calculations where some approximation is acceptable.
• Q.976
Question
1159
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
• ENUM:
• Definition: ENUM is a data type in SQL used to store a single value from a predefined list of
values.
• Purpose: Limits the column to one value from a list of possible options.
• Example:
Here,
status can only be one of the three values: 'Active', 'Inactive', or 'On Leave'.
CREATE TABLE employees (
status ENUM('Active', 'Inactive', 'On Leave')
);
• SET:
• Definition: SET is a data type used to store multiple values from a predefined list. It allows
the selection of zero or more values from the list.
• Purpose: Useful when you want to store multiple values in a single column.
• Example:
Here,
skills can store one or more values like 'Java', 'SQL', or both.
CREATE TABLE employees (
skills SET('Java', 'SQL', 'Python', 'JavaScript')
);
Key Differences:
• Value Storage: ENUM stores only one value from the list, whereas SET can store one or
more values from the list.
• Use Case: Use ENUM for single-choice fields and SET when a column needs to store
multiple choices.
• Q.977
Question
What is the difference between DELETE and TRUNCATE?
Answer
• DELETE:
• Definition: DELETE is a DML (Data Manipulation Language) command used to remove
rows from a table based on a condition. It can delete specific rows or all rows.
• Behavior:
• Can delete specific rows based on a WHERE clause.
• It is logged in the transaction log, which makes it slower for large datasets.
• Triggers associated with the table are fired when DELETE is used.
• Example:
DELETE FROM employees WHERE status = 'Inactive';
• TRUNCATE:
1160
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Definition: TRUNCATE is a DDL (Data Definition Language) command used to remove all
rows from a table. It resets the table to its empty state.
• Behavior:
• Removes all rows from the table (cannot be used with a WHERE clause).
• It is faster than DELETE because it does not log individual row deletions.
• Does not fire triggers.
• Does not reset the table’s structure (like constraints or indexes), but often resets identity
columns.
• Example:
TRUNCATE TABLE employees;
Key Differences:
• Scope: DELETE can remove specific rows, whereas TRUNCATE removes all rows.
• Performance: TRUNCATE is faster because it is minimally logged.
• Transaction Log: DELETE is fully logged, while TRUNCATE has minimal logging.
• Triggers: DELETE activates triggers, but TRUNCATE does not.
• Rollback: Both can be rolled back if used within a transaction, but TRUNCATE can’t be
rolled back in some DBMS (like in SQL Server if the table has a foreign key constraint).
• Q.978
Question
What is a scalar function?
Answer
A scalar function is a function in SQL that operates on a single value (or set of values) and
returns a single value. These functions perform operations like mathematical calculations,
string manipulations, or data type conversions.
Key Characteristics:
• Input: Takes one or more input values (arguments).
• Output: Returns a single value (scalar result).
• Example: Mathematical, string, or date functions.
Example:
SELECT UPPER('hello') AS UppercaseString;
• This uses the UPPER() scalar function to convert the string 'hello' to 'HELLO'.
Purpose: Scalar functions are used to process or transform individual values within SQL
queries, returning a single result for each input.
1161
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.979
Question
What is the difference between COALESCE and IFNULL?
Answer
• COALESCE:
• Definition: COALESCE returns the first non-NULL value from a list of arguments.
• Syntax: COALESCE(value1, value2, ..., valueN)
• Behavior: It can take multiple arguments and returns the first non-NULL value. If all
values are NULL, it returns NULL.
• Example:
Result:
'Hello' (returns the first non-NULL value).
SELECT COALESCE(NULL, NULL, 'Hello', 'World');
• IFNULL:
• Definition: IFNULL checks if the first argument is NULL; if it is, it returns the second
argument; otherwise, it returns the first argument.
• Syntax: IFNULL(expression, replacement)
• Behavior: It only takes two arguments, returning the second argument if the first is NULL.
• Example:
Result:
'Default' (since the first argument is NULL).
SELECT IFNULL(NULL, 'Default');
Key Differences:
• Number of Arguments: COALESCE can take multiple arguments, whereas IFNULL only
takes two arguments.
• Flexibility: COALESCE is more flexible and can handle multiple possible fallback values,
while IFNULL is more limited to one fallback value.
• Database Compatibility: COALESCE is standard SQL, while IFNULL is typically used in
MySQL and SQLite.
• Q.980
Question
What is the difference between CASE and IF?
Answer
• CASE:
• Definition: CASE is an expression used to perform conditional logic inside SQL queries,
similar to an IF-THEN-ELSE statement.
• Usage: Can be used in both SELECT statements and other SQL clauses (e.g., WHERE, ORDER
BY).
• Syntax:
1162
1000+ SQL Interview Questions & Answers | By Zero Analyst
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
ELSE result3
END
• Example:
SELECT employee_id,
CASE
WHEN salary > 50000 THEN 'High'
ELSE 'Low'
END AS salary_category
FROM employees;
• Behavior: Returns a value based on the condition(s) specified.
• IF:
• Definition: IF is used for conditional logic in SQL, but it's typically more suited for
procedural code or flow control in stored procedures or functions (not in regular SQL
queries).
• Usage: Primarily used in stored procedures, triggers, or functions in SQL.
• Syntax:
IF condition THEN
-- Do something
ELSE
-- Do something else
END IF;
• Example:
DELIMITER //
CREATE PROCEDURE check_salary(IN salary INT)
BEGIN
IF salary > 50000 THEN
SELECT 'High Salary';
ELSE
SELECT 'Low Salary';
END IF;
END //
DELIMITER ;
Key Differences:
• Usage Context:
• CASE is used in SQL queries for conditional column values or expressions.
• IF is used in SQL procedures, functions, or flow control statements (not in regular SELECT
queries).
• Flexibility:
• CASE is more versatile for use in SELECT, WHERE, ORDER BY, etc.
• IF is used for more complex, procedural logic and flow control.
• Return Type:
• CASE returns a value based on conditions, while IF is more about executing specific
statements based on a condition.
• Q.981
Question
What is the difference between CAST and CONVERT?
Answer
• CAST:
1163
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Definition: CAST is a standard SQL function used to convert one data type to another.
• Usage: Works in most SQL databases and is part of the SQL standard.
• Syntax:
CAST(expression AS target_data_type)
• Example:
SELECT CAST('123' AS INT);
• Behavior: Simple and universal for type conversion across different databases.
• CONVERT:
• Definition: CONVERT is specific to SQL Server and is used to convert data from one type to
another, with optional formatting for date/time conversions.
• Usage: Primarily used in SQL Server, but not part of the SQL standard. Allows additional
style formatting for date and time conversions.
• Syntax:
CONVERT(target_data_type, expression, [style])
• Example:
SELECT CONVERT(INT, '123');
SELECT CONVERT(VARCHAR, GETDATE(), 1); -- Date format style
• Behavior: Offers more flexibility, especially for date and time formatting, but is not as
portable as CAST.
Key Differences:
• Portability: CAST is standard SQL, whereas CONVERT is specific to SQL Server.
• Flexibility: CONVERT supports additional options, such as specifying styles for date/time
formatting, while CAST is simpler and more straightforward.
• Use Case: Use CAST for general data type conversion across most SQL databases, and
CONVERT in SQL Server when additional date formatting is required.
• Q.982
Question
What is the difference between a multi-column subquery and a nested subquery?
Answer
• Multi-Column Subquery:
• Definition: A multi-column subquery returns multiple columns (more than one) and is
used in situations where more than one column of data is needed to be compared.
• Usage: It is typically used with the IN, ANY, or ALL operators in the WHERE clause.
• Example:
SELECT employee_id, department_id
FROM employees
WHERE (employee_id, department_id) IN (
SELECT employee_id, department_id
FROM employees
WHERE salary > 50000
);
• Behavior: The subquery returns more than one column (employee_id and
department_id), and the outer query compares these values against the corresponding
columns in the main query.
• Nested Subquery:
• Definition: A nested subquery is a subquery placed inside another subquery or SQL
query, generally used to return a single value or set of values for comparison.
1164
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Usage: It can be used in SELECT, FROM, WHERE, or other SQL clauses, often nested within
another query.
• Example:
SELECT employee_id, salary
FROM employees
WHERE salary > (
SELECT AVG(salary)
FROM employees
);
• Behavior: The inner subquery computes an average salary, and the outer query compares
each employee's salary against this average.
Key Differences:
• Columns Returned:
• A multi-column subquery returns multiple columns, often used for comparisons
involving multiple fields.
• A nested subquery typically returns a single value or a set of values but is placed inside
another query.
• Usage:
• A multi-column subquery is commonly used with the IN or ANY operators, comparing
multiple columns.
• A nested subquery is more flexible, used in different parts of a query like WHERE, SELECT,
etc.
• Q.983
Question
What is a bitmap index?
Answer
A bitmap index is a type of index in SQL that uses bitmap vectors (bit arrays) to represent
the presence or absence of values in a column. It is especially effective for columns with a
low cardinality, meaning columns that have a limited number of distinct values.
Key Features:
• Bitmap Representation: Each distinct value in the column is represented by a bitmap (a
sequence of 0s and 1s). Each bit corresponds to a row in the table, where 1 means the value is
present in that row, and 0 means it is absent.
• Efficient for Low Cardinality: Bitmap indexes are most useful when the column has a
small number of unique values (e.g., Gender, Status, or Yes/No columns).
• Storage: The index is very compact and efficient for columns with a small set of distinct
values, but can be inefficient with high-cardinality columns (e.g., Name or Email).
• Bitwise Operations: Bitmap indexes allow for fast bitwise operations (AND, OR, NOT),
which can speed up complex queries, particularly when multiple conditions are involved.
Example:
For a Gender column with values like 'Male', 'Female', 'Other', a bitmap index would create a
bitmap for each distinct gender:
1165
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Benefits:
• Faster Query Performance: Particularly for analytical queries involving multiple
conditions (e.g., WHERE Gender = 'Male' AND Status = 'Active').
• Efficient Storage: Especially for columns with a limited number of distinct values.
Limitations:
• Inefficient for High Cardinality Columns: Bitmap indexes are not suitable for columns
with many unique values as they consume more space and can be slower.
• Update Overhead: If the column being indexed is frequently updated, the bitmap index
may require more resources to maintain.
• Q.984
Question
What is the purpose of the COALESCE function in SQL?
Answer
The COALESCE function in SQL is used to return the first non-NULL value from a list of
arguments. It is useful for handling NULL values in queries and providing default values when
NULL is encountered.
Key Points:
• Purpose: To handle NULL values by replacing them with a specified default value or the
first non-NULL value in a list.
• Syntax:
Example:
SELECT COALESCE(NULL, NULL, 'Hello', 'World');
Result: 'Hello'
Explanation: The first two values are NULL, but 'Hello' is the first non-NULL value, so it is
returned.
1166
1000+ SQL Interview Questions & Answers | By Zero Analyst
Use Cases:
• Default Value: Replacing NULL with a default value.
SELECT COALESCE(salary, 0) FROM employees;
Benefits:
• Simplifies Logic: Avoids complex CASE statements for NULL handling.
• Improves Readability: Makes SQL queries more readable and concise when dealing with
NULL values.
• Q.985
Question
What is the NTILE function in SQL?
Answer
The NTILE function in SQL is a window function that distributes rows of a result set into a
specified number of buckets or groups, based on the order defined by the ORDER BY clause.
Each bucket gets an approximately equal number of rows. It is commonly used for data
analysis to categorize data into quantiles (e.g., quartiles, deciles).
Syntax:
NTILE(number_of_buckets) OVER (ORDER BY column_name)
• number_of_buckets: The number of groups (or buckets) you want to divide the rows into.
• ORDER BY: Specifies the column(s) that define the order of rows before dividing them into
buckets.
Example:
Suppose we have the following sales table:
+-------------+---------+
| employee_id | sales |
+-------------+---------+
| 1 | 5000 |
| 2 | 3000 |
| 3 | 7000 |
| 4 | 4000 |
| 5 | 6000 |
| 6 | 2000 |
+-------------+---------+
To divide the employees into 3 groups (buckets) based on their sales, we use the NTILE
function:
SELECT employee_id, sales, NTILE(3) OVER (ORDER BY sales DESC) AS sales_bucket
FROM sales;
Result:
+-------------+---------+-------------+
| employee_id | sales | sales_bucket|
+-------------+---------+-------------+
1167
1000+ SQL Interview Questions & Answers | By Zero Analyst
| 3 | 7000 | 1 |
| 5 | 6000 | 1 |
| 1 | 5000 | 2 |
| 4 | 4000 | 2 |
| 2 | 3000 | 3 |
| 6 | 2000 | 3 |
+-------------+---------+-------------+
Key Points:
• The NTILE function assigns each row to a bucket numbered from 1 to n (where n is the
number of buckets).
• The result is ordered by the specified column(s) in the ORDER BY clause before the rows are
distributed into buckets.
• Rows are distributed as evenly as possible, but if the number of rows is not divisible by the
number of buckets, some buckets may contain one more row than others.
Use Cases:
• Dividing data into quartiles, deciles, or any other type of quantile.
• Ranking or segmenting data for comparative analysis.
Summary: The NTILE function is useful for distributing rows into a specific number of
groups based on their order, which is ideal for analysis tasks that require dividing data into
segments (like percentiles or rankings).
• Q.986
Question
How do you update existing records in a table?
Answer
To update existing records in a table, you use the UPDATE statement in SQL. This statement
modifies the values of one or more columns in existing rows of a table.
Syntax:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
• table_name: The name of the table where the data will be updated.
• SET: Specifies the columns to be updated and their new values.
• WHERE: Defines the condition to specify which rows should be updated (if omitted, all rows
are updated).
Example:
UPDATE employees
SET salary = 60000, department = 'HR'
WHERE employee_id = 101;
This updates the salary and department for the employee with employee_id = 101.
Key Points:
• WHERE Clause: Always use the WHERE clause to avoid updating all rows in the table.
• Multiple Columns: You can update multiple columns in a single UPDATE statement.
1168
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Rollback: If you accidentally update the wrong records, you can roll back the transaction
(if using transactions).
Summary: The UPDATE statement is used to modify existing records in a table by specifying
new values for one or more columns based on a condition provided by the WHERE clause.
• Q.987
Question
What is the MERGE statement in SQL?
Answer
The MERGE statement in SQL is used to perform insert, update, or delete operations on a
target table based on matching conditions with a source table. It allows for conditional
updates or inserts in a single statement, making it especially useful for handling "upsert"
operations (i.e., updating existing rows or inserting new ones based on certain conditions).
Syntax:
MERGE INTO target_table AS target
USING source_table AS source
ON target.column = source.column
WHEN MATCHED THEN
UPDATE SET target.column = source.column
WHEN NOT MATCHED THEN
INSERT (column1, column2, ...) VALUES (value1, value2, ...)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
Explanation:
• target_table: The table that will be updated, inserted into, or deleted from.
• source_table: The table that provides the data for comparison.
• ON: Specifies the condition for matching rows between the target and source tables.
• WHEN MATCHED: Defines the action to take when a match is found (e.g., update).
• WHEN NOT MATCHED: Defines the action to take when no match is found (e.g., insert).
• WHEN NOT MATCHED BY SOURCE: Defines the action when a row exists in the target table
but has no corresponding row in the source table (e.g., delete).
Example:
Assume we have a products table and a new_products table. We want to update the price
of existing products and insert new products if they don't already exist.
MERGE INTO products AS p
USING new_products AS np
ON p.product_id = np.product_id
WHEN MATCHED THEN
UPDATE SET p.price = np.price
WHEN NOT MATCHED THEN
INSERT (product_id, product_name, price)
VALUES (np.product_id, np.product_name, np.price);
Key Points:
• Single Operation: MERGE allows performing multiple operations (insert, update, delete) in
one query.
1169
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Efficient for Upserts: Commonly used for upsert operations (inserting or updating
records based on whether a match exists).
• Deletion Option: You can also delete records that no longer have a corresponding match
in the source table.
Summary: The MERGE statement in SQL combines INSERT, UPDATE, and DELETE operations
into one powerful statement, making it ideal for synchronizing data between two tables. It
compares rows from a source table with rows in a target table and performs actions based on
whether a match is found.
• Q.988
Question
What is PIVOT in SQL, and how is it different from GROUP BY?
Answer
PIVOT in SQL:
The PIVOT operation in SQL is used to rotate data, converting unique values from one
column into multiple columns in the result set. It is often used for summarizing or
transforming row data into columnar data, making it easier to analyze in a tabular format.
• Purpose: To aggregate data and display it in a cross-tabular format, where rows become
columns.
• Syntax (in SQL Server):
SELECT <non-aggregated_column>, [value1], [value2], [valueN]
FROM
(SELECT <columns> FROM <table>) AS source
PIVOT
(AGGREGATE_FUNCTION(<value_column>) FOR <pivot_column> IN ([value1], [value2], [valu
eN])) AS pvt;
Example:
Suppose you have a sales table:
+-------------+--------+---------+------------+
| product_id | month | sales | region |
+-------------+--------+---------+------------+
| 1 | Jan | 100 | East |
| 1 | Feb | 120 | East |
| 2 | Jan | 150 | West |
| 2 | Feb | 180 | West |
+-------------+--------+---------+------------+
Result:
+-------------+-----+-----+
| product_id | Jan | Feb |
+-------------+-----+-----+
| 1 | 100 | 120 |
1170
1000+ SQL Interview Questions & Answers | By Zero Analyst
| 2 | 150 | 180 |
+-------------+-----+-----+
GROUP BY in SQL:
The GROUP BY clause is used to group rows that have the same values into summary rows,
like calculating aggregates (e.g., COUNT(), SUM(), AVG(), etc.) for each group. It is commonly
used when you want to perform aggregation on data.
• Purpose: To aggregate data based on common values and provide summarized results for
each group.
• Syntax:
SELECT column, AGGREGATE_FUNCTION(column)
FROM table
GROUP BY column;
Example:
To get total sales per product and month:
SELECT product_id, month, SUM(sales) AS total_sales
FROM sales
GROUP BY product_id, month;
Result:
+-------------+--------+------------+
| product_id | month | total_sales|
+-------------+--------+------------+
| 1 | Jan | 100 |
| 1 | Feb | 120 |
| 2 | Jan | 150 |
| 2 | Feb | 180 |
+-------------+--------+------------+
Key Differences:
• Purpose:
• PIVOT: Rotates rows into columns to create a summary table with different columns for
each distinct value.
• GROUP BY: Groups rows based on a specific column or set of columns and applies
aggregate functions to each group.
• Output:
• PIVOT: Transforms data into a new format where distinct values from a column become
column headers.
• GROUP BY: Summarizes data in a grouped format with one row per group, showing
aggregated results.
• Complexity:
• PIVOT: Typically requires more complex syntax and is often used when you need to
create a cross-tab report.
• GROUP BY: Easier to use for simple aggregation but doesn’t transform the data structure
into a wide format like PIVOT.
• Flexibility:
• PIVOT: Useful for specific cases like creating dynamic column headers (e.g., months as
columns for sales).
1171
1000+ SQL Interview Questions & Answers | By Zero Analyst
• GROUP BY: More flexible for general aggregation tasks and works in a wider variety of
scenarios.
Summary: The PIVOT function in SQL is used to convert unique values into columns for
creating a cross-tab view, whereas GROUP BY is used to aggregate data by grouping rows with
common values and applying functions to summarize them.
• Q.989
Question
How do you perform a conditional update in SQL?
Answer
A conditional update in SQL is performed using the UPDATE statement in combination with
a WHERE clause that specifies the conditions under which the records should be updated. You
can also use conditional expressions like CASE or IF to update based on specific criteria.
Basic Syntax:
UPDATE table_name
SET column_name = new_value
WHERE condition;
• column_name: The column to update.
• new_value: The new value you want to set for the column.
• condition: Specifies which rows should be updated.
This updates the salary of the employee with employee_id = 101 to 70000.
1172
1000+ SQL Interview Questions & Answers | By Zero Analyst
This increases the price by 10% for products in the 'Electronics' category where the stock is
greater than 100.
Key Points:
• The WHERE clause is crucial to limit which rows will be updated; without it, all rows will
be updated.
• CASE can be used for conditional logic in a single update statement to set different values
based on conditions.
• The AND, OR, and other logical operators can be used within the WHERE clause for more
complex conditions.
Summary: A conditional update in SQL is done using the UPDATE statement with a WHERE
clause. You can also use CASE to set different values based on conditions within the same
query.
• Q.990
Question
What is query optimization in SQL?
Answer
Query optimization is the process of improving the performance of an SQL query by
reducing its execution time and resource consumption (like CPU, memory, and disk I/O). It
involves analyzing and modifying queries to ensure they execute in the most efficient way
possible, based on the database structure, indexes, and execution plans.
1173
1000+ SQL Interview Questions & Answers | By Zero Analyst
Example:
For a query that performs a full table scan:
SELECT * FROM employees WHERE salary > 50000;
After the index is created, the query optimizer may use the index to quickly find rows with a
salary > 50000, avoiding a full table scan.
Answer
To analyze query performance in SQL, you can use the following methods:
• EXPLAIN: Shows the execution plan of a query, helping identify bottlenecks (like full
table scans, missing indexes, etc.).
EXPLAIN SELECT * FROM employees WHERE salary > 50000;
• EXPLAIN ANALYZE: Provides both the execution plan and actual runtime statistics.
EXPLAIN ANALYZE SELECT * FROM employees WHERE salary > 50000;
• Query Profiling: Tools like SHOW PROFILE (in MySQL) provide detailed time breakdowns
of query execution.
• Index Analysis: Check if indexes are being used effectively with the query, especially for
columns in WHERE, JOIN, or ORDER BY.
• Database Logs: Review slow query logs to identify queries that take longer to execute.
1174
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Database Performance Tools: Use tools like SQL Server Profiler, MySQL Workbench,
or Oracle SQL Developer for detailed performance metrics.
• Q.992
Question
What is an execution plan in SQL?
Answer
An execution plan in SQL is a detailed roadmap that the database engine follows to execute
a query. It shows how the database will retrieve data, including the use of indexes, join
methods, and access paths.
Key Elements:
• Scan: Full table scan or index scan.
• Join Type: Inner join, outer join, hash join, etc.
• Sort: How rows are sorted (e.g., ORDER BY).
• Filters: Conditions applied during data retrieval.
How to View:
• MySQL: EXPLAIN SELECT * FROM table;
• SQL Server: SET SHOWPLAN_ALL ON; or use Execution Plan in SSMS.
• PostgreSQL: EXPLAIN ANALYZE SELECT * FROM table;
• Q.993
Question
What is the difference between Sequence Scan and Bitmap Scan in SQL?
Answer
• Sequence Scan: A sequential scan reads the entire table row by row. It is used when no
indexes are available or when scanning all rows is faster than using an index (e.g., for small
tables).
• Bitmap Scan: A bitmap scan uses a bitmap index to quickly identify rows matching
multiple conditions, and is more efficient when combining multiple indexes. It is typically
used for queries with multiple conditions (e.g., AND, OR) on indexed columns.
Key Differences:
• Sequence Scan: Scans all rows, no indexing.
• Bitmap Scan: Uses a bitmap index for faster retrieval on multiple conditions. More
efficient with large datasets and complex queries.
• Q.994
Question
How do you manage user permissions in SQL?
Answer
1175
1000+ SQL Interview Questions & Answers | By Zero Analyst
User permissions in SQL are managed using GRANT, REVOKE, and DENY commands.
• GRANT: Assigns specific privileges to users or roles on database objects (tables, views,
etc.).
GRANT SELECT, INSERT ON employees TO user_name;
• REVOKE: Removes previously granted permissions from a user or role.
REVOKE SELECT, INSERT ON employees FROM user_name;
• DENY: Explicitly denies specific permissions to a user, even if the user inherits them from
a role.
DENY DELETE ON employees TO user_name;
• SHOW GRANTS: Displays the current privileges for a user.
SHOW GRANTS FOR user_name;
Permissions can be granted at various levels (e.g., database, schema, table) and can be based
on roles to simplify management.
• Q.995
Question
What are the types of data integrity in SQL?
Answer
The main types of data integrity are:
• Entity Integrity: Ensures that each row in a table has a unique identifier (primary key) and
that no part of the primary key can be NULL.
• Referential Integrity: Ensures that foreign keys correctly reference valid rows in other
tables. It maintains relationships between tables.
• Domain Integrity: Ensures that all column values are within a defined domain (i.e., valid
data types, ranges, or specific values). This is enforced by constraints like CHECK, NOT NULL,
and DEFAULT.
• User-Defined Integrity: Custom rules defined by users that do not fall under other types,
ensuring the data follows business rules or logic.
• Q.996
Question
What is role-based access control (RBAC)?
Answer
Role-based access control (RBAC) is a security model where access to resources is granted
based on a user's role within an organization. Users are assigned roles, and each role has
specific permissions to perform actions on the database.
Key Elements:
• Roles: A set of permissions assigned to a group (e.g., Admin, User, Manager).
• Users: Individuals assigned to one or more roles.
• Permissions: Rights granted to roles, such as SELECT, INSERT, UPDATE, DELETE.
Example:
CREATE ROLE manager;
GRANT SELECT, UPDATE ON employees TO manager;
1176
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
• COMMIT: Finalizes the changes made in a transaction and makes them permanent in the
database. Once committed, changes cannot be undone.
COMMIT;
• ROLLBACK: Undoes all changes made during the current transaction, reverting the
database to its state before the transaction began.
ROLLBACK;
Key Difference:
• COMMIT makes changes permanent.
• ROLLBACK cancels changes made in the transaction, restoring the previous state.
• Q.998
Question
What is a covering index in SQL?
Answer
A covering index is an index that includes all the columns needed for a query, allowing the
database to satisfy the query entirely using the index, without needing to access the actual
table data.
Example:
If a query selects columns A, B, and C:
SELECT A, B, C FROM table WHERE A = 'value';
With this index, the database can return the query result directly from the index, avoiding the
need to access the table.
Benefits:
• Improves performance by reducing I/O operations (no need to fetch data from the table).
• Fast retrieval for queries that need specific columns indexed.
• Q.999
Question
How do you avoid full table scans in SQL?
1177
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
To avoid full table scans:
• Use Indexes: Ensure columns used in WHERE, JOIN, or ORDER BY have appropriate indexes.
• Example: Index on employee_id for fast lookups.
• Optimize Queries: Write selective queries with WHERE clauses to limit the number of rows
scanned.
• Analyze Execution Plans: Use EXPLAIN to check if indexes are being used or if full table
scans are occurring.
• Use Partitioning: Partition large tables to split data into smaller, manageable chunks,
reducing the number of rows scanned.
• Limit Results: Use LIMIT or TOP to restrict the number of rows returned when only a
subset of data is needed.
• Avoid SELECT *: Specify only the necessary columns to reduce unnecessary data retrieval.
• Q.1000
Question
What is a filtered index in SQL?
Answer
A filtered index is an index that is created with a filter condition, indexing only a subset of
rows that meet the specified criteria. This reduces the index size and improves query
performance when queries frequently reference specific rows.
Syntax:
CREATE INDEX index_name
ON table_name (column_name)
WHERE condition;
Example:
CREATE INDEX idx_active_employees
ON employees (status)
WHERE status = 'Active';
This index will only include rows where status = 'Active', improving performance for
queries filtering on active employees.
• Q.1001
Question
What is data encryption in SQL?
Answer
Data encryption in SQL is the process of converting sensitive data into an unreadable format
to prevent unauthorized access. It ensures that data stored in the database is secure, even if
someone gains access to the database files.
1178
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Transparent Data Encryption (TDE): Encrypts entire databases, making data on disk
unreadable without the decryption key.
• Example: Used in SQL Server and Oracle.
• Column-level Encryption: Encrypts specific columns (e.g., personal information) while
leaving other data unencrypted.
• Example: AES_ENCRYPT() and AES_DECRYPT() in MySQL.
• Backup Encryption: Encrypts database backups to protect sensitive data during storage or
transfer.
• SSL/TLS Encryption: Secures the connection between client applications and the
database server, ensuring data is encrypted during transmission.
Example (MySQL):
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
password VARBINARY(255)
);
Answer
Data masking in SQL is the process of hiding sensitive data by replacing it with fictional or
obfuscated values, while retaining its original format. It allows developers, testers, or other
users to work with realistic data without exposing confidential information.
In this case, the salary column will be masked when queried, showing masked values (e.g.,
xxx) instead of the actual data.
Benefits:
• Protects sensitive data in non-production environments.
• Reduces exposure of confidential information while maintaining data usability.
1179
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.1003
Question
What is data modeling?
Answer
Data modeling is the process of designing the structure of a database, including its tables,
columns, relationships, and constraints. It helps define how data is stored, accessed, and
manipulated, ensuring the database supports the organization's needs effectively.
Key Components:
• Entities: Objects or concepts that store data (e.g., Customer, Order).
• Attributes: Data fields within entities (e.g., Customer Name, Order Date).
• Relationships: Associations between entities (e.g., Customer places Order).
Data modeling ensures efficient data storage, retrieval, and consistency.
• Q.1004
Question
What is an Entity-Relationship Diagram (ERD)?
Answer
An Entity-Relationship Diagram (ERD) is a visual representation of the entities in a
database and their relationships. It helps in designing and understanding the structure of a
database.
Key Components:
• Entities: Represented by rectangles; they are objects or concepts (e.g., Customer,
Product).
• Attributes: Represented by ovals; they are properties or details of entities (e.g., Customer
Name, Product Price).
• Relationships: Represented by diamonds; they show how entities are related (e.g.,
Customer places Order).
• Primary Key: Underlined attribute that uniquely identifies each entity.
Example:
An ERD for a Customer placing an Order might show:
1180
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
Normalization is the process of organizing a database to reduce data redundancy and
improve data integrity by dividing large tables into smaller ones and defining relationships
between them.
Benefits:
• Reduces data duplication.
• Ensures consistency and integrity.
• Improves query performance by reducing redundant data.
• Q.1006
Question
What is the difference between Star Schema and Snowflake Schema in data warehousing?
Answer
• Star Schema:
In a Star Schema, the fact table is at the center and is directly connected to dimension tables.
The structure is simple, with no normalization. It is easy to query and performs faster for read
operations but may have data redundancy.
• Snowflake Schema:
In a Snowflake Schema, the dimension tables are normalized, meaning they are broken into
multiple related tables. This reduces redundancy but increases the complexity of queries and
joins. It is space-efficient but can be slower for querying due to multiple joins.
Key Difference:
• Star Schema has a denormalized structure, while Snowflake Schema is normalized.
1181
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Star Schema is simpler and faster, whereas Snowflake Schema is more complex and space-
efficient.
Answer
• Fact Table:
A Fact Table contains the quantitative data (facts) used for analysis, such as sales revenue,
quantity sold, or profit. It is the central table in a schema and typically includes foreign keys
that link it to dimension tables.
• Dimension Table:
A Dimension Table contains descriptive attributes (dimensions) related to the facts, such as
product details, customer information, or time periods. It provides context to the data in the
fact table and is used for filtering, grouping, or categorizing the facts.
Example:
In a sales schema:
• Fact Table: Sales with columns like Sale_ID, Product_ID, Date_ID, Amount.
• Dimension Table: Product with columns like Product_ID, Product_Name, Category.
Answer
ELT (Extract, Load, Transform) is a data integration process where:
• Extract: Data is pulled from multiple sources.
• Load: The raw data is loaded directly into a target system, such as a data warehouse or
data lake.
• Transform: The data is then transformed within the target system using its computational
power.
Key Features:
• Suitable for modern cloud-based data warehouses like Snowflake, BigQuery, or Redshift.
1182
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Allows for faster data loading since transformations happen after loading.
• Enables handling large-scale data processing.
Example:
• Extract data from APIs and databases.
• Load it into a data lake (e.g., Amazon S3).
• Transform it using SQL or tools like dbt in a cloud data warehouse.
Answer
SQL (Structured Query Language) is better than many other DBMS systems due to the
following reasons:
• Standardized Language:
SQL is a globally recognized and standardized language for managing relational databases.
• Ease of Use:
It has a simple, declarative syntax, making it easy to learn and use for querying, updating, and
managing data.
• Versatility:
SQL is supported by almost all relational database management systems (e.g., MySQL,
PostgreSQL, Oracle, MS SQL Server), ensuring compatibility and portability.
• Powerful Querying:
SQL allows complex queries using joins, aggregations, and subqueries to retrieve meaningful
insights from data.
• Scalability:
It handles large-scale data efficiently with indexing and partitioning, making it suitable for
enterprise applications.
• Integration:
SQL integrates seamlessly with analytics, reporting tools, and programming languages like
Python and R.
Why SQL Over Others:
While some NoSQL databases like MongoDB or Cassandra are better for unstructured data
and scalability, SQL excels in structured data management, ensuring data integrity,
consistency, and powerful analytics.
1183
1000+ SQL Interview Questions & Answers | By Zero Analyst
Answer
There are several types of RDBMS based on features, architecture, and usage. Here are the
main categories:
• Open Source RDBMS:
• Free to use and often customizable.
• Examples:
• MySQL: Popular for web applications and small-medium businesses.
• PostgreSQL: Known for advanced features like JSON support and extensibility.
• MariaDB: A fork of MySQL with additional features.
• Commercial RDBMS:
• Proprietary software with enterprise-grade support and advanced features.
• Examples:
• Oracle Database: Known for its robustness and high scalability.
• Microsoft SQL Server: Commonly used in Windows-based environments.
• IBM Db2: Used in large-scale enterprise applications.
• Cloud-Based RDBMS:
• Managed services on the cloud, reducing operational overhead.
• Examples:
• Amazon RDS (Relational Database Service): Supports MySQL, PostgreSQL, SQL
Server, and more.
• Google Cloud SQL: Managed MySQL, PostgreSQL, and SQL Server.
• Azure SQL Database: A cloud-based version of Microsoft SQL Server.
• Embedded RDBMS:
• Used for applications that require a lightweight, embedded database.
• Examples:
• SQLite: Lightweight, serverless, and widely used in mobile apps.
• H2: A fast, in-memory RDBMS often used in Java applications.
• Distributed RDBMS:
• Designed to run on multiple servers to handle large-scale distributed data.
• Examples:
• CockroachDB: Horizontally scalable and fault-tolerant.
• Google Spanner: A globally distributed RDBMS.
• In-Memory RDBMS:
• Stores data in memory for faster processing.
• Examples:
• SAP HANA: Known for real-time analytics.
• MemSQL (now SingleStore): Optimized for high-speed data handling.
• Object-Relational RDBMS (ORDBMS):
1184
1000+ SQL Interview Questions & Answers | By Zero Analyst
1185
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question Explanation:
• Data Structure:
• The Customers table holds information about each customer, including their
customer_id, customer_name, signup_date, and region.
• The Orders table holds information about each order placed by a customer, including the
order_id, customer_id, order_date, and amount spent.
• Logic:
• For each customer, we need to calculate the total spending over the last two quarters.
• Ensure the customer has placed at least 5 orders in each of the last two quarters.
• The result should include the total spending over the last two quarters, the average order
amount, and the customer name.
• Finally, we rank the customers by total spend and return the top 3.
• Challenges:
• Calculating the total spend for the last two quarters and determining which months fall
within those quarters.
• Filtering customers who made at least 5 orders per quarter.
• Ranking customers by their total spending in the last two quarters.
SQL Solution:
WITH quarterly_orders AS (
SELECT
o.customer_id,
EXTRACT(YEAR FROM o.order_date) AS order_year,
CASE
WHEN EXTRACT(MONTH FROM o.order_date) BETWEEN 1 AND 3 THEN 'Q1'
WHEN EXTRACT(MONTH FROM o.order_date) BETWEEN 4 AND 6 THEN 'Q2'
WHEN EXTRACT(MONTH FROM o.order_date) BETWEEN 7 AND 9 THEN 'Q3'
WHEN EXTRACT(MONTH FROM o.order_date) BETWEEN 10 AND 12 THEN 'Q4'
END AS quarter,
COUNT(o.order_id) AS num_orders,
SUM(o.amount) AS total_spend,
AVG(o.amount) AS avg_order_amount
FROM Orders o
WHERE o.order_date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY o.customer_id, EXTRACT(YEAR FROM o.order_date), quarter
1186
1000+ SQL Interview Questions & Answers | By Zero Analyst
),
filtered_customers AS (
SELECT
q.customer_id,
q.order_year,
q.quarter,
q.num_orders,
q.total_spend,
q.avg_order_amount
FROM quarterly_orders q
WHERE q.num_orders >= 5 -- Filter customers with at least 5 orders in a quarter
),
customer_spending AS (
SELECT
fc.customer_id,
SUM(fc.total_spend) AS total_spend_last_two_quarters,
AVG(fc.avg_order_amount) AS avg_order_amount_last_two_quarters
FROM filtered_customers fc
WHERE fc.quarter IN ('Q3', 'Q4') -- Last two quarters
GROUP BY fc.customer_id
),
ranked_customers AS (
SELECT
cs.customer_id,
cs.total_spend_last_two_quarters,
cs.avg_order_amount_last_two_quarters,
ROW_NUMBER() OVER (ORDER BY cs.total_spend_last_two_quarters DESC) AS rank
FROM customer_spending cs
)
SELECT
c.customer_name,
rc.total_spend_last_two_quarters,
rc.avg_order_amount_last_two_quarters
FROM ranked_customers rc
JOIN Customers c ON c.customer_id = rc.customer_id
WHERE rc.rank <= 3; -- Top 3 customers by total spending
Expected Output:
1187
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Step 4 (Ranking): The ROW_NUMBER() function is used to rank customers by total spend in
descending order.
• Final Output: We return the top 3 customers based on their total spending over the last
two quarters, along with their average order amount.
This problem involves multiple steps like filtering, aggregation, date manipulation, and
ranking. It challenges you to apply SQL features such as EXTRACT(), CASE, GROUP BY,
ROW_NUMBER(), and conditional aggregation for real-world business logic.
• Q.1012
Find the customer who made the most number of unique purchases from the same
product category within the last year.
Learnings:
• Time Period Filtering:
• You will learn how to filter data for a specific time range using date functions (e.g.,
CURRENT_DATE, INTERVAL in MySQL, and CURRENT_DATE - INTERVAL in PostgreSQL).
• Distinct Count:
• You will need to count the distinct products purchased by each customer from the same
category. This will require understanding how to use COUNT(DISTINCT column_name).
• Joins and Grouping:
• You'll gain experience using INNER JOINs to combine customer, orders, and product
tables.
• Using GROUP BY in SQL to group data by both customer and product category.
• Subqueries:
• To narrow down the most frequent purchaser, you may need to use a subquery to filter the
top customer per category.
1188
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL Solution
SELECT customer_id, customer_name, COUNT(DISTINCT product_id) AS unique_purchases
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id
JOIN products ON orders.product_id = products.product_id
WHERE order_date >= CURDATE() - INTERVAL 1 YEAR
GROUP BY customer_id, products.category_id
ORDER BY unique_purchases DESC
LIMIT 1;
Explanation:
• The query joins the customers, orders, and products tables to get all necessary
information.
1189
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Filters data for the last year using WHERE order_date >= CURDATE() - INTERVAL 1
YEAR.
• Groups the data by customer_id and category_id to calculate distinct purchases per
customer in the same category.
• Orders the result by the count of distinct products (COUNT(DISTINCT product_id)) and
limits the result to the top 1 customer.
PostgreSQL Solution
SELECT customer_id, customer_name, COUNT(DISTINCT product_id) AS unique_purchases
FROM customers
JOIN orders ON customers.customer_id = orders.customer_id
JOIN products ON orders.product_id = products.product_id
WHERE order_date >= CURRENT_DATE - INTERVAL '1 YEAR'
GROUP BY customer_id, products.category_id
ORDER BY unique_purchases DESC
LIMIT 1;
Explanation:
• In PostgreSQL, CURRENT_DATE - INTERVAL '1 YEAR' is used to filter the data from the
last year.
• The rest of the query structure is identical to MySQL.
Key Takeaways:
• Handling Time Intervals:
• You will understand how to filter data based on dynamic time intervals using INTERVAL
in both MySQL and PostgreSQL.
• Counting Distinct Values:
• The query demonstrates how to count distinct values in a specific field (i.e., distinct
products purchased by a customer).
• Joins and Aggregation:
• A deep dive into using multiple joins and grouping data to calculate aggregated metrics.
• Subquery for Top Results:
• The query can be modified to use subqueries if necessary to extract the "top" results,
teaching the importance of subqueries in data ranking.
• Q.1013
• Q.1014
Find the top 3 industries with the highest total earnings for employees who are under 30
years old. Include the industry name, total earnings, and the average age of employees
in that industry.
1190
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Aggregate Functions: You will be required to use aggregate functions such as SUM() and
AVG() to calculate total earnings and the average age.
• Ranking: The result should be ordered by total earnings to identify the top 3 industries.
The dataset represents employee information and their respective earnings in different
industries. By applying proper filtering and aggregation, you'll find which industries are the
highest earners for employees below 30 years old.
Learnings:
• Filtering Data by Age:
• You will learn how to filter employees based on their age, using comparison operators
like < to get employees under a certain age.
• Aggregate Functions:
• The query will help you practice using SUM() to calculate total earnings and AVG() to get
the average age of employees within each industry.
• GROUP BY and Sorting:
• You will be grouping data by industry and sorting by the total earnings to rank the
industries.
• Limiting Results:
• You will use LIMIT to restrict the output to the top 3 industries by total earnings.
1191
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL Solution
SELECT
i.industry_name,
SUM(e.earnings) AS total_earnings,
AVG(e.age) AS average_age
FROM
employees e
JOIN
industries i ON e.industry_id = i.industry_id
WHERE
e.age < 30
GROUP BY
i.industry_name
ORDER BY
total_earnings DESC
LIMIT 3;
Explanation:
• Filtering by Age: The WHERE e.age < 30 condition ensures that only employees under
the age of 30 are considered.
• Joining Tables: We use JOIN to combine the employees table and the industries table
on industry_id.
• Aggregation: The SUM(e.earnings) computes the total earnings for each industry, while
AVG(e.age) calculates the average age of employees in that industry.
• Grouping and Sorting: The GROUP BY i.industry_name groups the results by industry,
and the results are ordered by total_earnings DESC to get the industries with the highest
earnings.
• Limiting Results: LIMIT 3 restricts the result to the top 3 industries.
PostgreSQL Solution
SELECT
i.industry_name,
SUM(e.earnings) AS total_earnings,
AVG(e.age) AS average_age
FROM
employees e
JOIN
industries i ON e.industry_id = i.industry_id
WHERE
e.age < 30
GROUP BY
i.industry_name
ORDER BY
total_earnings DESC
LIMIT 3;
Explanation:
The query for PostgreSQL is the same as for MySQL. The logic and syntax are identical. The
key operations are filtering by age, joining the tables, grouping by industry, and ordering the
results to get the top 3 industries with the highest earnings.
1192
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Takeaways:
• Handling Age Filtering:
• Learn how to filter employees based on age using comparison operators like < 30 to get
those under 30 years old.
• Aggregation:
• Practice using aggregate functions like SUM() and AVG() to calculate total earnings and
average age, respectively, across grouped data.
• Joins and Grouping:
• Understand how to join tables based on a common column (industry_id) and how to
group data using GROUP BY to get aggregated results per group (industry in this case).
• Ranking with Limits:
• Learn how to limit the number of rows returned using LIMIT and sort the result based
on aggregated values to get the top results (top 3 industries).
• Real-Life Data Use:
• This type of query could be used in real-world scenarios such as identifying top-earning
industries for young professionals, salary analysis, or workforce distribution within a specific
age group.
• Q.1015
Find the top 3 Indian states with the highest population density in 2023. Output the
state name, population in 2023, area of the state, and population density (calculated as
population/area).
Learnings:
• Calculated Columns:
• You will learn how to calculate derived metrics such as population density using
arithmetic operations in SQL.
• Sorting and Ranking:
• By using ORDER BY, you'll learn to sort results in descending order to get the top states
based on a metric.
• Data Aggregation and Filtering:
1193
1000+ SQL Interview Questions & Answers | By Zero Analyst
• You'll practice aggregating data by computing population density at the state level, and
filtering results to retrieve the top 3 states.
• Handling Large Datasets:
• In real-world applications, this type of query can help in processing large datasets and
calculating statistics like population density for data analysis, policy-making, or urban
planning.
Solutions
MySQL Solution
SELECT
state_name,
population,
area,
(population / area) AS population_density
FROM
states
ORDER BY
population_density DESC
LIMIT 3;
Explanation:
• Derived Field (Population Density): We calculate population density by dividing the
population by the area (population / area).
• Sorting by Population Density: The ORDER BY population_density DESC sorts the
states based on the calculated population density in descending order.
• Limiting Results: The LIMIT 3 ensures that only the top 3 states with the highest
population density are returned.
PostgreSQL Solution
1194
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT
state_name,
population,
area,
(population / area) AS population_density
FROM
states
ORDER BY
population_density DESC
LIMIT 3;
Explanation:
• The query for PostgreSQL is identical to MySQL in terms of logic. We calculate the
population density and sort by it in descending order to get the top 3 states.
Key Takeaways:
• Calculated Metrics:
• Learn how to calculate population density by dividing one column by another
(population / area).
• Sorting and Ranking:
• Understand how to use ORDER BY to rank data based on calculated metrics and apply
LIMIT to retrieve the top 3 results.
• Handling Numerical Data:
• Practice working with large numbers (population and area) and ensure your calculations
are accurate by using appropriate data types like DECIMAL and INT.
• Real-World Use Cases:
• This query can be used in real-world applications for urban planning, resource
allocation, and government policies, where calculating population density is crucial for
determining state priorities.
• Data Quality Considerations:
• The question demonstrates the importance of working with accurate and consistent
datasets to ensure that metrics like population density are meaningful.
1195
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Sorting by total revenue: You are asked to identify the top 3 companies based on their
revenue in 2023.
• Percentage calculation: The percentage change is calculated as:
Percentage Change
The question evaluates your ability to handle multi-year data, perform basic arithmetic
operations, and filter out the top companies based on revenue.
Learnings:
• Percentage Calculations:
• You'll learn how to calculate the percentage change in data over multiple years, which is
common in financial and business data analysis.
• Sorting and Ranking:
• The use of ORDER BY allows you to rank companies based on their 2023 revenue. With
LIMIT 3, you can efficiently get the top companies.
• Handling Multiple Years of Data:
• Working with multiple years of data will help you practice combining and comparing
datasets from different time periods, which is useful for financial forecasting or historical
trend analysis.
• Data Aggregation:
• The query will involve GROUP BY to aggregate data by company and compute totals,
which is a fundamental concept in SQL.
Create and Insert Statements (based on Top USA Companies and Revenues)
-- Create the companies table
CREATE TABLE companies (
company_id INT PRIMARY KEY,
company_name VARCHAR(255)
);
1196
1000+ SQL Interview Questions & Answers | By Zero Analyst
);
-- Insert sample revenue data for two years (2022 and 2023)
INSERT INTO company_revenue (company_id, year, revenue) VALUES
(1, 2022, 365817.00), -- Apple 2022 revenue in millions
(1, 2023, 389000.00), -- Apple 2023 revenue in millions
(2, 2022, 168000.00), -- Microsoft 2022 revenue in millions
(2, 2023, 183000.00), -- Microsoft 2023 revenue in millions
(3, 2022, 469800.00), -- Amazon 2022 revenue in millions
(3, 2023, 510000.00), -- Amazon 2023 revenue in millions
(4, 2022, 257600.00), -- Google 2022 revenue in millions
(4, 2023, 274000.00), -- Google 2023 revenue in millions
(5, 2022, 53500.00), -- Tesla 2022 revenue in millions
(5, 2023, 64000.00), -- Tesla 2023 revenue in millions
(6, 2022, 117930.00), -- Meta 2022 revenue in millions
(6, 2023, 119500.00), -- Meta 2023 revenue in millions
(7, 2022, 26100.00), -- Nvidia 2022 revenue in millions
(7, 2023, 38000.00), -- Nvidia 2023 revenue in millions
(8, 2022, 73000.00), -- Intel 2022 revenue in millions
(8, 2023, 75000.00); -- Intel 2023 revenue in millions
Solutions
MySQL Solution
SELECT
c.company_name,
r2023.revenue AS revenue_2023,
((r2023.revenue - r2022.revenue) / r2022.revenue) * 100 AS percentage_change
FROM
companies c
JOIN
company_revenue r2023 ON c.company_id = r2023.company_id AND r2023.year = 2023
JOIN
company_revenue r2022 ON c.company_id = r2022.company_id AND r2022.year = 2022
ORDER BY
r2023.revenue DESC
LIMIT 3;
Explanation:
• Join on Multiple Tables: The query performs a self-join on the company_revenue table to
get both 2022 and 2023 revenue for the same company.
• Calculate Percentage Change: The percentage change is calculated using the formula
((2023 revenue - 2022 revenue) / 2022 revenue) * 100.
• Sorting and Ranking: The companies are ordered by 2023 revenue in descending order,
and the LIMIT 3 clause ensures only the top 3 companies are returned.
PostgreSQL Solution
SELECT
c.company_name,
r2023.revenue AS revenue_2023,
((r2023.revenue - r2022.revenue) / r2022.revenue) * 100 AS percentage_change
FROM
companies c
JOIN
company_revenue r2023 ON c.company_id = r2023.company_id AND r2023.year = 2023
JOIN
company_revenue r2022 ON c.company_id = r2022.company_id AND r2022.year = 2022
ORDER BY
r2023.revenue DESC
LIMIT 3;
Explanation:
1197
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The PostgreSQL query is identical to the MySQL query, as the syntax for JOINs and
calculations is very similar between MySQL and PostgreSQL.
Key Takeaways:
• Percentage Change Calculation:
• Learn how to calculate the percentage change between two years using basic arithmetic
operations in SQL.
• Multi-Year Data Aggregation:
• This query requires you to work with multiple years of data and JOIN the same table on
itself to compare values for two different years.
• Sorting and Ranking:
• You'll learn how to sort data based on a calculated metric, in this case, revenue in 2023,
and filter it to get the top 3 companies.
• Real-World Business Application:
• This query simulates a real-world use case for financial analysis, where businesses need to
assess their growth or decline by comparing revenues year-over-year.
• Sorting States by Growth Rate: You need to find the states with the highest growth in
GDP from 2022 to 2023 and output the top 5 states.
• Ranking: You will use SQL's ORDER BY and LIMIT clauses to get the top 5 states
based on the highest growth rates.
This question will test your ability to work with time-series data (year-over-year
comparison), calculate growth rates, and filter results.
1198
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Percentage Change Calculations:
• You will learn how to compute the percentage change between two years (or two data
points) in SQL, a common requirement in economic and financial analysis.
• SQL Sorting and Ranking:
• This question requires you to sort the results based on calculated values (GDP growth)
and limit the results to top values, an essential skill in data analysis and reporting.
• Data Comparison Across Time:
• Handling time-series data (2022 vs. 2023) to compare GDP will improve your ability to
work with historical data and derive insights over time.
• Advanced SQL Joins:
• You will practice joining two years of data from the same table and performing operations
on those values, a critical concept for business and financial analysis.
Create and Insert Statements (based on Indian Economy and GDP Growth)
-- Create the states table
CREATE TABLE states (
state_id INT PRIMARY KEY,
state_name VARCHAR(255)
);
1199
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL Solution
SELECT
s.state_name,
g2023.gdp AS gdp_2023,
g2022.gdp AS gdp_2022,
((g2023.gdp - g2022.gdp) / g2022.gdp) * 100 AS gdp_growth_rate
FROM
states s
JOIN
state_gdp g2023 ON s.state_id = g2023.state_id AND g2023.year = 2023
JOIN
state_gdp g2022 ON s.state_id = g2022.state_id AND g2022.year = 2022
ORDER BY
gdp_growth_rate DESC
LIMIT 5;
Explanation:
• Join: The state_gdp table is joined twice on itself: once for 2023 and once for 2022 data.
• GDP Growth Rate Calculation: The GDP growth rate is calculated using the formula: .
((GDP2023−GDP2022)/GDP2022)×100((GDP_{2023} - GDP_{2022}) / GDP_{2022})
\times 100
• Sorting: The results are sorted in descending order based on the GDP growth rate.
• Top 5 States: The LIMIT 5 clause returns the top 5 states with the highest growth rates.
PostgreSQL Solution
SELECT
s.state_name,
g2023.gdp AS gdp_2023,
g2022.gdp AS gdp_2022,
((g2023.gdp - g2022.gdp) / g2022.gdp) * 100 AS gdp_growth_rate
FROM
states s
JOIN
state_gdp g2023 ON s.state_id = g2023.state_id AND g2023.year = 2023
JOIN
state_gdp g2022 ON s.state_id = g2022.state_id AND g2022.year = 2022
ORDER BY
gdp_growth_rate DESC
LIMIT 5;
Explanation:
• The PostgreSQL solution is the same as the MySQL one. Both SQL engines support
similar syntax for joins, sorting, and calculations.
Key Takeaways:
• GDP Growth Rate Calculation:
• You'll learn how to calculate growth rates based on previous and current values (2022 and
2023 GDP), a common economic analysis task.
• Joins Across Different Time Periods:
1200
1000+ SQL Interview Questions & Answers | By Zero Analyst
• The ability to join data from the same table for two different years (2022 and 2023) is
crucial for time-series data analysis.
• SQL Ranking and Sorting:
• Sorting the data by growth rate and filtering the top 5 states shows the power of using
ORDER BY and LIMIT together to rank results.
• Economic Analysis Using SQL:
• This exercise is similar to real-world economic analysis where you need to evaluate the
performance of different states (or regions) over time.
• Sorting by Growth Rate: After calculating the growth rate, you need to sort the results by
the growth rate in descending order, to identify the top-performing companies.
• Limit the Results: The question asks for the top 5 companies with the highest growth
rate. This is achieved using LIMIT 5 in SQL.
Learnings:
• Understanding Financial Data:
• This question is great for learning how to handle financial data, particularly how to analyze
market capitalization growth—a common task in stock market analysis.
1201
1000+ SQL Interview Questions & Answers | By Zero Analyst
Create and Insert Statements (based on Nifty 50 Companies and Market Cap
Growth)
-- Create the companies table
CREATE TABLE companies (
company_id INT PRIMARY KEY,
company_name VARCHAR(255)
);
1202
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL Solution
SELECT
c.company_name,
m2023.market_cap AS market_cap_2023,
m2022.market_cap AS market_cap_2022,
((m2023.market_cap - m2022.market_cap) / m2022.market_cap) * 100 AS market_cap_growt
h
FROM
companies c
JOIN
market_cap m2023 ON c.company_id = m2023.company_id AND m2023.year = 2023
JOIN
market_cap m2022 ON c.company_id = m2022.company_id AND m2022.year = 2022
ORDER BY
market_cap_growth DESC
LIMIT 5;
Explanation:
• Joins: The query uses two joins on the market_cap table to retrieve both 2023 and 2022
data for the companies.
• Market Cap Growth Calculation: The percentage growth of market capitalization is
calculated using the formula:
• Sorting: The results are sorted in descending order by the market cap growth.
• Top 5 Companies: The LIMIT 5 clause ensures only the top 5 companies with the highest
growth are returned.
PostgreSQL Solution
SELECT
c.company_name,
m2023.market_cap AS market_cap_2023,
m2022.market_cap AS market_cap_2022,
((m2023.market_cap - m2022.market_cap) / m2022.market_cap) * 100 AS market_cap_growt
h
FROM
companies c
JOIN
market_cap m2023 ON c.company_id = m2023.company_id AND m2023.year = 2023
JOIN
market_cap m2022 ON c.company_id = m2022.company_id AND m2022.year = 2022
ORDER BY
market_cap_growth DESC
LIMIT 5;
Explanation:
• The PostgreSQL query is the same as the MySQL one. The SQL syntax for JOIN, ORDER
BY, and LIMIT works the same in both databases.
Key Takeaways:
• Financial Analysis Using SQL:
1203
1000+ SQL Interview Questions & Answers | By Zero Analyst
• This problem teaches how to calculate the year-on-year growth in market capitalization—a
key financial metric for evaluating company performance.
• Using SQL Joins for Time Series Data:
• The question involves joining data from the same table for two different years (2022 and
2023), which is a common task in financial and business analysis.
• Sorting and Ranking Companies:
• You'll learn how to sort and filter the top companies based on specific criteria like market
cap growth using ORDER BY and LIMIT.
• Practical Financial SQL Skills:
• This exercise simulates real-world scenarios, where investors, analysts, and financial
institutions need to evaluate and rank companies based on financial performance over time.
• Percentage Increase: After calculating the increase, you also need to calculate the
percentage increase in population. This can be calculated as:
• Sorting by Population Increase: You need to sort the countries by their population
increase in descending order to get the top 5.
• Limiting the Results: The question asks for the top 5 countries with the highest
population increase, which requires you to use LIMIT 5.
1204
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Understanding Population Growth:
• This question teaches how to analyze population growth, a key demographic metric used
by governments, economists, and policy makers for planning resources, infrastructure, and
governance.
• Performing Calculations in SQL:
• You'll learn how to calculate the difference between two years and the percentage change
within SQL—common operations in many fields including economics and business analysis.
• Ranking and Sorting Data:
• This exercise helps you understand how to rank data based on calculated fields (such as
population increase), and how to filter out the top results using LIMIT.
• Practical Use of Joins:
• Working with data from two different years and calculating the differences is a common
scenario in SQL queries, especially when analyzing trends or financial growth.
1205
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL Solution
SELECT
c.country_name,
p2023.population AS population_2023,
p2022.population AS population_2022,
(p2023.population - p2022.population) AS population_increase,
((p2023.population - p2022.population) / p2022.population) * 100 AS percentage_incre
ase
FROM
countries c
JOIN
population p2023 ON c.country_id = p2023.country_id AND p2023.year = 2023
JOIN
population p2022 ON c.country_id = p2022.country_id AND p2022.year = 2022
ORDER BY
population_increase DESC
LIMIT 5;
Explanation:
• Joins: The query uses two joins on the population table to retrieve both the 2022 and
2023 population data for each country.
• Population Increase Calculation: The increase in population is calculated by subtracting
the population in 2022 from the population in 2023.
• Percentage Increase: The percentage increase is then calculated by dividing the
population increase by the 2022 population, and multiplying by 100.
• Sorting: The query sorts the countries by the population increase in descending order,
ensuring that the top countries with the highest growth appear first.
• Limit: The LIMIT 5 ensures that only the top 5 countries with the highest population
increase are returned.
PostgreSQL Solution
SELECT
c.country_name,
p2023.population AS population_2023,
p2022.population AS population_2022,
(p2023.population - p2022.population) AS population_increase,
((p2023.population - p2022.population) / p2022.population) * 100 AS percentage_incre
ase
FROM
countries c
JOIN
population p2023 ON c.country_id = p2023.country_id AND p2023.year = 2023
JOIN
population p2022 ON c.country_id = p2022.country_id AND p2022.year = 2022
ORDER BY
population_increase DESC
LIMIT 5;
Explanation:
• The PostgreSQL solution is identical to the MySQL solution, as the SQL syntax for joins,
ordering, and limiting works similarly in both databases.
1206
1000+ SQL Interview Questions & Answers | By Zero Analyst
Key Takeaways:
• Understanding Population Growth:
• The question teaches how to measure the increase in population between two years and
how to calculate and interpret percentage growth, which is a key skill in demographics,
economics, and social sciences.
• Handling Time-Series Data:
• You'll learn how to work with time-series data, specifically comparing population data
across two different years using SQL joins.
• SQL Calculations:
• This problem demonstrates how to perform arithmetic operations directly in SQL queries,
which is a common task in data analysis.
• Practical Use Cases:
• This question is applicable in various real-world situations, including governmental policy
planning, economic forecasting, and international comparisons on demographic growth.
This problem is designed to test a mix of arithmetic skills, data transformation, and
proficiency in SQL joins and sorting, providing you with a well-rounded challenge in data
analysis.
• Q.1020
Identify the top 5 airlines with the highest number of accidents over the past 10 years.
Output the airline name, number of accidents, and the percentage of total accidents.
1207
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Calculate Percentage of Total Accidents: Calculate each airline's share of the total
accidents as a percentage.
• Sorting: Rank the airlines by the number of accidents, and return only the top 5.
Learnings:
• SQL Grouping and Aggregation:
• You'll learn how to group data using GROUP BY and use aggregation functions like COUNT()
to calculate the number of accidents per airline.
• Percentage Calculations in SQL:
• This problem helps you practice how to calculate percentages in SQL by dividing the
airline's accident count by the total number of accidents.
• Filtering and Ranking:
• This question tests your ability to filter and rank data, specifically sorting by the accident
count and limiting the results to the top 5 using LIMIT.
• Real-World Use Case:
• This question mimics a real-world scenario that could be useful for aviation safety
analysts, government agencies, or insurance companies that deal with aviation safety and
accident analysis.
1208
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
MySQL Solution
SELECT
a.airline_name,
COUNT(f.accident_id) AS number_of_accidents,
ROUND((COUNT(f.accident_id) / total_accidents) * 100, 2) AS percentage_of_total_acci
dents
FROM
airlines a
JOIN
flight_accidents f ON a.airline_id = f.airline_id
JOIN
(SELECT COUNT(accident_id) AS total_accidents FROM flight_accidents WHERE accident_d
ate BETWEEN '2013-01-01' AND '2023-01-01') AS total
ON 1=1
WHERE
f.accident_date BETWEEN '2013-01-01' AND '2023-01-01'
GROUP BY
a.airline_name
ORDER BY
number_of_accidents DESC
LIMIT 5;
Explanation:
• Joins: The query joins the airlines and flight_accidents tables using the
airline_id.
• Date Filter: The query filters accidents that occurred between 2013-01-01 and 2023-01-
01, which ensures we're analyzing the last 10 years.
• Subquery: The subquery calculates the total number of accidents in the last 10 years. This
total is then used in the percentage calculation.
• Percentage Calculation: The percentage of total accidents for each airline is calculated by
dividing the number of accidents by the total accidents, and then multiplying by 100.
• Sorting and Limiting: The query orders the results by the number of accidents in
descending order, and limits the output to the top 5 airlines.
PostgreSQL Solution
SELECT
a.airline_name,
COUNT(f.accident_id) AS number_of_accidents,
ROUND((COUNT(f.accident_id) * 100.0 / total_accidents), 2) AS percentage_of_total_ac
cidents
FROM
airlines a
JOIN
flight_accidents f ON a.airline_id = f.airline_id
JOIN
(SELECT COUNT(accident_id) AS total_accidents FROM flight_accidents WHERE accident_d
ate BETWEEN '2013-01-01' AND '2023-01-01') AS total
ON 1=1
1209
1000+ SQL Interview Questions & Answers | By Zero Analyst
WHERE
f.accident_date BETWEEN '2013-01-01' AND '2023-01-01'
GROUP BY
a.airline_name
ORDER BY
number_of_accidents DESC
LIMIT 5;
Explanation:
• The PostgreSQL solution is almost identical to the MySQL solution, with a small
difference in how division is handled for the percentage calculation. PostgreSQL uses 100.0
to ensure the division results in a float.
Key Takeaways:
• Using Joins Across Multiple Tables:
• The problem helps you practice joining multiple tables (airlines and flight accidents) and
using aggregations (like COUNT()) to summarize the data.
• Percentage Calculations:
• The task of calculating percentages based on grouped data is very common in data analysis
and business intelligence.
• Data Filtering by Date:
• You'll learn how to filter data by a time range, which is especially useful in time-series
analysis and reporting.
• Handling Large Datasets:
• Flight accident data is usually quite large, so this exercise helps you understand how to
handle large datasets by aggregating and limiting results efficiently.
1210
1000+ SQL Interview Questions & Answers | By Zero Analyst
Retrieve all employees with more than 10 absences in 2024 and department equal to
"Engineering" from the EmployeeAttendance table.
Explanation
You need to select all records where the Absences are greater than 10, the Year is 2024, and
the Department is 'Engineering'. Use the WHERE clause to filter based on these conditions.
Datasets and SQL Schemas
-- Table creation
CREATE TABLE EmployeeAttendance (
EmployeeID INT,
EmployeeName VARCHAR(50),
Department VARCHAR(50),
Absences INT,
Year INT
);
-- Datasets
INSERT INTO EmployeeAttendance (EmployeeID, EmployeeName, Department, Absences, Year) VA
LUES
(1, 'John', 'Engineering', 12, 2024),
(2, 'Emma', 'Marketing', 8, 2024),
(3, 'Liam', 'Engineering', 15, 2024),
(4, 'Sophia', 'HR', 9, 2023),
(5, 'Noah', 'Engineering', 7, 2024);
Learnings
• Using the WHERE clause to filter based on multiple conditions (numeric values and text).
• Combining conditions with AND to retrieve data based on several criteria.
Solutions
• - PostgreSQL solution
SELECT EmployeeID, EmployeeName, Department, Absences, Year
FROM EmployeeAttendance
WHERE Absences > 10 AND Year = 2024 AND Department = 'Engineering';
• - MySQL solution
SELECT EmployeeID, EmployeeName, Department, Absences, Year
FROM EmployeeAttendance
WHERE Absences > 10 AND Year = 2024 AND Department = 'Engineering';
• Q.1026
• Q.1027
Question
Find the top 3 products with the highest cancellation rates for the month of January 2023.
Include only products that had at least 30 orders in that month, and where the cancellation
rate (number of cancellations / total orders) is greater than 20%.
Explanation
For each product, calculate the cancellation rate as the percentage of cancellations relative to
total orders. Filter out products with fewer than 30 orders in January 2023 and those with a
cancellation rate greater than 20%. Return the top 3 products with the highest cancellation
rates.
Datasets and SQL Schemas
• - Table creation
CREATE TABLE ProductOrders (
OrderID INT,
ProductID INT,
ProductName VARCHAR(100),
1211
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using COUNT() to count both orders and cancellations
• Calculating cancellation rates
• Filtering with HAVING based on conditions involving aggregated data
• Sorting results by aggregated values
Solutions
• - PostgreSQL solution
WITH CancellationRates AS (
SELECT po.ProductID, po.ProductName,
COUNT(co.CancellationID) AS TotalCancellations,
COUNT(po.OrderID) AS TotalOrders,
(COUNT(co.CancellationID) * 1.0 / COUNT(po.OrderID)) * 100 AS CancellationRat
e
FROM ProductOrders po
LEFT JOIN OrderCancellations co ON po.OrderID = co.OrderID
WHERE po.OrderDate BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY po.ProductID, po.ProductName
HAVING COUNT(po.OrderID) >= 30 AND (COUNT(co.CancellationID) *1.0 /
COUNT(po.OrderID)) * 100 > 20)SELECT ProductID, ProductName, CancellationRate FROM
CancellationRatesORDER BY CancellationRate DESC LIMIT 3;
MySQL Solution
WITH CancellationRates AS (
SELECT po.ProductID, po.ProductName,
COUNT(co.CancellationID) AS TotalCancellations,
COUNT(po.OrderID) AS TotalOrders,
(COUNT(co.CancellationID) * 1.0 / COUNT(po.OrderID)) * 100 AS CancellationRat
e
FROM ProductOrders po
LEFT JOIN OrderCancellations co ON po.OrderID = co.OrderID
WHERE po.OrderDate BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY po.ProductID, po.ProductName
1212
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Write a SQL query to find all active customers who watched more than 10 episodes of a
show called "Stranger Things" in the last 30 days.
Explanation
The task is to identify active users who have watched more than 10 distinct episodes of the
show "Stranger Things" within the last 30 days. The query should:
• Join the users, viewing_history, and shows tables.
• Filter for active users (u.active = TRUE).
• Filter for the show "Stranger Things" (s.show_name = 'Stranger Things').
• Ensure the viewing history is from the last 30 days.
• Count the distinct episodes watched and only include users who watched more than 10
episodes.
-- Shows
INSERT INTO shows (show_id, show_name)
VALUES
(2001, 'Stranger Things'),
(2002, 'Money Heist');
-- Viewing History
1213
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Filtering data using date ranges (NOW() - INTERVAL '30 days').
• Using JOIN to combine data from multiple tables.
• Using COUNT(DISTINCT ...) to count unique episodes watched by each user.
• Filtering based on aggregated counts using HAVING.
• Sorting results based on user activity and ensuring distinct episodes are counted.
Solutions
• - PostgreSQL solution
SELECT DISTINCT v.user_id
FROM users u
JOIN viewing_history v ON v.user_id = u.user_id
JOIN shows s ON s.show_id = v.show_id
WHERE u.active = TRUE
AND s.show_name = 'Stranger Things'
AND v.watch_date >= NOW() - INTERVAL '30 days'
GROUP BY v.user_id
HAVING COUNT(DISTINCT v.episode_id) > 10;
• - MySQL solution
SELECT DISTINCT v.user_id
FROM users u
JOIN viewing_history v ON v.user_id = u.user_id
JOIN shows s ON s.show_id = v.show_id
WHERE u.active = TRUE
AND s.show_name = 'Stranger Things'
AND v.watch_date >= CURDATE() - INTERVAL 30 DAY
GROUP BY v.user_id
HAVING COUNT(DISTINCT v.episode_id) > 10;
• Q.1029
Question:
Uber wants to analyze driver performance by giving a special Diwali bonus!
Write an SQL query to find the top drivers based on the highest average rating in each city,
ensuring they have completed at least 5 rides in the last 3 months.
Ignore incomplete rides (where end_time is missing).
Return city_name, driver_name, total_completed_rides, and avg_rating.
Explanation:
• Join the Drivers and Rides tables based on driver_id.
• Filter rides that have completed (end_time IS NOT NULL) and are within the last 3
months (start_time >= CURRENT_DATE - INTERVAL '3 month').
• Group the results by city and driver_id to calculate the total completed rides and
average rating for each driver.
1214
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Use HAVING to ensure each driver has completed at least 5 rides in the last 3 months.
• Use the RANK() window function to rank the drivers by their average rating in each city.
• Filter to return only the top-ranked driver (rank = 1) in each city.
INSERT INTO Rides (ride_id, driver_id, customer_id, start_time, end_time, distance, pric
e, rating)
VALUES
(6, 104, 301, '2024-08-10 13:00:00', '2024-08-10 13:40:00', 8.2, 220.00, 4.5),
(7, 104, 302, '2024-09-12 14:20:00', '2024-09-12 14:55:00', 5.0, 150.25, 4.3),
(8, 104, 303, '2024-10-02 09:15:00', '2024-10-02 09:45:00', 6.5, 175.50, 4.8),
(9, 105, 304, '2024-08-15 16:30:00', '2024-08-15 17:00:00', 7.1, 187.50, 4.7),
(10, 105, 305, '2024-09-10 08:10:00', '2024-09-10 08:45:00', 9.2, 245.00, 4.6),
(11, 105, 306, '2024-10-20 19:05:00', '2024-10-20 19:35:00', 5.9, 160.00, 5.0),
(12, 106, 307, '2024-07-22 18:20:00', null, null, null, null),
(13, 106, 308, '2024-08-08 11:30:00', '2024-08-08 12:00:00', 3.6, 100.00, 4.4),
(14, 106, 309, '2024-09-15 09:00:00', '2024-09-15 09:35:00', 5.0, 132.50, 4.8),
(15, 107, 310, '2024-08-25 08:00:00', '2024-08-25 08:30:00', 6.2, 157.50, 4.2),
(16, 107, 311, '2024-09-22 13:20:00', '2024-09-22 13:50:00', 5.3, 140.00, 4.3),
(17, 107, 312, '2024-10-05 10:05:00', '2024-10-05 10:30:00', 4.8, 125.00, 4.5),
(18, 108, 313, '2024-08-02 15:30:00', '2024-08-02 16:00:00', 7.0, 190.00, 4.6),
(19, 108, 314, '2024-09-17 14:10:00', '2024-09-17 14:40:00', 8.2, 210.00, 4.7),
(20, 108, 315, '2024-10-12 17:30:00', '2024-10-12 17:55:00', 6.3, 165.00, 4.8),
(21, 109, 316, '2024-08-18 09:30:00', '2024-08-18 10:00:00', 6.0, 180.00, 4.2),
(22, 109, 317, '2024-09-20 11:45:00', '2024-09-20 12:15:00', 5.9, 175.00, 4.1),
(23, 109, 318, '2024-10-15 13:00:00', '2024-10-15 13:30:00', 4.7, 130.00, 4.5),
(27, 104, 322, '2024-10-15 11:10:00', '2024-10-15 11:40:00', 4.5, 120.00, 4.3),
(31, 105, 326, '2024-10-12 10:10:00', null, null, null, null),
(32, 105, 327, '2024-10-14 12:45:00', '2024-10-14 13:15:00', 5.8, 155.00, 4.5),
(35, 106, 330, '2024-10-11 08:30:00', '2024-10-11 09:00:00', 5.5, 140.00, 4.7),
(36, 106, 331, '2024-10-13 13:50:00', '2024-10-13 14:20:00', 6.4, 165.50, 4.5),
(39, 107, 334, '2024-10-14 09:00:00', '2024-10-14 09:30:00', 5.3, 150.00, 4.3),
(40, 107, 335, '2024-10-15 19:00:00', '2024-10-15 19:30:00', 6.1, 160.00, 4.4),
1215
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings:
• Joins: Using JOIN to combine tables based on driver_id.
• Filtering: Using conditions like end_time IS NOT NULL and date filtering.
• Aggregation: Using COUNT() to count completed rides and AVG() to calculate average
ratings.
• Window Functions: Using RANK() to rank drivers within each city based on their average
rating.
• Grouping: Using GROUP BY to aggregate data at a city and driver level.
• Having Clause: Ensuring drivers have completed a minimum of 5 rides.
Solutions
• - PostgreSQL solution
SELECT city, driver_name, total_completed_rides, avg_rating
FROM
(SELECT
d.city,
d.driver_id,
d.driver_name,
COUNT(r.ride_id) as total_completed_rides,
AVG(r.rating) as avg_rating,
RANK() OVER(PARTITION BY d.city ORDER BY AVG(r.rating) DESC) as rank
FROM rides as r
JOIN drivers as d ON d.driver_id = r.driver_id
WHERE
r.end_time IS NOT NULL
AND r.start_time >= CURRENT_DATE - INTERVAL '3 month'
GROUP BY d.city, d.driver_id, d.driver_name
HAVING COUNT(r.ride_id) >= 5) as subquery
WHERE rank = 1;
• - MySQL solution
SELECT city, driver_name, total_completed_rides, avg_rating
FROM
(SELECT
1216
1000+ SQL Interview Questions & Answers | By Zero Analyst
d.city,
d.driver_id,
d.driver_name,
COUNT(r.ride_id) as total_completed_rides,
AVG(r.rating) as avg_rating,
RANK() OVER(PARTITION BY d.city ORDER BY AVG(r.rating) DESC) as rank
FROM rides as r
JOIN drivers as d ON d.driver_id = r.driver_id
WHERE
r.end_time IS NOT NULL
AND r.start_time >= CURDATE() - INTERVAL 3 MONTH
GROUP BY d.city, d.driver_id, d.driver_name
HAVING COUNT(r.ride_id) >= 5) as subquery
WHERE rank = 1;
• Q.1030
Question
Tracking Refunds and Chargebacks
Given a table of payments (payment_id, user_id, payment_method, amount, payment_date,
transaction_type) and a table of refunds (refund_id, payment_id, refund_amount,
refund_date), write a query to calculate the net payment amount (payment amount minus
refund) for each user in the last 30 days. Include only users who have a net payment amount
greater than $0.
Explanation
• The goal is to track the net payment amount for each user by subtracting any refunds from
the original payments.
• For each payment, we will calculate the total refund (if any), and subtract it from the
payment amount.
• We will filter users whose net payment amount is greater than $0 in the last 30 days.
• This involves joining the payments table with the refunds table and calculating the net
amount.
1217
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using JOIN to combine data from payments and refunds tables.
• Using COALESCE() or IFNULL() to handle null values when no refund exists for a payment.
• Filtering data for the last 30 days using CURRENT_DATE - INTERVAL '30 days'.
• Calculating the net payment by subtracting the refund amount from the payment amount.
Solutions
• - PostgreSQL solution
SELECT
p.user_id,
SUM(p.amount) - COALESCE(SUM(r.refund_amount), 0) AS net_payment
FROM
payments p
LEFT JOIN
refunds r ON p.payment_id = r.payment_id
WHERE
p.payment_date > CURRENT_DATE - INTERVAL '30 days'
GROUP BY
p.user_id
HAVING
SUM(p.amount) - COALESCE(SUM(r.refund_amount), 0) > 0;
• - MySQL solution
SELECT
p.user_id,
SUM(p.amount) - IFNULL(SUM(r.refund_amount), 0) AS net_payment
FROM
payments p
LEFT JOIN
refunds r ON p.payment_id = r.payment_id
WHERE
p.payment_date > CURDATE() - INTERVAL 30 DAY
GROUP BY
p.user_id
HAVING
SUM(p.amount) - IFNULL(SUM(r.refund_amount), 0) > 0;
• Q.1031
Question
Identify employees who were absent for two consecutive days.
Explanation
You need to identify employees who have two consecutive absent records in the attendance
table. You will likely use a self-join or window functions to compare attendance records for
consecutive days.
1218
1000+ SQL Interview Questions & Answers | By Zero Analyst
employee_id INT,
employee_name VARCHAR(100)
);
-- datasets
INSERT INTO employees (employee_id, employee_name)
VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Brown');
Table: attendance
CREATE TABLE attendance (
employee_id INT,
attendance_date DATE,
status VARCHAR(20)
);
-- datasets
INSERT INTO attendance (employee_id, attendance_date, status)
VALUES
(1, '2025-01-01', 'Absent'),
(1, '2025-01-02', 'Absent'),
(1, '2025-01-03', 'Present'),
(2, '2025-01-01', 'Present'),
(2, '2025-01-02', 'Absent'),
(2, '2025-01-03', 'Absent'),
(3, '2025-01-01', 'Present');
Learnings
• Self-joins or window functions for comparing consecutive rows
• Date handling for consecutive days
• Filtering based on conditions (e.g., 'Absent' status)
Solutions
PostgreSQL Solution
SELECT e.employee_name
FROM attendance a1
JOIN attendance a2 ON a1.employee_id = a2.employee_id
AND a1.attendance_date = a2.attendance_date - INTERVAL '1 day'
WHERE a1.status = 'Absent' AND a2.status = 'Absent'
GROUP BY e.employee_name;
MySQL Solution
SELECT e.employee_name
FROM attendance a1
JOIN attendance a2 ON a1.employee_id = a2.employee_id
AND a1.attendance_date = DATE_SUB(a2.attendance_date, INTERVAL 1 DAY)
WHERE a1.status = 'Absent' AND a2.status = 'Absent'
GROUP BY e.employee_name;
• Q.1032
Question
Find employees whose total overtime hours across all present days exceed 5 hours. Return
their employee_id and total overtime hours.
Explanation
1219
1000+ SQL Interview Questions & Answers | By Zero Analyst
You need to calculate the total overtime hours for each employee across days when they were
present. If the total overtime exceeds 5 hours, return their employee_id and the total
overtime hours. You will use aggregation and filtering to achieve this.
-- datasets
INSERT INTO employees (employee_id, employee_name)
VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Brown');
Table: attendance
CREATE TABLE attendance (
employee_id INT,
attendance_date DATE,
status VARCHAR(20),
overtime_hours DECIMAL(5,2) -- number of overtime hours
);
-- datasets
INSERT INTO attendance (employee_id, attendance_date, status, overtime_hours)
VALUES
(1, '2025-01-01', 'Present', 2.5),
(1, '2025-01-02', 'Present', 3.0),
(1, '2025-01-03', 'Absent', 0.0),
(2, '2025-01-01', 'Present', 1.5),
(2, '2025-01-02', 'Present', 2.5),
(2, '2025-01-03', 'Present', 2.0),
(3, '2025-01-01', 'Present', 6.0),
(3, '2025-01-02', 'Present', 0.5);
Learnings
• Aggregating values (total overtime hours).
• Filtering based on conditions (e.g., presence on specific days).
• Summing up overtime across multiple records.
Solutions
PostgreSQL Solution
SELECT employee_id, SUM(overtime_hours) AS total_overtime
FROM attendance
WHERE status = 'Present'
GROUP BY employee_id
HAVING SUM(overtime_hours) > 5;
MySQL Solution
SELECT employee_id, SUM(overtime_hours) AS total_overtime
FROM attendance
WHERE status = 'Present'
GROUP BY employee_id
HAVING SUM(overtime_hours) > 5;
• Q.1033
Question
1220
1000+ SQL Interview Questions & Answers | By Zero Analyst
For employees who were late, calculate their average overtime hours on those days. Exclude
employees who were never late.
Explanation
You need to calculate the average overtime hours for each employee on the days they were
late. Exclude employees who never had a "Late" status. This involves filtering for "Late"
days and then calculating the average overtime hours for those specific days.
-- datasets
INSERT INTO employees (employee_id, employee_name)
VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Brown');
Table: attendance
CREATE TABLE attendance (
employee_id INT,
attendance_date DATE,
status VARCHAR(20),
overtime_hours DECIMAL(5,2) -- number of overtime hours
);
-- datasets
INSERT INTO attendance (employee_id, attendance_date, status, overtime_hours)
VALUES
(1, '2025-01-01', 'Late', 2.5),
(1, '2025-01-02', 'Present', 0.0),
(1, '2025-01-03', 'Late', 1.5),
(2, '2025-01-01', 'Present', 0.0),
(2, '2025-01-02', 'Late', 3.0),
(2, '2025-01-03', 'Present', 0.0),
(3, '2025-01-01', 'Present', 0.0);
Learnings
• Filtering for specific statuses (e.g., "Late").
• Calculating averages with AVG() function.
• Excluding records based on conditions (e.g., employees who were never late).
Solutions
PostgreSQL Solution
SELECT employee_id, AVG(overtime_hours) AS avg_overtime
FROM attendance
WHERE status = 'Late'
GROUP BY employee_id
HAVING COUNT(*) > 0;
MySQL Solution
SELECT employee_id, AVG(overtime_hours) AS avg_overtime
FROM attendance
WHERE status = 'Late'
GROUP BY employee_id
1221
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Rank employees by their overall attendance consistency, defined as the total number of
Present days divided by the total number of attendance records. Return their employee_id,
consistency percentage, and rank.
Explanation
You need to calculate the attendance consistency for each employee. The consistency is
defined as the ratio of Present days to the total number of attendance records (including both
Present and Absent days). You will then rank employees based on their consistency. This
requires the use of aggregation, division, and ranking functions.
-- datasets
INSERT INTO employees (employee_id, employee_name)
VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Brown');
Table: attendance
CREATE TABLE attendance (
employee_id INT,
attendance_date DATE,
status VARCHAR(20)
);
-- datasets
INSERT INTO attendance (employee_id, attendance_date, status)
VALUES
(1, '2025-01-01', 'Present'),
(1, '2025-01-02', 'Absent'),
(1, '2025-01-03', 'Present'),
(2, '2025-01-01', 'Present'),
(2, '2025-01-02', 'Present'),
(2, '2025-01-03', 'Absent'),
(3, '2025-01-01', 'Absent'),
(3, '2025-01-02', 'Absent'),
(3, '2025-01-03', 'Present');
Learnings
• Calculating ratios (Present days / Total days).
• Ranking with window functions (RANK(), DENSE_RANK()).
• Using aggregation and conditional counting.
Solutions
PostgreSQL Solution
WITH attendance_summary AS (
1222
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT employee_id,
COUNT(*) AS total_records,
COUNT(CASE WHEN status = 'Present' THEN 1 END) AS present_days
FROM attendance
GROUP BY employee_id
)
SELECT employee_id,
(present_days * 100.0 / total_records) AS consistency_percentage,
RANK() OVER (ORDER BY (present_days * 100.0 / total_records) DESC) AS rank
FROM attendance_summary;
MySQL Solution
WITH attendance_summary AS (
SELECT employee_id,
COUNT(*) AS total_records,
COUNT(CASE WHEN status = 'Present' THEN 1 END) AS present_days
FROM attendance
GROUP BY employee_id
)
SELECT employee_id,
(present_days * 100.0 / total_records) AS consistency_percentage,
RANK() OVER (ORDER BY (present_days * 100.0 / total_records) DESC) AS rank
FROM attendance_summary;
• Q.1035
Question
For each student, calculate their total grade across all subjects and rank them in descending
order of total grades. Return student_id, total_grade, and rank.
Explanation
You need to calculate the total grade for each student by summing their grades across all
subjects. Then, rank the students based on their total grade in descending order. This involves
using aggregation and ranking functions.
-- datasets
INSERT INTO students (student_id, student_name)
VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Brown');
Table: grades
CREATE TABLE grades (
student_id INT,
subject VARCHAR(100),
grade DECIMAL(5,2)
);
-- datasets
INSERT INTO grades (student_id, subject, grade)
VALUES
(1, 'Math', 85.5),
(1, 'Science', 90.0),
(1, 'History', 78.0),
(2, 'Math', 92.0),
1223
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using SUM() to calculate total grades.
• Ranking results using RANK() or DENSE_RANK().
• Aggregation with grouping and ordering by total grades.
Solutions
PostgreSQL Solution
WITH total_grades AS (
SELECT student_id, SUM(grade) AS total_grade
FROM grades
GROUP BY student_id
)
SELECT student_id, total_grade,
RANK() OVER (ORDER BY total_grade DESC) AS rank
FROM total_grades;
MySQL Solution
WITH total_grades AS (
SELECT student_id, SUM(grade) AS total_grade
FROM grades
GROUP BY student_id
)
SELECT student_id, total_grade,
RANK() OVER (ORDER BY total_grade DESC) AS rank
FROM total_grades;
• Q.1036
Question
Identify the top scorer in each subject. If multiple students have the same top score in a
subject, return all of them. Return subject, student_id, and grade.
Explanation
You need to find the highest score for each subject, and if multiple students share the same
highest score, return all of them. This involves using aggregation to find the maximum grade
per subject, and then filtering to get students who match the maximum grade.
-- datasets
INSERT INTO students (student_id, student_name)
VALUES
(1, 'John Doe'),
(2, 'Jane Smith'),
(3, 'Alice Brown');
Table: grades
1224
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- datasets
INSERT INTO grades (student_id, subject, grade)
VALUES
(1, 'Math', 85.5),
(1, 'Science', 90.0),
(1, 'History', 78.0),
(2, 'Math', 92.0),
(2, 'Science', 85.5),
(2, 'History', 88.0),
(3, 'Math', 92.0),
(3, 'Science', 80.0),
(3, 'History', 70.0);
Learnings
• Using MAX() to find the highest grade in a subject.
• Filtering to return all students who match the highest grade.
• Grouping by subject to calculate the top scorer.
Solutions
PostgreSQL Solution
WITH max_grades AS (
SELECT subject, MAX(grade) AS max_grade
FROM grades
GROUP BY subject
)
SELECT g.subject, g.student_id, g.grade
FROM grades g
JOIN max_grades m ON g.subject = m.subject AND g.grade = m.max_grade;
MySQL Solution
WITH max_grades AS (
SELECT subject, MAX(grade) AS max_grade
FROM grades
GROUP BY subject
)
SELECT g.subject, g.student_id, g.grade
FROM grades g
JOIN max_grades m ON g.subject = m.subject AND g.grade = m.max_grade;
• Q.1037
Question
Identify products whose sales performance improved each day over the 4-day period. This
means the units sold on a given day must be greater than the previous day. Return
product_id and the total_improvement_days (number of consecutive days with increasing
sales).
Explanation
You need to identify products where sales increased every day over a 4-day period. For each
product, you will compare sales for each day with the previous day. If the sales were greater
on a given day, the streak of improvement continues. At the end, count how many days in the
4-day period saw this improvement for each product.
1225
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- datasets
INSERT INTO products (product_id, product_name)
VALUES
(1, 'Product A'),
(2, 'Product B'),
(3, 'Product C');
Table: sales
CREATE TABLE sales (
product_id INT,
sale_date DATE,
units_sold INT
);
-- datasets
INSERT INTO sales (product_id, sale_date, units_sold)
VALUES
(1, '2025-01-01', 10),
(1, '2025-01-02', 15),
(1, '2025-01-03', 20),
(1, '2025-01-04', 25),
(2, '2025-01-01', 5),
(2, '2025-01-02', 7),
(2, '2025-01-03', 8),
(2, '2025-01-04', 7),
(3, '2025-01-01', 3),
(3, '2025-01-02', 4),
(3, '2025-01-03', 5),
(3, '2025-01-04', 6);
Learnings
• Using LAG() or self-joins to compare consecutive rows.
• Filtering based on a condition where sales increase over multiple consecutive days.
• Aggregating data to count the number of improvement days for each product.
Solutions
PostgreSQL Solution
WITH sales_comparison AS (
SELECT product_id, sale_date, units_sold,
LAG(units_sold) OVER (PARTITION BY product_id ORDER BY sale_date) AS prev_uni
ts_sold
FROM sales
)
SELECT product_id,
COUNT(*) AS total_improvement_days
FROM sales_comparison
WHERE units_sold > prev_units_sold
GROUP BY product_id
HAVING COUNT(*) = 3; -- 3 consecutive improvement days out of 4
MySQL Solution
WITH sales_comparison AS (
SELECT product_id, sale_date, units_sold,
LAG(units_sold) OVER (PARTITION BY product_id ORDER BY sale_date) AS prev_uni
ts_sold
FROM sales
1226
1000+ SQL Interview Questions & Answers | By Zero Analyst
)
SELECT product_id,
COUNT(*) AS total_improvement_days
FROM sales_comparison
WHERE units_sold > prev_units_sold
GROUP BY product_id
HAVING COUNT(*) = 3; -- 3 consecutive improvement days out of 4
• Q.1038
Question
For each product, calculate the maximum sales in a single day and return the product_id and
the max_sales. Only include products where the maximum sales were greater than 45 units.
Explanation
You need to calculate the maximum sales for each product on any single day. Then, filter the
results to only include products where their maximum sales exceeded 45 units. This involves
using the MAX() function to calculate the highest sales and applying a condition to filter based
on the value.
-- datasets
INSERT INTO products (product_id, product_name)
VALUES
(1, 'Product A'),
(2, 'Product B'),
(3, 'Product C');
Table: sales
CREATE TABLE sales (
product_id INT,
sale_date DATE,
units_sold INT
);
-- datasets
INSERT INTO sales (product_id, sale_date, units_sold)
VALUES
(1, '2025-01-01', 10),
(1, '2025-01-02', 55),
(1, '2025-01-03', 20),
(1, '2025-01-04', 35),
(2, '2025-01-01', 20),
(2, '2025-01-02', 45),
(2, '2025-01-03', 30),
(2, '2025-01-04', 50),
(3, '2025-01-01', 5),
(3, '2025-01-02', 10),
(3, '2025-01-03', 15),
(3, '2025-01-04', 30);
Learnings
• Using the MAX() function to calculate the highest sales.
• Filtering results with HAVING to include only products with sales greater than a threshold.
1227
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution
SELECT product_id, MAX(units_sold) AS max_sales
FROM sales
GROUP BY product_id
HAVING MAX(units_sold) > 45;
MySQL Solution
SELECT product_id, MAX(units_sold) AS max_sales
FROM sales
GROUP BY product_id
HAVING MAX(units_sold) > 45;
• Q.1039
Question
Find all the available listings in the location 'Miami' or 'New York' with an availability date
within the next 7 days.
Explanation
You need to filter the listings based on two conditions:
• The location must be either 'Miami' or 'New York'.
• The availability date should be within the next 7 days from the current date.
You will use WHERE clauses to filter based on the location and availability date. Additionally,
you can use date functions to ensure the availability date is within the next 7 days.
-- datasets
INSERT INTO listings (listing_id, location, available_from, available_to)
VALUES
(1, 'Miami', '2025-01-15', '2025-01-20'),
(2, 'New York', '2025-01-18', '2025-01-22'),
(3, 'Los Angeles', '2025-01-10', '2025-01-15'),
(4, 'Miami', '2025-01-17', '2025-01-21'),
(5, 'New York', '2025-01-12', '2025-01-16');
Learnings
• Using the WHERE clause to filter by location.
• Using CURRENT_DATE or NOW() to filter dates relative to the current day.
• Date comparison to filter availability within the next 7 days.
Solutions
PostgreSQL Solution
1228
1000+ SQL Interview Questions & Answers | By Zero Analyst
MySQL Solution
SELECT listing_id, location, available_from, available_to
FROM listings
WHERE location IN ('Miami', 'New York')
AND available_from BETWEEN CURDATE() AND CURDATE() + INTERVAL 7 DAY;
• Q.1040
Question
Calculate the total potential revenue for listings available between '2024-12-20' and
'2024-12-25'.
(Revenue is defined as the price_per_night multiplied by the number of days the property
is available within this period.)
Explanation
You need to calculate the potential revenue for each listing based on the number of nights the
listing is available within the specified date range (2024-12-20 to 2024-12-25).
• The revenue for each listing is calculated by multiplying the price_per_night by the
number of nights the listing is available within the given date range.
• If the listing's availability period overlaps with this date range, you need to calculate how
many days fall within that range.
-- datasets
INSERT INTO listings (listing_id, location, available_from, available_to, price_per_nigh
t)
VALUES
(1, 'Miami', '2024-12-15', '2024-12-22', 150.00),
(2, 'New York', '2024-12-20', '2024-12-25', 250.00),
(3, 'Los Angeles', '2024-12-18', '2024-12-23', 200.00),
(4, 'Miami', '2024-12-10', '2024-12-18', 180.00),
(5, 'New York', '2024-12-21', '2024-12-25', 300.00);
Learnings
• Calculating the overlap between two date ranges.
• Using GREATEST() and LEAST() functions to determine the overlapping dates.
• Multiplying price_per_night by the number of overlapping days to calculate revenue.
Solutions
PostgreSQL Solution
1229
1000+ SQL Interview Questions & Answers | By Zero Analyst
SELECT listing_id,
location,
price_per_night,
-- Calculate the number of overlapping days
GREATEST(LEAST(available_to, '2024-12-25'::DATE) - GREATEST(available_from, '2024
-12-20'::DATE), 0) AS available_days,
price_per_night * GREATEST(LEAST(available_to, '2024-12-25'::DATE) - GREATEST(ava
ilable_from, '2024-12-20'::DATE), 0) AS potential_revenue
FROM listings
WHERE available_to >= '2024-12-20' AND available_from <= '2024-12-25';
MySQL Solution
SELECT listing_id,
location,
price_per_night,
-- Calculate the number of overlapping days
GREATEST(LEAST(available_to, '2024-12-25') - GREATEST(available_from, '2024-12-20
'), 0) AS available_days,
price_per_night * GREATEST(LEAST(available_to, '2024-12-25') - GREATEST(available
_from, '2024-12-20'), 0) AS potential_revenue
FROM listings
WHERE available_to >= '2024-12-20' AND available_from <= '2024-12-25';
Question
Write a query to find the most expensive listing (price_per_night) in each city.
Explanation
You need to find the highest price per night (price_per_night) for listings in each city. This
can be achieved by grouping the data by city and using the MAX() function to determine the
highest price. You will then return the listing that has the maximum price for each city.
-- datasets
INSERT INTO listings (listing_id, location, available_from, available_to, price_per_nigh
t)
VALUES
(1, 'Miami', '2024-12-15', '2024-12-22', 150.00),
(2, 'New York', '2024-12-20', '2024-12-25', 250.00),
(3, 'Los Angeles', '2024-12-18', '2024-12-23', 200.00),
(4, 'Miami', '2024-12-10', '2024-12-18', 180.00),
(5, 'New York', '2024-12-21', '2024-12-25', 300.00),
1230
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Using the MAX() function to find the highest value in a group.
• Grouping data by a specific column (location) to find maximum values for each city.
• Using JOIN (if needed) to retrieve the full record for the highest-priced listings.
Solutions
PostgreSQL Solution
WITH max_prices AS (
SELECT location, MAX(price_per_night) AS max_price
FROM listings
GROUP BY location
)
SELECT l.listing_id, l.location, l.price_per_night
FROM listings l
JOIN max_prices m ON l.location = m.location
AND l.price_per_night = m.max_price;
MySQL Solution
WITH max_prices AS (
SELECT location, MAX(price_per_night) AS max_price
FROM listings
GROUP BY location
)
SELECT l.listing_id, l.location, l.price_per_night
FROM listings l
JOIN max_prices m ON l.location = m.location
AND l.price_per_night = m.max_price;
Question
Write a query to count how many listings are available and unavailable, grouped by
location.
Explanation
You need to count the number of listings in each location that are either available or
unavailable. To determine availability, you can use the available_from and available_to
dates. If the current date is between available_from and available_to, the listing is
considered available. Otherwise, it is unavailable.
1231
1000+ SQL Interview Questions & Answers | By Zero Analyst
listing_id INT,
location VARCHAR(100),
available_from DATE,
available_to DATE,
price_per_night DECIMAL(10, 2) -- Price per night for the listing
);
-- datasets
INSERT INTO listings (listing_id, location, available_from, available_to, price_per_nigh
t)
VALUES
(1, 'Miami', '2024-12-15', '2024-12-22', 150.00),
(2, 'New York', '2024-12-20', '2024-12-25', 250.00),
(3, 'Los Angeles', '2024-12-18', '2024-12-23', 200.00),
(4, 'Miami', '2024-12-10', '2024-12-18', 180.00),
(5, 'New York', '2024-12-21', '2024-12-25', 300.00),
(6, 'Los Angeles', '2024-12-22', '2024-12-30', 220.00);
Learnings
• Using CASE statements to categorize records as available or unavailable based on date
comparison.
• Grouping data by location and counting the records for each category.
• Using CURRENT_DATE or NOW() to determine the current date.
Solutions
PostgreSQL Solution
SELECT location,
COUNT(CASE WHEN available_from <= CURRENT_DATE AND available_to >= CURRENT_DATE T
HEN 1 END) AS available_count,
COUNT(CASE WHEN available_from > CURRENT_DATE OR available_to < CURRENT_DATE THEN
1 END) AS unavailable_count
FROM listings
GROUP BY location;
MySQL Solution
SELECT location,
COUNT(CASE WHEN available_from <= CURDATE() AND available_to >= CURDATE() THEN 1
END) AS available_count,
COUNT(CASE WHEN available_from > CURDATE() OR available_to < CURDATE() THEN 1 END
) AS unavailable_count
FROM listings
GROUP BY location;
Question
Write an SQL query to find for each month and country, the number of transactions and their
total amount, the number of approved transactions and their total amount.
1232
1000+ SQL Interview Questions & Answers | By Zero Analyst
2018-
US 2 1 3000 1000
12
2019-
US 1 1 2000 2000
01
2019-
DE 1 1 2000 2000
01
Explanation
In this query, you need to:
• Group the transactions by month and country.
• Count the total number of transactions (trans_count).
• Count the number of approved transactions (approved_count).
• Calculate the total amount of all transactions (trans_total_amount).
• Calculate the total amount of approved transactions (approved_total_amount).
• Use CASE statements to conditionally sum values based on whether the transaction is
approved or not.
You can extract the month from the trans_date column using the appropriate date function
(TO_CHAR in PostgreSQL, DATE_FORMAT in MySQL) and group by both month and country.
Solution
PostgreSQL Solution
SELECT
TO_CHAR(trans_date, 'YYYY-MM') AS month,
country,
COUNT(*) AS trans_count,
SUM(CASE WHEN state = 'approved' THEN 1 ELSE 0 END) AS approved_count,
SUM(amount) AS trans_total_amount,
SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END) AS approved_total_amount
FROM Transactions
GROUP BY TO_CHAR(trans_date, 'YYYY-MM'), country
ORDER BY month, country;
MySQL Solution
SELECT
DATE_FORMAT(trans_date, '%Y-%m') AS month,
country,
COUNT(*) AS trans_count,
SUM(CASE WHEN state = 'approved' THEN 1 ELSE 0 END) AS approved_count,
SUM(amount) AS trans_total_amount,
SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END) AS approved_total_amount
FROM Transactions
GROUP BY DATE_FORMAT(trans_date, '%Y-%m'), country
ORDER BY month, country;
1233
1000+ SQL Interview Questions & Answers | By Zero Analyst
• COUNT(*):
This counts the total number of transactions for each group (i.e., month and country).
• SUM(CASE WHEN state = 'approved' THEN 1 ELSE 0 END):
This counts how many transactions were approved. It adds 1 for approved transactions, and
0 for others, effectively counting only the approved ones.
• SUM(amount):
This calculates the total amount for all transactions in that group.
• SUM(CASE WHEN state = 'approved' THEN amount ELSE 0 END):
This sums the amounts only for approved transactions, calculating the total amount of
approved transactions.
• GROUP BY:
The data is grouped by month (which is derived from trans_date) and country.
• ORDER BY:
The result is ordered first by month and then by country to match the desired output format.
Expected Output:
2018-
US 2 1 3000 1000
12
2019-
US 1 1 2000 2000
01
2019-
DE 1 1 2000 2000
01
• Q.1044
Question
Given the reviews table, write a query to retrieve the average star rating for each product,
grouped by month.
The output should display the month as a numerical value, product ID, and average star rating
rounded to two decimal places.
Sort the output first by month and then by product ID.
1234
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this, you need to:
• Extract the month from the submit_date field.
• Group the data by product_id and month.
• Calculate the average star rating for each group.
• Round the average star rating to two decimal places.
• Sort the results by month and then by product_id.
Learnings
• Extracting the month and year from a timestamp or date field.
• Using GROUP BY to aggregate data.
• Calculating average values with AVG() and rounding the result using ROUND().
• Sorting results with ORDER BY.
Solutions
PostgreSQL Solution
SELECT
TO_CHAR(submit_date, 'YYYY-MM') AS month,
product_id,
ROUND(AVG(stars)::numeric, 2) AS average_star_rating
1235
1000+ SQL Interview Questions & Answers | By Zero Analyst
FROM reviews
GROUP BY TO_CHAR(submit_date, 'YYYY-MM'), product_id
ORDER BY month, product_id;
MySQL Solution
SELECT
DATE_FORMAT(submit_date, '%Y-%m') AS month,
product_id,
ROUND(AVG(stars), 2) AS average_star_rating
FROM reviews
GROUP BY DATE_FORMAT(submit_date, '%Y-%m'), product_id
ORDER BY month, product_id;
Explanation:
• TO_CHAR(submit_date, 'YYYY-MM') (PostgreSQL) or
DATE_FORMAT(submit_date, '%Y-%m') (MySQL): Extracts the year and month from
the submit_date as a string in the format YYYY-MM.
• AVG(stars): Calculates the average of the stars column for each group.
• ROUND(..., 2): Rounds the average star rating to 2 decimal places.
• GROUP BY: Groups the data by month and product_id to calculate the average per
product per month.
• ORDER BY: Orders the results first by month and then by product ID.
Expected Output:
1236
1000+ SQL Interview Questions & Answers | By Zero Analyst
Summary:
This query calculates the average star ratings for each product in each month, rounds the
average to two decimal places, and sorts the results by month and product ID.
• Q.1045
Question
Identify users who have made purchases totaling more than $10,000 in the last month from
the purchases table.
The table contains information about purchases, including the user ID, date of purchase,
product ID, and amount spent.
Explanation
To solve this:
• Filter the records to include only those within the last month.
• This can be done by comparing the date_of_purchase with the current date
(CURRENT_DATE) and checking if it falls within the last month.
• Group the records by user_id to aggregate the total amount spent by each user.
• Use the SUM() function to calculate the total amount spent by each user.
• Filter the users whose total spending is greater than $10,000.
1237
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date comparison to filter records from the last month.
• SUM() aggregation to calculate the total purchase amount.
• GROUP BY to aggregate data by user_id.
• HAVING clause to filter groups based on aggregated results.
Solutions
PostgreSQL Solution
SELECT user_id, SUM(amount_spent) AS total_spent
FROM purchases
WHERE date_of_purchase >= CURRENT_DATE - INTERVAL '1 month'
GROUP BY user_id
HAVING SUM(amount_spent) > 10000;
MySQL Solution
SELECT user_id, SUM(amount_spent) AS total_spent
FROM purchases
WHERE date_of_purchase >= CURDATE() - INTERVAL 1 MONTH
GROUP BY user_id
HAVING SUM(amount_spent) > 10000;
Expected Output:
user_id total_spent
145 27000.00
578 26000.00
Summary:
This query identifies users who made purchases totaling more than $10,000 in the last month.
The results display the user_id and their total spending during this period.
• Q.1046
1238
1000+ SQL Interview Questions & Answers | By Zero Analyst
Question
Identify the top 3 posts with the highest engagement (likes + comments) for each user on a
Facebook page. Display the user ID, post ID, engagement count, and rank for each post.
Explanation
You need to calculate the total engagement for each post (sum of likes and comments). Then,
rank the posts for each user based on this engagement in descending order. Finally, filter the
results to only show the top 3 posts for each user.
Learnings
• Use of ROW_NUMBER() for ranking.
• Window functions to partition data by user.
• Summing values for calculating engagement.
• Filtering based on rank.
Solutions
• - PostgreSQL Solution
WITH RankedPosts AS (
SELECT user_id, post_id, (likes + comments) AS engagement,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY (likes + comments) DESC) AS
rank
FROM fb_posts
)
1239
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
You need to group the job listings by company, title, and description, and then count how
many times each combination appears. If a combination appears more than once, it indicates
a duplicate listing. Finally, count how many companies have at least one duplicate job listing.
1240
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use GROUP BY to group rows based on shared attributes (company, title, description).
• Use HAVING to filter groups with more than one occurrence.
• Use COUNT for aggregating and filtering duplicate entries.
Solutions
• - PostgreSQL Solution
SELECT COUNT(DISTINCT company_id) AS cnt_company
FROM (
SELECT company_id, title, description, COUNT(*) AS total_job
FROM job_listings
GROUP BY company_id, title, description
HAVING COUNT(*) > 1
) x1;
• - MySQL Solution
SELECT COUNT(DISTINCT company_id) AS cnt_company
FROM (
SELECT company_id, title, description, COUNT(*) AS total_job
FROM job_listings
GROUP BY company_id, title, description
HAVING COUNT(*) > 1
) x1;
• Q.1048
Question
Identify the region with the lowest sales amount for the previous month. Return the region
name and total sales amount.
Explanation
The task is to identify the region with the lowest sales for the previous month. The solution
involves filtering the sales data for the previous month, grouping it by region, calculating the
total sales for each region, and then selecting the region with the lowest total sales.
-- datasets
INSERT INTO Sales (Region, Amount, SaleDate) VALUES
('North', 5000.00, '2024-02-01'),
('South', 6000.00, '2024-02-02'),
('East', 4500.00, '2024-02-03'),
('West', 7000.00, '2024-02-04'),
('North', 5500.00, '2024-02-05'),
('South', 6500.00, '2024-02-06'),
('East', 4800.00, '2024-02-07'),
('West', 7200.00, '2024-02-08'),
('North', 5200.00, '2024-02-09'),
('South', 6200.00, '2024-02-10'),
('East', 4700.00, '2024-02-11'),
('West', 7100.00, '2024-02-12'),
('North', 5300.00, '2024-02-13'),
1241
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Date Functions: Using EXTRACT() to filter records for the previous month.
• Aggregation: Summing values using SUM() and grouping by region with GROUP BY.
• Sorting: Sorting by total sales in ascending order to get the region with the lowest sales.
• Limiting: Using LIMIT 1 to get only the region with the lowest sales.
Solutions
• PostgreSQL solution
SELECT
region,
SUM(amount) as total_sales
FROM sales
WHERE EXTRACT(MONTH FROM saledate) = EXTRACT(MONTH FROM CURRENT_DATE - INTERVAL '1 month
')
AND EXTRACT(YEAR FROM saledate) = EXTRACT(YEAR FROM CURRENT_DATE)
GROUP BY region
ORDER BY total_sales ASC
LIMIT 1;
• MySQL solution
SELECT
region,
SUM(amount) AS total_sales
FROM sales
WHERE MONTH(saledate) = MONTH(CURRENT_DATE - INTERVAL 1 MONTH)
AND YEAR(saledate) = YEAR(CURRENT_DATE)
GROUP BY region
ORDER BY total_sales ASC
LIMIT 1;
• Q.1049
Question
Find the median within a series of numbers in SQL.
Explanation
To calculate the median, you need to:
• Order the data in ascending or descending order.
• Use ROW_NUMBER() to assign a rank to each record.
• For an odd number of records, the median is the middle value. For an even number, the
median is the average of the two middle values.
• Filter the rows to select the middle value(s) based on the rank difference.
1242
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- datasets
INSERT INTO tiktok (views)
VALUES
(100), (800), (350),
(150), (600),
(700), (700), (950);
Learnings
• Window Functions: Using ROW_NUMBER() to assign row numbers in both ascending and
descending order.
• Median Calculation: Calculating the median using ranking and selecting values based on
the rank difference.
• CTE: Using Common Table Expressions (CTEs) to simplify complex queries and
intermediate results.
• Handling Odd/Even Data: Identifying whether the data set size is odd or even and
calculating the appropriate median.
Solutions
• PostgreSQL solution
WITH CTE AS (
SELECT
views,
ROW_NUMBER() OVER(ORDER BY views ASC) AS rn_asc,
ROW_NUMBER() OVER(ORDER BY views DESC) AS rn_desc
FROM tiktok
WHERE views < 900
)
SELECT
AVG(views) AS median
FROM CTE
WHERE ABS(rn_asc - rn_desc) <= 1; -- 0 or 1
• MySQL solution
WITH CTE AS (
SELECT
views,
ROW_NUMBER() OVER(ORDER BY views ASC) AS rn_asc,
ROW_NUMBER() OVER(ORDER BY views DESC) AS rn_desc
FROM tiktok
WHERE views < 900
)
SELECT
AVG(views) AS median
FROM CTE
WHERE ABS(rn_asc - rn_desc) <= 1; -- 0 or 1
• Q.1050
Question
Which metro city had the highest number of restaurant orders in September 2021?
Write the SQL query to retrieve the city name and the total count of orders, ordered by the
total count of orders in descending order.
1243
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Filter the records for September 2021.
• Restrict the cities to 'Delhi', 'Mumbai', 'Bangalore', and 'Hyderabad'.
• Group the data by city and count the number of orders per city.
• Order the results by the count of orders in descending order to find the city with the highest
number of orders.
-- datasets
INSERT INTO restaurant_orders (city, restaurant_id, order_id, order_date)
VALUES
('Delhi', 101, 1, '2021-09-05'),
('Bangalore', 102, 12, '2021-09-08'),
('Bangalore', 102, 13, '2021-09-08'),
('Bangalore', 102, 14, '2021-09-08'),
('Mumbai', 103, 3, '2021-09-10'),
('Mumbai', 103, 30, '2021-09-10'),
('Chennai', 104, 4, '2021-09-15'),
('Delhi', 105, 5, '2021-09-20'),
('Bangalore', 106, 6, '2021-09-25'),
('Mumbai', 107, 7, '2021-09-28'),
('Chennai', 108, 8, '2021-09-30'),
('Delhi', 109, 9, '2021-10-05'),
('Bangalore', 110, 10, '2021-10-08'),
('Mumbai', 111, 11, '2021-10-10'),
('Chennai', 112, 12, '2021-10-15'),
('Kolkata', 113, 13, '2021-10-20'),
('Hyderabad', 114, 14, '2021-10-25'),
('Pune', 115, 15, '2021-10-28'),
('Jaipur', 116, 16, '2021-10-30');
Learnings
• Filtering by Date: Using the WHERE clause to filter data for a specific month and year.
• City Restriction: Filtering cities by a predefined list (Metro cities).
• Aggregation: Using COUNT() to count the number of orders per city.
• Sorting: Ordering the results by the count of orders in descending order to get the city with
the highest orders.
Solutions
• PostgreSQL solution
SELECT
city,
COUNT(order_id) AS total_orders
FROM restaurant_orders
WHERE city IN ('Delhi', 'Mumbai', 'Bangalore', 'Hyderabad')
1244
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To calculate the running total:
• Use the SUM() function with OVER() to compute the cumulative sum of revenue.
• The PARTITION BY clause is used to compute the running total separately for each
product_id.
• The ORDER BY clause ensures that the running total is calculated in the correct sequence
(by product_id and date).
• Ensure the result is ordered by product_id and date as per the requirement.
-- datasets
INSERT INTO orders (date, product_id, product_name, revenue) VALUES
('2024-01-01', 101, 'iPhone 13 Pro', 1000.00),
('2024-01-01', 102, 'iPhone 13 Pro Max', 1200.00),
('2024-01-02', 101, 'iPhone 13 Pro', 950.00),
('2024-01-02', 103, 'iPhone 12 Pro', 1100.00),
('2024-01-03', 102, 'iPhone 13 Pro Max', 1250.00),
('2024-01-03', 104, 'iPhone 11', 1400.00),
('2024-01-04', 101, 'iPhone 13 Pro', 800.00),
('2024-01-04', 102, 'iPhone 13 Pro Max', 1350.00),
('2024-01-05', 103, 'iPhone 12 Pro', 1000.00),
('2024-01-05', 104, 'iPhone 11', 700.00),
('2024-01-06', 101, 'iPhone 13 Pro', 600.00),
('2024-01-06', 102, 'iPhone 13 Pro Max', 550.00),
('2024-01-07', 101, 'iPhone 13 Pro', 400.00),
('2024-01-07', 103, 'iPhone 12 Pro', 250.00),
('2024-01-08', 102, 'iPhone 13 Pro Max', 200.00),
1245
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Window Functions: Using SUM() with OVER() to compute the running total.
• Partitioning: Using PARTITION BY to calculate running totals separately for each
product_id.
• Ordering: Ensuring the correct sequence of calculations using ORDER BY within the
OVER() clause.
• Date Handling: Managing the date ordering for running totals.
Solutions
• PostgreSQL solution
SELECT
date,
product_id,
product_name,
revenue,
SUM(revenue) OVER (PARTITION BY product_id ORDER BY date) AS running_total
FROM orders
ORDER BY product_id, date;
• MySQL solution
SELECT
date,
product_id,
product_name,
revenue,
SUM(revenue) OVER (PARTITION BY product_id ORDER BY date) AS running_total
FROM orders
ORDER BY product_id, date;
• Q.1052
Question
Suppose you are given two tables - Orders and Returns.
The Orders table contains information about orders placed by customers, and the Returns
table contains information about returned items.
Design a SQL query to find the top 5 customers with the highest percentage of returned items
out of their total orders.
Return the customer ID and the percentage of returned items rounded to two decimal places.
Explanation
To solve this:
• Join the orders and returns tables on order_id.
• Calculate the total items ordered and the total returned items for each customer.
1246
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Joining Tables: Using JOIN to combine data from two tables based on common order_id.
• Aggregating Data: Using SUM() to calculate the total items ordered and returned for each
customer.
1247
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
• PostgreSQL solution
SELECT
o.customer_id,
ROUND((COALESCE(SUM(r.returned_items), 0) / SUM(o.total_items_ordered)) * 100, 2) AS
return_percentage
FROM orders o
LEFT JOIN returns r ON o.order_id = r.order_id
GROUP BY o.customer_id
ORDER BY return_percentage DESC
LIMIT 5;
• MySQL solution
SELECT
o.customer_id,
ROUND((IFNULL(SUM(r.returned_items), 0) / SUM(o.total_items_ordered)) * 100, 2) AS r
eturn_percentage
FROM orders o
LEFT JOIN returns r ON o.order_id = r.order_id
GROUP BY o.customer_id
ORDER BY return_percentage DESC
LIMIT 5;
• Q.1053
Question
Write a SQL query to report the sum of all total investment values in 2016 (tiv_2016) for all
policyholders who:
• Have the same tiv_2015 value as one or more other policyholders.
• Are not located in the same city as any other policyholder (i.e., the (lat, lon) attribute
pairs must be unique).
The result should be rounded to two decimal places.
Explanation
• Step 1: Identify policyholders with the same tiv_2015 value as other policyholders. This
can be done by using a GROUP BY on tiv_2015 and filtering with HAVING COUNT(*) > 1 to
only include tiv_2015 values that appear more than once.
• Step 2: Ensure that the policyholders are not located in the same city. This means that the
(lat, lon) pair must be unique. We can use GROUP BY on (lat, lon) and filter with
HAVING COUNT(*) = 1.
• Step 3: Join these two criteria and calculate the sum of tiv_2016 for each pid that
satisfies both conditions.
• Step 4: Use ROUND() to round the tiv_2016 values to two decimal places.
1248
1000+ SQL Interview Questions & Answers | By Zero Analyst
tiv_2016 FLOAT,
lat FLOAT,
lon FLOAT
);
-- Sample data
INSERT INTO Insurance (pid, tiv_2015, tiv_2016, lat, lon) VALUES
(1, 10, 5, 10, 10),
(2, 20, 20, 20, 20),
(3, 10, 30, 20, 20),
(4, 10, 40, 40, 40);
Learnings
• Using GROUP BY and HAVING: We use GROUP BY to group by columns like tiv_2015 and
(lat, lon), and HAVING to apply conditions like COUNT(*) > 1 for duplicates.
• Joins: In this case, we need to check both the same tiv_2015 value and the unique (lat,
lon) pair, which may require self-joins or filtering after grouping.
• Rounding: We use the ROUND() function to round the tiv_2016 to two decimal places.
Solutions
• PostgreSQL solution
SELECT ROUND(SUM(i.tiv_2016), 2) AS total_tiv_2016
FROM Insurance i
JOIN (
SELECT tiv_2015
FROM Insurance
GROUP BY tiv_2015
HAVING COUNT(*) > 1
) dup_tiv ON i.tiv_2015 = dup_tiv.tiv_2015
JOIN (
SELECT lat, lon
FROM Insurance
GROUP BY lat, lon
HAVING COUNT(*) = 1
) unique_loc ON i.lat = unique_loc.lat AND i.lon = unique_loc.lon;
• MySQL solution
SELECT ROUND(SUM(i.tiv_2016), 2) AS total_tiv_2016
FROM Insurance i
JOIN (
SELECT tiv_2015
FROM Insurance
GROUP BY tiv_2015
HAVING COUNT(*) > 1
) dup_tiv ON i.tiv_2015 = dup_tiv.tiv_2015
JOIN (
SELECT lat, lon
FROM Insurance
GROUP BY lat, lon
HAVING COUNT(*) = 1
) unique_loc ON i.lat = unique_loc.lat AND i.lon = unique_loc.lon;
• Q.1054
Question
Write a SQL query to fix the names in the Users table so that only the first character is
uppercase and the rest are lowercase.
Return the result table ordered by user_id.
Explanation
1249
1000+ SQL Interview Questions & Answers | By Zero Analyst
-- Sample data
INSERT INTO Users (user_id, name) VALUES
(1, 'aLice'),
(2, 'bOB');
Learnings
• String Manipulation: Using UPPER() and LOWER() to manipulate string data in SQL.
• Concatenation: Using the CONCAT() function (or a similar method) to merge strings, in
this case, for fixing the name format.
• Ordering Results: Using ORDER BY to sort the result by user_id.
Solutions
• PostgreSQL solution
SELECT
user_id,
INITCAP(name) AS name
FROM Users
ORDER BY user_id;
• MySQL solution
SELECT
user_id,
CONCAT(UPPER(SUBSTRING(name, 1, 1)), LOWER(SUBSTRING(name, 2))) AS name
FROM Users
ORDER BY user_id;
1250
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
• Condition Matching: We need to filter the rows where the conditions column contains a
code starting with the prefix DIAB1.
• String Matching: The LIKE operator can be used to check if the conditions column
contains DIAB1 at the beginning of any condition code.
• Handling Multiple Conditions: Since the conditions column can contain multiple codes
separated by spaces, we need to ensure that we correctly identify whether DIAB1 appears as
part of any condition.
-- Sample data
INSERT INTO Patients (patient_id, patient_name, conditions) VALUES
(1, 'Daniel', 'YFEV COUGH'),
(2, 'Alice', ''),
(3, 'Bob', 'DIAB100 MYOP'),
(4, 'George', 'ACNE DIAB100'),
(5, 'Alain', 'DIAB201');
Learnings
• String Matching: Using LIKE to search for specific patterns in text data.
• Handling Multiple Words in a String: Since conditions are space-separated, LIKE can
match a prefix within the string.
Solutions
• PostgreSQL solution
SELECT
patient_id,
patient_name,
conditions
FROM Patients
WHERE conditions LIKE '%DIAB1%'
;
• MySQL solution
SELECT
patient_id,
patient_name,
conditions
FROM Patients
WHERE conditions LIKE '%DIAB1%'
;
Explanation:
• LIKE '%DIAB1%': The LIKE operator checks if the conditions field contains the string
DIAB1 anywhere, ensuring we capture any code starting with DIAB1.
Both solutions are identical since the query is simple and works in both PostgreSQL and
MySQL in the same way. The result will contain all the patients with any condition starting
with DIAB1.
1251
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Q.1056
Question
Write a SQL query to delete all duplicate emails from the Person table, keeping only the one
with the smallest id.
Explanation
• Identifying Duplicates: Duplicate emails can be identified by grouping the email column
and selecting the minimum id for each email.
• Deleting Duplicates: After identifying the smallest id for each email, we need to delete
rows where the id is greater than the minimum id for that email.
To do this, we can use a DELETE statement with a subquery. The subquery will select the
id of the rows that are not the smallest id for each email, and those rows will be deleted.
-- Sample data
INSERT INTO Person (id, email) VALUES
(1, '[email protected]'),
(2, '[email protected]'),
(3, '[email protected]');
Learnings
• DELETE with a Subquery: A common approach to deleting rows based on conditions
related to other rows in the same table.
• GROUP BY and MIN(): To find the smallest id for each duplicate email, GROUP BY is
used with the MIN() function.
• Subquery: Used to find all id values that should be retained, and delete all others.
Solutions
• PostgreSQL/MySQL solution
DELETE FROM Person
WHERE id NOT IN (
SELECT MIN(id)
FROM Person
GROUP BY email
);
Explanation:
• Subquery:
• SELECT MIN(id) FROM Person GROUP BY email finds the smallest id for each unique
email.
• Main Query:
1252
1000+ SQL Interview Questions & Answers | By Zero Analyst
• DELETE FROM Person WHERE id NOT IN (...) deletes rows whose id is not the
smallest for their email.
This query will keep only the row with the smallest id for each email and delete all other
rows with the same email.
• Q.1057
Question
Write a SQL query to convert the first letter of each word in content_text to uppercase,
while keeping the rest of the letters lowercase. Return the original text and the modified text
in two columns.
Explanation
• Splitting the Text into Words: We need to split the content_text into individual words,
process each word to capitalize the first letter, and then reassemble the sentence.
• Uppercase and Lowercase: The function UPPER(LEFT(word, 1)) is used to capitalize
the first letter of each word, while LOWER(RIGHT(word, LENGTH(word)-1)) ensures that the
rest of the word is in lowercase.
• Reassemble the Sentence: After modifying each word, the words should be concatenated
back together with a space separator.
• Final Output: The result will have two columns—one with the original content and the
other with the modified content.
-- Sample data
INSERT INTO user_content
(
content_id,
customer_id,
content_type,
content_text
)
VALUES
(1, 2, 'comment', 'hello world! this is a TEST.'),
(2, 8, 'comment', 'what a great day'),
(3, 4, 'comment', 'WELCOME to the event.'),
(4, 2, 'comment', 'e-commerce is booming.'),
(5, 6, 'comment', 'Python is fun!!'),
(6, 6, 'review', '123 numbers in text.'),
(7, 10, 'review', 'special chars: @#$$%^&*()'),
(8, 4, 'comment', 'multiple CAPITALS here.'),
(9, 6, 'review', 'sentence. and ANOTHER sentence!'),
(10, 2, 'post', 'goodBYE!');
Learnings
• String Manipulation: Using UPPER() and LOWER() functions to adjust case.
1253
1000+ SQL Interview Questions & Answers | By Zero Analyst
• Text Processing: Handling text split and reassembly using functions like
STRING_TO_ARRAY(), UNNEST(), and STRING_AGG().
• Aggregating Data: Using STRING_AGG() to concatenate processed words back together.
Solutions
• PostgreSQL Solution
WITH t1 AS (
SELECT
content_id,
content_text as original_content,
UNNEST(STRING_TO_ARRAY(content_text, ' ')) as word
FROM user_content
),
t2 AS (
SELECT
content_id,
original_content,
STRING_AGG(
CONCAT(UPPER(LEFT(word, 1)), LOWER(RIGHT(word, LENGTH(word)-1))),
' '
) as modified_content
FROM t1
GROUP BY content_id, original_content
ORDER BY content_id
)
SELECT
original_content,
modified_content
FROM t2;
• MySQL Solution
-- MySQL does not have direct functions like STRING_TO_ARRAY and STRING_AGG,
-- so we need to process the text manually.
SELECT
content_text AS original_content,
-- Concatenate words, capitalizing the first letter of each word
TRIM(
CONCAT(
UPPER(SUBSTRING(content_text, 1, 1)),
LOWER(SUBSTRING(content_text, 2))
)
) AS modified_content
FROM user_content;
• Q.1058
Question
You are given two tables:
• Transactions – Stores transaction details (ID, customer ID, date, and amount).
• Customers – Stores customer details (ID and name).
Find the average transaction amount for each customer who made more than 5 transactions in
September 2023.
Explanation
To solve this, we need to:
• Filter transactions that occurred in September 2023.
• Count the number of transactions per customer during that period.
• Find customers with more than 5 transactions in September 2023.
• Calculate the average transaction amount for those customers.
1254
1000+ SQL Interview Questions & Answers | By Zero Analyst
Learnings
• Use COUNT() with GROUP BY to count the number of transactions per customer.
• Filter transactions within the month of September 2023 using WHERE and BETWEEN.
• Use HAVING to filter customers with more than 5 transactions.
• Join the Transactions and Customers tables to get the customer's name along with their
average transaction amount.
Solutions
PostgreSQL Solution:
SELECT c.customer_name, AVG(t.amount) AS average_amount
FROM Transactions t
JOIN Customers c ON t.customer_id = c.customer_id
WHERE t.transaction_date BETWEEN '2023-09-01' AND '2023-09-30'
GROUP BY c.customer_name
HAVING COUNT(t.transaction_id) > 5;
MySQL Solution:
SELECT c.customer_name, AVG(t.amount) AS average_amount
FROM Transactions t
JOIN Customers c ON t.customer_id = c.customer_id
WHERE t.transaction_date BETWEEN '2023-09-01' AND '2023-09-30'
GROUP BY c.customer_name
HAVING COUNT(t.transaction_id) > 5;
• Q.1059
Question
Write an SQL query to find all dates' ID with a higher temperature compared to its previous
date (yesterday).
1255
1000+ SQL Interview Questions & Answers | By Zero Analyst
Explanation
To solve this:
• Use the LAG() window function to get the temperature from the previous day.
• Compare the current day's temperature with the previous day's temperature.
• Filter the results where the current day's temperature is greater than the previous day's
temperature.
Learnings
• LAG() window function allows you to access the previous row's value.
• WHERE clause is used to filter for cases where today's temperature is greater than
yesterday's.
• ORDER BY within OVER() ensures the data is ordered by date to properly compare
consecutive rows.
Solutions
PostgreSQL Solution:
WITH weather_data AS (
SELECT *,
LAG(temperature, 1) OVER(ORDER BY recorddate) AS prev_day_temp
FROM weather
)
SELECT id
FROM weather_data
WHERE temperature > prev_day_temp;
MySQL Solution:
WITH weather_data AS (
SELECT *,
LAG(temperature, 1) OVER(ORDER BY recorddate) AS prev_day_temp
FROM weather
)
SELECT id
FROM weather_data
WHERE temperature > prev_day_temp;
• Q.1060
Question
1256
1000+ SQL Interview Questions & Answers | By Zero Analyst
Write an SQL query that reports for every date within at most 90 days from today, the
number of users that logged in for the first time on that date. Assume today is 2019-06-30.
Note that we only care about dates with non-zero user count.
Explanation
We need to identify users who logged in for the first time on a given date within the last 90
days from today (2019-06-30). This can be achieved by checking the earliest "login" date for
each user and counting users who logged in on each distinct date.
Learnings
• Identifying first-time logins requires filtering out older activities and only considering the
earliest "login" event.
• Use GROUP BY and MIN() to extract the first login date for each user.
• Use COUNT() and HAVING to count non-zero results.
1257
1000+ SQL Interview Questions & Answers | By Zero Analyst
Solutions
PostgreSQL Solution:
WITH first_logins AS (
SELECT user_id, MIN(activity_date) AS first_login_date
FROM traffic
WHERE activity = 'login'
GROUP BY user_id
)
SELECT first_login_date, COUNT(user_id) AS user_count
FROM first_logins
WHERE first_login_date BETWEEN '2019-06-01' AND '2019-06-30'
GROUP BY first_login_date
HAVING COUNT(user_id) > 0
ORDER BY first_login_date;
MySQL Solution:
WITH first_logins AS (
SELECT user_id, MIN(activity_date) AS first_login_date
FROM traffic
WHERE activity = 'login'
GROUP BY user_id
)
SELECT first_login_date, COUNT(user_id) AS user_count
FROM first_logins
WHERE first_login_date BETWEEN '2019-06-01' AND '2019-06-30'
GROUP BY first_login_date
HAVING COUNT(user_id) > 0
ORDER BY first_login_date;
1258
1000+ SQL Interview Questions & Answers | By Zero Analyst
1259
1000+ SQL Interview Questions & Answers | By Zero Analyst
Stay Connected:
• LinkedIn: N. H.
• GitHub: 1000 SQL Questions & Answers
• Instagram: @Zero_Analyst
• YouTube: Zero Analyst
Acknowledgments:
A heartfelt thank you to every reader, student, and follower who has been part of my journey.
Your dedication to learning and growing fuels my passion for creating resources like this
book.
1260
1000+ SQL Interview Questions & Answers | By Zero Analyst
1261