LEETCODE SQL INTERVIEW QUESTIONS & SOLUTIONS
LEETCODE SQL INTERVIEW QUESTIONS & SOLUTIONS
Table: Products
0 Y N
1 Y Y
2 N Y
3 Y Y
4 N N
product_id is the primary key (column with unique values) for this table.
low_fats is an ENUM (category) of type ('Y', 'N') where 'Y' means this product is low fat and 'N' means it is not.
recyclable is an ENUM (category) of types ('Y', 'N') where 'Y' means this product is recyclable and 'N' means it is not.
Write a solution to find the ids of products that are both low fat and recyclable.
Output:
+-------------+
| product_id |
+-------------+
|1 |
|3 |
+-------------+
Solution
SELECT product_id FROM Products WHERE low_fats = 'Y' AND recyclable = 'Y';
1 Will null
2 Jane null
3 Alex 2
4 Bill null
5 Zack 1
6 Mark 2
Each row of this table indicates the id of a customer, their name, and the id of the customer who referred them.
Find the names of the customer that are not referred by the customer with id = 2.
Output:
+------+
| name |
+------+
| Will |
| Jane |
| Bill |
| Zack |
+------+
Solution
Each row of this table gives information about the name of a country, the continent to which it belongs, its area, the
population, and its GDP value.
Write a solution to find the name, population, and area of the big countries.
Output:
Solution
SELECT name, population, area FROM World WHERE area >= 3000000 OR population >= 25000000;
1 3 5 2019-08-01
1 3 6 2019-08-02
2 7 7 2019-08-01
2 7 6 2019-08-02
4 7 1 2019-07-22
3 4 4 2019-07-21
3 4 4 2019-07-21
There is no primary key (column with unique values) for this table, the table may have duplicate rows.
Each row of this table indicates that some viewer viewed an article (written by some author) on some date.
Note that equal author_id and viewer_id indicate the same person.
Write a solution to find all the authors that viewed at least one of their own articles.
O t t
Output:
+------+
| id |
+------+
|4 |
|7 |
+------+
Solution
SELECT DISTINCT author_id AS id FROM Views WHERE author_id = viewer_id ORDER BY id;
tweet_id is the primary key (column with unique values) for this table.
Write a solution to find the IDs of the invalid tweets. The tweet is invalid if the number of characters used
in the content of the tweet is strictly greater than 15.
Output:
+----------+
| tweet_id |
+----------+
|2 |
+----------+
Solution
id is the primary key (column with unique values) for this table.
Each row of this table contains the id and the name of an employee in a company.
EmployeeUNI table:
+----+-----------+
| id | unique_id |
+----+-----------+
|3 |1 |
| 11 | 2 |
| 90 | 3 |
+----+-----------+
(id, unique_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the id and the corresponding unique id of an employee in the company.
Write a solution to show the unique ID of each user, If a user does not have a unique ID replace just show
null.
Output:
+-----------+----------+
| unique_id | name |
+-----------+----------+
| null | Alice |
| null | Bob |
|2 | Meir |
|3 | Winston |
|1 | Jonathan |
+-----------+----------+
Solution
SELECT unique_id, name FROM Employees LEFT JOIN EmployeeUNI USING(id)
Explanation:
Use LEFT JOIN to ensure that all employees from the Employees table are included in the result, even if they
don't have a corresponding unique_id in the EmployeeUNI table.
If an employee doesn’t have a matching id in EmployeeUNI, unique_id will be NULL.
7)Find the product Sales analysis
Sales table:
(sale_id, year) is the primary key (combination of columns with unique values) of this table.
Each row of this table shows a sale on the product product_id in a certain year.
Product table:
product_id product_name
100 Nokia
200 Apple
300 Samsung
product_id is the primary key (column with unique values) of this table.
Each row of this table indicates the product name of each product.
Write a solution to report the product_name, year, and price for each sale_id in the Sales table.
Output:
Solution
visit_id customer_id
1 23
2 9
4 30
5 54
6 96
7 54
8 54
This table contains information about the customers who visited the mall.
Table: Transactions
2 5 310
3 5 300
9 5 200
12 1 910
13 2 970
This table contains information about the transactions made during the visit_id.
Write a solution to find the IDs of the users who visited without making any transactions and the number of
times they made these types of visits.
Output:
customer_id count_no_trans
54 2
30 1
96 1
Solution
SELECT v.customer_id,
COUNT(v.visit_id) AS count_no_trans
FROM Visits v
LEFT JOIN Transactions t USING(visit_id)
WHERE t.transaction_id IS NULL
GROUP BY v.customer_id;
Explanation:
Customer with id = 23 visited the mall once and made one transaction during the visit with id = 12.
Customer with id = 9 visited the mall once and made one transaction during the visit with id = 13.
Customer with id = 30 visited the mall once and did not make any transactions.
Customer with id = 54 visited the mall three times. During 2 visits they did not make any transactions, and during
one visit they made 3 transactions.
Customer with id = 96 visited the mall once and did not make any transactions.
As we can see, users with IDs 30 and 96 visited the mall one time without making any transactions. Also, user 54
visited the mall twice and did not make any transactions.
id recordDate temperature
1 2015-01-01 10
2 2015-01-02 25
3 2015-01-03 20
4 2015-01-04 30
Write a solution to find all dates' id with higher temperatures compared to its previous dates (yesterday).
Output:
+----+
| id |
+----+
|2 |
|4 |
+----+
Solution
Explanation:
In 2015-01-02, the temperature was higher than the previous day (10 -> 25).
In 2015-01-04, the temperature was higher than the previous day (20 -> 30).
0 0 start 0.712
0 0 end 1.520
0 1 start 3.140
0 1 end 4.120
1 0 start 0.550
1 0 end 1.550
1 1 start 0.430
1 1 end 1.420
2 0 start 4.100
2 0 end 4.512
2 1 start 2.500
2 1 end 5.000
(machine_id, process_id, activity_type) is the primary key (combination of columns with unique values) of this table.
'start' means the machine starts the process at the given timestamp and 'end' means the machine ends the process
at the given timestamp.
The 'start' timestamp will always be before the 'end' timestamp for every (machine_id, process_id) pair.
It is guaranteed that each (machine_id, process_id) pair has a 'start' and 'end' timestamp.
There is a factory website that has several machines each running the same number of processes.
Write a solution to find the average time each machine takes to complete a process. The time to complete a
process is the 'end' timestamp minus the 'start' timestamp. The average time is calculated by the total time to
complete every process on the machine divided by the number of processes that were run. The resulting table should
have the machine_id along with the average time as processing_time, which should be rounded to 3 decimal
places.
Output:
+------------+-----------------+
| hi id | i ti |
| machine_id | processing_time |
+------------+-----------------+
|0 | 0.894 |
|1 | 0.995 |
|2 | 1.456 |
+------------+-----------------+
Solution
WITH ProcessTime AS
(SELECT machine_id,
process_id,
FROM Activity
SELECT machine_id,
GROUP BY machine_id;
Explanation:
We extract the start and end timestamps for each (machine_id, process_id) pair.
Using MAX(CASE WHEN activity_type = 'end' THEN timestamp END), we get the end timestamp.
Using MAX(CASE WHEN activity_type = 'start' THEN timestamp END), we get the start timestamp.
The time taken for each process is calculated as end timestamp - start timestamp.
We use AVG(process_time) grouped by machine_id to compute the average processing time per machine.
The result is rounded to 3 decimal places using ROUND().
1 John 3 1000
2 Dan 3 2000
4 Thomas 3 4000
Each row of this table indicates the name and the ID of an employee in addition to their salary and the id of their
manager.
Table: Bonus
empId bonus
2 500
4 2000
empId is a foreign key (reference column) to empId from the Employee table.
Each row of this table contains the id of an employee and their respective bonus.
Write a solution to report the name and bonus amount of each employee with a bonus less than 1000.
Output:
+------+-------+
| name | bonus |
+------+-------+
| Brad | null |
| John | null |
| Dan | 500 |
+------+-------+
Solution
SELECT e.name, b.bonus FROM Employee e
Explanation:
LEFT JOIN: We join the Employee table with the Bonus table on empId. This ensures that all
employees are included, even if they don’t have a bonus.
student_id student_name
1 Alice
2 Bob
13 John
6 Alex
student_id is the primary key (column with unique values) for this table.
Each row of this table contains the ID and the name of one student in the school.
Table: Subjects
subject_name
Math
Physics
Programming
subject_name is the primary key (column with unique values) for this table.
Each row of this table contains the name of one subject in the school.
Table: Examinations
+------------+--------------+
| student_id | subject_name |
+------------+--------------+
|1 | Math |
|1 | Physics |
|1 | Programming |
|2 | Programming |
|1 | Physics |
|1 | Math |
| 13 | Math |
| 13 | Programming |
| 13 | Ph i |
| 13 | Physics |
|2 | Math |
|1 | Math |
+------------+--------------+
There is no primary key (column with unique values) for this table. It may contain duplicates.
Each student from the Students table takes every course from the Subjects table.
Each row of this table indicates that a student with ID student_id attended the exam of subject_name.
Write a solution to find the number of times each student attended each exam.
Output:
1 Alice Math 3
1 Alice Physics 2
1 Alice Programming 1
2 Bob Math 1
2 Bob Physics 0
2 Bob Programming 1
6 Alex Math 0
6 Alex Physics 0
6 Alex Programming 0
13 John Math 1
13 John Physics 1
13 John Programming 1
Solution
SELECT s.student_id,
s.student_name,
sub.subject_name,
COUNT(e.subject_name) AS attended_exams FROM Students s
CROSS JOIN Subjects sub
LEFT JOIN Examinations e ON s.student_id = e.student_id
AND sub.subject_name = e.subject_name
GROUP BY s.student_id, s.student_name, sub.subject_name ORDER BY s.student_id, sub.subject_name;
Explanation:
CROSS JOIN:
We first create a combination of all students with all subjects using CROSS JOIN (no common column id
needed for CROSS JOIN)
This ensures that even students who haven’t attended any exams still appear in the result.
We join the generated student-subject combination with the Examinations table to count the number of times a
student attended an exam for a given subject.
COUNT(e.subject_name):
We group by student_id, student_name, and subject_name to get the correct counts per student and subject.
The final output is sorted by student_id and subject_name to match the required format.
The result table should contain all students and all subjects.
Alice attended the Math exam 3 times, the Physics exam 2 times, and the Programming exam 1 time.
Bob attended the Math exam 1 time, the Programming exam 1 time, and did not attend the Physics exam.
John attended the Math exam 1 time, the Physics exam 1 time, and the Programming exam 1 time.
Each row of this table indicates the name of an employee, their department, and the id of their manager.
Output:
+------+
| name |
+------+
| John |
+------+
Solution
SELECT e.name FROM Employee e JOIN Employee m ON e.id = m.managerId
user_id time_stamp
3 2020-03-21 10:16:13
7 2020-01-04 13:57:59
2 2020-07-29 23:09:44
6 2020-12-09 10:39:37
Each row contains information about the signup time for the user with ID user_id.
Table: Confirmations
user_id time_stamp action
(user_id, time_stamp) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates that the user with ID user_id requested a confirmation message at time_stamp and
that confirmation message was either confirmed ('confirmed') or expired without confirming ('timeout').
The confirmation rate of a user is the number of 'confirmed' messages divided by the total number of requested
confirmation messages. The confirmation rate of a user that did not request any confirmation messages is 0. Round
the confirmation rate to two decimal places.
Output:
+---------+-------------------+
| user_id | confirmation_rate |
+---------+-------------------+
|6 | 0.00 |
|3 | 0.00 |
|7 | 1.00 |
|2 | 0.50 |
+---------+-------------------+
Solution
SELECT user_id,
GROUP BY user_id;
Explanation:
SUM(IF(C.action='confirmed', 1, 0))/COUNT(C.action), 0), 2) : This part sums up the results of the CASE
statement. It counts the number of times the CASE statement returned 1, which effectively counts the number
of 'confirmed' actions for each user.
COALESCE(..., 0): This is crucial. COALESCE handles the case where a user has no entries in the Confirmations
table. In that scenario, COUNT(c.action) would be 0, and dividing by zero would result in an error or NULL.
COALESCE checks the result of the division.
User 6 did not request any confirmation messages. The confirmation rate is 0.
User 3 made 2 requests and both timed out. The confirmation rate is 0.
User 7 made 3 requests and all were confirmed. The confirmation rate is 1.
User 2 made 2 requests where one was confirmed and the other timed out. The confirmation rate is 1 / 2 = 0.5.
id is the primary key (column with unique values) for this table.
Each row contains information about the name of a movie, its genre, and its rating.
Write a solution to report the movies with an odd-numbered ID and a description that is not "boring".
Output:
+----+------------+-------------+--------+
| id | i |d i ti | ti |
| id | movie | description | rating |
+----+------------+-------------+--------+
| 5 | House card | Interesting | 9.1 |
| 1 | War | great 3D | 8.9 |
+----+------------+-------------+--------+
Solution
SELECT id, movie, description, rating FROM Cinema
1 2019-02-17 2019-02-28 5
1 2019-03-01 2019-03-22 20
2 2019-02-01 2019-02-20 15
2 2019-02-21 2019-03-31 30
(product_id, start_date, end_date) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates the price of the product_id in the period from start_date to end_date.
For each product_id there will be no two overlapping periods. That means there will be no two intersecting periods
for the same product_id.
Table: UnitsSold
1 2019-02-25 100
1 2019-03-01 15
2 2019-02-10 200
2 2019-03-22 30
Write a solution to find the average selling price for each product. average_price should be rounded to 2
decimal places. If a product does not have any sold units, its average selling price is assumed to be 0.
Output:
+------------+---------------+
| product_id | average_price |
+------------+---------------+
|1 | 6.96 |
|2 | 16.96 |
+------------+---------------+
Solution
SELECT p.product_id,
FROM Prices p
GROUP BY p.product_id;
Explanation:
Average selling price for product 1 = ((100 * 5) + (15 * 20)) / 115 = 6.96
Average selling price for product 2 = ((200 * 15) + (30 * 30)) / 230 = 16.96
1 1
1 2
1 3
2 1
2 4
Each row of this table indicates that the employee with employee_id is working on the project with project_id.
Table: Employee
1 Khaled 3
2 Ali 2
3 John 1
4 Doe 2
employee_id is the primary key of this table. It's guaranteed that experience_years is not NULL.
Write an SQL query that reports the average experience years of all the employees for each project,
rounded to 2 digits.
Output:
+-------------+---------------+
| project_id | average_years |
+-------------+---------------+
|1 | 2.00 |
|2 | 2.50 |
+-------------+---------------+
Solution
SELECT p.project_id, ROUND(AVG(e.experience_years), 2) AS average_years FROM Project p
Explanation:
The average experience years for the first project is (3 + 2 + 1) / 3 = 2.00 and for the second project is
(3 + 2) / 2 = 2.50
user_id user_name
6 Alice
2 Bob
7 Alex
user_id is the primary key (column with unique values) for this table.
Each row of this table contains the name and the id of a user.
Table: Register
contest_id user_id
215 6
209 2
208 2
210 6
208 6
209 7
209 6
215 7
208 7
210 2
207 2
210 7
(contest_id, user_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the id of a user and the contest they registered into.
Write a solution to find the percentage of the users registered in each contest rounded to two decimals.
Return the result table ordered by percentage in descending order. In case of a tie, order it by contest_id in
ascending order.
Output:
contest_id percentage
208 100.0
209 100.0
210 100.0
215 66.67
207 33.33
Solution
SELECT contest_id,
round(COUNT(USER_ID)*100 /
FROM USERS
GROUP BY contest_id
Explanation:
All the users registered in contests 208, 209, and 210. The percentage is 100% and we sort them in the
answer table by contest_id in ascending order.
Alice and Alex registered in contest 215 and the percentage is ((2/3) * 100) = 66.67%
Bob registered in contest 207 and the percentage is ((1/3) * 100) = 33.33%
Cat Shirazi 5 2
Cat Siamese 3 3
Cat Sphynx 7 4
The rating column has a value from 1 to 5. Query with rating less than 3 is a poor query.
The average of the ratio between query rating and its position.
Output:
Solution
SELECT query_name,
FROM Queries
GROUP BY query_name;
Explanation:
Write an SQL query to find for each month and country, the number of transactions and their total amount,
the number of approved transactions and their total amount.
Output:
month country trans_count approved_count trans_total_a approved_total_amount
mount
Solution
SELECT DATE_FORMAT(trans_date, '%Y-%m') AS MONTH,
country,
COUNT(trans_date) AS trans_count,
SUM(amount) AS trans_total_amount,
FROM transactions
1 1 2019-08-01 2019-08-02
2 2 2019-08-02 2019-08-02
3 1 2019-08-11 2019-08-12
4 3 2019-08-24 2019-08-24
5 3 2019-08-21 2019-08-22
6 2 2019-08-11 2019-08-13
7 4 2019-08-09 2019-08-09
delivery_id is the column of unique values of this table.
The table holds information about food delivery to customers that make orders at some date and specify a preferred
delivery date (on the same order date or after it).
If the customer's preferred delivery date is the same as the order date, then the order is called immediate;
otherwise, it is called scheduled.
The first order of a customer is the order with the earliest order date that the customer made. It is guaranteed that
a customer has precisely one first order.
Write a solution to find the percentage of immediate orders in the first orders of all customers, rounded to
2 decimal places.
Output:
+----------------------+
| immediate_percentage |
+----------------------+
| 50.00 |
+----------------------+
Solution
WITH FirstOrders AS
FROM Delivery)
SELECT ROUND(count(*)*100 /
FROM FirstOrders
Explanation:
ORDER BY order_date ASC → Assigns a row number to each order in ascending order of order_date.
min_order_date = 1 → Ensures only the first orders of each customer are considered.
order_date = customer_pref_delivery_date → Selects only immediate orders (orders delivered on the same
day).
Calculation:
1 2 2016-03-01 5
1 2 2016-03-02 6
2 3 2017-06-25 1
3 1 2016-03-02 0
3 4 2018-07-03 5
(player_id, event_date) is the primary key (combination of columns with unique values) of this table.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on
someday using some device.
Write a solution to report the fraction of players that logged in again on the day after the day they first
logged in, rounded to 2 decimal places. In other words, you need to count the number of players that
logged in for at least two consecutive days starting from their first login date, then divide that number by
the total number of players.
Output:
+-----------+
| fraction |
+-----------+
| 0.33 |
+-----------+
Solution
SELECT ROUND(COUNT(DISTINCT player_id) /
(SELECT COUNT(DISTINCT player_id)
FROM Activity), 2) AS fraction
FROM Activity
WHERE (player_id, DATE_SUB(event_date, INTERVAL 1 DAY))
IN
(SELECT player_id, MIN(event_date) AS first_login FROM ACTIVITY GROUP BY player_id)
Explanation:
Only the player with id 1 logged back in after the first day he had logged in so the answer is 1/3 = 0.33
FROM Activity
GROUP BY player_id
This finds the earliest login date (first login) for each player_id.
FROM Activity
The WHERE condition checks if (player_id, event_date - 1 day) exists in the first login subquery.
DATE_SUB(event_date, INTERVAL 1 DAY) shifts the event_date back by one day.
The IN clause ensures that this new date matches the first login date of that player.
1 2 3
1 2 4
1 3 3
2 1 1
2 2 1
2 3 1
2 4 1
(subject_id, dept_id) is the primary key (combinations of columns with unique values) of this table.
Each row in this table indicates that the teacher with teacher_id teaches the subject subject_id in the department
dept_id.
Write a solution to calculate the number of unique subjects each teacher teaches in the university.
Output:
teacher_id cnt
1 2
2 4
Solution
SELECT teacher_id, COUNT(DISTINCT subject_id) AS cnt FROM Teacher GROUP BY teacher_id;
Explanation:
Teacher 1:
Teacher 2:
1 1 2019-07-20 open_session
1 1 2019-07-20 scroll_down
1 1 2019-07-20 end_session
2 4 2019-07-20 open_session
2 4 2019-07-21 send_message
2 4 2019-07-21 end_session
3 2 2019-07-21 open_session
3 2 2019-07-21 send_message
3 2 2019-07-21 end_session
4 3 2019-06-25 open_session
4 3 2019-06-25 end_session
The activity_type column is an ENUM (category) of type ('open_session', 'end_session', 'scroll_down', 'send_message').
The table shows the user activities for a social media website.
Write a solution to find the daily active user count for a period of 30 days ending 2019-07-27inclusively. A user was
active on someday if they made at least one activity on that day.
day active_users
2019-07-20 2
2019-07-21 2
Note that we do not care about days with zero active users.
Solution
SELECT activity_date AS DAY, COUNT(DISTINCT user_id) AS active_users FROM Activity
GROUP BY activity_date;
Explanation:
25)Find the sales analysis for the first year of every product sold
Table: Sales
(sale_id, year) is the primary key (combination of columns with unique values) of this table.
Each row of this table shows a sale on the product product_id in a certain year.
Write a solution to select the product id, year, quantity, and price for the first year of every product sold.
Solution
WITH FirstProductSale AS
FROM SALES)
SELECT product_id, year AS first_year, quantity, price FROM FirstProductSale WHERE sale_rank = 1;
Explanation:
1. CTE FirstProductSale:
This CTE calculates the rank for each product's sales based on the year (ascending order),
partitioned by product_id, which helps identify the first sale for each product.
2. sale_rank:
It represents the ranking of the sales for each product by year. The first sale of each product
will have a sale_rank of 1.
3. Final SELECT Statement:
Filters out the first sale of each product (where sale_rank = 1) and returns the relevant columns
(product_id, first_year, quantity, and price).
https://www.linkedin.com/in/niranjana405/