0% found this document useful (0 votes)
2 views30 pages

40 SQL Interview Questions & Solution For DBA

The document presents 40 SQL interview questions and solutions specifically designed for Database Administrators (DBAs). It covers various SQL challenges related to customer retention, salary ranking, revenue analytics, and operational queries, with detailed examples and outputs. The questions aim to assess a candidate's ability to write queries and reason with data effectively.

Uploaded by

CCPCCP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views30 pages

40 SQL Interview Questions & Solution For DBA

The document presents 40 SQL interview questions and solutions specifically designed for Database Administrators (DBAs). It covers various SQL challenges related to customer retention, salary ranking, revenue analytics, and operational queries, with detailed examples and outputs. The questions aim to assess a candidate's ability to write queries and reason with data effectively.

Uploaded by

CCPCCP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

40 SQL Interview Questions &

Solutions for DBAs

Asfaw Gedamu

July 2, 2025
40 SQL Interview Questions & Solutions For
DBAs
Your interviewer has 30 minutes. Your mission: write queries that show you can reason in data.
Not just recite syntax.

I pulled together 40 real-world SQL challenges that cover the exact patterns hiring managers
probe for:

Retention & Churn: month-over-month customer stickiness, six-month “silent” customers,


moving averages, and more

Salary & Rank Logic: second-highest salary, department pay gaps, in-department ranking
with RANK()

Revenue Analytics : Pareto 80 percent products, product-level revenue share, YoY growth

Window Magic: rolling 3-day sales, 90th-percentile spenders, consecutive-day purchases

Everyday Ops: duplicate detection, unsold products, employees hired on weekends

Now to the questions:

1. Calculate customer retention rate month-over-month


Input (Orders):

| customer_id | order_date |
|-------------|------------|
| 101 | 2023-01-15 |
| 101 | 2023-02-20 |
| 102 | 2023-01-10 |
| 103 | 2023-02-05 |
| 101 | 2023-03-01 |

Statement:

WITH monthly_customers AS (
SELECT DISTINCT
customer_id,
FORMAT(order_date, 'yyyy-MM') AS month
FROM Orders
),
retention AS (
SELECT m1.month,
COUNT(DISTINCT m1.customer_id) AS current_customers,
COUNT(DISTINCT m2.customer_id) AS retained_customers
FROM monthly_customers m1
LEFT JOIN monthly_customers m2 ON m1.customer_id = m2.customer_id
AND m2.month = DATEADD(MONTH, 1,
m1.month)
GROUP BY m1.month
)
SELECT month,
current_customers,
retained_customers,
ROUND(100.0 * retained_customers / NULLIF(current_customers,
0), 2) AS retention_rate
FROM retention
ORDER BY month;

Output:

| month | current_customers | retained_customers | retention_rate |


|---------|-------------------|--------------------|----------------|
| 2023-01 | 2 | 1 | 50.00 |
| 2023-02 | 2 | 1 | 50.00 |
| 2023-03 | 1 | 0 | 0.00 |

2. Retrieve the second highest salary from the Employee


table
Input (Employee):

| emp_id | name | salary |


|--------|--------|--------|
| 1 | Alice | 90000 |
| 2 | Bob | 75000 |
| 3 | Carol | 90000 |
| 4 | Dave | 60000 |
Statement:

SELECT MAX(salary) AS SecondHighestSalary


FROM Employee
WHERE salary < (SELECT MAX(salary) FROM Employee);

Output:

| SecondHighestSalary |
|---------------------|
| 75000 |

3. Find employees without department (Left Join usage)


Input (Employee):

| emp_id | name | department_id |


|--------|--------|---------------|
| 1 | Alice | 101 |
| 2 | Bob | NULL |
| 3 | Carol | 102 |
| 4 | Dave | NULL |

Input (Department):

| department_id | dept_name |
|---------------|-----------|
| 101 | Sales |
| 102 | Marketing |

Statement:

SELECT e.*
FROM Employee e
LEFT JOIN Department d ON e.department_id = d.department_id
WHERE d.department_id IS NULL;

Output:

| emp_id | name | department_id |


|--------|------|---------------|
| 2 | Bob | NULL |
| 4 | Dave | NULL |

4. Calculate the total revenue per product


| product_id | quantity | price |
|------------|----------|-------|
| 101 | 2 | 50 |
| 102 | 1 | 100 |
| 101 | 3 | 50 |
| 103 | 5 | 20 |

Statement:

SELECT product_id, SUM(quantity * price) AS total_revenue


FROM Sales
GROUP BY product_id;

Input (Sales):

Output:

| product_id | total_revenue |
|------------|---------------|
| 101 | 250 |
| 102 | 100 |
| 103 | 100 |

5. Get the top 3 highest-paid employees


Input (Employee):

| emp_id | name | salary |


|--------|--------|--------|
| 1 | Alice | 90000 |
| 2 | Bob | 75000 |
| 3 | Carol | 85000 |
| 4 | Dave | 60000 |

Statement:

SELECT TOP 3 *
FROM Employee
ORDER BY salary DESC;

Output:

| emp_id | name | salary |


|--------|-------|--------|
| 1 | Alice | 90000 |
| 3 | Carol | 85000 |
| 2 | Bob | 75000 |

6. Customers who made purchases but never returned


products
Input (Customers):

| customer_id | name |
|-------------|--------|
| 101 | Alice |
| 102 | Bob |
| 103 | Carol |

Input (Orders):

| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 102 |
| 3 | 103 |

Input (Returns):

| return_id | customer_id |
|-----------|-------------|
| 1 | 101 |

Statement:

SELECT DISTINCT c.customer_id


FROM Customers c
JOIN Orders o ON c.customer_id = o.customer_id
WHERE c.customer_id NOT IN (SELECT customer_id FROM Returns);
Output:

| customer_id |
|-------------|
| 102 |
| 103 |

7. Show the count of orders per customer


Input (Orders):

| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 102 |
| 4 | 101 |
| 5 | 103 |

Statement:

SELECT customer_id, COUNT(*) AS order_count


FROM Orders
GROUP BY customer_id;

Output:

| customer_id | order_count |
|-------------|-------------|
| 101 | 3 |
| 102 | 1 |
| 103 | 1 |

8. Retrieve all employees who joined in 2023


Input (Employee):

| emp_id | name | hire_date |


|--------|--------|------------|
| 1 | Alice | 2023-01-15 |
| 2 | Bob | 2022-11-20 |
| 3 | Carol | 2023-03-10 |
| 4 | Dave | 2021-05-05 |
Statement:

SELECT *
FROM Employee
WHERE YEAR(hire_date) = 2023;

Output:

| emp_id | name | hire_date |


|--------|-------|------------|
| 1 | Alice | 2023-01-15 |
| 3 | Carol | 2023-03-10 |

9. Calculate the average order value per customer


Input (Orders):

| order_id | customer_id | total_amount |


|----------|-------------|--------------|
| 1 | 101 | 100 |
| 2 | 101 | 150 |
| 3 | 102 | 75 |
| 4 | 101 | 200 |

Statement:

SELECT customer_id, AVG(total_amount) AS avg_order_value


FROM Orders
GROUP BY customer_id;

Output:

| customer_id | avg_order_value |
|-------------|-----------------|
| 101 | 150 |
| 102 | 75 |

10. Get the latest order placed by each customer


Input (Orders):
| order_id | customer_id | order_date |
|----------|-------------|------------|
| 1 | 101 | 2023-01-15 |
| 2 | 101 | 2023-02-20 |
| 3 | 102 | 2023-01-10 |
| 4 | 101 | 2023-03-05 |

Statement:

SELECT customer_id, MAX(order_date) AS latest_order_date

FROM Orders
GROUP BY customer_id;

Output:

| customer_id | latest_order_date |
|-------------|-------------------|
| 101 | 2023-03-05 |
| 102 | 2023-01-10 |

11. Find products that were never sold


Input (Products):

| product_id | product_name |
|------------|--------------|
| 101 | Laptop |
| 102 | Phone |
| 103 | Tablet |
| 104 | Monitor |

Input (Sales):

| sale_id | product_id | quantity |


|---------|------------|----------|
| 1 | 101 | 2 |
| 2 | 102 | 1 |
| 3 | 101 | 3 |

Statement:

SELECT p.product_id
FROM Products p
LEFT JOIN Sales s ON p.product_id = s.product_id
WHERE s.product_id IS NULL;

Output:

| product_id |
|------------|
| 103 |
| 104 |

12. Identify the most selling product


Input (Sales):

| sale_id | product_id | quantity |


|---------|------------|----------|
| 1 | 101 | 2 |
| 2 | 102 | 1 |
| 3 | 101 | 3 |
| 4 | 103 | 5 |
| 5 | 102 | 2 |

Statement:

SELECT TOP 1 product_id, SUM(quantity) AS total_qty


FROM Sales
GROUP BY product_id
ORDER BY total_qty DESC;

Output:

| product_id | total_qty |
|------------|-----------|
| 101 | 5 |

13. Get the total revenue and the number of orders per
region
Input (Orders):

| order_id | region | total_amount |


|----------|--------|--------------|
| 1 | East | 100 |
| 2 | West | 150 |
| 3 | East | 200 |
| 4 | North | 75 |
| 5 | East | 125 |

Statement:

SELECT region,
SUM(total_amount) AS total_revenue,
COUNT(*) AS order_count
FROM Orders
GROUP BY region;

Output:

| region | total_revenue | order_count |


|--------|---------------|-------------|
| East | 425 | 3 |
| West | 150 | 1 |
| North | 75 | 1 |

14. Count how many customers placed more than 5 orders


Input (Orders):

| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 101 |
| 4 | 101 |
| 5 | 101 |
| 6 | 101 |
| 7 | 102 |
| 8 | 102 |
| 9 | 103 |

Statement:

SELECT COUNT(*) AS customer_count


FROM (
SELECT customer_id FROM Orders
GROUP BY customer_id
HAVING COUNT(*) > 5
) AS subquery;

Output:

| customer_count |
|----------------|
| 1 |

15. Retrieve customers with orders above the average order


value
Input (Orders):

| order_id | customer_id | total_amount |


|----------|-------------|--------------|
| 1 | 101 | 100 |
| 2 | 102 | 200 |
| 3 | 103 | 150 |
| 4 | 104 | 50 |
| 5 | 105 | 250 |

Statement:

SELECT *
FROM Orders
WHERE total_amount > (SELECT AVG(total_amount) FROM Orders);

Output:

| order_id | customer_id | total_amount |


|----------|-------------|--------------|
| 2 | 102 | 200 |
| 3 | 103 | 150 |
| 5 | 105 | 250 |

16. Find all employees hired on weekends


Input (Employee):

| emp_id | name | hire_date |


|--------|--------|------------|
| 1 | Alice | 2023-01-14 | -- Saturday
| 2 | Bob | 2023-01-16 | -- Monday
| 3 | Carol | 2023-01-15 | -- Sunday
| 4 | Dave | 2023-01-17 | -- Tuesday

Statement:

SELECT *
FROM Employee
WHERE DATENAME(WEEKDAY, hire_date) IN ('Saturday', 'Sunday');

Output:

| emp_id | name | hire_date |


|--------|-------|------------|
| 1 | Alice | 2023-01-14 |
| 3 | Carol | 2023-01-15 |

17. Find employees with salaries between $50,000 and


$100,000
Input (Employee):

| emp_id | name | salary |


|--------|--------|--------|
| 1 | Alice | 90000 |
| 2 | Bob | 45000 |
| 3 | Carol | 75000 |
| 4 | Dave | 110000 |

Statement:

SELECT *
FROM Employee
WHERE salary BETWEEN 50000 AND 100000;

Output:

| emp_id | name | salary |


|--------|-------|--------|
| 1 | Alice | 90000 |
| 3 | Carol | 75000 |
18. Get monthly sales revenue and order count
Input (Orders):

| order_id | date | amount |


|----------|------------|--------|
| 1 | 2023-01-15 | 100 |
| 2 | 2023-01-20 | 150 |
| 3 | 2023-02-05 | 200 |
| 4 | 2023-02-10 | 75 |
| 5 | 2023-03-01 | 125 |

Statement:

SELECT FORMAT(date, 'yyyy-MM') AS month,


SUM(amount) AS total_revenue,
COUNT(order_id) AS order_count
FROM Orders
GROUP BY FORMAT(date, 'yyyy-MM');

Output:

| month | total_revenue | order_count |


|---------|---------------|-------------|
| 2023-01 | 250 | 2 |
| 2023-02 | 275 | 2 |
| 2023-03 | 125 | 1 |

19. Rank employees by salary within each department


Input (Employee):

| employee_id | department_id | salary |


|-------------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 85000 |
| 3 | 102 | 95000 |
| 4 | 101 | 90000 |
| 5 | 102 | 80000 |

Statement:

SELECT employee_id, department_id, salary,


RANK() OVER (PARTITION BY department_id ORDER BY salary DESC)
AS salary_rk
FROM Employee;

Output:

| employee_id | department_id | salary | salary_rk |


|-------------|---------------|--------|-----------|
| 1 | 101 | 90000 | 1 |
| 4 | 101 | 90000 | 1 |
| 2 | 101 | 85000 | 3 |
| 3 | 102 | 95000 | 1 |
| 5 | 102 | 80000 | 2 |

20. Find customers who placed orders every month in 2023


Input (Orders):

| order_id | customer_id | order_date |


|----------|-------------|------------|
| 1 | 101 | 2023-01-15 |
| 2 | 101 | 2023-02-20 |
| 3 | 101 | 2023-03-05 |
| ... | ... | ... |
| 13 | 101 | 2023-12-10 |
| 14 | 102 | 2023-01-10 |
| 15 | 102 | 2023-02-15 |

Statement:

SELECT customer_id
FROM Orders
WHERE YEAR(order_date) = 2023
GROUP BY customer_id
HAVING COUNT(DISTINCT FORMAT(order_date,'yyyy-MM')) = 12;

Output:

| customer_id |
|-------------|
| 101 |
21. Find moving average of sales over the last 3 days
Input (Orders):

| order_id | order_date | total_amount |


|----------|------------|--------------|
| 1 | 2023-01-01 | 100 |
| 2 | 2023-01-02 | 150 |
| 3 | 2023-01-03 | 200 |
| 4 | 2023-01-04 | 175 |
| 5 | 2023-01-05 | 125 |

Statement:

SELECT order_date,
AVG(total_amount) OVER (ORDER BY order_date ROWS BETWEEN 2
PRECEDING AND CURRENT ROW) AS moving_avg
FROM Orders;

Output:

| order_date | moving_avg |
|------------|------------|
| 2023-01-01 | 100.00 |
| 2023-01-02 | 125.00 |
| 2023-01-03 | 150.00 |
| 2023-01-04 | 175.00 |
| 2023-01-05 | 166.67 |

22. Identify the first and last order date for each customer
Input (Orders):

| order_id | customer_id | order_date |


|----------|-------------|------------|
| 1 | 101 | 2023-01-15 |
| 2 | 101 | 2023-03-20 |
| 3 | 102 | 2023-02-10 |
| 4 | 101 | 2023-05-05 |
| 5 | 103 | 2023-04-01 |

Statement:
SELECT customer_id,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order
FROM Orders
GROUP BY customer_id;

Output:

| customer_id | first_order | last_order |


|-------------|-------------|-------------|
| 101 | 2023-01-15 | 2023-05-05 |
| 102 | 2023-02-10 | 2023-02-10 |
| 103 | 2023-04-01 | 2023-04-01 |

23. Show product sales distribution (percent of total


revenue)
Input (Sales):

| product_id | quantity | price |


|------------|----------|-------|
| 101 | 2 | 50 |
| 102 | 1 | 100 |
| 101 | 3 | 50 |
| 103 | 5 | 20 |

Statement:

WITH TotalRevenue AS (
SELECT SUM(quantity * price) AS total
FROM Sales
)
SELECT s.product_id,
SUM(s.quantity * s.price) AS revenue,
SUM(s.quantity * s.price) * 100 / t.total AS revenue_pct
FROM Sales s
CROSS JOIN TotalRevenue t
GROUP BY s.product_id, t.total;

Output:

| product_id | revenue | revenue_pct |


|------------|---------|-------------|
| 101 | 250 | 55.56 |
| 102 | 100 | 22.22 |
| 103 | 100 | 22.22 |

24. Retrieve customers who made consecutive purchases (2


Days)
Input (Orders):

| id | order_date |
|----|------------|
| 101| 2023-01-01 |
| 101| 2023-01-02 |
| 101| 2023-01-04 |
| 102| 2023-01-10 |
| 102| 2023-01-11 |
| 103| 2023-01-15 |

Statement:

WITH cte AS (
SELECT id, order_date,
LAG(order_date) OVER (PARTITION BY id ORDER BY order_date)
AS prev_order_date
FROM Orders
)
SELECT id, order_date, prev_order_date
FROM cte
WHERE DATEDIFF(DAY, prev_order_date, order_date) = 1;

Output:

| id | order_date | prev_order_date |
|-----|------------|-----------------|
| 101 | 2023-01-02 | 2023-01-01 |
| 102 | 2023-01-11 | 2023-01-10 |

25. Find churned customers (no orders in the last 6 months)


Input (Orders) - Assuming current date is 2023-07-01:
| customer_id | order_date |
|-------------|------------|
| 101 | 2022-12-15 |
| 101 | 2023-01-20 |
| 102 | 2023-06-15 |
| 103 | 2022-10-10 |

Statement:

SELECT customer_id
FROM Orders
GROUP BY customer_id
HAVING MAX(order_date) < DATEADD(MONTH, -6, GETDATE());

Output:

| customer_id |
|-------------|
| 103 |

26. Calculate cumulative revenue by day


Input (Orders):

| order_date | total_amount |
|------------|--------------|
| 2023-01-01 | 100 |
| 2023-01-02 | 150 |
| 2023-01-03 | 200 |
| 2023-01-05 | 175 |

Statement:

SELECT order_date,
SUM(total_amount) OVER (ORDER BY order_date) AS
cumulative_revenue
FROM Orders;

Output:

| order_date | cumulative_revenue |
|------------|--------------------|
| 2023-01-01 | 100 |
| 2023-01-02 | 250 |
| 2023-01-03 | 450 |
| 2023-01-05 | 625 |

27. Identify top-performing departments by average salary


Input (Employee):

| emp_id | department_id | salary |


|--------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 85000 |
| 3 | 102 | 75000 |
| 4 | 102 | 80000 |
| 5 | 103 | 95000 |

Statement:

SELECT department_id,
AVG(salary) AS avg_salary
FROM Employee
GROUP BY department_id
ORDER BY avg_salary DESC;

Output:

| department_id | avg_salary |
|---------------|------------|
| 103 | 95000 |
| 101 | 87500 |
| 102 | 77500 |

28. Find customers who ordered more than the average


number of orders per customer
Input (Orders):

| order_id | customer_id |
|----------|-------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 101 |
| 4 | 102 |
| 5 | 102 |
| 6 | 103 |

Statement:

WITH customer_orders AS (
SELECT customer_id, COUNT(*) AS order_count
FROM Orders
GROUP BY customer_id
)
SELECT * FROM customer_orders
WHERE order_count > (SELECT AVG(order_count) FROM customer_orders);

Output:

| customer_id | order_count |
|-------------|-------------|
| 101 | 3 |

29. Calculate revenue generated from new customers (first-


time orders)
Input (Orders):

| customer_id | order_date | total_amount |


|-------------|------------|--------------|
| 101 | 2023-01-15 | 100 |
| 101 | 2023-02-20 | 150 |
| 102 | 2023-01-10 | 200 |
| 103 | 2023-03-01 | 175 |

Statement:

WITH first_orders AS (
SELECT customer_id, MIN(order_date) AS first_order_date
FROM Orders
GROUP BY customer_id
)
SELECT SUM(o.total_amount) AS new_revenue
FROM Orders o
JOIN first_orders f ON o.customer_id = f.customer_id
WHERE o.order_date = f.first_order_date;

Output:

| new_revenue |
|-------------|
| 475 |

30. Find the percentage of employees in each department


Input (Employee):

| emp_id | department_id |
|--------|---------------|
| 1 | 101 |
| 2 | 101 |
| 3 | 102 |
| 4 | 103 |
| 5 | 103 |
| 6 | 103 |

Statement:

SELECT department_id,
COUNT(*) AS emp_count,
COUNT(*) * 100.0 / (SELECT COUNT(*) FROM Employee) AS pct
FROM Employee
GROUP BY department_id;

Output:

| department_id | emp_count | pct |


|---------------|-----------|-------|
| 101 | 2 | 33.33 |
| 102 | 1 | 16.67 |
| 103 | 3 | 50.00 |

31. Retrieve the maximum salary difference within each


department
Input (Employee):
| emp_id | department_id | salary |
|--------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 75000 |
| 3 | 102 | 85000 |
| 4 | 102 | 82000 |
| 5 | 103 | 95000 |

Statement:

SELECT department_id,
MAX(salary) - MIN(salary) AS salary_diff
FROM Employee
GROUP BY department_id;

Output:

| department_id | salary_diff |
|---------------|-------------|
| 101 | 15000 |
| 102 | 3000 |
| 103 | 0 |

32. Find products that contribute to 80% of revenue (Pareto


Principle)
Input (Sales):

| product_id | quantity | price |


|------------|----------|-------|
| 101 | 10 | 100 |
| 102 | 5 | 200 |
| 103 | 20 | 50 |
| 104 | 8 | 75 |

Statement:

WITH sales_cte AS (
SELECT product_id, SUM(quantity * price) AS revenue
FROM Sales GROUP BY product_id
),
total_revenue AS (
SELECT SUM(revenue) AS total FROM sales_cte
)
SELECT s.product_id, s.revenue,
SUM(s.revenue) OVER (ORDER BY s.revenue DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS
running_total
FROM sales_cte s, total_revenue t
WHERE SUM(s.revenue) OVER (ORDER BY s.revenue DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) <= t.total *
0.8;

Output:

| product_id | revenue | running_total |


|------------|---------|---------------|
| 101 | 1000 | 1000 |
| 102 | 1000 | 2000 |

33. Calculate average time between purchases for each


customer
Input (Orders):

| customer_id | order_date |
|-------------|------------|
| 101 | 2023-01-01 |
| 101 | 2023-01-10 |
| 101 | 2023-01-25 |
| 102 | 2023-02-05 |
| 102 | 2023-02-20 |

Statement:

WITH cte AS (
SELECT customer_id, order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY
order_date) AS prev_date
FROM Orders
)
SELECT customer_id,
AVG(DATEDIFF(DAY, prev_date, order_date)) AS avg_gap_days
FROM cte
WHERE prev_date IS NOT NULL
GROUP BY customer_id;

Output:

| customer_id | avg_gap_days |
|-------------|--------------|
| 101 | 12.0 |
| 102 | 15.0 |

34. Show last purchase for each customer with order amount
Input (Orders):

| customer_id | order_id | total_amount | order_date |


|-------------|----------|--------------|------------|
| 101 | 1001 | 150 | 2023-01-15 |
| 101 | 1002 | 200 | 2023-02-20 |
| 102 | 1003 | 175 | 2023-01-10 |
| 101 | 1004 | 125 | 2023-03-05 |

Statement:

WITH ranked_orders AS (
SELECT customer_id, order_id, total_amount,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY
order_date DESC) AS rn
FROM Orders
)
SELECT customer_id, order_id, total_amount
FROM ranked_orders
WHERE rn = 1;

Output:

| customer_id | order_id | total_amount |


|-------------|----------|--------------|
| 101 | 1004 | 125 |
| 102 | 1003 | 175 |

35. Calculate year-over-year growth in revenue


Input (Orders):

| order_date | total_amount |
|------------|--------------|
| 2021-01-15 | 1000 |
| 2021-02-20 | 1500 |
| 2022-01-10 | 2000 |
| 2022-03-05 | 2500 |
| 2023-02-01 | 3000 |

Statement:

SELECT FORMAT(order_date, 'yyyy') AS year,


SUM(total_amount) AS revenue,
SUM(total_amount) - LAG(SUM(total_amount)) OVER (ORDER BY
FORMAT(order_date, 'yyyy')) AS yoy_growth
FROM Orders
GROUP BY FORMAT(order_date, 'yyyy');

Output:

| year | revenue | yoy_growth |


|------|---------|------------|
| 2021 | 2500 | NULL |
| 2022 | 4500 | 2000 |
| 2023 | 3000 | -1500 |

36. Detect customers with purchases above their 90th


percentile
Input (Orders):

| customer_id | order_id | total_amount |


|-------------|----------|--------------|
| 101 | 1001 | 100 |
| 101 | 1002 | 200 |
| 101 | 1003 | 150 |
| 101 | 1004 | 500 |
| 102 | 1005 | 300 |

Statement:

WITH ranked_orders AS (
SELECT customer_id, order_id, total_amount,
NTILE(10) OVER (PARTITION BY customer_id ORDER BY
total_amount) AS decile
FROM Orders
)
SELECT customer_id, order_id, total_amount
FROM ranked_orders
WHERE decile = 10;

Output:

| customer_id | order_id | total_amount |


|-------------|----------|--------------|
| 101 | 1004 | 500 |
| 102 | 1005 | 300 |

37. Retrieve longest gap between orders for each customer


Input (Orders):

| customer_id | order_date |
|-------------|------------|
| 101 | 2023-01-01 |
| 101 | 2023-01-10 |
| 101 | 2023-02-15 |
| 102 | 2023-01-05 |
| 102 | 2023-03-01 |

Statement:

WITH cte AS (
SELECT customer_id, order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY
order_date) AS prev_order_date
FROM Orders
)
SELECT customer_id, MAX(DATEDIFF(DAY, prev_order_date, order_date)) AS
max_gap
FROM cte
WHERE prev_order_date IS NOT NULL
GROUP BY customer_id;
Output:

| customer_id | max_gap |
|-------------|---------|
| 101 | 36 |
| 102 | 55 |

38. Identify customers with revenue below 10th percentile


Input (Orders):

| customer_id | total_amount |
|-------------|--------------|
| 101 | 100 |
| 101 | 200 |
| 102 | 50 |
| 103 | 300 |
| 104 | 75 |

Statement:

WITH cte AS (
SELECT customer_id, SUM(total_amount) AS total_revenue
FROM Orders
GROUP BY customer_id
)
SELECT customer_id, total_revenue
FROM cte
WHERE total_revenue < (SELECT PERCENTILE_CONT(0.1) WITHIN GROUP (ORDER
BY total_revenue) FROM cte);

Output:

| customer_id | total_revenue |
|-------------|---------------|
| 102 | 50 |
| 104 | 75 |

39. Find employees with salary above department average


Input (Employee):

| employee_id | department_id | salary |


|-------------|---------------|--------|
| 1 | 101 | 90000 |
| 2 | 101 | 85000 |
| 3 | 102 | 95000 |
| 4 | 101 | 80000 |
| 5 | 102 | 90000 |

Statement:

WITH dept_avg AS (
SELECT department_id, AVG(salary) AS avg_salary
FROM Employee
GROUP BY department_id
)
SELECT e.employee_id, e.department_id, e.salary, d.avg_salary
FROM Employee e
JOIN dept_avg d ON e.department_id = d.department_id
WHERE e.salary > d.avg_salary;

Output:

| employee_id | department_id | salary | avg_salary |


|-------------|---------------|--------|------------|
| 1 | 101 | 90000 | 85000 |
| 3 | 102 | 95000 | 92500 |

40. Find duplicate records in a table


Input (your_table):

| column1 | column2 |
|---------|---------|
| John | Doe |
| Jane | Smith |
| John | Doe |
| Mike | Johnson |
| Jane | Smith |
Statement:

SELECT column1, column2, COUNT(*)


FROM your_table
GROUP BY column1, column2
HAVING COUNT(*) > 1;

Output:

| column1 | column2 | COUNT(*) |


|---------|---------|----------|
| John | Doe | 2 |
| Jane | Smith | 2 |

Conclusion
These questions cover advanced analytical scenarios including statistical calculations
(percentiles), time-based analysis (retention rates), and complex business metrics (Pareto
principle). Each solution demonstrates practical applications of window functions, CTEs, and
advanced joins that are essential for data analysis roles.

You might also like