0% found this document useful (0 votes)
11 views23 pages

SQL 30 Days

The document contains a series of SQL queries addressing various data retrieval tasks, including finding products with the highest prices by country, calculating click-through rates, and identifying delayed orders. Each question is accompanied by a SQL query that demonstrates how to extract the required information from the specified database tables. The queries utilize various SQL functions and techniques such as joins, window functions, and aggregations.

Uploaded by

ajeshaju269
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views23 pages

SQL 30 Days

The document contains a series of SQL queries addressing various data retrieval tasks, including finding products with the highest prices by country, calculating click-through rates, and identifying delayed orders. Each question is accompanied by a SQL query that demonstrates how to extract the required information from the specified database tables. The queries utilize various SQL functions and techniques such as joins, window functions, and aggregations.

Uploaded by

ajeshaju269
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Question 1

/* 1. You have two tables: Product and Supplier.


- Product Table Columns: Product_id, Product_Name, Supplier_id, Price
- Supplier Table Columns: Supplier_id, Supplier_Name, Country
*/
-- Write an SQL query to find the name of the product with the highest
-- price in each country
– Using window functions and Subquery we can do it

SELECT
product_name,
price,
country
FROM
(SELECT
s.country,
p.product_name,
p.price,
ROW_NUMBER() OVER(PARTITION BY s.country ORDER BY p.price
DESC) as rn
FROM products as p
JOIN suppliers as s
ON s.supplier_id = p.supplier_id) x1 -- using alias for the query
WHERE rn = 1

Question 2

Write a SQL query to retrieve the IDs of the Facebook pages that have zero likes.
The output should be sorted in ascending order based on the page IDs.

-- Question 2 link :: https://datalemur.com/questions/sql-page-with-no-likes

-- My Solution

SELECT p.page_id
FROM pages p
LEFT JOIN page_likes pl ON p.page_id = pl.page_id
GROUP BY p.page_id
HAVING COUNT(pl.page_id) = 0
ORDER BY p.page_id ASC
Write a query to calculate the click-through rate (CTR) for the app in 2022 and round the results
to 2 decimal places.
Definition and note:
Percentage of click-through rate (CTR) = 100.0 * Number of clicks / Number of impressions
To avoid integer division, multiply the CTR by 100.0, not 100.
Expected Output Columns: app_id, ctr

Question 2 Link :: https://datalemur.com/questions/click-through-rate

-- SQL query to calculate the click-through rate (CTR)


SELECT
app_id,
ROUND((100.0 * SUM(CASE WHEN event_type = 'click' THEN 1 ELSE 0 END) / COUNT(*)),
2) AS ctr
FROM
events
WHERE
YEAR(timestamp) = 2022
GROUP BY
app_id;

Question 3

Write an SQL query to calculate the difference


between the highest salaries
in the marketing and engineering department.
Output the absolute difference in salaries.

SELECT
ABS(MAX(CASE WHEN department = 'Marketing' THEN salary END) -
MAX(CASE WHEN department = 'Engineering' THEN salary END)) as salary_diff
FROM Salaries;

Question 4

Customer Segmentation Problem:


You have two tables: customers and orders.

customers table has columns:


customer_id, customer_name, age, gender.
orders table has columns:
order_id, customer_id, order_date, total_amount.
Write an SQL query to find the average order amount
for male and female customers separately
return the results with 2 DECIMAL.

SELECT
c.gender as gender_name,
ROUND(avg(o.total_amount), 2) as avg_spent
FROM customers as c
JOIN orders as o
ON c.customer_id = o.customer_id
GROUP BY gender_name

Question 5

Write a SQL query to obtain the third transaction of every user.


Output the user id, spend, and transaction date.

SELECT
user_id,
spend,
transaction_date
FROM
(
SELECT
user_id,
spend,
transaction_date,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY transaction_date) as rn
FROM transactions) x1
WHERE rn = 3

Question 6

Write a query that calculates the total viewership for laptops and mobile devices,
where mobile is defined as the sum of tablet and phone viewership. Output the total
viewership for laptops as laptop_views and the total viewership for mobile devices as
mobile_views.

SELECT
SUM(CASE WHEN device_type = 'laptop' THEN viewership_count ELSE 0 END) AS
laptop_views,
SUM(CASE WHEN device_type IN ('tablet', 'phone') THEN viewership_count ELSE 0 END)
AS mobile_views
FROM viewership

Question 7

Write a query to identify the top two highest-grossing products


within each category in the year 2022. Output should include the category,
product, and total spend.

SELECT
category,
product,
total_spend
FROM (
SELECT
category,
product,
SUM(spend) as total_spend,
RANK() OVER(PARTITION BY category ORDER BY SUM(spend) DESC) rn
FROM product_spend
WHERE EXTRACT(YEAR FROM transaction_date) = '2022'
GROUP BY 1, 2 ) x1
WHERE rn <= 2

Question 8

Write a query to obtain a histogram of tweets posted per user in 2022.


Output the tweet count per user as the bucket and the number of Twitter users who fall into that
bucket.

SELECT
num_post,
COUNT(user_id) as num_user
FROM
(
SELECT
user_id,
COUNT(tweet_id) as num_post
FROM tweets
WHERE EXTRACT(YEAR FROM tweet_date) = '2022'
GROUP BY user_id
)x1
GROUP BY num_post

Question 9

Find Department's Top three


salaries in each department

SELECT
department_name,
emp_name,
salary
FROM (
SELECT
d.name as department_name,
e.name as emp_name,
e.salary as salary,
DENSE_RANK() OVER(PARTITION BY d.name ORDER BY e.salary
DESC) drn
FROM employee as e
JOIN
department as d
ON e.departmentid = d.id) x1
WHERE drn <= 3

Question 10

Write an SQL query to find for each month and country,


the number of transactions and their total amount,
the number of approved transactions and their total amount.

Return the result table in in below order.RANGE

SELECT
TO_CHAR(trans_date, 'YYYY-MM') as month,
country,
COUNT(1) as trans_count,
SUM(CASE WHEN state='approved' THEN 1 ELSE 0 END) as approved_count,
SUM(amount) as trans_total_amount,
SUM(CASE WHEN state= 'approved' THEN amount ELSE 0 END) as
approved_total_amount
FROM transactions
GROUP BY 1, 2

Question 11

Question:: Given the reviews table, write a query to retrieve


the average star rating for each product, grouped by month.
The output should display the month as a numerical value,
product ID, and average star rating rounded to two decimal places.
Sort the output first by month and then by product ID.

SELECT * FROM reviews;


-- month by each product and their avg rating

SELECT
EXTRACT(MONTH FROM submit_date) as month,
product_id,
ROUND(AVG(stars), 2) as avg_rating
FROM reviews
GROUP BY month, product_id
ORDER BY month, product_id

Question 12

SQL Question 1: Identify IBM's High Capacity Users

CREATE TABLE purchases (


purchase_id INT PRIMARY KEY,
user_id INT,
date_of_purchase TIMESTAMP,
product_id INT,
amount_spent DECIMAL(10, 2)
);

SQL Question:
Identify users who have made purchases
totaling more than $10,000 in the last month
from the purchases table.
The table contains information about purchases,
including the user ID, date of purchase, product ID,
and amount spent.

SELECT user_id, SUM(amount_spent) AS total_spent


FROM purchases
WHERE date_of_purchase >= DATE_TRUNC('month', CURRENT_DATE) - INTERVAL '1
month'
AND date_of_purchase < DATE_TRUNC('month', CURRENT_DATE)
GROUP BY user_id
HAVING SUM(amount_spent) > 10000;

Average Duration of Employee's Service


Given the data on IBM employees, can you find the average duration
of service for employees across different departments?
The Duration of service is represented as end_date - start_date.
If end_date is NULL, consider it as the current date.

CREATE TABLE employee_service (


employee_id INT PRIMARY KEY,
name VARCHAR(50),
start_date DATE,
end_date DATE,
department VARCHAR(50)
)

SELECT department,
AVG(COALESCE(end_date, CURRENT_DATE) - start_date) AS avg_service_duration
FROM employee_service
GROUP BY department;

Question 13

Question: Identify the top 3 posts with the highest engagement


(likes + comments) for each user on a Facebook page. Display
the user ID, post ID, engagement count, and rank for each post.
CREATE TABLE fb_posts (
post_id INT PRIMARY KEY,
user_id INT,
likes INT,
comments INT,
post_date DATE
)

WITH rank_posts
AS (
SELECT
user_id,
post_id,
SUM(likes + comments) as engagement_count,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY SUM(likes +
comments) DESC) rn,
DENSE_RANK() OVER(PARTITION BY user_id ORDER BY SUM(likes +
comments) DESC) ranks
FROM fb_posts
GROUP BY user_id, post_id
ORDER BY user_id, engagement_count DESC
)
SELECT
user_id,
post_id,
engagement_count,
ranks
FROM rank_posts
WHERE rn <=3

-- Q.2

Determine the users who have posted more than 2 times


in the past week and calculate the total number of likes
they have received. Return user_id and number of post and no of likes

CREATE TABLE posts (


post_id INT PRIMARY KEY,
user_id INT,
likes INT,
post_date DATE
);
SELECT
user_id,
SUM(likes) as total_likes,
COUNT(post_id) as cnt_post
FROM posts
WHERE post_date >= CURRENT_DATE - 7 AND
post_date < CURRENT_DATE
GROUP BY user_id
HAVING COUNT(post_id)

Question 14

-- Q.1 LinkedIn Data Analyst Interview question

Assume you're given a table containing job postings


from various companies on the LinkedIn platform.
Write a query to retrieve the count of companies
that have posted duplicate job listings.

Definition:

Duplicate job listings are defined as two job listings


within the same company that share identical titles and descriptions.

CREATE TABLE job_listings (


job_id INTEGER PRIMARY KEY,
company_id INTEGER,
title TEXT,
description TEXT
)

SELECT
COUNT(1) as cnt_company
FROM
(SELECT
company_id,
title,
description,
COUNT(1) as total_job
FROM job_listings
GROUP BY 1, 2, 3
HAVING COUNT(1) > 1
)x1

Question 15

Identify the region with the lowest sales amount for the previous month.
return region name and total_sale amount.

-- region and sum sale


-- filter last month
-- lowest sale region

CREATE TABLE Sales (


SaleID SERIAL PRIMARY KEY,
Region VARCHAR(50),
Amount DECIMAL(10, 2),
SaleDate DATE
)

SELECT
region,
SUM(amount) as total_sales
FROM sales
WHERE EXTRACT(MONTH FROM saledate) = EXTRACT(MONTH FROM CURRENT_DATE -
INTERVAL '1 month')
AND EXTRACT(YEAR FROM saledate) = EXTRACT(YEAR FROM CURRENT_DATE)
GROUP BY region
ORDER BY total_sales ASC
LIMIT 1

Question 16

Find the median within a series of numbers in SQL

WITH CTE
AS (
SELECT
views,
ROW_NUMBER() OVER( ORDER BY views ASC) rn_asc,
ROW_NUMBER() OVER( ORDER BY views DESC) rn_desc
FROM tiktok
WHERE views < 900
)
SELECT
AVG(views) as median
FROM CTE
WHERE ABS(rn_asc - rn_desc) <= 1

Question 17

-- How many delayed orders does each delivery partner have,


considering the predicted delivery time and the actual delivery time?

CREATE TABLE order_details (


order_id INT,
del_partner VARCHAR(255),
predicted_time TIMESTAMP,
delivery_time TIMESTAMP
)

-- My solution

-- del_partner delayed orders cnt


-- delayed order means del_time > pred_del_time

SELECT
del_partner,
COUNT(order_id) as cnt_delayed_orders
FROM order_details
WHERE
predicted_time < delivery_time
GROUP BY del_partne

Question 18

Which metro city had the highest number of restaurant orders in September 2021?

Write the SQL query to retrieve the city name and the total count of orders,
ordered by the total count of orders in descending order.

-- Note metro cites are 'Delhi', 'Mumbai', 'Bangalore', 'Hyderabad'


-- Create the Table
CREATE TABLE restaurant_orders (
city VARCHAR(50),
restaurant_id INT,
order_id INT,
order_date DATE
)

-- city name
-- total orders
-- filter metro
-- grouyp by city
-- 1

SELECT
city,
count(order_id) as total_orders
FROM restaurant_orders
WHERE city IN ('Delhi', 'Mumbai', 'Bangalore', 'Hyderabad')
AND order_date BETWEEN '2021-09-01' AND '2021-09-30'
GROUP BY city
ORDER BY total_orders DESC
LIMIT 1

Question 19

-- Get the count of distint student that are not unique

CREATE TABLE student_names(


student_id INT,
name VARCHAR(50)
);

-- Insert the records


INSERT INTO student_names (student_id, name) VALUES
(1, 'RAM'),
(2, 'ROBERT'),
(3, 'ROHIM'),
(4, 'RAM'),
(5, 'ROBERT')

SELECT
COUNT(*) as distint_student_cnt
FROM
(
SELECT name,
COUNT(name)
FROM student_names
GROUP BY name
HAVING COUNT(name) = 1
) as subquery

Question 20

Find city wise customers count who have placed


more than three orders in November 2023.

CREATE TABLE zomato_orders(


order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
price FLOAT,
city VARCHAR(25)
)

SELECT
city,
COUNT(1) total_customer_count
FROM (
SELECT
city,
customer_id as customer,
COUNT(1) as total_orders
FROM zomato_orders
WHERE order_date BETWEEN '2023-11-01' AND '2023-11-30'
GROUP BY city, customer_id
HAVING COUNT(1)> 3
) sub_query
GROUP BY city
Question 21

Find the top-performing two months


by revenue for each hotel for each year.
return hotel_id, year, month, revenue

CREATE TABLE hotel_revenue (


hotel_id INT,
month VARCHAR(10),
year INT,
revenue DECIMAL(10, 2)
)

Find the top-performing two months


by revenue for each hotel for each year.
return hotel_id, year, month, revenue

-- hotel_id, year, month, revenue


-- ranking based on revenue
-- filter top 2 month for each hotel in each year

WITH CTE1
AS
(
SELECT
hotel_id,
year,
month,
revenue,
DENSE_RANK() OVER(PARTITION BY hotel_id, year ORDER BY revenue
DESC) drn
FROM hotel_revenue
)

SELECT
hotel_id,
year,
month,
revenue
FROM CTE1
WHERE drn <= 2
Question 22

Write a SQL query to retrieve the emp_id, emp_name, and manager_name


from the given employee table.
It's important to note that managers are also employees in the table.

Employees table has 3 COLUMNS


-- emp_id, emp_name, maneger_id

CREATE TABLE employees (


emp_id INT PRIMARY KEY,
emp_name VARCHAR(255),
manager_id INT,
FOREIGN KEY (manager_id) REFERENCES employees(emp_id)
);

-- Inserting records into the employees table


INSERT INTO employees (emp_id, emp_name, manager_id) VALUES
(1, 'John Doe', NULL), -- John Doe is the manager-- -----------------------
-- My Solution
-- -----------------------

-- emp_id,
-- emp_name,
-- manager_name based on manager id

SELECT
e1.emp_id,
e1.emp_name,
e1.manager_id,
e2.emp_name as manager_name
FROM employees as e1
CROSS JOIN
employees as e2
WHERE e1.manager_id = e2.emp_id
-- approach 2

SELECT
e1.emp_id,
e1.emp_name,
e1.manager_id,
e2.emp_name as manager_name
FROM employees as e1
LEFT JOIN
employees as e2
ON e1.manager_id = e2.emp_id
WHERE e1.manager_id IS NOT NULL

(2, 'Jane Smith', 1), -- Jane Smith reports to John Doe


(3, 'Alice Johnson', 1), -- Alice Johnson reports to John Doe
(4, 'Bob Williams', 2), -- Bob Williams reports to Jane Smith
(5, 'Charlie Brown', 2), -- Charlie Brown reports to Jane Smith
(6, 'David Lee', 3), -- David Lee reports to Alice Johnson
(7, 'Emily Davis', 3), -- Emily Davis reports to Alice Johnson
(8, 'Fiona Clark', 4), -- Fiona Clark reports to Bob Williams
(9, 'George Turner', 4), -- George Turner reports to Bob Williams
(10, 'Hannah Baker', 5), -- Hannah Baker reports to Charlie Brown
(11, 'Isaac White', 5), -- Isaac White reports to Charlie Brown
(12, 'Jessica Adams', 6), -- Jessica Adams reports to David Lee
(13, 'Kevin Harris', 6); -- Kevin Harris reports to David Lee

Question 23

Given the employee table with columns EMP_ID and SALARY,


write an SQL query to find all salaries greater than the average salary.
return emp_id and salary

CREATE TABLE employee (


EMP_ID INT PRIMARY KEY,
SALARY DECIMAL(10, 2)
)

SELECT
emp_id,
salary
FROM employee
WHERE salary < (SELECT AVG(salary) FROM employee)

Question 24

Consider a table named customers with the following columns:


customer_id, first_name, last_name, and email.
Write an SQL query to find all the duplicate email addresses
in the customers table.

SELECT
email
-- COUNT(email) as cnt_frequency
FROM customers
GROUP BY email
HAVING COUNT(email) > 1

Question 25

Question:
Write a SQL query to calculate the running
total revenue for each combination of date and product ID.

Expected Output Columns:


date, product_id, product_name, revenue, running_total
ORDER BY product_id, date ascending

CREATE TABLE orders (


date DATE,
product_id INT,
product_name VARCHAR(255),
revenue DECIMAL(10, 2)
)

SELECT
o1.date,
o1.product_id,
o1.product_name,
o1.revenue,
SUM(o2.revenue) as running_total
FROM orders as o1
JOIN
orders as o2
ON
o1.product_id = o2.product_id
AND
o1.date >= o2.date
GROUP BY
o1.date,
o1.product_id,
o1.product_name,
o1.revenue
ORDER BY
o1.product_id, o1.date

Question 26

Suppose you are given two tables - Orders and Returns.


The Orders table contains information about orders placed by customers,
and the Returns table contains information about returned items.

Design a SQL query to


find the top 5 ustomer with the highest percentage
of returned items out of their total orders.

Return the customer ID


and the percentage of returned items rounded to two decimal places.

*/

-- customer_id,
-- total_items_ordered by each cx
-- total_items_returned by each cx
-- 2/4*100 50% total_items_returned/total_items_ordered*100

CREATE TABLE orders (


order_id INT,
customer_id INT,
order_date DATE,
total_items_ordered INT
)
CREATE TABLE returns (
return_id INT,
order_id INT,
return_date DATE,
returned_items INT
)

WITH orders_cte
AS
(
SELECT
customer_id,
SUM(total_items_ordered) as total_items_ordered
FROM orders
GROUP BY customer_id
),
return_cte
As
(
SELECT
o.customer_id,
SUM(r.returned_items) as total_items_returned
FROM returns as r
JOIN
orders as o
ON r.order_id = o.order_id
GROUP BY
o.customer_id
)

SELECT
oc.customer_id,
oc.total_items_ordered,
rc.total_items_returned,
ROUND(CASE
WHEN oc.total_items_ordered > 0 THEN
(rc.total_items_returned::float/oc.total_items_ordered::float)*100
ELSE 0 END::numeric ,2) as return_percentage

FROM orders_cte as oc
JOIN
return_cte rc
ON oc.customer_id = rc.customer_id
ORDER BY return_percentage DESC
LIMIT 5;

Question 27

-- Question: Write an SQL query to fetch user IDs that have only bought both 'Burger' and 'Cold
Drink' items.

-- Expected Output Columns: user_id

CREATE TABLE orders (


user_id INT,
item_ordered VARCHAR(512)
)

SELECT
user_id
-- COUNT(DISTINCT item_ordered)
FROM orders
GROUP BY user_id
HAVING COUNT(DISTINCT item_ordered) = 2
AND
SUM(CASE WHEN item_ordered IN ('Burger', 'Cold Drink')
THEN 1
ELSE 0
END
)=2

Question 28

-- Given two tables, orders and return, containing sales and returns data for Amazon's

write a SQL query to find the top 3 sellers with the highest sales amount
but the lowest lowest return qty.

CREATE TABLE orders (


order_id INT PRIMARY KEY,
seller_id INT,
sale_amount DECIMAL(10, 2)
);

WITH orders_cte
AS
(
SELECT
seller_id,
SUM(sale_amount) as total_sales
FROM orders
GROUP BY seller_id
),

returns_cte
AS
(
SELECT
seller_id,
SUM(return_quantity) as total_return_qty
FROM returns
GROUP BY seller_id
)

SELECT
orders_cte.seller_id as seller_id,
orders_cte.total_sales as total_sale_amt,
COALESCE(returns_cte.total_return_qty, 0) as total_return_qty
FROM orders_cte
LEFT JOIN
returns_cte
ON orders_cte.seller_id = returns_cte.seller_id
ORDER BY total_sale_amt DESC, total_return_qty ASC
LIMIT 3

Question 29

Write a solution to select the product id, year, quantity,


and price for the first year of every product sold.

CREATE TABLE Sales (


sale_id INT,
product_id INT,
year INT,
quantity INT,
price INT
)
SELECT
product_id,
first_year,
quantity,
price

FROM (
SELECT
product_id,
year as first_year,
quantity,
price,
RANK() OVER(PARTITION BY product_id ORDER BY year) as rn
FROM sales
) subquery
WHERE rn = 1

Question 30

Write a SQL query to find the top 10 most popular songs by total number of listens.
You have two tables: Songs (containing song_id, song_name,
and artist_name) and Listens (containing listen_id, user_id, song_id, and listen_date).

CREATE TABLE Songs (


song_id INT PRIMARY KEY,
song_name VARCHAR(255),
artist_name VARCHAR(255)
);

CREATE TABLE Listens (


listen_id INT PRIMARY KEY,
user_id INT,
song_id INT,
listen_date DATE,
FOREIGN KEY (song_id) REFERENCES Songs(song_id)
);

SELECT
song_name,
times_of_listens,
DENSE_RANK() OVER (ORDER BY times_of_listens DESC) AS rank
FROM
(SELECT
s.song_name,
COUNT(l.listen_id) AS times_of_listens
FROM
Songs s
JOIN
Listens l ON s.song_id = l.song_id
GROUP BY
s.song_name) AS sub
ORDER BY
times_of_listens DESC
LIMIT 10;

You might also like