Kewal Kumar Singh
Kewal Kumar Singh
Introduction to SQL
Problem Statement
Business Context
A lot of people in the world share a common desire: to own a vehicle. A car or an
automobile is seen as an object that gives the freedom of mobility. Many now
prefer pre-owned vehicles because they come at an affordable cost, but at the
same time, they are also concerned about whether the after-sales service
provided by the resale vendors is as good as the care you may get from the
actual manufacturers.
New-Wheels, a vehicle resale company, has launched an app with an end-to-end
service from listing the vehicle on the platform to shipping it to the customer's
location. This app also captures the overall after-sales feedback given by the
customer.
Objective
New-Wheels sales have been dipping steadily in the past year, and due to the
critical customer feedback and ratings online, there has been a drop in new
customers every quarter, which is concerning to the business. The CEO of the
company now wants a quarterly report with all the key metrics sent to him so he
can assess the health of the business and make the necessary decisions.
As a data analyst, you see that there is an array of questions that are being
asked at the leadership level that need to be answered using data. Import the
dump file that contains various tables that are present in the database. Use the
data to answer the questions posed and create a quarterly business report for
the CEO.
Business Questions
Question 1: Find the total number of customers who have placed orders. What
is the distribution of the customers across states?
Solution Query:
Output:
Query 2: Distribution of Customers Across States
SELECT
c.state,
COUNT(DISTINCT o.customer_id) AS customer_count
FROM customer_t c
JOIN order_t o ON c.customer_id = o.customer_id
GROUP BY c.state
ORDER BY customer_count DESC;
Output:
Observations and Insights:
▪ Total Number of Customers Who Placed Orders is 994.
▪ Top 5: Texas and California lead the nation in car purchases, followed by
Florida, New York and the District of Columbia.
▪ Population Density: These states tend to have high population densities,
which could lead to a greater overall need for cars.
▪ Urbanization: More urban areas often rely less on public transportation and
more on personal vehicles.
Question 2: Which are the top 5 vehicle makers preferred by the customers?
Solution Query:
SELECT
pt.vehicle_maker,
COUNT(ot.product_id) AS top_5
FROM product_t AS pt
USING(product_id)
GROUP BY vehicle_maker
LIMIT 5;
Output:
▪ Toyota secures the third position, Pontiac and Dodge follow closely behind in
fourth and fifth respectively.
Question 3: Which is the most preferred vehicle maker in each state?
Solution Query:
WITH final AS (
SELECT C.state,
P.vehicle_maker,
COUNT(C.customer_id) AS cnt_cust
FROM customer_t C
INNER JOIN order_t O
ON C.customer_id = O.customer_id
INNER JOIN product_t P
ON O.product_id =P.product_id
GROUP BY 1,2),
final_rank AS (
SELECT *, DENSE_RANK() OVER( PARTITION BY STATE ORDER BY
CNT_CUST DESC) AS drnk
FROM final)
SELECT state,
vehicle_maker,
cnt_cust,
drnk
FROM final_rank
WHERE drnk = 1;
Output:
Consider the following mapping for ratings: “Very Bad”: 1, “Bad”: 2, “Okay”: 3,
“Good”: 4, “Very Good”: 5
Solution Query:
Query 1 :Calculate Overall Average Rating given by the customers
WITH ratings_mapping AS (
SELECT
customer_feedback,
CASE
WHEN customer_feedback = 'Very Bad' THEN 1
WHEN customer_feedback = 'Bad' THEN 2
WHEN customer_feedback = 'Okay' THEN 3
WHEN customer_feedback = 'Good' THEN 4
WHEN customer_feedback = 'Very Good' THEN 5
END AS rating
FROM (SELECT DISTINCT customer_feedback FROM order_t) AS
unique_feedback
),
feedback_with_ratings AS (
SELECT
R.rating
FROM order_t O
JOIN ratings_mapping R
ON O.customer_feedback = R.customer_feedback
)
SELECT
ROUND(AVG(rating), 2) AS overall_avg_rating
FROM feedback_with_ratings;
Output:
Output:
▪ Dissatisfaction: The recurring decline suggests the customer are unhappy with
the product or service provided.
Solution Query:
Query 1: Percentage distribution of feedback from the customers
SELECT
customer_feedback,
COUNT(*) AS feedback_count,
ROUND((COUNT(*) * 100.0) / (SELECT COUNT(*) FROM order_t), 2) AS
feedback_percentage
FROM order_t
GROUP BY customer_feedback
ORDER BY feedback_percentage DESC;
Output:
Query 2: Feedback Trends Over Time
WITH feedback_final AS (
SELECT quarter_number,
SUM(CASE WHEN customer_feedback = 'Very Good' then 1 ELSE 0
END) AS very_good,
SUM(CASE WHEN customer_feedback = 'Good' then 1 ELSE 0 END) AS
good,
SUM(CASE WHEN customer_feedback = 'Okay' then 1 ELSE 0 END) AS
okay,
SUM(CASE WHEN customer_feedback = 'Bad' then 1 ELSE 0 END) AS
bad,
SUM(CASE WHEN customer_feedback = 'Very BAD' then 1 ELSE 0 END)
AS very_bad,
COUNT(customer_feedback) AS total_feedback
FROM order_t
GROUP BY 1)
SELECT quarter_number,
100*(very_good/total_feedback) AS per_ver_good,
100*(good/total_feedback) AS per_good,
100*(okay/total_feedback) AS per_okay,
100*(bad/total_feedback) AS per_bad,
100*(very_bad/total_feedback) AS per_very_bad
FROM feedback_final
ORDER BY 1;
Output:
Output:
Solution Query:
SELECT
ROUND(SUM(P.vehicle_price * O.quantity * (1 - O.discount / 100)), 2) AS
net_revenue
FROM order_t O
JOIN product_t P
ON O.product_id = P.product_id;
Output:
Query 2: Quarter-over-quarter % change in net revenue
WITH quarter_rev AS (
SELECT quarter_number,
SUM(quantity *(vehicle_price - ((discount/100)*vehicle_price))) AS
total_revenue
FROM order_t
GROUP BY 1
ORDER BY 1 )
Output:
Solution Query:
SELECT quarter_number,
SUM(quantity *(vehicle_price - ((discount/100)*vehicle_price))) AS
total_revenue,
COUNT(order_id) AS total_orders
FROM order_t
GROUP BY 1
ORDER BY 1;
Output:
Solution Query:
SELECT
distinct ct.credit_card_type,
AVG(ot.discount) OVER(PARTITION BY ct.credit_card_type) AS
avg_discount_per_credit_type
FROM customer_t AS ct
INNER JOIN order_t AS ot
ON ct.customer_id = ot.customer_id
ORDER BY 2 DESC;
Output:
Observations and Insights:
1. Laser
2. Mastercard
3. Maestro
4. Visa-Electron
5. China-Unionpay
Question 10: What is the average time taken to ship the placed orders for each
quarter?
Solution Query:
SELECT
distinct quarter_number,
FROM order_t;
Output:
Observations and Insights:
▪ Q1: Average shipping time was 57 days.
▪ Sharp Increases: Shipping times jumped to 71 days in Q2 at 24% increase.
▪ Overall increase: This represents a tripling of shipping times from Q1 to Q4
(174 days vs 57 days).
Last Quarter Revenue Last quarter Orders Average Days to Ship % Good Feedback
Business Recommendations
To better understand the reasons behind the decline, gather additional data and
investigate root cause: