Problem Statement
Problem Statement
Objective of this exercise is to understand your familiarity with Retail dataset and
KPIs, gauge your SQL prowess and quantify your adeptness with making sense of
big data in terms of business logics.
Problem Statement:
You have received a Retail Dataset for Dec 2022 and Dec 2023 for a US based Toy
Retailer. All the data required to solve the below questions is available in the
datasets provided. The overarching idea is to compare two years (2022 vs 2023)
across KPIs and multiple categories of Customers.
Hygiene Check:
1. Only consider stores for the analysis which have served equal number of days
in both the time periods (2022 & 2023)
2. Only consider Customers for the Analysis for which we have a Profile
available.
3. Only consider those Customers residing within 50 miles of the store they
shopped in
Questions:
1. Give a KPI overview of 2022 vs 2023 performance for the customers. We want
data for Overall Customers as well as the New Customers
2. Identify the Top 2 and Bottom 2 stores by 2023 vs 2022 growth and identify
the core KPI which saw the most growth.
3. Compare profile performance for 2 timeframes and highlight the best and
worst performing profiles.
4. Compare KPIs for Online Only, Instore Only and Multichannel Customers for
2023
5. For all the Customers, who made more than 1 purchase in 2023, what was
their Average Order Value for their 2nd Transaction?
You can use any type of SQL to solve the above problems. Datasets have been given
in the CSV format. We need the codes (well formatted/commented). The output in
terms of numbers needs to be in Excel, which should be lucid, self-explanatory,
clean, and well formatted. Have your insights and data anomalies well called out.
Ideally the problem statement, data and data dictionary are pretty simple and self-
explanatory, but in case of doubts, please go ahead and take logical assumptions
(mention your assumptions well).
Powerpoint presentation and Excel Visualizations are not mandatory but will earn
bonus points.