BY MANISH KUMAR CHAUDHARY
Interview Question
SQL
Data Analyst Case Study by A
Major Travel Company
DIFFICULTY LEVEL :MEDIUM
Question From
Ankit Bansal
Yotube Channel
PROBLEM STATEMENT
Data Analyst Case Study by A Major Travel
Company
There are two tables one booking_table and
users_table.
We are requested to solve the following 4 questions.
1.Write a query to find the total number of users for
each segment and total number of users who
booked flight in April 2022.
2.Write a query to identify users whose first booking
was a hotel booking.
3.Write a query to calculate the days between first
and last booking of each user.
4.Write a query to count the number of flight and
hotel bookings in each user segments for the year
2022
Sample booking_table Input Table
Booking_id Booking_date User_id Line_of_business
b1 2022-03-23 u1 Flight
b2 2022-03-27 u2 Flight
b3 2022-03-28 u1 Hotel
b4 2022-03-31 u4 Flight
b5 2022-04-02 u1 Hotel
b6 2022-04-02 u2 Flight
b7 2022-04-06 u5 Flight
b8 2022-04-06 u6 Hotel
b9 2022-04-06 u2 Flight
b10 2022-04-10 u1 Flight
b11 2022-04-12 u4 Flight
b12 2022-04-16 u1 Flight
b13 2022-04-19 u2 Flight
users_table Input Table
User_id Segment
u1 s1
u2 s1
u3 s1
u4 s2
u5 s2
u6 s3
u7 s3
u8 s3
u9 s3
u10 s3
1.Write a query to find the total number of users for
each segment and total number of users who
booked flight in April 2022.
QUERY
QUERY EXPLANATION
1.To calculate the number of users from each
segment we are simply using COUNT with Distinct
user_id.
2.To get the number of users who booked flight in
April 2022 we are using CASE WHEN statment.
Here, to get date as April 2022 we have made use of
FORMAT function.
We have made use of LEFT join so that we can count
all the users from each segment. If we hadn't used
LEFT join then there might have been chances that all
users have booked hotel or flight.
OUTPUT
total_flights_booke
segment total_users
d
s1 3 2
s2 2 2
s3 5 1
2.Write a query to identify users whose first
booking was a hotel booking.
QUERY
QUERY EXPLANATION
1.We are using rnk_cte to get the required details
and rank on the basis of Order date for each user.
This will help us to get the first booking done by that
particular user.
2.We simply querying the required records and
filtering with WHERE condition so that we can get
records with rnk=1 and its business was 'Hotel'
OUTPUT
user_id
u6
3.Write a query to calculate the days between first
and last booking of each user.
QUERY
QUERY EXAPLANATION
1.We are using DATEDIFF to calculate the day difference
between first booking date and last booking date for
each user_id.
Here, we have used MIN() for getting the first booking
date and MAX() for getting last booking date.
OUTPUT
user_id diff
u1 44
u2 32
u4 34
u5 14
u6 16
4.Write a query to count the number of flight and
hotel bookings in each user segments for the year
2022
QUERY
QUERY EXPLANATION
1.By using CASE WHEN statement we are flagging with
1 if booking was for 'Flight' or 'Hotel' and then by using
SUM function we are adding these 1s to count them.
To filter for year 2022 only we have used DATEPART
Function.
OUTPUT
segment np_of_flight_bookings no_of_hotel_bookings
s1 8 4
s2 3 3
s3 1 1
Questions were good and I was able to solve them. All
concepts for these questions were already covered in Ankit
Bansal's interview series. Thus, I was exactly able to solve
similar way Ankit Sir did.
BY MANISH KUMAR CHAUDHARY
THANK YOU
Don't judge each day by the harvest you
reap but by the seeds that you plant.
Robert Louis Stevenson