0% found this document useful (0 votes)
9 views12 pages

How To Ace Any SQL Interview

The document outlines a fool-proof approach for succeeding in SQL interviews, using a specific example of calculating the percentage of viral posts on a social media platform. It emphasizes the importance of reframing questions, stating assumptions about table metadata, outlining a structured approach, writing the SQL query, reviewing it, and suggesting improvements. Key steps include clarifying the question, identifying primary and foreign keys, and considering edge cases and optimization strategies.

Uploaded by

satish g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views12 pages

How To Ace Any SQL Interview

The document outlines a fool-proof approach for succeeding in SQL interviews, using a specific example of calculating the percentage of viral posts on a social media platform. It emphasizes the importance of reframing questions, stating assumptions about table metadata, outlining a structured approach, writing the SQL query, reviewing it, and suggesting improvements. Key steps include clarifying the question, identifying primary and foreign keys, and considering edge cases and optimization strategies.

Uploaded by

satish g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

How to

approach
(& ace) any
SQL interview

Dawn Choo
Data role?
ying for a
Appl

Here is my ultimate
fool-proof approach
for any SQL interview:
Let’s use this interview
question as an example

Interview Question
You work for a social media company. A
post is considered “viral” if it receives
more than 100 likes in the first hour
after posting. Calculate the percentage
of posts made in the last 7 days that
went viral.

You have the following tables:


posts
post_id
user_id
timestamp
likes
like_id
post_id
user_id
timestamp
Step 1: Reframe the question
and ask clarifying questions
Confirm your understanding of the question by
reframing it in your own words.

Original question
A post is considered “viral” if it receives more than 100 likes
in the first hour after posting. Calculate the percentage of
posts made in the last 7 days that went viral.

Reframed question
We need to analyze posts from the past week to
determine how many went viral. Our definition of
“viral” is any post that got over 100 likes in its first
hour. We'll need to calculate this as a percentage
of all posts made during that time period.

We have data on posts and likes in separate tables


that we'll need to join and analyze.
Step 2: State assumptions
on the table metadata
Typically, a SQL question would come with one or
more tables. It is helpful to state your assumptions
on the metadata before starting.

Some assumptions you can state include

Which column is the primary key?


Which columns are the foreign keys?
Can an event take place multiple times?
Are there any columns with unique values?
Step 2: State assumptions
on the table metadata
In this example question

We could state these primary, foreign & unique keys


posts
post_id ← primary key & unique
user_id ← foreign key
timestamp
likes
like_id ← primary key & unique
post_id ← foreign key
user_id ← foreign key
timestamp

Other assumptions we might call out


1. A user can only “like” a post at most once.
2. The timestamps are in a standardized timezone.
3. We do not need to query user_id in this question.
Step 3: Outline your approach
Before writing any code, start laying out your
approach for the question. Confirm your
approach with your interviewer
before writing the query.

In this example, you might say:

“We can approach this question in 2 steps:

Step 1: Gather data on recent posts and their like


counts within the first hour. We’ll do this by joining
the posts table to the likes table.

Step 2: We’ll calculate the percentage of posts


that are viral, meaning that they had 100+ likes in
the first hour

How does this sound?”


Step 4: Write the SQL query
Now that we have aligned on the structure,
it’s time to write the SQL query!
p, IMO.
rtant ste
ost impo
The m

Step 5: Review your query


Before “locking in” your answer, make sure
to do a thorough review of your query.

Please do not skip this step.


You want to catch any
errors before your
interviewer does!
Step 6: Provide suggestions
for improving the query
THIS is how you demonstrate that
you’re a thought leader, and
not a SQL monkey.

At the end of each query,answer at least


one of these questions:

What are edge cases that the query did not catch?

What are some ways to optimize your query?

What are some drawbacks to this approach?


Step 6: Provide suggestions
for improving the query
In this example

What are edge cases that the query did not catch?
Posts haven't had a full hour to accumulate likes,
which might skew our results.

What are some ways to optimize the query?


If dealing with large datasets, we could partition the
tables by date to improve query performance.

What are some drawbacks to this approach?


This definition of “viral” assumes all posts should be
treated equally: We might want to consider factors like
the user's follower count or post impressions.
Found this
useful?
Save it
Follow me Dawn Choo

Repost it

You might also like