Extra Credit Quiz Sample Questions - Week 5
Extra Credit Quiz Sample Questions - Week 5
2022.
Question 1 – SQL
CVS Health wants to gain a clearer understanding of its pharmacy sales and the
performance of various products.
Write a query to calculate the total drug sales for each manufacturer. Round the
answer to the nearest million and report your results in descending order of total
sales. In case of any duplicates, sort them alphabetically by the manufacturer
name.
pharmacy_sales_Table:
------------------------------
Column Name Type
------------------------------
product_id integer
units_sold integer
total_sales decimal
cogs decimal
manufacturer varchar
drug varchar
SOLUTION
SELECT manufacturer, '$' || ROUND(SUM(total_sales)/1000000,0) || ' million' AS
total_sales_manu
FROM pharmacy_sales
GROUP BY manufacturer
ORDER BY SUM(total_sales) DESC, manufacturer
;
Output:
-----------------------------------------
manufacturer total_sales_manu
-----------------------------------------
AbbVie $114 million
Eli Lilly $77 million
Biogen $70 million
Johnson & Johnson $43 million
Bayer $34 million
AstraZeneca $32 million
Pfizer $28 million
Novartis $26 million
Sanofi $25 million
Merck $25 million
Roche $16 million
GlaxoSmithKline $4 million
Question 2 – SQL
Assume you are given the table below on Uber transactions made by users. Write a
query to obtain the third transaction of every user. Output the user id, spend and
transaction date.
transactions_Table:
----------------------------------
Column Name Type
----------------------------------
user_id integer
spend decimal
transaction_date timestamp
SOLUTION
WITH transaction_order_table AS
(
SELECT user_id, spend, transaction_date,
RANK() OVER (PARTITION BY user_id ORDER BY transaction_date) AS transaction_order
FROM transactions
)
SELECT user_id, spend, transaction_date
FROM transaction_order_table
WHERE transaction_order = 3;
Output:
----------------------------------------------
user_id spend transaction_date
----------------------------------------------
111 89.60 02/05/2022 12:00:00
121 67.90 04/03/2022 12:00:00
263 100.00 07/12/2022 12:00:00
Question 3 – SQL
This is the same question as problem #10 in the SQL Chapter of Ace the Data Science
Interview!
Given a table of tweet data over a specified time period, calculate the 3-day
rolling average of tweets for each user. Output the user ID, tweet date, and
rolling averages rounded to 2 decimal places.
tweets_Table:
----------------------------------
Column Name Type
----------------------------------
user_id integer
tweet_date timestamp
tweet_count integer
tweets_Table DATA:
----------------------------------------------
user_id tweet_date tweet_count
----------------------------------------------
111 06/01/2022 00:00:00 2
111 06/02/2022 00:00:00 1
111 06/03/2022 00:00:00 3
111 06/04/2022 00:00:00 4
111 06/05/2022 00:00:00 5
111 06/06/2022 00:00:00 4
111 06/07/2022 00:00:00 6
199 06/01/2022 00:00:00 7
199 06/02/2022 00:00:00 5
199 06/03/2022 00:00:00 9
199 06/04/2022 00:00:00 1
199 06/05/2022 00:00:00 8
199 06/06/2022 00:00:00 2
199 06/07/2022 00:00:00 2
254 06/01/2022 00:00:00 1
254 06/02/2022 00:00:00 1
254 06/03/2022 00:00:00 2
254 06/04/2022 00:00:00 1
254 06/05/2022 00:00:00 3
254 06/06/2022 00:00:00 1
254 06/07/2022 00:00:00 3
SOLUTION
SELECT user_id, tweet_date, ROUND(AVG(tweet_count)
OVER (
PARTITION BY user_id
ORDER BY tweet_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
), 2) AS rolling_avg_3d
FROM tweets;
Output:
---------------------------------------------------------
user_id tweet_date rolling_avg_3d
---------------------------------------------------------
111 06/01/2022 00:00:00 2.00
111 06/02/2022 00:00:00 1.50
111 06/03/2022 00:00:00 2.00
111 06/04/2022 00:00:00 2.67
111 06/05/2022 00:00:00 4.00
111 06/06/2022 00:00:00 4.33
111 06/07/2022 00:00:00 5.00
199 06/01/2022 00:00:00 7.00
199 06/02/2022 00:00:00 6.00
199 06/03/2022 00:00:00 7.00
199 06/04/2022 00:00:00 5.00
199 06/05/2022 00:00:00 6.00
199 06/06/2022 00:00:00 3.67
199 06/07/2022 00:00:00 4.00
254 06/01/2022 00:00:00 1.00
254 06/02/2022 00:00:00 1.00
254 06/03/2022 00:00:00 1.33
254 06/04/2022 00:00:00 1.33
254 06/05/2022 00:00:00 2.00
254 06/06/2022 00:00:00 1.67
254 06/07/2022 00:00:00 2.33
Question 4 – Python
Given a list of salaries, we'll define a metric called inequity which is the
difference between max and min salary seen in the list: inequity=max(input_list)
−min(input_list).
Write a function called min_inequity which takes in a list of salaries, and a value
n, and returns the minimum inequity possible when taking n salaries from the full
salary list.
SOLUTION
a=[60000, 80000, 120000, 70000]
n=4
def xxx(sal,num):
a.sort()
for i in a[0:n]:
ineq=max(a[0:i])-min(a[0:i])
print(ineq)
xxx(a,n)
Output
60000
Question 5 – Python
Given a list of integers, return the maximum product of any three numbers in the
array.
For example, for A = [1, 2, 3, 4, 5], you should return 60, since 3∗4∗5=603∗4∗5=60.
SOLUTION
a=[1,2,3,4,5]
def max_three(input):
a1=max(input)
b=input
b.remove(a1)
b1=max(b)
c=b
c.remove(b1)
c1=max(c)
prodmax=a1*b1*c1
return prodmax
max_three(a)
Output
60