Data Science Interview Questions
Data Science Interview Questions
11 Jan 2016
CLICK HERE to get the 2016 data scientist salary report delivered to your
inbox!
If you would like more information about Data Science careers, please click
the orange "Request Info" button on top of this page.
At a recent Big Data panel, organised by the Silicon Valley Bank in Boston,
almost every speaker unanimously agreed that the time to hire Data
Scientists was yesterday. Theoretically, Data Scientists should be able to look
at a companys data and gure out to make the data protable for the
business. It is not academics. Playing around with data and experimenting
on it with dierent algorithms is not going to help in the long run if the
business needs are not met. It is getting very dicult for companies to nd
qualied data scientists who understand that their projects will ultimately
need to make money for the company. Most of the time the data based
models that the data scientists work on, cannot be turned into protable
relevant applications. This is the reason the interview process for Data
Scientists at any company is rigorous and complicated. Finding Data
Scientists who not only have the necessary technical skill but also have the
knowledge of the industry and the acumen to understand business needs.
Companies need to see value at the end of the day. Hiring a Data Scientist
no matter how cool it may be in theory, if they do not bring value to the
company is a loss. To avoid such scenarios, it is imperative that a company
rst understand what kind of data they have, how much data they have
and what kind of possible projects a Data Scientist can work on based on the
data. Below we have listed some of the questions asked in Data Science
Interviews, in the companies that have gured out why they need Data
Scientists and what output they need from data science projects.
Learn Data Science in Python to Land a Top Gig as a Data Scientist at Top
Tech Companies!
3) You are at a Casino. You have two dices to play with. You win $10
every time you roll a 5. If you play till you win and then stop, what is the
expected pay-out?
4) How many big Macs does McDonald sell every year in US?
5) You are about to get on a plane to Seattle, you want to know whether
you have to bring an umbrella or not. You call three of your random friends
and as each one of them if its raining. The probability that your friend is
telling the truth is 2/3 and the probability that they are playing a prank on
you by lying is 1/3. If all 3 of them tell that it is raining, then what is the
probability that it is actually raining in Seattle.
6) You can roll a dice three times. You will be given $X where X is the
highest roll you get. You can choose to stop rolling at any time (example, if
you roll a 6 on the rst roll, you can stop). What is your expected pay-out?
10) You have 2 dices. What is the probability of getting at least one 4?
Also nd out the probability of getting at least one 4 if you have n dices.
11) Pick up a coin C1 given C1+C2 with probability of trials p (h1) =.7, p
(h2) =.6 and doing 10 trials. And what is the probability that the given coin
you picked is C1 given you have 7 heads and 3 tails?
15) How will you test that there is increased probability of a user to stay
active after 6 months given that a user has more friends now?
16) You have two tables-the rst table has data about the users and their
friends, the second table has data about the users and the pages they have
liked. Write an SQL query to make recommendations using pages that your
friends liked. The query result should not recommend the pages that have
already been liked by a user.
18) Which technique will you use to compare the performance of two back-
end engines that generate automatic friend recommendations on Facebook?
21) You are given 50 cards with ve dierent colors- 10 Green cards, 10 Red
Cards, 10 Orange Cards, 10 Blue cards, and 10 Yellow cards. The cards of
each colors are numbered from one to ten. Two cards are picked at random.
Find out the probability that the cards picked are not of same number and
same color.
22) What approach will you follow to develop the love,like, sad feature on
Facebook?
2) What will you do if removing missing values from a dataset cause bias?
2) A disc is spinning on a spindle and you dont know the direction in which
way the disc is spinning. You are provided with a set of pins.How will you use
the pins to describe in which way the disc is spinning?
8) What are the factors used to produce People You May Know data
product on LinkedIn?
9) How will you nd the second largest element in a Binary Search tree ?
(Asked for a Data Scientist Intern job role)
3) Case Study based questions Cars are implanted with speed tracker so
that the insurance companies can track about our driving state. Based on
this new scheme what kind of business questions can be answered?
5) What is a joke that people say about you and how would you rate the joke
on a scale of 1 to 10?
6) You own a clothing enterprise and want to improve your place in the
market. How will you do it from the ground level ?
2) How will inspect missing data and when are they important for your
analysis?
3) How will you decide whether a customer will buy a product today or not
given the income of the customer, location where the customer lives,
profession and gender? Dene a machine learning algorithm for this.
4) From a long sorted list and a short 4 element sorted list, which algorithm
will you use to search the long sorted list for 4 elements.
5) How can you compare a neural network that has one layer, one input and
output to a logistic regression model?
7) How will you deal with unbalanced data where the ratio of negative and
positive is huge?
8) What is the dierence between -
2) What are the metrics you will use to track if Ubers paid advertising
strategies to acquire customers work? How will you gure out the acceptable
cost of customer acquisition?
5) Which machine learning algoritthm will you use to solve a Uber driver
accepting request?
6)How will you compare the results of various machine learning algorithms?
8) How will you design the heatmap for Uber drivers to provide
recommendation on where to wait for passengers? How would you
approach this?
1) How can you build and test a metric to compare ranked list of TV shows or
Movies for two Netix users?
2) How can you decide if one algorithm is better than the other?
Microsoft Data Science Interview Questions
2) How can you compute an inverse matrix faster by playing with some
computation tricks?
3) You have a bag with 6 marbles. One marble is white. You reach the bag
100 times. After taking out a marble, it is placed back in the bag. What is the
probability of drawing a white marble at least once?
1) Suppose that American Express has 1 million card members along with
their transaction details. They also have 10,000 restaurants and 1000 food
coupons. Suggest a method which can be used to pass the food coupons to
users given that some users have already received the food coupons so far.
2) You are given a training dataset of users that contain their demographic
details, the pages on Facebook they have liked so far and results of
psychology test based on their personality i.e. their openness to like FB
pages or not. How will you predict the age, gender and other demographics
of unseen data?
4) Develop an algorithm to sort two lists of sorted integers into a single list.
3) A box has 12 red cards and 12 black cards. Another box has 24 red cards
and 24 black cards. You want to draw two cards at random from one of the
two boxes, which box has a higher probability of getting cards of same
colour and why?
6) There are 8 identical balls and only one of the ball is slightly heavier than
the others. You are given a balance scale to nd the heavier ball. What is the
least number of times you have to use the balance scale to nd the heavier
ball?
6) State some use cases where Hadoop MapReduce works well and where it
does not.
10) Have you used sampling? What are the various types of sampling have
you worked with?
2. On rolling a dice if you get $1 per dot on the upturned face,what are your
expected earnings from rolling a dice?
3. In continuation with question #2, if you have 2 chances to roll the dice
and you are given the opportunity to decide when to stop rolling the dice
(in the rst roll or in the second roll). What will be your rolling strategy to
get maximum earnings?
4. What will be your expected earnings with the two roll strategy?
5. You are creating a report for user content uploads every month and
observe a sudden increase in the number of upload for the month of
November. The increase in uploads is particularly in image uploads. What
do you think will be the cause for this and how will you test this sudden
spike?
1) A dice is rolled twice, what is the probability that on the second chance it
will be a 6?
3) Burn two ropes, one needs 60 minutes of time to burn and the other
needs 30 minutes of time. How will you achieve this in 45 minutes of time ?
7) When required data is not available for analysis, how do you go about
collecting it? (Asked at Vodafone)
12) What do you understand by ROC curve and how is it used? (Asked at
MachinePulse)
13) How will you identify the top K queries from a le? (Asked at
BloomReach)
14) Given a set of webpages and changes on the website, how will you test
the new website feature to determine if the change works positively? (Asked
at BloomReach)
15) There are N pieces of rope in a bucket. You put your hand into the
bucket, take one end piece of the rope .Again you put your hand into the
bucket and take another end piece of a rope. You tie both the end pieces
together. What is the expected value of the number of loops within the
bucket? (Asked at Natera)
16) How will you test if a chosen credit scoring model works or not? What
data will you look at? (Asked at Square)
17) There are 10 bottles where each contains coins of 1 gram each. There is
one bottle of that contains 1.1 gram coins. How will you identify that bottle
after only one measurement? (Data Science Puzzle asked at Latent View
Analytics)
18) How will you measure a cylindrical glass lled with water whether it is
exactly half lled or not? You cannot measure the water, you cannot
measure the height of the glass nor can you dip anything into the glass.
(Data Science Puzzle asked at Latent View Analytics)
19) What would you do if you were a trac sign? (Data Science Interview
Question asked at Latent View Analytics)
20) If you could get the dataset on any topic of interest, irespective of the
collection methods or resources then how would the dataset look like and
what will you do with it. (Data Scientist Interview Question asked at CKM
Advisors)
21) Given n samples from a uniform distribution [0,d], how will you estimate
the value of d? (Data Scientist Interview Question asked at Spotify)
22) How will you tune a Random Forest? (Data Science Interview Question
asked at Instacart).
23) Tell us about a project where you have extracted useful information from
a large dataset. Which machine learning algorithm did you use for this and
why? (Data Scientist Interview Question asked at Greenplum)
24) What is the dierence between Z test and T test ? (Data Scientist
Interview Questions asked at Antuit)
25) What are the dierent models you have used for analysis and what were
your inferences? (Data Scientist Interview Questions asked at Cognizant)
26) Given the title of a product, identify the category and sub-category of the
product. (Data Scientist interview question asked at Delhivery)
27) What is the dierence between machine learning and deep learning? (
Data Scientist Interview Question asked at InfoObjects)
28) What are the dierent parameters in ARIMA models ? (Data Science
Interview Question asked at Morgan Stanley)
29) What are the optimisations you would consider when computing the
similarity matrix for a large dataset? (Data Science Interview questions asked
at MakeMyTrip)
31) Why do you use Random Forest instead of a simple classier for one of
the classication problems ? (Data Science Interview Question asked at Audi)
32) What is an n-gram? (Data Science Interview Question asked at Yelp)
33) What are the problems related to Overtting and Undertting and how
will you deal with these ? (Data Science Interview Question asked at Tiger
Analytics)
34) Given a MxN dimension matrix with each cell containing an alphabet,
nd if a string is contained in it or not.(Data Science Interview Question
asked at Tiger Analytics)
If you are asked questions like what is your favourite leisure activity? Or
something like what is that you like to do for fun? Most of the people often
tend to answer that they like to read programming books or do coding
thinking that this is what they are supposed to say in a technical interview. Is
this something you really do it for fun? A key point to bear in mind that the
interviewer is also a person and interact with them as a person naturally.
This will help the interviewer see you as an all-rounder who can visualize the
companys whole vision and not just view business problems from an
academic viewpoint.
PREVIOUS NEXT
Follow
$399
26 Sat and Sun (6 weeks)
$399
26 Sat and Sun (6 weeks)
Blog Categories
! Big Data
! CRM
! Data Science
! NoSQL Database
! Web Development
Tutorials
! MatPlotLib Tutorial
! R Tutorial: Data.Table
! SciPy Tutorial
! Hadoop Training
Courses
EV SSL Certicate
About DeZyre
About Us
Contact Us
DeZyre Reviews
Blog
Tutorials
Webinar
Online Hackathons
Student Portfolios
Privacy Policy
Disclaimer
Connect with us
Copyright 2017 Iconiq Inc. All rights reserved. All trademarks are property of their respective owners.