The Ultimate Guide To AI and Machine Learning Job Interviews 1 1
The Ultimate Guide To AI and Machine Learning Job Interviews 1 1
Table of Contents
Introduction 5
1
In Their Own Words: What Hiring Managers Are Looking For 99
Susie Pan - Royal Bank of Canada, Product Lead 99
Integrate.ai - Rachel Jacobson, VP of People; Brennan Biddle, AI Recruiter 101
2
Takeaways 114
Conclusion 123
3
Introduction
The rapid success of Springboard's M achine Learning Engineering Career
Track has reinforced our belief that there isn't enough easily accessible
education about this exciting and fast-growing field. This tactical career
guide is one way we can help.
While working with machine learning experts to design the course and
talking to aspiring learners, we found that there was a mishmash of
resources that discuss ML job interviews, but no complete guides. There
were individual profiles and collections of interview questions, but no
comprehensive resource with solutions, no profiles that spoke to: how do I
actually get this job?
The goal of this guide is to help you navigate the entire process, from A to Z,
to find and secure an interview in a machine learning job, whether as an
engineer, analyst, product manager, data scientist, researcher, or whatever
role you determine is right for you.
To develop this ebook, we wanted to talk to both hiring managers and job
candidates, people from both sides of the table, to outline what this
experience looks like -- and what the job market looks like—right now. We
wanted to talk to recruiters who source candidates, hiring managers who
conduct interviews and make offers, and successful candidates who have
made it through challenging machine learning interviews.
4
So, you're interested in AI and machine learning. But how do you actually
turn that interest into a career? Let's get into it.
What Are AI, Machine Learning, and Deep Learning?
If you're getting into the field, let's say as a programmer or an analyst, but
you want to brush up on your knowledge of key terms and definitions of AI,
machine learning, and deep learning, this section is for you.
Artificial Intelligence
Artificial intelligence has been around since at least the 1950s, but it’s only in
the past few years that it’s become ubiquitous. Companies we interact with
every day— Amazon, Facebook, and Google—have fully embraced AI. It
powers product recommendations, maps, and our social media feeds. But
it’s not only the tech giants that employ AI in their products. Now, startups,
banks, consulting companies, and even governments are integrating AI
solutions.
5
data. That’s what allows the AI to learn and adapt. It takes in reams of
information and data and processes it. If it encounters a problem, it learns
from the situation and recognizes a pattern.
There are many different terms being used, sometimes interchangeably and
sometimes incorrectly, to describe artificial intelligence. AI is an umbrella
term encompassing several different forms of learning. The main buckets
are m
achine learning, deep learning, and neural networks.
Machine Learning
6
Andrew Ng, one of modern AI’s pioneers, offers this helpful t able on what
machine learning can do:
Deep learning is based on neural networks, which is the idea that machines
could mimic the human brain, with many layers of artificial neurons. Neural
networks are powerful when they are multi-layered, with more neurons and
7
interconnectivity. Neural networks have been explored for years, but only
recently has research been pushed to the next level and commercialized.
Conceptually, here is a c
omparison of a simple neural network to what a
multi-layered neural network in deep learning may look like:
It’s important to keep in mind that these are general, simplified definitions.
Different organizations could have different definitions of each of these
terms. And they could also have varying ideas about the depth they are
looking for from an applicant. This can be difficult for candidates because
there may be dramatically different job and interview requirements.
8
9
Industries With AI and ML Careers
Before we talk more about roles, let's explore the industries that employ
people in AI and ML careers.
If you are looking to get into a certain industry, you’ll want to optimize your
resume and LinkedIn profile with keywords that are more common in that
industry. It's important for screening purposes.
10
Different industries also focus more on certain types of roles. For example,
software, medicine, and telecom companies are typically the largest
employers of data scientists. On the other hand, aerospace and information
technology companies hire more engineers. And analysts tend to be hired by
healthcare as well as consulting and banking companies.
It's important to be aware of the industry your potential employers will be in
so you can learn more about their needs and also how they express
themselves.
How Companies Think About AI and Machine Learning
Different companies can have very distinct interview processes.
The dream: a Silicon Valley paradise, working with a small team, raising
millions of dollars, and changing the world. Besides the glory, there are major
opportunities in working with an early-stage startup. For one, you may be
able to work the fastest within one, and potentially see a large amount of
success in a short period of time.
One important thing to keep in mind if you join an early-stage company is
that your job description probably will not be static. You'll likely be required to
do a number of different things across a number of different areas. Also,
you'll probably be short on resources, so you'll need to be self-motivated,
resourceful, and flexible.
11
In a startup, you could have the opportunity of a lifetime, not only in terms of
learning but in the potential financial windfall. But there's also a high risk that
it could fail, as most new businesses do.
12
13
AI and machine learning are still relatively new on the hype cycle, and there
are a number of companies that have built sizeable data sets that can
incorporate ML into their business. That way, they can leverage those data
sets to improve their existing products and potentially to develop new
products.
14
Mid to large companies will be more rigid in their culture or in the systems
that they have set up, which can make it harder to innovate. But you'll have
data that you can use to build machine learning models based on millions of
data points. And you also know that you have a chance to immediately make
more of an impact at scale, given the number of users and customers that
you're likely to have.
While some of these companies may not be on the cutting edge of machine
learning or AI innovations, they still offer a fantastic opportunity to learn, and
you'll have solid compensation and benefits, as well as an overall solid
foundation to stand on.
15
16
17
and some of the most talented employees in the world. You'll often be
working on complicated problems that require very innovative thinking.
If you want a challenge and some of the best training in the world, this is the
target. You will have access to immense amounts of data. You likely won't
be able to move as fast as some of the early-stage startups, but you'll have a
good balance between those and a traditional company, with an impressive
compensation package, and often a very well recognized brand on your
resume when you decide to move on.
18
19
20
How to Look Into Companies
One of the most important parts of getting a good job in machine learning
and AI is first identifying the quality of the company that you want to join.
I've quoted these strong guiding questions verbatim from Karen Hao's recent
publication in the MIT Technology Review, because they are simply so
spot-on:
1. What is the problem it’s trying to solve? What does the company says
it’s trying to do, and is it worthy of machine learning? Perhaps we’re talking
to Affectiva, which is building emotion recognition technology to accurately
track and analyze people’s moods. Conceptually, this is a pattern recognition
problem and thus would be one that machine learning could tackle (see:
What is machine learning?). It would also be very challenging to approach
through another means because it is too complex to program into a set of
rules.
21
train an audio system to pattern match on people’s tone of voice. Here, we
want to figure out how the company has reframed its problem statement
into a machine-learning problem, and determine what data it would need to
input into its algorithms.
3. How does the company source its training data? Once we know the kind
of data the company needs, we want to know how the company goes about
acquiring it. Most AI applications use supervised machine learning, which
requires clean, high-quality labeled data. Who is labeling the data? And if the
labels are subjective like emotions, do they follow a scientific standard? In
Affectiva’s case you would learn that the company collects audio and video
data voluntarily from users, then employs trained specialists to label the data
in a rigorously consistent way. Knowing the details of this part of the pipeline
also helps you identify any potential sources of data collection or labeling
bias (See: This is how AI bias really happens).
4. Does the company have processes for auditing its products? N ow we
should examine whether the company tests its products. How accurate are
its algorithms? Are they audited for bias? How often does it re-evaluate its
algorithms to make sure they’re still performing up to par? If the company
doesn’t yet have algorithms that reach its desired accuracy or fairness, what
plans does it have to make sure they will before deployment?
22
So, use this fantastic five-question framework from Karen Hao to assess
whether a company is really on the right track with their machine learning
product.
Whatever your most important criteria are, put those first, and don't settle.
23
People talk all the time about culture and how important fit is, both for a
candidate and for a company.
1. Check out Glassdoor. It’s probably the best online repository of
information about companies, including reviews from people who have
interviewed with the company as well as current and former
employees. If you see a lot of bad reviews, it’s pretty simple: you
should probably stay away. So, it's a good, quick reference check to
evaluate a company’s reputation. However, the Glassdoor approach
can be tricky, especially with a very small or very large company. With
the former, it's possible that there won’t be any reviews at all. On the
other hand, with a very large company, it's possible that none of the
reviews pertain to the machine learning part of the business.
2. Read up on recent company news. One of the easiest ways to do this
is to go to the company's website and check out their online press
room to see what they've been putting out recently. But that’s just the
company’s approved PR. You should also do at least some light
Google searching. Obviously, if they've had some bad press recently,
you might want to do additionally due diligence about the business.
3. Look on LinkedIn. Check out people who currently work at the
company, both people you would directly work with and employees at
the director level and higher. You might want to check out their
backgrounds to see the schools they attended, activities they are
24
involved in, and of course their most recent work history and how they
describe their career. If you see some similarities and some things that
interest you, that could be a sign that the company is a good fit for
you. You should also keep in mind that some companies simply hire
differently. For example, some are looking for specialists versus
generalists. Some are looking for fully formed candidates while others
are looking for people that they can train. Looking at profiles of current
employees can help you get a sense of what that company typically
likes, and then you can consider whether that works for you.
Now that you've looked into companies a bit, let's look into getting an
interview.
8 Paths to Getting a Job Interview
Scoring an interview is sometimes the hardest (and most frustrating) part of
getting the job, even more than getting through the interview. So in this
section, we're going to help you figure out how to land an interview in the
first place.
Most companies post jobs on their career site. You can always target a
company and respond to a specific job posting or submit a general
application expressing broader interest in the company through that jobs
25
portal. You can also find machine learning job postings on websites like
Indeed or LinkedIn. These are old faithfuls. Definitely invest some time there.
There are specific job boards for the machine learning space, such as ML
Jobs List, as well.
I've also heard that aggregators from VCs such as First Round Capital,
Greylock, and Costanoa are quite good.
2. Recruiters
You’ll typically work with a recruiter during the interview process, but you
don’t just have to wait for them to contact you. Sometimes you may want to
reach out directly to recruiters, either at the company itself (the in-house
recruiter) or third-party recruiters who put companies in touch with great
candidates.
26
While the options above are pretty traditional, it's more and more common
for candidates to take a different tack when it comes to getting a job
interview. Often you'll need to hustle and demonstrate creativity and grit in
order to get a position. Startups are one of the main areas of new jobs in AI
and machine learning, and are known for pioneering a different type of
interview style.
This is often the best way to meet people interested in AI and machine
learning in your community, and you may also learn about job opportunities
from attendees. There are large conferences and smaller or more focused
community meetups that you can target, depending on what you’re looking
for.
Conferences
27
should consider this event if you're a machine learning engineer, but there's
likely something for everyone connected to machine learning at this one.
Another long-time conference (since 1987), this one is probably the most
focussed on theory and research into the latest developments in machine
learning. There is a significant focus on computational neuroscience, so it
would be perfect for machine learning researchers or individuals from
theoretical backgrounds to consider.
Meetups
28
One of the best stops for meetups is very simply Meetup.com. You’ll find all
sorts of them on the site. Some are quite large; for example, the N
YC
Machine Learning Meetup has more than 13,000 members. But don’t let that
intimidate you; many people will register for a Meetup but not actually
attend. Typically a smaller meetup is between 10-200 people.
Another alternative if you can't find a good meetup or one nearby: create one
yourself! I know many candidates for roles that have gotten jobs because
they started the relevant community group. It makes you a connector or an
influencer in the community. And that's a great role to add to your CV.
There's no reason that you can't start doing machine learning work right
way. One of the easiest ways is to start freelancing. This will likely be easiest
for designers, engineers, and data scientists, though there certainly can be
opportunities for researchers and product managers as well. Sites like
Upwork or T
opTal make it easy as a skilled professional to create a profile
and find work in short-term contracts and ongoing projects, or even
longer-term engagements.
A portfolio can help you build your brand and also be an online record of
experience for your work. It can also give you some early references and
testimonials that you can pass on to potential employers. And of course, it
could allow you to do some genuinely interesting work that could inspire
articles for blogs or other content that you can use to expand your profile.
Ultimately, freelancing may also simply be a good idea for you to validate the
different types of work and industries that you are interested in, to help you
narrow your job search and be more specific in the future.
29
Companies hiring engineers are often particularly known to hire based on
open-source contributions, and sometimes will find you through what you
wrote. It's similar to the portfolio effect. People will often look you up online
and want to see what you worked on.
There are machine learning competitions like Kaggle and many hackathons
that allow you to work quickly on real business or social problems. It's a
great way to put your skills in machine learning and artificial intelligence to
use, and you will be able to meet people as well as showcase your ability to
make a difference.
8. Informational Interviews
The final path in, or step that you can take, is a classic, but it may be
irreplaceable. Ultimately, relationships are what can start you on the path to
a job, and help you close the deal. More than half of jobs aren't even posted
30
on job boards and sometimes the only way through a company’s seemingly
impenetrable shell is to start meeting people from that company and build
strong relationships with them.
One of the best ways to network is to request only a little bit of someone’s
time. A quick coffee date is ideal. Meet them on their schedule, at a place of
their choosing. Reach out via email or a LinkedIn message with a very short
note. All you need is one sentence about why you're unique as a candidate.
You could also use this g reat framework from Steve Blank.
Building Your Profile for Recruiters
Let's go deeper into networking. One of the most powerful sources of
information for companies is a strong referral, especially if that referral is
coming from someone who's already part of the company. If you have
someone looking out for you on the inside, they'll ensure that your
application gets looked at, and sometimes you can even jump ahead in the
interview process.
What's important to know is that these referrals don't need to be friends. It
could be as simple as having basic name recognition with a person and
making a good impression. They might put in a good word and say, "Yeah, I
met Rob once, he was great." That could help you make it past the initial
screen.
31
The important thing is to build relationships before you need them. You want
to take a long-term view on this and regularly be looking to build new
relationships and nurture them. That way, when an opportunity comes along
you’ll already have great relationships or places to turn to get the right
referrals. You won’t be as successful if you're only invested in these
relationships when you need help. You need to invest in them as well.
If you find yourself in a place where you need a referral right away, you could
use something that is known as the informational interview technique. This
technique is about reaching out to people in the field to get a sense of what
they are working on. If you approach in the right way, people can be very
generous with their time and offer to help you.
Hi [name],
I am very interested in the problems that Google is working on using machine learning. I’ve
been aspiring to break into the field, and being a passionate follower of the Think with
Google blog, I regularly read updates on how Google is advancing the frontiers of AI and
machine learning.
Based on my background in engineering and design, I might be able to help come up with
some creative ideas on how to help Google's latest projects.
I’d love to take you out to coffee and get a greater sense of what problems you are
working on in your role. And perhaps I can help! Would you have some time in the coming
weeks?
32
Cheers,
[your name]
[your LinkedIn or email]
Take a shot with something like that and see how far you get. One thing to
keep in mind: don’t get discouraged too easily. It may simply be a numbers
game, because people are quite busy. Don't feel personal disappointment if
someone doesn't respond to you.
One of the most helpful ways that you can use LinkedIn is to check out your
second-degree connections. You might be able to identify some mutual
connections that you can mention to the person you’re targeting.
Finally, you may also find from the interview that you are not interested in
working at that company or on that specific team, which will save you
valuable time!
CV vs. LinkedIn
33
There are many people I know who have regularly receive job and speaking
offers thanks to the content and the comments that they post on LinkedIn
regularly. One of my strategies is to follow people who matter in my industry
and comment on their posts and like what they do. Then intentionally send
them a message after you’ve established a presence through these
engagements.
Those coming from academia tend to value publication. But when searching
for these kinds of jobs, it's all about being succinct and talking about the
impact and metrics that you've driven within your accomplishments.
Recruiters tend to run through these very quickly, so you're going to want to
be as concise as possible. Use keywords for the industry. Use meaningful
numbers. And don't sell yourself short.
The way you were probably taught to do this in school is totally different
from how you should actually do it. And it is very important, especially for
smaller companies and the hottest startups. They care about who you are as
a person and how you’ll fit within the culture.
34
Email the hiring manager and include a couple of key paragraphs or some
bullets points about your experiences and interests. Keep it short, sweet, and
personalized. Highlight why your background would be valuable to their
company and what you hope to get out of the experience. That really might
be all you need.
Here is a more in-depth guide with some great examples of how to make a
cover letter better.
Preparing for an Interview
Once you secure an interview—or several—you'll be invited to start the
process, typically on a screening call with a recruiter.
In this section, I will outline what a typical interview process looks like in AI
and machine learning careers.
What to Expect
For example, one of the companies that we highlight later on, Integrate.ai,
focuses heavily on the behavioral interview, and even if a candidate has the
best technical interview, they won't move forward if they aren't a cultural fit.
35
Phone Screen
Phone screens are about filtering out candidates who don't meet the base
parameters of the job. They are also about validating whether a candidate's
claimed experience is legitimate.
During the call, you'll answer questions, but you’ll also want to ask your own,
such as:
And consider other thoughtful questions that could help you to understand
their business and the space that they operate in, as well as the role itself.
36
Take-Home Assignment
After a phone interview, companies will often give you an assignment and a
deadline, usually a few days or a week. This is typically a second screening
stage that companies use to ensure that you have a minimum level of
technical skills and understanding for the role, as well as some reasoning
power and problem-solving ability. It also screens out candidates with
commitment issues.
Ap
ost on KDNuggets recommends that you:
37
For case studies, consider reading blogs of major companies like Google,
Facebook, Twitter, and others as you can often get a better sense of how
these companies tackle business problems with machine learning.
The likely focus of this interview will be your technical skills, and it will
probably be the last screen and phone call before you go on-site. Often this
is split into two or three different components, usually taking place over one
long call, but sometimes during three shorter phone calls of 30 minutes
each.
Coding
This part of the interview is the most common, especially for a machine
learning engineer. You'll likely be evaluated on your ability to solve a coding
challenge by presenting pseudocode, or in tougher interviews, compile-ready
code. If you're applying for a data position, it will probably be more about
asking how to query data with SQL. The questions you will be asked are
likely in the programming scripting languages that you said you're
38
Your interviewer may use some sort of online whiteboarding software to
evaluate you online, or ask you to share your screen. Alternatively, they may
ask you to directly collaborate with them on a text editor and have you type
in your solution. Be ready for these scenarios and train with tools like
HackerRank or Collabedit if you can.
For coding interviews, there are myriad online resources available, everything
from C
racking the coding interview to Interview cake. Take advantage of
them.
You may have a call that screens for key mathematical and statistical
concepts. This is particularly relevant for those applying for data science
roles. Web companies will tend to focus on your knowledge of A/B split
testing, your understanding of how p-values are calculated, and what
statistical significance means. Energy companies may touch more heavily
on regression in linear algebra. The key in any of these interviews is to share
your entire thought process.
For example, if you're asked about A/B tests, describe the process in detail,
emphasize what to watch out for, and voice your experiences running
experiments. Treat these questions like mathematical proofs and showcase
your ability to statistically reason. And also, don't hesitate to tell a story
39
about why this matters and what insights you could share with a company
based on the result.
Qualitative Discussion
The final aspect of the phone call with the hiring manager will focus on how
you communicate and whether you’ll mesh with the rest of the team. (This
could be a separate call from the technical phone screens.) The goal of this
call is for the hiring manager to get a feel for who you are: your character,
your motivations, your fit with the team, and also a general sense of your
intelligence. The goal is to showcase who you are and why you're the right
person for the job—not just for your skills, but your personality and traits.
The way to prepare for this part of the interview is to think about the
problems the hiring manager is facing and the kind of person they're looking
for. They may already have a mental model of what kind of qualities that
person has. Your goal here is not to be a chameleon, but you can do some
tailoring of your conversation and focusing in on specific traits of yours.
Another helpful element to think about is if you would pass "the airplane
test." If you were sitting next to each other for several hours on an airplane,
would you enjoy the experience?
If you've made it through the screens, it's now time to meet your hiring
manager in person. They'll be judging you from both a technical and
non-technical perspective. This will go deeper into evaluating who you are as
a candidate.
40
On-site is also where things get more intense. I've documented a full set of
questions below that you might engage with in this interview or in the next
several portions.
Technical Challenge
Executive Interview
If you make it through an interview or two with the hiring manager, you're
likely near the end of the process, and around that time you'll likely meet
executive team members. If it's a startup, this could even be a founder
and/or the CEO.
Good job if you've made it this far. Typically, it means you've already
"passed" the other portions, and now is the time for final decisions and
choosing between star candidates. The key now is to make it clear why you
are the best person for the role. Basically, keep doing what you've done, and
don't let nerves get to you.
The executive interview usually won't be technical. Odds are, it's about
solidifying their choice based on fit and how you get along with the strategic
vision of the company. It's also a good opportunity to discuss how you could
41
see yourself growing with the organization, not just how you would fit in for a
specific role.
Key Interview Questions, Prep, and Solutions
I have divided this broadly into two categories: behavioral and technical. I've
further divided the technical portion into several sub-categories (e.g.,
algorithms, probability, and more).
One general recommendation I've been given is to review the entirety of
chapter five of the MIT Press “Deep Learning” book, which focuses on
machine learning basics. You can access it for free h
ere.
Below, I've done my best to pull together tactical examples, proofs, and
resources that you can directly apply toward verified questions in machine
learning interviews.
Behavioral
These questions are used to evaluate an applicant’s qualitative skills and fit,
including past work situations and scenarios, as well as teamwork skills.
Past work
Can you tell me about an AI / machine learning project that you have done in
the past?
42
■ Go into detail about your specific contribution and the outcome from a
business goal perspective. The interviewer wants to know what you
specifically did while trying to understand the overall goal of the project.
Intent: The intent of the question is to identify whether the role you’re
interviewing for is suitable for you, and to identify why you’re moving on from
a previous position.
■ Understand the role well. If possible, before the interview, use your HR
contact to get as much information as possible about the role and its
challenges. The HR person can be a treasure trove of information about the
role, team, history, and key immediate business goals.
■ Avoid talking about issues you had with specific people, and be
professional when talking about what you disliked. Introspect carefully and
talk to what makes you passionate. For example, discuss solving a machine
learning problem in an actionable way as something you enjoy. You could
43
also talk about learning new technologies that make machine learning
applicable across an organization. You might dislike how the organization is
not placing AI/ML at the center of its strategy or that the company has had
significant attrition at the management level and the direction of the team is
unclear. Keep it positive and away from personal situations.
● Bad: “I didn't like that management didn’t have a clue what the company
direction was!”
Situational
Tell me about a time when you had to convince others to take your position
on a specific matter. What was the outcome?
Intent: The intent is to find out how good are you at defending your position
and your ability to make change within a team.
I often use a framework to describe situations and outcomes that has its
origins at prestigious business consulting firm McKinsey:
44
Bottom line is, you want to describe a situation that was happening as
normal, a trigger event or complication that messed things up, and then how
you pushed the team to come to resolution. This will show that you don't
simply react instinctually, but that you think about how to solve problems.
Note: this is as general framework that you can use for many situational
questions.
This is one of the most helpful frameworks that I've seen on how to
approach an ML project in general. Here it is:
45
logic. This baseline can also serve as a good benchmark for ML
algorithms.
● Review ML literature. To avoid reinventing the wheel and get inspired
on what techniques / algorithms are good at addressing the questions
using our data.
● Set up a single-number metric. What does it mean to be successful
(high accuracy, lower error, or bigger AUC?) and how do you measure
it? The metric has to align with high-level goals. Set up a single-number
against which all models are measured.
● Do exploratory data analysis. Play with the data to get a general idea of
data types, distribution, variable correlation, facets, etc. This step
would involve a lot of plotting.
● Partition data. The validation set should be large enough to detect
differences between the models you’re training; the test set should be
large enough to indicate the overall performance of the final model; for
the training set, needless to say, the larger the merrier.
● Preprocess. This would include data integration, cleaning,
transformation, reduction, discretization, and more.
● Engineer features. Coming up with features is difficult,
time-consuming, and requires expert knowledge. Applied machine
learning is basically feature engineering. This step usually involves
feature selection and creation, using domain knowledge. It can be
minimal for deep learning projects.
● Develop models. Choose which algorithm to use, what
hyperparameters to tune, which architecture to use, etc.
● Ensemble. Ensemble methods can usually boost performance,
depending on the correlations of the models/features. So it’s always a
good idea to try out. But be open-minded about making
tradeoffs—some ensemble methods are too complex/slow to put into
production.
46
47
Note: I directly quote verbatim questions and solutions from many sources
and have done my best to give attribution and links everywhere that I can. I
exclude quotations to keep it streamlined. This is a compilation of the
excellent work of others, and I've gathered the best and most concise
solutions possible.
Cracking The Machine Learning Interview (this is the longest set of
questions that I've seen)
Mathematical Skills
Linear Algebra
The two primary mathematical entities that are of interest in linear algebra
are the vector and the matrix. They are examples of a more general entity
known as a tensor. Tensors possess an order (or rank), which determines
the number of dimensions in an array required to represent it. Scalars are
48
49
Solution from M
achine Learning Mastery
Broadcasting is the name given to the method that NumPy uses to allow
array arithmetic between arrays with a different shape or size.
The term broadcasting describes how numpy treats arrays with different
shapes during arithmetic operations. Subject to certain constraints, the
smaller array is “broadcast” across the larger array so that they have
compatible shapes.
— Broadcasting, SciPy.org
50
NumPy does not actually duplicate the smaller array; instead, it makes
memory and computationally efficient use of existing structures in memory
that in effect achieve the same result.
The concept has also permeated linear algebra notation to simplify the
explanation of simple operations.
In the context of deep learning, we also use some less conventional notation.
We allow the addition of matrix and a vector, yielding another matrix: C = A +
b, where Ci,j = Ai,j + bj. In other words, the vector b is added to each row of
the matrix. This shorthand eliminates the need to define a matrix with b
copied into each row before doing the addition. This implicit copying of b to
many locations is called broadcasting.
51
Linear and logistic regression are commonly used for ML algorithms. You
can expect at least one of these questions in an interview.
Analyze a data set and give a model that can predict this response variable.
Source: Wikipedia
52
The cost of one pen is x$. The cost of ten pens is 10x$ . This is the most
classic layman’s form of linear regression. The simplest form of the
regression equation with one dependent and one independent variable is
defined by the formula y = c + b*x, where y = estimated dependent variable
score, c = constant, b = regression coefficient, and x = score on the
independent variable. In our pen example, c=0, y is the cost of pens and x is
the number of pens. If we know the unit cost of one pen b we can calculate
the cost of any number of pens. A complex form of linear regression is used
in housing price predictions.
Source: L
ogistic regression
53
Linear regression is used for continuous targets while logistic regression is
used for binary targets as sigmoid curve in the logistic model forces the
features to either a 0 or 1.
Solution from T
owards Data Science
SVM tries to maximize the margin between the closest support vectors while
LR the posterior class probability. Thus, SVM finds a solution which is as fair
as possible for the two categories while LR does not have this property.
54
How can the SVM optimization function be derived from the logistic
regression optimization function?
This one is a bit too long to include in its entirety here, so check out this
resource, particularly slides 5 and 10.
More on Demystifying the math of support vector machines
55
56
Numerical Optimization
According to J
ason Brownlee, the core of machine learning models are an
optimization problem. Each is really a "search" for terms with unknown
values needed to fill an equation.
A solution from Zarantech describes both the ordinary least square and
maximum likelihood methods to reaching these values.
Answer: OLS and maximum likelihood are the methods used by the
respective regression methods to approximate the unknown parameter
(coefficient) value. In simple words, ordinary least square is a method used
in linear regression which approximates the parameters resulting in
minimum distance between actual and predicted values. Maximum
likelihood helps in choosing the values of parameters which maximizes the
likelihood that the parameters are most likely to produce observed data.
Maximum likelihood
More on optimization
57
Overflow is when the absolute value of the number is too high for the
computer to represent it. Underflow is when the absolute value of the
number is too close to zero for the computer to represent it.
You can get overflow with both integers and floating point numbers. You can
only get underflow with floating point numbers.
If the variable x is a signed byte it can have values in the range -128 to +127,
then
1. x = 127
2. x = x + 1
For floating point numbers, the range depends on their representation. If x is
a single precision (32-bit IEEE) number, then
1. x = 1e-38
2. x = x / 1000
58
Bayes’ Theorem gives you the posterior probability of an event given what is
known as prior knowledge.
Bayes’ Theorem says no. It says that you have a (.6 * 0.05) (True Positive
Rate of a Condition Sample) / (.6*0.05) (True Positive Rate of a Condition
Sample) + (.5*0.95) (False Positive Rate of a Population) = 0.0594 or 5.94%
chance of getting the flu.
Bayes’ Theorem is the basis behind a branch of machine learning that most
notably includes the Naive Bayes classifier. That’s important to consider
when you’re faced with machine learning interview questions.
59
Type II error means that you claim nothing is happening when in fact
something is.
You may want to communicate your grasp of the concepts with an example
and how it might be relevant to the business at hand. Type I error, or a false
positive, would be telling a man he was pregnant, while Type II error would
be telling a pregnant woman she wasn’t.
If you were running a fraud detection business, you might have a very high
tolerance for false positives (a client will not fuss about an email on the
potential of fraud), but a false negative (not detecting fraud when it is
happening) could be disastrous.
Confidence Intervals
Solution from K
ey Differences
n
S ample M ean x = 1
n
∑ = 1ai
i
∑ = Add up
60
N
P opulation M ean μ = 1
N
∑ = 1ai
i
∑ = Add up
The formula we use for standard deviation depends on whether the data is
being considered a population of its own, or the data is a sample
representing a larger population.
σ = N ∑(xi−μ)2
61
sx = n−1∑(xi−xˉ)2
The steps in each formula are all the same except for one—we divide by one
less than the number of data points when dealing with sample data.
Answer from T
owards Data Science
62
variables from experiments like dice rolls, choosing a number out of a hat, or
getting a high score on a test. The “discrete” part means that there’s a set
number of outcomes. For example, you can only roll a 1,2,3,4,5, or 6 on a die.
P(X = x)
That just means “the probability that X takes on some value x.”
It’s not a very useful equation on its own; what’s more useful is an equation
that tells you the probability of some individual event happening. For
example:
How you come up with these equations depends mostly on what type of
event you have. For example, the binomial distribution PMF is:
And the P
oisson distribution PMF is:
63
To find the probability that X falls in an interval (a, b) you need to find P(a < X
< b).
(1) f(x) is positive everywhere in the support S, that is, f(x) > 0, for all x in S
(2) The area under the curve f(x) in the support S is 1, that is:
∫Sf (x)dx = 1
64
(3) If f(x) is the p.d.f. of x, then the probability that x belongs to A, where A is
some interval, is given by the integral of f(x) over that interval, that is:
Autoencoders
Autoencoders (AE) are neural networks that aim to copy their inputs to their
outputs. They work by compressing the input into a latent-space
representation, and then reconstructing the output from this representation.
1. Encoder: This is the part of the network that compresses the input into
a latent-space representation. It can be represented by an encoding
function h=f(x).
2. Decoder: This part aims to reconstruct the input from the latent space
representation. It can be represented by a decoding function r=g(h).
65
Architecture of an autoencoder
Programming Skills
This kind of question demonstrates your ability to think in parallelism and
how you could handle concurrency in programming implementations dealing
with big data. Take a look at pseudo-code frameworks such as P eril-L and
visualization tools such as W
eb Sequence Diagrams to help you
demonstrate your ability to write code that reflects parallelism.
66
A hash table is a data structure that produces an associative array. A key is
mapped to certain values through the use of a hash function. They are often
used for tasks such as database indexing.
67
SQL
Although you don't have to be a SQL expert for most machine learning
positions—it is more common for a data scientist role—definitely some
SQL-related questions could come up.
● (INNER) JOIN: Return records that have matching values in both tables
● LEFT (OUTER) JOIN: Return all records from the left table, and the
matched records from the right table
● RIGHT (OUTER) JOIN: Return all records from the right table, and the
matched records from the left table
● FULL (OUTER) JOIN: Return all records when there is a match in either
left or right table
1. W3schools SQL
2. SQLZOO
68
Variance is error due to too much complexity in the learning algorithm you’re
using. This leads to the algorithm being highly sensitive to high degrees of
variation in your training data, which can lead your model to overfit the data.
You’ll be carrying too much noise from your training data for your model to
be very useful for your test data.
69
More reading: What is the difference between supervised and unsupervised machine
learning? (Quora)
Supervised learning requires labeled training data. For example, in order to
do classification (a supervised learning task), you’ll need to first label the
data you’ll use to train the model to classify data into your labeled groups.
Unsupervised learning, in contrast, does not require labeling data explicitly.
More reading: How is the k-nearest neighbor algorithm different from k-means clustering?
(Quora)
The critical difference here is that KNN needs labeled points and is thus
supervised learning, while k-means doesn’t and is thus unsupervised
learning.
What’s your favorite algorithm, and can you explain it to me in less than a
minute?
70
What cross-validation technique would you use on a time series data set?
Instead of using standard k-folds cross-validation, you have to pay attention
to the fact that a time series is not randomly distributed data—it is inherently
ordered by chronological order. If a pattern emerges in later time periods,
your model may still pick up on it even if that effect doesn’t hold in earlier
years!
71
You’ll want to do something like forward chaining, where you’ll be able to
model on past data then look at forward-facing data.
Pruning is what happens in decision trees when branches that have weak
predictive power are removed in order to reduce the complexity of the model
and increase the predictive accuracy of a decision tree model. Pruning can
happen bottom-up and top-down, with approaches such as reduced error
pruning and cost complexity pruning.
Reduced error pruning is perhaps the simplest version: replace each node. If
it doesn’t decrease predictive accuracy, keep it pruned. While simple, this
heuristic actually comes pretty close to an approach that would optimize for
maximum accuracy.
72
You could find missing/corrupted data in a data set and either drop those
rows or columns, or decide to replace them with another value.
In Pandas, there are two very useful methods that will help you find columns
of data with missing or corrupted data and drop those values: isnull() and
dropna(). If you want to fill the invalid values with a placeholder value (for
example, 0), you could use the fillna() method.
More reading: 8 Tactics to Combat Imbalanced Classes in Your Machine Learning data
set (Machine Learning Mastery)
An imbalanced data set is when you have, for example, a classification test
and 90% of the data is in one class. That leads to problems: an accuracy of
90% can be skewed if you have no predictive power on the other category of
data! Here are a few tactics to get over the hump:
1- Collect more data to even the imbalances in the data set.
73
What’s important here is that you have a keen sense for what damage an
unbalanced data set can cause, and how to balance that.
Do you have experience with Spark or big data tools for machine learning?
More reading: 50 Top Open Source Tools for Big Data (Datamation)
You’ll want to get familiar with the meaning of big data for different
companies and the different tools they’ll want. Spark is the big data tool
most in demand now, able to handle immense data sets with speed. Be
honest if you don’t have experience with the tools demanded, but also take a
look at job descriptions and see what tools pop up: you’ll want to invest in
familiarizing yourself with them.
More reading: What is the difference between a generative and discriminative algorithm?
(Stack Overflow)
A generative model will learn categories of data while a discriminative model
will simply learn the distinction between different categories of data.
Discriminative models will generally outperform generative models on
classification tasks.
74
This question tests your grasp of the nuances of machine learning model
performance! Machine learning interview questions often look toward the
details. There are models with higher accuracy that can perform worse in
predictive power— how does that make sense?
Well, it has everything to do with how model accuracy is only a subset of
model performance, and at that, a sometimes misleading one. For example,
if you wanted to detect fraud in a massive data set with a sample of millions,
a more accurate model would most likely predict no fraud at all if only a vast
minority of cases were fraud. However, this would be useless for a predictive
model—a model designed to find fraud that asserted there was no fraud at
all! Questions like this help you demonstrate that you understand model
accuracy isn’t the be-all and end-all of model performance.
The F1 score is a measure of a model’s performance. It is a weighted
average of the precision and recall of a model, with results tending to 1 being
the best, and those tending to 0 being the worst. You would use it in
classification tests where true negatives don’t matter much.
75
Ensemble techniques use a combination of learning algorithms to optimize
better predictive performance. They typically reduce overfitting in models
and make the model more robust (unlikely to be influenced by small changes
in the training data).
You could list some examples of ensemble methods, from bagging to
boosting to a “bucket of models” method and demonstrate how they could
increase predictive power.
This is a simple restatement of a fundamental problem in machine learning:
76
the possibility of overfitting the training data and carrying the noise of that
data through to the test set, thereby providing inaccurate generalizations.
1. Keep the model simple: reduce variance by taking into account fewer
variables and parameters, thereby removing some of the noise in the
training data.
2. Use cross-validation techniques such as k-folds cross-validation.
3. Use regularization techniques such as LASSO that penalize certain
model parameters if they’re likely to cause overfitting.
More reading: How to Evaluate Machine Learning Algorithms (Machine Learning Mastery)
You would first split the data set into training and test sets, or perhaps use
cross-validation techniques to further segment the data set into composite
sets of training and test sets within the data. You should then implement a
choice selection of performance metrics—here is a fairly comprehensive list.
You could use measures such as the F1 score, accuracy, and the confusion
matrix. What’s important here is to demonstrate that you understand the
nuances of how a model is measured and how to choose the right
performance measures for the right situations.
77
Deep Learning
What is deep learning, and how does it contrast with other machine learning
algorithms?
You can imagine that the ability to learn from unlabeled or unstructured data
is an enormous benefit for applications in the real world. It allows you to
78
create systems that can learn from a chaotic and spontaneous world, with
many different inputs.
Solution from D
eep Learning
Most deep learning methods use n eural network architectures, which is why
deep learning models are often referred to as deep neural networks.
One of the most popular types of deep neural networks is known as
convolutional neural networks (CNN or ConvNet). A CNN involves learned
features with input data, and uses 2D convolutional layers, making this
architecture well suited to processing 2D data, such as images.
CNNs eliminate the need for manual f eature extraction, so you do not need
to identify features used to classify images. The CNN works by extracting
features directly from images. The relevant features are not pretrained; they
are learned while the network trains on a collection of images. This
automated feature extraction makes deep learning models highly accurate
for computer vision tasks such as object classification.
79
I found another long, incredibly comprehensive solution here for this on
Machine Learning Mastery.
It goes through:
The same author also wrote a great guide on building a time series model
more generally.
Regularization
80
L2 regularization tends to spread error among all the terms, while L1 is more
binary/sparse, with many variables either being assigned a 1 or a 0 in
weighting. L1 corresponds to setting a Laplacean prior on the terms, while
L2 corresponds to a Gaussian prior.
What is dropout?
From Machine Learning Mastery
Dropout is a r egularization technique patented by Google for reducing
overfitting in neural networks by preventing complex co-adaptations on
training data.
Clustering
The k-means clustering distortion function measures how well the data fits a
cluster and is computed as a simple sum of squared distances:
81
For a given cluster j, we add the squared distances from all the cluster points
x to the cluster center w. The total distortion is just the sum of all distortions
for a given value of K:
The scree plot shows how the total distortion changes as we increase K. At
the limit, when K equals the number of samples in the data set, when every
point in the data set corresponds to its own cluster, the total distortion is
zero.
More reading h
ere; more on convex vs not-convex and proof here.
82
More reading h
ere.
What is WORD2VEC?
Solution from S
kymind.ai
WORD2VEC is a two-layer neural net that processes text. Its input is a text
corpus and its output is a set of vectors: feature vectors for words in that
corpus. While Word2vec is not a deep neural network, it turns text into a
numerical form that deep nets can understand. D eeplearning4j implements
a distributed form of Word2vec for Java and Scala, which works on Spark
with GPUs.
83
Solution from D
ataCamp
How it works:
More reading h
ere.
84
Loss Optimization
Name some typical loss functions used for regression. Compare and
contrast.
Mean square error (MSE) is the most commonly used regression loss
function. MSE is the sum of squared distances between our target variable
and predicted values.
Mean absolute error (MAE) is another loss function used for regression
models. MAE is the sum of absolute differences between our target and
predicted variables. It measures the average magnitude of errors in a set of
predictions, without considering their directions.
Huber loss is less sensitive to outliers in data than the squared error loss. It’s
also differentiable at 0. It’s basically absolute error, which becomes
quadratic when error is small. How small that error has to be to make it
quadratic depends on a hyperparameter, 𝛿 (delta), which can be tuned.
Huber loss approaches MAE when 𝛿 ~ 0 and MSE when 𝛿 ~ ∞ (large
numbers).
One big problem with using MAE for training of neural nets is its constantly
large gradient, which can lead to missing minima at the end of training using
85
gradient descent. For MSE, gradient decreases as the loss gets close to its
minima, making it more precise.
Huber loss can be really helpful in such cases, as it curves around the
minima which decreases the gradient. And it’s more robust to outliers than
MSE. Therefore, it combines good properties from both MSE and MAE.
However, the problem with Huber loss is that we might need to train
hyperparameter delta, which is an iterative process.
What is the 0–1 loss function? Why can’t the 0–1 loss function or
classification error be used as a loss function for optimizing a deep neural
network?
Zero-one loss is a common loss function used with classification learning. It
assigns 0 to loss for a correct classification and 1 for an incorrect
classification.
The reason it isn't a good fit as a loss function for optimization has to do
with convexity. It is non-convex and also non-differentiable at 0. Therefore,
even if you derive a derivative to make the function differentiable, the
function remains non-convex and difficult to optimize. Convex functions are
a better choice, such as the hinge loss in conjunction with the SVM model.
86
More on the Monte Carlo simulation here; how to estimate Pi using the
Monte Carlo method here.
87
Representation
88
Most representation learning problems face a trade-off between preserving
as much information about the input as possible and attaining nice
properties (such as independence).
Dimensionality Reduction
89
So you can see that if there's a way to reduce dimensionality, you can
economize. The technique of dimensionality reduction can help to compress
data without losing too much signal.
More reading: What are some of the best research papers/books for machine learning?
90
Related to the last point, most organizations hiring for machine learning
positions will look for your formal experience in the field. Research papers,
co-authored or supervised by leaders in the field, can make the difference
between you being hired and not. Make sure you have a summary of your
research experience and papers ready—and an explanation for your
background and lack of formal research experience if you don’t.
More reading: What are the typical use cases for different machine learning algorithms?
(Quora)
The Quora thread above contains some examples, such as decision trees
that categorize people into different tiers of intelligence based on IQ scores.
Make sure that you have a few examples in mind and describe what
resonated with you. It’s important that you demonstrate an interest in how
machine learning is implemented.
Machine learning interview questions like this one really test your knowledge
91
How would you simulate the approach AlphaGo took to beat Lee Sidol at Go?
More reading: Mastering the game of Go with deep neural networks and tree search
(Nature)
AlphaGo beating Lee Sidol, the best human player at Go, in a best-of-five
series was a seminal event in the history of machine learning and deep
learning. The paper above describes how this was accomplished with
“Monte-Carlo tree search with deep neural networks that have been trained
by supervised learning, from human expert games, and by reinforcement
learning from games of self-play.”
Case Studies/Scenarios
These are more specific scenarios that a company could give you that apply
directly to their business.
This kind of question requires you to listen carefully and impart feedback in a
manner that is constructive and insightful. Your interviewer is trying to gauge
if you’d be a valuable member of their team and whether you grasp the
nuances of why certain things are set the way they are in the company’s
92
This is a tricky question. The ideal answer would demonstrate knowledge of
what drives the business and how your skills could relate. For example, if you
were interviewing for music-streaming company Spotify, you could remark
that your skills at developing a better recommendation model would
increase user retention, which would then increase revenue in the long run.
The startup metrics linked above will help you understand exactly what
performance indicators are important for startups and tech companies as
they think about revenue and growth.
Here are key questions and prompts from a great prep document from Rafi
Lurie. I haven't gone into specific solutions here as they are much more
open-ended, but check the guide for additional breakdowns on how to think
about these problems.
Product vision: What product do you feel has a lot of potential but hasn't
achieved it yet?
● Why?
93
Conceptualizing, designing, and building a new feature is only half the battle.
How do you launch a new feature?
In Their Own Words: What Hiring Managers Are Looking
For
We interviewed hiring managers from startups, software companies, and big
corporations in order to give you a broad understanding of the different
types of interviews.
Further, these managers each have different roles and hire for different types
of candidates. They should give you a decent sense of what each of them
are looking for and what you can do to prepare and stand out.
94
● Resume screen
● Phone screen
● Technical interview (mix of take-home challenge and on-site coding)
● Deep, technical on-site interview (with our engineering team) with a
product on-site interview (with our product team)
What's the best advice that you can give job seekers?
My best advice is to actively work on projects and have them available on
your GitHub profile.
95
Also, work on real problems with real data because in the real world, most
problems don't have perfect data. The majority of your time is going to be
spent cleaning up data.
You need to be able to show that you can go through the entire workflow
from data engineering to modeling to productionizing your code.
What are the key questions that you ask? Anything particularly unique?
● When have you worked with a data set and what did you do with it?
● We will ask, in different scenarios, why do you select one model versus
another one? What are the pros and cons?
● Can you explain how you got your results? How do you go through
model validation?
● What is the implication of what you found?
Ultimately, we are testing for machine learning literacy, that you have the
ability to use your knowledge in practical situations to see the whole process
through, and that you have a genuine outside-of-work interest in ML and AI.
96
We aren't looking for a cookie-cutter profile. Hiring for these roles used to
mean "find someone in San Francisco that has a Ph.D." But it's a bit different
now. These days, everyone and their neighbor calls themselves a machine
learning scientist or a data scientist. So we have to go deeper. We look for
diversity across a number of different angles. Yes, our researchers still tend
to have Ph.D.s, but they don't have to be in data or stats: one of our
engineers has a Ph.D. in astrophysics and did research on blackholes.
We are also open to hiring from around the world and bringing people to
Canada.
● Discuss projects you have worked on and the impact you have had on
those projects.
● What did you actually do to make it happen?
97
Step 3: 3-4 hour in-person interview
Note that:
98
The best advice we can give is to be vulnerable and show your humanity in
our interview process. We care about that just as much as your machine
learning capabilities.
99
1. The first step is usually a cold analytical exercise, where we give you a
data set and a few questions for you to answer.
2. Then we have a phone/Zoom screen, testing for analytical skills, as well as
to get to know you and the work that you have done. We'll ask you about
one or two projects that you have done in your portfolio. And then the
majority of the interview is really focused on an open-ended exercise, which
is trying to get at your data modeling and data analytical skills.
Besides your technical skills, we also index on your ability to take something
that is ambiguously framed and then to dive into a business or product
context. We are interested to understand how much information you can
give when thinking about the user.
3. Once you've finished the screen, you come on-site, usually speaking to
four or five different groups.
One is focussed on your product thinking skill set. We would present 1-2
interesting open-ended product ideas that we would ask you to talk through
and discuss how you would prioritize different decisions and trade-offs.
100
typically focuses on behavioral questions to see how you can talk across
your space to someone from a different background.
Get your basics right. A lot of people are so focused on the buzzwords in
machine learning and deep learning. Depending on where you're going, at
least with us, we don't build large-scale systems or focus on the problems
that Facebook or Google does. But for us, it's about understanding and
framing an open-ended business problem into an analyst's domain. Take a
business problem and understand how to use data to answer the question.
What are the key questions that you ask? Anything particularly unique?
We care about if you are able to turn questions back to the customer and
keep the end user in mind. We aren't obsessed with technical nuances. What
does an MVP look like? What would it look like if you double the resources?
How do you measure your success? And do you take feedback well.
101
To be clear, they also did ask a solid number of product questions. But even
though I was interviewing for a machine learning role, there weren't a lot of
questions that were specific to machine learning. Much of the interview was
framed with "Tell me about a time that you…" and you would adjust your
answer accordingly.
For example: "Tell me about a time you disagreed with an engineer." In my
case, an example of a time that I disagreed with an engineer is when a
machine learning algorithm we were working on together provided bad
suggestions. Initially, we overreacted and realized that we did not create it in
the way it's typically done. So we backtracked, made amends, and had an
improved working relationship going forward.
For the interview it's important to recognize what applications and problems
are good candidates for using machine learning, and which are not. There
are many scenarios where you don't need to use machine learning. Further,
102
there are times where employing machine learning could actually be
counterproductive to the problem that you're trying to solve.
3) Some user behaviour is better off being deterministic. For example, user
does X, so Y happens. It's more predictable.
You are soon changing jobs to work at Kite. Why did you decide to work
there?
The second reason is that the team is very strong at machine learning. You
want to learn the best practices and learn from the best.
103
My advice is to be well prepared for the coding part of your interview. Your
methodology matters significantly and most of the point of the interview is
to show your work.
Also remember to be polite and kind to everyone, from the receptionist to the
last person you talk to.
104
● Phone screen: this was for company fit and asking about my recent
experience.
● Then we had a take home challenge, which was a data set.
● Technical interview with a senior data scientist.
○ This is when we went through machine learning questions. It
was where the tough questions became clearer.
○ The important part was relating my answer to my previous
experience.
1. In-person is a lot more personal. It's easier for you to gauge what the
other person is like. And ultimately, despite the technical questions,
they want to get to know you as a person. In my opinion, you can
evaluate the company better in person. But if you are remote: make
sure to ask about the projects that the company has recently done.
2. Ask questions like these:
a. What would I work on if I were to come aboard?
b. How are ML projects managed? Are they planned out?
c. Are there agile practices in place?
d. What hours do people typically work?
e. How does the team keep up with current developments in the
field? For example, some companies I have worked at have had
a reading circle every Friday.
f. When working on a project as a data scientist, whom would I
report to?
105
I wanted to move! The company was based in Ireland and I wanted to be
there. Also, the company was a startup and I really wanted to work in a small
startup, building something from scratch. Further, the team looked really
great, and I had a chance to talk to everyone during a session where we all
got to meet each other.
The technical interview was very challenging, but the interviewer was very
pleasant, even in situations when we disagreed on something. So I knew that
I would be learning from someone who was incredibly knowledgeable.
Something that surprised me which was great was that there were a lot of
values pieces at Integrate. Not a lot of companies do that. To make sure
there was not a conflict between my values and the company was really
important.
106
If you can somehow get more specific prep from the company, without
getting the actual questions, that can help you prepare.
In my case, if I were to do it again, I would use the script that, "Hey, I'm
typically a generalist so it's very helpful for me to know what specific
problems and ML areas I can focus on for the interview." I would also ask
about the kinds of numbers and problems they are looking at as a business.
Getting some insight there will help you understand how to best prepare.
One of the pitfalls in an interview like this is looking too narrowly at
something—missing out on the big picture, the main values, the main
elements that person is looking for. In my case, I was coming from more of a
theoretical background. I faced a bias going into machine learning, people
thinking that I was an academic, not as much of a coder. That's what you
might face if you have a strong theoretical or stats background.
Also, have evidence that you know how to code, taking a mathematical
problem and turning that into production-ready code. The more examples
you have of products and projects that you've worked on, the better chance
you will have.
The best advice I have for you on the behavioral side is only be yourself.
Because if you have multiple behavioral interviews, you can't fake it multiple
107
I chose this job because of the interview process. The people I got to meet
during it were amazing. They made be feel like a human throughout.
I also wanted to learn how machine learning could be used across a wide
array of businesses. At Integrate, we work across a wide variety of
industries. So… what better place to do that?
What surprised you and what did you find difficult about the interview?
Something else that surprised me throughout the interview was that they
were more impressed by the way I ran a pre-analysis after cleaning the data.
They were more impressed by that rather than just applying the machine
learning algorithm. They cared whether I could actually find the right inputs
beforehand.
Get information and derive insights before jumping into the algorithm. After
data cleaning, you can play with the data to see what insights you can get
108
(e.g., a correlation chart). It's a simple analysis that you can do without any
kind of algorithm. You can also share that with customers to get them
excited fast.
It seemed really exciting! I was going to be the first data scientist, the one
who brings data-driven solutions to the company. It was a very nimble and
flexible organization with many opportunities to learn.
Also, the job seemed like a good one. It would get me ahead in the field.
Takeaways
The key takeaways for a machine learning interview are:
1. Don’t think questions about basic material won’t be covered. Read up on
technical fundamentals before you go through the interview process.
109
uniquely valuable for the company at hand. Having relevant projects and
being very clear about what you contributed to those works will mark you as
a candidate worthy of passing to the next round.
4. Be patient. An interview process can take a long time. You’ll want to be
prepared to wait.
Now that we’ve gone through the actual machine learning interview process,
let’s look at what happens after you’ve finished interviewing.
7 Things to Do After the Interview
After you’ve finished your machine learning interview, you might think your
work is finished. That’s not necessarily the case.
Here is a list of things you can do after the interview to ensure, as best as
possible, that you maximize your chances of making the best impression on
your potential employers.
How you follow up on an interview can make the difference between internal
advocates fighting to get you in and apathy. It is now customary to send a
thank you note. With each office worker receiving an average of 1 21 emails a
day, however, you won’t want to just stick with a boilerplate “thank you for
the opportunity” email. Make sure you’re remembered. A nice email is the
bare minimum. Candidates who take the extra step of sending handwritten
notes or a list of thoughts after the interview will stand out from the rest of
the 120 emails.
110
After all, an interview isn’t just a test;it’s a discussion. If you listen carefully to
the questions presented and ask the right questions yourself, you will know
exactly what problems the company is facing. Why not send thoughts on the
solutions you’d pursue?
It can be difficult envisioning how your skills could apply to the office,
especially for somebody who has just met you. The sharpest hiring
organizations will often give you a sample problem to solve that is sourced
from real issues they are facing. This gives you the chance to demonstrate
how your efforts could impact the business in a positive manner.
Organizations that don’t do that will hesitate to hire the right candidate
because they haven’t sufficiently demonstrated how they’d drive impact for
the company in question. However, you can be proactive and use what you
learned in the interview to follow up. You don’t have to stop at sending them
thoughts that show you listened carefully;you can give them actual, tangible
solutions.
The author of t his post on Forbes was told that she didn’t have enough of a
portfolio to get a job as a freelance copywriter. Having listened carefully
throughout the interview, the candidate knew that a major project (the
111
redesign of a website) was just over the horizon. Instead of accepting defeat,
she sent 10 proposed headlines for the website banner, free of charge. This
burst of initiative got her a job doing the rest of the writing for the website.
You need to have a portfolio that shows the impact you can make, but
sometimes that isn’t enough. If you’re astute and you ask the right questions,
you can find a business problem or ML opportunity for the company. There’s
always something—that’s why they’re hiring in the first place! There’s a
project out there that everybody would love to see done or a thorny problem
that no one can figure out. Send them a plan for what you’d do or play with
some of the data they’ve divulged, and then give some solid insights into
how you work. Initiative will go a long way to getting you an offer.
One of the most awkward parts of the post-interview process is waiting for a
response. You don’t want to come off as desperate by following up too
many times, but companies take their time if you don’t engage with them
proactively. It is possible to affect the post interview decision from outside of
the company, but you should keep in mind the appropriate channel to reach
somebody. Make sure to ask before the interview ends how best to reach
your interviewer. Everybody has a preferred mode of communication;if they
specify short emails or a call, follow that rule and dispel some of the
post-interview awkwardness.
As a rule of thumb, don't check in more than once every week, or better yet,
in 10 days. Sometimes companies just take their time.
5. Leverage connections
Ideally, you’ll have come in with strong references both from external and
internal sources. If you had been building your network and providing value
112
to them, you should have strong advocates who can support your
candidacy. Check in with people who referred you internally every once in
awhile, and if needed, get them to mention how excited you would be to
work at the company and how lucky the company would be to hire you.
Hiring is often network-driven, and the best signal that you can send to a
potential employer is a vibrant network of people who are willing to go to bat
for you.
Look, there's a good chance you will get rejected. Sometimes you’re just not
right for the role, or they might have found somebody who is a slightly better
fit. It’s important at this point to maintain your composure, thank the
employer for their time, and move on.
People in the industry talk amongst themselves, and being unprofessional at
this point will only be bad karma and might get you ignored at other
companies. Being professional ensures the health of your network. More
importantly, a no isn’t always a no. Sometimes companies really do keep
your profile on file and will reach out in the future.
Perhaps Winston Churchill put it best when he said, “Success is the ability to
go from one failure to another with no loss of enthusiasm.” J.K. Rowling has
shared her rejection letters from publishers. Brian Chesky, the founder of
Airbnb, published seven rejection letters from potential investors. In order to
achieve greatness, you will have to endure rejection. Everybody successful
already has.
7. Keep up hope
The interview process can be filled with great anxiety. Your future can be
mapped out by deciding what company you work for. An interview can mean
113
the beginning of a career change. It can mean moving cities. It is a period in
our lives where other people have disproportionate control over our
destinies. Nevertheless, as seen in the previous steps, you control a lot more
than you think. It’s important to keep your head up and do what you can.
The most important thing you can do during the interview process is keep up
hope. Interviews are lengthy. Companies take time to get back to you. There
are many internal checks and processes before a candidate is accepted. You
may go through multiple rounds of interviews with the same company and
not seem any closer to a final offer. You have to set expectations. You
should never be disheartened during your journey.
How to Handle Offers
Your goal is to get as many offers as possible that you can evaluate and
potentially negotiate. While the process itself is difficult, and may take longer
than you expect, once you start getting offers, you’ll have earned them. We
can’t emphasize enough how important it is to manage your expectations
and keep your hopes up.
For many candidates in the AI and machine learning space, it can take
months to even half a year to find the right role, particularly if they are
coming from academia. Make sure you weigh what is presented to you and
choose the future you deserve once you’ve put in all the hard work earning it.
What to Assess
If you complete the interview process successfully, you may have multiple
offers. Congratulations! Accepting an offer is a commitment of a significant
amount of your time to the company in question. Always keep that in mind.
114
There are several factors you can use to ascertain whether or not an offer is
the right one for you.
Company Culture
This might be one of the most important factors in determining when an
offer is one you should accept. Make sure you ask about the company
culture. Look for signs that the company employs individuals who genuinely
enjoy spending time with one another; run away from generic descriptors or
companies that struggle to define their culture or even wave the question
away.
Great companies invest tons of time and effort into making sure they hire
awesome people who love what they do. That’ll come through in your
questioning. You should also check external and objective sources, such as
company reviews on Glassdoor. Approach current employees as well as
former ones (whom you can find on LinkedIn) to get their side of the story.
You’ll often find candid tales that can give you a good preview of what
working at your new job would be like.
Team
Company culture is an extension of the team that inhabits it, but you should
be excited about coming to the office every day and working with everybody
else. Make sure that you’re working with a team that you can learn from. You
are t he average of the five people you spend the most time with—and you’re
going to be spending a lot of time with your colleagues.
Location
Make sure you’re comfortable with the company’s location, especially if
you’re moving a significant distance to take the role. Moving is a difficult
115
process, so it’s important that you feel at ease with where you live. Factors
like the weather and the transit system matter to a certain degree, especially
if you’re planning to live under those conditions for several years.
Negotiating Your Salary
An astonishing 61% of people didn't negotiate their salary in their last job
offer, despite the fact that those who do typically see their salary raised by
13.3%, according to C NBC and Glassdoor.
When you first get your offer, you’re at a unique leverage point that you
might not see again for several years. This is the time to test what you’re
worth. Reach out with a counteroffer—a company won’t fire you or cancel a
contract offer because you were asserting your worth. Initial offers are sent
with a buffer for slight negotiation. Take advantage.
116
you being frank and positive at what is often the most difficult part of
the recruitment process for them. Before you accept the offer, make
sure you know how committed you are to the company, team, and
money.
5. Use your other offers as leverage. It's always best to go into a salary
negotiation with at least one other competitive offer. Even if you know
the job you want, it's important to be able to negotiate to get what you
want in terms of compensation.
Salary data often changes, but here are some facts and figures that can start
your research.
117
Check out G lassdoor for more valuable salary data. And t his is a good
resource for broader data on major tech companies.
Be aware that good companies will work to make you as comfortable as
possible. You should reach out to future teammates and figure out what
they do and how you can help them solve their business problems. Take the
time to socialize and meet as many people as you can.
More importantly, if you have time between when you accepted the offer and
when you start, relax and enjoy! Make sure you catch up with as many
people as you can in your life, take the chance to rest, and be completely
refreshed for your first day at your new company.
Conclusion
The AI and machine learning interview process is one of the hardest
recruitment processes to crack, and it’s one of the most competitive. Your
fellow interviewees will likely be experienced engineers, Ph.D.s, researchers,
or product managers. And some of them will have extensive experience in
the field.
118
While the space is attracting many talented people, remember that it has a
slew of different related industries, teams, and roles. If you think outside of
the box and apply a few battle-tested tactics, you’ll be able to get an
interview and take it all the way to an offer you love.
Split the process into its composite steps, and remember what it takes to
succeed. Don’t hunt for jobs like everybody else by limiting your search to
the standard job boards and sending out typical cover letters. Reach out to
people within organizations you admire for informational interviews. Do
something different from the hundreds of other candidates. Stand out as a
great technical thinker and, above all else, a proficient communicator (not to
mention… a great person).
Use this and other guides to thoroughly understand the technical and
nontechnical parts of the interview. Once you’ve mastered the thinking
behind the questions and what hiring managers are looking for, you’ll have a
good sense of how to excel throughout the process.
Final Checklist
Here's a final cheat sheet for you as you prepare for your interview.
Remember the following:
119
8) Follow up respectfully
9) Negotiate and accept
Special Thanks
Thanks to the Springboard team and authors of “The Ultimate Guide to Data
Science Interviews,” which provided great structure and content for this
guide:
T.J. DeGroat
Roger Huang
Sri Kanajan
Special thanks to two of my best friends and machine learning geniuses for
reading this and providing edits, suggestions, and laughs:
Patrick Lung
Logan Graham
About the Author
Jaxson Khan is CEO at Khan & Associates, a global advisory firm that helps
innovative companies and organizations with strategy, communications, and
growth. He previously served as head of marketing at Nudge.ai, the
Canadian AI Company of the Year in 2018. Jaxson is a host of the Ask AI
podcast, a mentor with Techstars, an instructor at Product Faculty, an
advisor to Century Initiative, and a member of the World Economic Forum
120