Data Science & ML - A Complete Interview Guide - Dimensionless PDF
Data Science & ML - A Complete Interview Guide - Dimensionless PDF
a
Introduction
The constant evolution of technology has meant data and information 2
is being generated at a rate unlike ever before, and it’s only on the
rise. Furthermore, the demand for people skilled in analyzing,
interpreting and using this data is already high and is set to grow
exponentially over the coming years. These new roles cover all aspect
Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 1/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Newfrom strategy,
Data Scienceoperations
batch to governance.
starting Hence,
on 14th May.the current and
Enrol Today
future demand will require more data scientists, data engineers, data
strategists, and Chief Data Officers.
Statistics
1. Name and explain few methods/techniques used in Statistics for
analyzing the data?
Answer:
Arithmetic Mean:
It is an important technique in statistics Arithmetic Mean can also be
called an average. It is the number or the quantity obtained by
summing two or more numbers/variables and then dividing the sum
by the number of numbers/variables.
Median:
Median is also a way of finding the average of a group of data points. 2
It’s the middle number of a set of numbers. There are two
possibilities, the data points can be an odd number group or it can be
en even number group.
If the group is odd, arrange the numbers in the group from smallest to
Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 2/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Mode:
The mode is also one of the types for finding the average. A mode is a
number, which occurs most frequently in a group of numbers. Some
series might not have any mode; some might have two modes which
is called bimodal series.
In the study of statistics, the three most common ‘averages’ in
statistics are Mean, Median and Mode.
Standard Deviation (Sigma):
Standard Deviation is a measure of how much your data is spread out
in statistics.
Regression:
Regression is an analysis in statistical modelling. It’s a statistical
process for measuring the relationships among the variables; it
determines the strength of the relationship between one variable and
a series of other changing independent variables.
3. List all the other models work with statistics to analyze the data?
Answer:
Statistics along with Data Analytics analyzes the data and help Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 3/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 4/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 5/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Newsimilar
Databetween these
Science twostarting
batch in mathematical
on 14thterms,
May.they are different
Enrol Today
from each other.
109 Commonly Asked Data Science Interview Questions
Preparing for an interview is not easy-there is significant uncertainty
regarding the data science interview questions…www.springboard.com
100 Data Science Interview Questions and Answers (General)
for 2018
Hone yourself to be the ideal candidate at your next data scientist job
interview with these frequently asked data…www.dezyre.com
Top 250+ Statistics Interview Questions – Best Statistics
Interview Questions and Answers | Wisdom…
250+ Statistics Interview Questions and Answers, Question1: What is
Bayesian? Question2: What is frequentist…www.wisdomjobs.com
Programming
R Interview Questions
1. Explain what is R?
R is data analysis software which is used by analysts, quants,
statisticians, data scientists, and others. 2
2. List out some of the function that R provides?
The function that R provides are
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 6/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Median
New Data Science batch starting on 14th May. Enrol Today
Distribution
Covariance
Regression
Non-linear
Mixed Effects
GLM
GAM. etc.
Commander GUI.
4. In R how you can import Data?
You use R commander to import Data in R, and there are three ways
through which you can enter data into it
You can enter data directly via Data New Data Set
Import data from a plain text (ASCII) or other files (SPSS, Minitab,
etc.)
Read a dataset either by typing the name of the data set or
selecting the data set in the dialogue box
# subtraction
# division
2
# note order of operations exists
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 7/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewGoData Science
to Data batch
> Active starting
Data Set on Active
> Export 14th May. Enrol
dataset and Today
a dialogue
box will appear, when you click ok the dialogue box lets you save your
data in the usual way.
11. What are the data structures in R that are used to perform
statistical analyses and create graphs?
R has data structures like
Vectors
Matrices
Arrays
Data frames
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 8/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Newtranspose a matrixbatch
Data Science or a data frame ton
starting () 14th
function is used.
May. Enrol Today
15. Explain how data is aggregated in R?
By collapsing data in R by using one or more BY variables, it becomes
easy. When using the aggregate() function the BY variable should be
in the list.
Machine Learning
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 9/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
• Speech recognition
• Regression
• Predict time series
• Annotate strings
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 10/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewAnswer:
Data Science batch starting on 14th May. Enrol Today
This is the popular Machine Learning Interview Questions asked in an
interview. Overfitting in Machine Learning is defined as when a
statistical model describes random error or noise instead of underlying
relationship or when a model is excessively complex.
13. What are the five popular algorithms for Machine Learning?
Answer:
Below is the list of five popular algorithms of Machine Learning:
• Decision Trees
• Probabilistic networks
• Nearest Neighbor
• Support vector machines
• Neural Networks
14. What are the different use cases where machine learning algorithms
can be used?
Answer:
The different use cases where machine learning algorithms can be
used are as follows:
• Fraud Detection
• Face detection
• Natural language processing
• Market Segmentation 2
• Text Categorization
• Bioinformatics
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 11/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewAnswer:
Data Science batch starting on 14th May. Enrol Today
Parametric models are those with a finite number of parameters and
to predict new data, you only need to know the parameters of the
model.
Non Parametric models are those with an unbounded number of
parameters, allowing for more flexibility and to predict new data, you
need to know the parameters of the model and the state of the data
that has been observed.
16. What are the three stages to build the hypotheses or model in
machine learning?
Answer:
This is the frequently asked Machine Learning Interview Questions in
an interview. The three stages to build the hypotheses or model in
machine learning are:
1. Model building
2. Model testing
3. Applying the model
Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 12/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewAnswer:
Data Science batch starting on 14th May. Enrol Today
The difference between inductive machine learning and deductive
machine learning are as follows:
machine learning where the model learns by examples from a set of
observed instances to draw a generalized conclusion whereas in
deductive learning the model first draws the conclusion and then the
conclusion is drawn.
20. What are the advantages decision trees?
Answer:
The advantages decision trees are:
Deep Learning
2. Why are deep networks better than shallow ones? Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 13/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewAnswer:
Data Science batch starting on 14th May. Enrol Today
There are studies which say that both shallow and deep networks can
fit at any function, but as deep networks have several hidden layers
often of different types so they are able to build or extract better
features than shallow models with fewer parameters.
3. What is a cost function?
Answer:
A cost function is a measure of the accuracy of the neural network
with respect to given training sample and expected output. It is a
single value, nonvector as it gives the performance of the neural
network as a whole. It can be calculated as below Mean Squared Error
function:-
MSE=1n∑i=0n(Y^i–Yi)²
Where Y^ and desired value Y is what we want to minimize.
5. What is a backpropagation?
Answer:
Backpropagation is a training algorithm used for multilayer neural
network. In this method, we move the error from an end of the
network to all weights inside the network and thus allowing efficient 2
computation of the gradient. It consists of several steps as follows:-
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 14/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
New3.Data
Then Science
we back propagate for computing
batch starting on 14thderivative
May. of error with
Enrol Today
respect to output activation on previous and continue this for all the
hidden layers.
4. Using previously calculated derivatives for output and all hidden
layers we calculate error derivatives with respect to weights.
5. And then we update the weights.
Answer:
Stochastic Gradient Descent: Here we use only single training example
for calculation of gradient and update parameters.
Batch Gradient Descent: Here we calculate the gradient for the whole
dataset and perform the update at each iteration.
Mini-batch Gradient Descent: It’s one of the most popular optimization
algorithms. It’s a variant of Stochastic Gradient Descent and here
instead of single training example, mini-batch of samples is used.
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 15/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
Answer:
Yes, this can be done considering that layer 4 output is from previous
time step like in RNN. Also, we need to assume that previous input
batch is sometimes- correlated with the current batch.
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 16/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewFurthermore,
Data Sciencewithout which
batch the neural
starting on network would be only able to
14th May. Enrol Today
learn linear function which is a linear combination of its input data.
Problem Solving
Conclusion
It is the perfect time to move ahead of the curve and position yourself
with the skills needed to fill these emerging gaps in data science and
analysis. Most importantly, this is not only for people who are at the
very beginning of their careers and who decide on the path to study.
Hence, professionals already in the workforce can benefit from this
data science trend, perhaps even more than their fresh counterparts.
Share via:
USEFUL LINKS
2
About Us
Blog
Contact Us
Privacy - Terms
Terms & Conditions
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 17/18
5/18/2019 Data Science & ML : A Complete Interview Guide | Dimensionless
NewPrivacy
Data Policy
Science batch starting on 14th May. Enrol Today
ABOUT DIMENSIONLESS
Dimensionless wants to remove the limitations and make students
flexible, adaptable, nimble by imparting them right skill sets with right
POPULAR COURSES
Privacy - Terms
https://dimensionless.in/data-science-complete-interview-preparation-guide/ 18/18