0% found this document useful (0 votes)

11 views34 pages

Top 40 Machine Learning Questions & Answers

Q1) A feature F1 can take a certain value: A, B, C, D, E, or F, which represents the grades of
students from a college.

Which of the following statement is true in the following case?

A) Feature F1 is an example of a nominal variable.

B) Feature F1 is an example of an ordinal variable.

C) It doesn’t belong to any of the above categories.

D) Both of these

Solution: (B)

Ordinal variables are the variables that have some order in their categories. For example, grade A

should be considered a high grade than grade B.

Q2) Which of the following is an example of a deterministic algorithm?

A) PCA

B) K-Means clustering

C) KNN

D) None of the above

Solution: (A)

A deterministic algorithm is one in which output does not change on different runs. PCA would

give the same result if we run again, but not k-means clustering.

Q3) [True or False] A Pearson correlation between two variables is zero; still, their values can be
related to each other.
A) TRUE

B) FALSE

Solution: (A)

Y=X2. Note that they are not only associated, but one is a function of the other, and the Pearson

correlation between them is 0.

Q4) Which of the following statement(s) is / are true for Gradient Descent (GD) and Stochastic
Gradient Descent (SGD)?

1. In GD and SGD, you update a set of parameters in an iterative

manner to minimize the error function.

2. In SGD, you must run through all the samples in your training set

for a single parameter update in each iteration.

3. In GD, you either use the entire data points or a subset of training

data to update a parameter in each iteration.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 2 and 3

F) 1, 2 and 3

Solution: (A)
In SGD, for each iteration, you choose the batch, which generally contains a random sample of

data. But in the case of GD, each iteration contains all of the training observations.

Q5) Which of the following hyper-parameter (s), when increased, may cause the random
forest to over fit the data?

1. Number of Trees

2. Depth of Tree

3. Learning Rate

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 2 and 3

F) 1, 2 and 3

Solution: (B)

Usually, if we increase the depth of the tree, it will cause overfitting. The learning rate is not a

hyperparameter in a random forest. An increase in the number of trees will cause under fitting.

Q6) Suppose you want to develop a machine learning algorithm that predicts the number of
views on the articles in a blog.

Your data analysis is based on features like author name, number of articles written by the

same author, etc. Which of the following evaluation metrics would you choose in that case?

1. Mean Square Error

2. Accuracy

3. F1 Score

A) Only 1

B) Only 2

C) Only 3

D) 1 and 3

E) 2 and 3

F) 1 and 2

Solution:(A)

You can think that the number of views of articles is the continuous target variable that falls

under the regression problem. So, the mean of squared error will be used as an evaluation metric.

Q7) Given below are three images (1,2,3). Which of the following option is correct for these
images?

A)
A) 1 is tanh, 2 is ReLU, and 3 is SIGMOID activation functions.

B) 1 is SIGMOID, 2 is ReLU, and 3 is tanh activation functions.

C) 1 is ReLU, 2 is tanh, and 3 is SIGMOID activation functions.

D) 1 is tanh, 2 is SIGMOID, and 3 is ReLU activation functions.

Solution: (D)

The range of the SIGMOID function is [0,1].

The range of the tanh function is [-1,1].

The range of the RELU function is [0, infinity].

So Option D is the right answer.

Q8) Below are the 8 actual target variable values in the training file.

[0,0,0,1,1,1,1,1]

What is the entropy of the target variable?

A) -(5/8 log(5/8) + 3/8 log(3/8))

B) 5/8 log(5/8) + 3/8 log(3/8)

C) 3/8 log(5/8) + 5/8 log(3/8)

D) 5/8 log(3/8) – 3/8 log(5/8)

Solution: (A)

The formula for entropy is So the answer is A.

Q9) Let’s say you are working with categorical feature(s), and you have not looked at the
distribution of the categorical variable in the test data.

You want to apply one hot encoding (OHE) on the categorical feature(s). What challenges

may you face if you have applied OHE on a categorical variable of the training dataset?

A) All categories of the categorical variables are not present in the test dataset.

B) The frequency distribution of categories is different in the train compared to the test dataset.

C) Train and Test always have the same distribution.

D) Both A and B

E) None of these

Solution: (D)

Both are true. The OHE will fail to encode the categories which is present in the test but not in

the train, so it could be one of the main challenges while applying OHE. The challenge given in

option B is also true. You need to be more careful while applying OHE if frequency distribution

isn’t the same in the train and the test.

Q10) Skip-gram model is one of the best models used in the Word2vec algorithm for embedding
words.

Which one of the following models depicts the skip-gram model?

A) A

B) B

C) Both A and B

D) None of these

Solution: (B)

Both models (model1 and model2) are used in the Word2vec algorithm. Model1 represents a

CBOW model, whereas Model2 represents the Skip-gram model.

Q11) Let’s say you are using activation function X in hidden layers of a neural network.
At a particular neuron for any given input, you get the output as “-0.0001”. Which of the

following activation function could X represent?

A) ReLU

B) tanh

C) SIGMOID

D) None of these

Solution: (B)

The function is a tanh because this function output range is between (-1,-1).

Q12) [True or False] LogLoss evaluation metric can have negative values.

A) TRUE

B) FALSE

Solution: (B)

Log loss cannot have negative values.

Q13) Which of the following statements is/are true about “Type I error” and “Type II error”
errors?

1. Type I error is known as a false positive, and Type II error is

known as a false negative.

2. Type I error is known as a false negative, and Type II error is

known as a false positive.

3. Type I error occurs when we reject a null hypothesis when it is

actually true.

A) Only 1

B) Only 2

C) Only 3

D) 1 and 2

E) 1 and 3

F) 2 and 3

Solution: (E)

In statistical hypothesis testing, a type I error is the incorrect rejection of a true null hypothesis (a

“false positive”), while a type II error is incorrectly retaining a false null hypothesis (a “false

negative”).

Q14) Which of the following is/are one of the important step(s) to pre-process the text in NLP-
based projects?

1. Stemming

2. Stop word removal

3. Object Standardization

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) 1,2 and 3
Solution: (D)

Stemming is a rudimentary rule-based process of stripping the suffixes (“ing”, “ly”, “es”, “s”,

etc.) from a word.

Stop words are those words which will have not relevant to the context of the data, for example,

is/am/are.

Object Standardization is also a good way to pre-process the text.

Q15) Suppose you want to project high-dimensional data into lower dimensions.

The two most famous dimensionality reduction algorithms used here are PCA and t-SNE.

Let’s say you have applied both algorithms respectively on data “X”, and you got the

datasets “X_projected_PCA” , “X_projected_tSNE”.

Which of the following statements is true for “X_projected_PCA” & “X_projected_tSNE”?

A) X_projected_PCA will have interpretation in the nearest neighbor space.

B) X_projected_tSNE will have interpretation in the nearest neighbor space.

C) Both will have interpretation in the nearest neighbor space.

D) None of them will have interpretation in the nearest neighbor space.

Solution: (B)

t-SNE algorithm considers nearest neighbor points to reduce the dimensionality of the data. So,

after using t-SNE, we can think that reduced dimensions will also have interpretation in the

nearest neighbor space. But in the case of PCA, it is not the case.

Context for Questions 16-17

Given below are three scatter plots for two features (Images 1, 2 & 3, from left to right).

Q16) In the above images, which of the following is/are examples of multi-collinear features?

A) Features in Image 1

B) Features in Image 2

C) Features in Image 3

D) Features in Images 1 & 2

E) Features in Images 2 & 3

F) Features in Images 3 & 1

Solution: (D)

In Image 1, features have a high positive correlation, whereas, in Image 2, there is a high

negative correlation between the features. So in both images, the pair of features is an example of

multicollinear features.

Q17) In the previous question, suppose you have identified multi-collinear features. Which of the
following action(s) would you perform next?

1. Remove both collinear variables.

2. Instead of removing both variables, we can remove only one

variable.

3. Removing correlated variables might lead to a loss of information.

In order to retain those variables, we can use penalized regression

models like ridge or lasso regression.

A) Only 1

B) Only 2

C) Only 3

D) Either 1 or 3

E) Either 2 or 3

Solution: (E)

You cannot remove both features because after removing them, you will lose all of the

information. So you should either remove only 1 feature or use a regularization algorithm like L1

and L2.

Q18) Adding a non-important feature to a linear regression model may result in.

1. Increase in R-square

2. Decrease in R-square

A) Only 1 is correct

B) Only 2 is correct
C) Either 1 or 2

D) None of these

Solution: (A)

After adding a feature in the feature space, whether that feature is an important or unimportant

one, the R-squared always increases.

Q19) Suppose you are given three variables X, Y, and Z. The Pearson correlation coefficients for
(X, Y), (Y, Z), and (X, Z) are C1, C2 & C3, respectively.

Now, you have added 2 in all values of X (i.e., new values become X+2), subtracting 2 from

all values of Y (i.e., new values are Y-2), and Z remains the same. The new coefficients for

(X,Y), (Y,Z), and (X,Z) are given by D1, D2 & D3, respectively. How do the values of D1,

D2 & D3 relate to C1, C2 & C3?

A) D1= C1, D2 < C2, D3 > C3

B) D1 = C1, D2 > C2, D3 > C3

C) D1 = C1, D2 > C2, D3 < C3

D) D1 = C1, D2 < C2, D3 < C3

E) D1 = C1, D2 = C2, D3 = C3

F) Cannot be determined

Solution: (E)

Correlation between the features won’t change if you add or subtract a value in the features.

Q20) Imagine you are solving a classification problem with a highly imbalanced class.
The majority class is observed 99% of the time in the training data. Your model has 99%

accuracy after taking the predictions on the test set. Which of the following is true in such a

case?

1. The accuracy metric is not a good idea for imbalanced class

problems.

2. The accuracy metric is a good idea for imbalanced class problems.

3. Precision and recall metrics are good for imbalanced class

problems.

4. Precision and recall metrics aren’t good for imbalanced class

problems.

A) 1 and 3

B) 1 and 4

C) 2 and 3

D) 2 and 4

Solution: (A)

Refer the question number 4 from this article.

Q21) In ensemble learning (i.e., bagging or boosting), you aggregate the predictions for weak
learners so that an ensemble of these models will give a better prediction than the prediction of
individual machine learning models.

Which of the following statements is / are true for weak learners used in the ensemble

model?
1. They don’t usually overfit.

2. They have high bias, so they cannot solve complex learning

problems

3. They usually overfit.

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) Only 1

E) Only 2

F) None of the above

Solution: (A)

Weak learners are sure about a particular part of a problem. So, they usually don’t overfit, which

means that weak learners have low variance and high bias.

Q22) Which of the following options is/are true for K-fold cross-validation?

1. An increase in K will result in a higher time required to cross-

validate the result.

2. Higher values of K will result in higher confidence in the cross-

validation result as compared to a lower value of K.

3. If K=N, then it is called Leave one out cross validation, where N is

the number of observations.

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1,2 and 3

Solution: (D)

A larger k-value means less bias towards overestimating the truly expected error (as training

folds will be closer to the total dataset) and higher running time (as you are getting closer to the

limit case: Leave-One-Out CV). We also need to consider the variance between the k folds

accuracy while selecting the k.

Context for Questions 23-24

Cross-validation is an important step in machine learning for hyper parameter tuning. Let’s say

you are tuning a hyper-parameter “max_depth” for GBM by selecting it from 10 different depth

values (values are greater than 2) for a tree-based model using 5-fold cross-validation.

Time taken by an algorithm for training (on a model with max_depth 2) 4-fold is 10 seconds, and

for the prediction on remaining 1-fold is 2 seconds.

Note: Ignore hardware dependencies from the equation.

Q23) Which of the following option is true for overall execution time for 5-fold cross-validation
with 10 different values of “max_depth”?

A) Less than 100 seconds

B) 100 – 300 seconds

C) 300 – 600 seconds

D) More than or equal to 600 seconds

E) None of the above

F) Can’t estimate

Solution: (D)

Each iteration for depth “2” in 5-fold cross-validation will take 10 secs for training and 2 seconds

for testing. So, 5 folds will take 12*5 = 60 seconds. Since we are searching over the 10 depth

values so the algorithm would take 60*10 = 600 seconds. But training and testing a model on a

depth greater than 2 will take more time than depth “2”, so overall timing would be greater than

600.

Q24) In the previous question, if you train the same algorithm for tuning 2 hyperparameters, say
“max_depth” and “learning_rate”.

You want to select the right value against “max_depth” (from the given 10 depth values)

and learning rate (from the given 5 different learning rates). In such cases, which of the

following will represent the overall time?

A) 1000-1500 second

B) 1500-3000 Second

C) More than or equal to 3000 Second

D) None of these

Solution: (D)

Explanation is the same as above.

Q25) Given below is a scenario for training error TE and Validation error VE for a machine
learning algorithm M1.

You want to choose a hyperparameter (H) based on TE and VE.

H TE VE
1 105 90
2 200 85
3 250 96
4 105 85
5 300 100

Which value of H will you choose based on the above table?

A) 1

B) 2

C) 3

D) 4

E) 5

Solution: (D)

Looking at the table, option D seems the best

Q26) What would you do in PCA to get the same projection as SVD?

A) Transform data to zero mean

B) Transform data to zero median

C) Not possible

D) None of these
Solution: (A)

When the data has a zero mean vector, PCA will have the same projections as SVD; otherwise,

you have to center the data first before taking SVD.

Context for Questions 27-28

Assume there is a black box algorithm that takes training data with multiple observations (t1, t2,

t3,…….. tn) and a new observation (q1).

The black box outputs the nearest neighbor of q1 (say ti) and its corresponding class label ci.

You can also think that this black box algorithm is the same as 1-NN (1-nearest neighbor).

Q27) Is it possible to construct a k-NN classification algorithm based on this black box alone?

Note: Where n (number of training observations) is very large compared to k.

A) TRUE

B) FALSE

Solution: (A)

In the first step, you pass an observation (q1) in the black box algorithm, so this algorithm

returns the nearest observation and its class.

In the second step, you go through the nearest observation from train data and again input the

observation (q1). The black box algorithm will again return the nearest observation and its class.

You need to repeat this procedure k times.

Q28) Instead of using a 1-NN black box, we want to use the j-NN (j>1) algorithm as a black box.
Which of the following option is correct for finding k-NN using j-NN?
1. J must be a proper factor of k

2. J > k

3. Not possible

A) 1

B) 2

C) 3

Solution: (A)

Explanation is the same as above.

Q29) Suppose you are given 7 Scatter plots 1-7 (left to right), and you want to compare Pearson
correlation coefficients between variables of each scatterplot.

Which of the following is in the right order?

1. 1<2<3<4

2. 1>2>3 > 4

3. 7<6<5<4

4. 7>6>5>4
Which of the following option is / are true for the interpretation of log loss as an evaluation

metric?

1. If a classifier is confident about an incorrect classification, then log-

loss will penalize it heavily.

2. For a particular observation, the classifier assigns a very small

probability for the correct class then the corresponding

contribution to the log loss will be very large.

3. The lower the log loss, the better the model.

A) 1 and 3

B) 2 and 3

C) 1 and 2

D) 1,2 and 3

Solution: (D)

Options are self-explanatory.

Context for Questions 31-32

Below are five samples given in the dataset.

Note: Visual distance between the points in the image represents the actual distance.

Q31) Which of the following is leave-one-out cross-validation accuracy for 3-NN (3-nearest
neighbor)?

A) 0

B) 0.4

C) 0.8

D) 1

Solution: (C)

In Leave-One-Out cross-validation, we will select (n-1) observations for training and 1

observation of validation. Consider each point as a cross-validation point and then find the 3

nearest points to this point. So if you repeat this procedure for all points, you will get the correct

classification for all positive classes given in the above figure, but the negative classes will be

misclassified. Hence you will get 80% accuracy.

Q32) Which of the following value of K will have the least leave-one-out cross-validation
accuracy?

A) 1NN

B) 3NN
C) 4NN

D) All have the same leave-one-out error

Solution: (A)

Each point will always be misclassified in 1-NN which means that you will get 0% accuracy.

Q33) Suppose you are given the below data, and you want to apply a logistic regression model
for classifying it into two given classes.

You are using logistic regression with L1 regularization.

Where C is the regularization parameter, and w1 & w2 are the coefficients of x1 and x2.

Which of the following option is correct when you increase the value of C from zero to a

very large value?

A) First, w2 becomes zero, and then w1 becomes zero

B) First, w1 becomes zero, and then w2 becomes zero

C) Both become zero at the same time

D) Both cannot be zero even after a very large value of C

Solution: (B)

By looking at the image, we see that even by just using x2, we can efficiently perform

classification. So at first, w1 will become 0. As the regularization parameter increases more, w2

will come closer and closer to 0.

Q34) Suppose we have a dataset that can be trained with 100% accuracy with the help of
a decision tree of depth 6.

Now consider the points below and choose the option based on these points.

Note: All other hyper parameters are the same, and other factors are not affected.

1. Depth 4 will have high bias and low variance

2. Depth 4 will have low bias and low variance

A) Only 1

B) Only 2

C) Both 1 and 2

D) None of the above

Solution: (A)

If you fit the decision tree of depth 4 in such data means, it will be more likely to underfit the

data. So, in case of underfitting, you will have high bias and low variance.

Q35) Which of the following options can be used to get global minima in k-Means Algorithm?

1. Try to run an algorithm for different centroid initialization

2. Adjust the number of iterations

3. Find out the optimal number of clusters

A) 2 and 3

B) 1 and 3

C) 1 and 2

D) All of above

Solution: (D)

All of the options can be tuned to find the global minima.

Q36) Imagine you are working on a project which is a binary classification problem.
You trained a model on the training dataset and got the below confusion matrix on the

validation dataset.

Based on the above confusion matrix, choose which option(s) below will give you the

correct predictions.

1. Accuracy is ~0.91

2. Misclassification rate is ~ 0.91

3. True Negative rate is ~0.95

4. True positive rate is ~0.95

A) 1 and 3

B) 2 and 4

C) 1 and 4

D) 2 and 3

Solution: (C)

The Accuracy (correct classification) is (50+100)/165 which is nearly equal to 0.91.

The true Positive Rate is how many times you are predicting positive class correctly, so the true

positive rate would be 100/105 = 0.95, also known as “Sensitivity” or “Recall”

Q37) For which of the following hyperparameters higher value is better for the decision tree
algorithm?

1. Number of samples used for split

2. Depth of tree

3. Samples for leaf

A)1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

E) Can’t say

Solution: (E)

For all three options, A, B, and C, it is not necessary that if you increase the value of the

parameter, the performance may increase. For example, if we have a very high value of depth of

the tree, the resulting tree may overfit the data and would not generalize well. On the other hand,

if we have a very low value, the tree may underfit the data. So, we can’t say for sure that “higher

is better.”

Context for Questions 38-39

Imagine you have a 28 * 28 image, and you run a 3 * 3 convolution neural network on it with an

input depth of 3 and an output depth of 8.

Note: Stride is 1, and you are using the same padding.

Q38) What is the dimension of the output feature map when you are using the given parameters?

A) 28 width, 28 height, and 8 depth

B) 13 width, 13 height, and 8 depth

C) 28 width, 13 height, and 8 depth

D) 13 width, 28 height, and 8 depth

Solution: (A)

The formula for calculating output size is = (N – F)/S + 1

where N is input size, F is filter size, and S is stride.

Read this article to get a better understanding.

Q39) What are the dimensions of the output feature map when you are using the following
parameters?

A) 28 width, 28 height, and 8 depth

B) 13 width, 13 height, and 8 depth

C) 28 width, 13 height, and 8 depth

D) 13 width, 28 height, and 8 depth

Solution: (B)

Explanation is the same as above.

Q40) Suppose we were plotting the visualization for different values of C (Penalty parameter) in
the SVM algorithm.

Due to some reason, we forgot to tag the C values with visualizations. In that case, which of

the following option best explains the C values for the images below (1,2,3 left to right, so C

values are C1 for image1, C2 for image2, and C3 for image3 ) in the case of an rbf kernel?

A) C1 = C2 = C3

B) C1 > C2 > C3

C) C1 < C2 < C3

D) None of these

Solution: (C)

Penalty parameter C of the error term. It also controls the trade-off between smooth decision

boundaries and classifying the training points correctly. For large values of C, the optimization

will choose a smaller-margin hyperplane. Read more here.

Conclusion
I hope these questions and answers helped you test your knowledge and maybe learn a thing or

two about Python, machine learning, and deep learning. You can find all the information about

our upcoming skill tests and other events here. Machine learning questions.

Hope you like this article and getting understanding about the machine learning questions and

machine learning exam questions and answers.You Will get a clear understanding about these

questions and answers and it will make a better for learning the machine learning exam question

and answers will help you to clear the interviews.

Key Takeways

 Concepts such as Supervised and Unsupervised learning, Neural

Networks are very important to learn if you are aiming for a data

scientist job.

 You must be good with data analysis skills, such as handling missing

values and outliers.

 Keep yourself updated by reading data science blogs so that you are

always up to date.

Other Useful Resources on Machine Learning

Here are some more articles and tutorials if you wish to explore machine learning further.

Machine Learning basics for a newbie

Deep Learning vs. Machine Learning – the essential differences you need to know!

Applied Machine Learning Course

Introduction to Data Science Course

Ace Data Science Interviews Course

data science interviewsessential machine learning skillsmachine

learningmachine learning algorithmmachine learning applicationmachine
learning question papermachine learning viva questions
f
facebook_user_410 Jul, 2024
IntermediateInterview QuestionsInterviewsMachine LearningSkilltest

Solutions for Skilltest Machine Learning : Reve...

45 Questions to test a data scientist on basics...

SVM Skill Test: 25 MCQs to Test a Data Scientis...

30+ Most Important Data Science Interview Quest...



Top 15 Important Machine Learning Interview Que...

30 Questions to Test a Data Scientist on Tree B...

Top 20 Most Asked Machine Learning Interview Qu...



30 Questions to test a Data Scientist on Deep L...

Responses From Readers

Submit reply

Amit Srivastava02 May, 2017

For question 25, wouldnt Occam's Razor suggest choosing option 2. Its giving the same VE, but
with a lower hyperparameter value. Considering that we should keep our hyperparameters and
hence our model simpler, wouldnt option 2 be a choice. Option 4 may be overfitting the training
data

Ankit Gupta02 May, 2017

Hi Amit, It is true what you are saying but here hyperparameter H doesn't have any
interpretation. So in such case you should choose the one which has lower training and validation
error and also the close match. Best! Ankit Gupta

Quan02 May, 2017

Hi, why is the correct answer for question 28 "Not Possible"? For example, to construct a 6-NN
classifier f

Machine Learning Multiple Choice Questions
100% (1)
Machine Learning Multiple Choice Questions
20 pages
ML MCQ
100% (4)
ML MCQ
31 pages
Huawei.H13-311 - V3.0.v2022-03-02.q107: Show Answer
No ratings yet
Huawei.H13-311 - V3.0.v2022-03-02.q107: Show Answer
24 pages
MLT MCQ
100% (1)
MLT MCQ
21 pages
Final2019 Solutions
No ratings yet
Final2019 Solutions
23 pages
Final 2019
No ratings yet
Final 2019
15 pages
Data Science Final Mock Test
No ratings yet
Data Science Final Mock Test
47 pages
MCQs Dumps 2
No ratings yet
MCQs Dumps 2
15 pages
Machine Learning - AKTU PAPER (Session 2019 - 2020)
No ratings yet
Machine Learning - AKTU PAPER (Session 2019 - 2020)
10 pages
MLRECT2 Solution
No ratings yet
MLRECT2 Solution
9 pages
Data Analytic MCQ
No ratings yet
Data Analytic MCQ
5 pages
NPTEL ML Assignment Week1
100% (4)
NPTEL ML Assignment Week1
5 pages
212 Final-Solution
No ratings yet
212 Final-Solution
23 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Hcia Ai
100% (1)
Hcia Ai
49 pages
Khoi KHDL - de On
No ratings yet
Khoi KHDL - de On
6 pages
(1 - 1) CSA2001 - Module 5 Worksheet
No ratings yet
(1 - 1) CSA2001 - Module 5 Worksheet
12 pages
Machine Learning Suggestion (2 Marks) MCQ
No ratings yet
Machine Learning Suggestion (2 Marks) MCQ
5 pages
d3 PDF
No ratings yet
d3 PDF
7 pages
Machine Learning, (CS-3035), Online Spring End Semester Examination 2021
No ratings yet
Machine Learning, (CS-3035), Online Spring End Semester Examination 2021
8 pages
Marks Hi Marks: Be Comp MCQ PDF
100% (1)
Marks Hi Marks: Be Comp MCQ PDF
878 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Final: CS 189 Spring 2016 Introduction To Machine Learning
No ratings yet
Final: CS 189 Spring 2016 Introduction To Machine Learning
12 pages
EE2211 Past Paper
No ratings yet
EE2211 Past Paper
14 pages
It ML
No ratings yet
It ML
10 pages
2023 ML Assignment
No ratings yet
2023 ML Assignment
57 pages
IML-IITKGP - Assignment 1 Solution
No ratings yet
IML-IITKGP - Assignment 1 Solution
7 pages
MCQ Question
No ratings yet
MCQ Question
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
EE2211 Past Paper Ans
No ratings yet
EE2211 Past Paper Ans
19 pages
ML MCQs Set
No ratings yet
ML MCQs Set
18 pages
MLF PA GA Sol
No ratings yet
MLF PA GA Sol
159 pages
PA Wk1
No ratings yet
PA Wk1
7 pages
3rd Data
No ratings yet
3rd Data
18 pages
A) It Is Probably A Overfitted Model
No ratings yet
A) It Is Probably A Overfitted Model
2 pages
Data Science
No ratings yet
Data Science
35 pages
ML Finals16 PDF
No ratings yet
ML Finals16 PDF
12 pages
MCQ of Machine Learning
100% (2)
MCQ of Machine Learning
151 pages
Advantages:: Q.No 1.a Ans
No ratings yet
Advantages:: Q.No 1.a Ans
12 pages
MCQ Machine Learning
No ratings yet
MCQ Machine Learning
23 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Dda3020 22
No ratings yet
Dda3020 22
4 pages
MLPUE1 Solution
No ratings yet
MLPUE1 Solution
9 pages
MCQ - Practice Question Template (16.11.2020)
No ratings yet
MCQ - Practice Question Template (16.11.2020)
25 pages
IT602
No ratings yet
IT602
13 pages
2022 Exam2 Solution
No ratings yet
2022 Exam2 Solution
10 pages
Finals 19
No ratings yet
Finals 19
16 pages
CS771 IITK EndSem Solutions
100% (1)
CS771 IITK EndSem Solutions
8 pages
Quiz AI2
No ratings yet
Quiz AI2
11 pages
Quiz2 B
No ratings yet
Quiz2 B
6 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
MLvsMAP Merged
No ratings yet
MLvsMAP Merged
208 pages
ML MCQ Unit 2
No ratings yet
ML MCQ Unit 2
8 pages
Answer 2023-24
No ratings yet
Answer 2023-24
19 pages
ML 4
No ratings yet
ML 4
3 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
IGNOU MCA Digital Image Processing and Computer Vision Unsolved Paper Book MCS 230
From Everand
IGNOU MCA Digital Image Processing and Computer Vision Unsolved Paper Book MCS 230
Manish Soni
No ratings yet
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
The Hitchhiker Autocad
No ratings yet
The Hitchhiker Autocad
87 pages
Premium College
No ratings yet
Premium College
2 pages
Samsung ML-1660, ML-1665
No ratings yet
Samsung ML-1660, ML-1665
118 pages
RAI Rejections - Warnings - Validations
No ratings yet
RAI Rejections - Warnings - Validations
10 pages
Migrating A Survey From LimeSurvey To Qualtrics
No ratings yet
Migrating A Survey From LimeSurvey To Qualtrics
11 pages
Number of Smartphone Subscriptions Worldwide From 2016 To 2026 (In Millions)
No ratings yet
Number of Smartphone Subscriptions Worldwide From 2016 To 2026 (In Millions)
5 pages
Clat1 Vlsi Ak
No ratings yet
Clat1 Vlsi Ak
5 pages
IL WD19 2b StudentID 1
100% (1)
IL WD19 2b StudentID 1
2 pages
Introduction To CNC Router Design Diycncdesign
100% (1)
Introduction To CNC Router Design Diycncdesign
28 pages
Exception Handling
No ratings yet
Exception Handling
54 pages
Kindle Cashflow - CheatSheet - 2016
No ratings yet
Kindle Cashflow - CheatSheet - 2016
57 pages
ML202005 SW NA Installation Guide975 0639-01-03 Rev 3 ENG
No ratings yet
ML202005 SW NA Installation Guide975 0639-01-03 Rev 3 ENG
72 pages
Online Mobile Showroom and Services
No ratings yet
Online Mobile Showroom and Services
45 pages
Java J2EE-Unit-1
No ratings yet
Java J2EE-Unit-1
42 pages
Principles of Cyber Security: Cybertaipan - Csiro.au
No ratings yet
Principles of Cyber Security: Cybertaipan - Csiro.au
34 pages
Devops: Roadmap - SH
No ratings yet
Devops: Roadmap - SH
1 page
Hella-India Report
No ratings yet
Hella-India Report
36 pages
Zied Kanoun Resume v2
No ratings yet
Zied Kanoun Resume v2
1 page
BIS15Q1 Chapter 2
No ratings yet
BIS15Q1 Chapter 2
32 pages
Case Study UIUX Sumit B - Designerrs
No ratings yet
Case Study UIUX Sumit B - Designerrs
37 pages
Final All Codes of Paul Hudson On Swift
No ratings yet
Final All Codes of Paul Hudson On Swift
34 pages
Assignment1: Internet, Intranet and Extranet
No ratings yet
Assignment1: Internet, Intranet and Extranet
10 pages
Microsoft Expert Lesson 1 Knowledge Assessment
No ratings yet
Microsoft Expert Lesson 1 Knowledge Assessment
1 page
Festo Training Book
No ratings yet
Festo Training Book
39 pages
02 - PKI - Components - and - Noida - Nov 2015
No ratings yet
02 - PKI - Components - and - Noida - Nov 2015
42 pages
FYP Final Report 20048661 Bimal Khatri 89378
No ratings yet
FYP Final Report 20048661 Bimal Khatri 89378
141 pages
Computer Basics-WPS Office
No ratings yet
Computer Basics-WPS Office
4 pages
667400a31d833d00172262cf - ## - Inverse Trigonometric Functions - DPP 01 (Of Lec 03) - Lakshya JEE 2025
No ratings yet
667400a31d833d00172262cf - ## - Inverse Trigonometric Functions - DPP 01 (Of Lec 03) - Lakshya JEE 2025
2 pages
MCA Projects
No ratings yet
MCA Projects
6 pages
Service Level Agreement Template
No ratings yet
Service Level Agreement Template
3 pages

Top 40 Machine Learning Questions & Answers: Which of The Following Statement Is True in The Following Case?

Uploaded by

Top 40 Machine Learning Questions & Answers: Which of The Following Statement Is True in The Following Case?

Uploaded by

Top 40 Machine Learning Questions & Answers

Which of the following statement is true in the following case?

A) Feature F1 is an example of a nominal variable.

B) Feature F1 is an example of an ordinal variable.

C) It doesn’t belong to any of the above categories.

should be considered a high grade than grade B.

Q2) Which of the following is an example of a deterministic algorithm?

D) None of the above

correlation between them is 0.

1. In GD and SGD, you update a set of parameters in an iterative

manner to minimize the error function.

for a single parameter update in each iteration.

data to update a parameter in each iteration.

1. Mean Square Error

B) 1 is SIGMOID, 2 is ReLU, and 3 is tanh activation functions.

C) 1 is ReLU, 2 is tanh, and 3 is SIGMOID activation functions.

D) 1 is tanh, 2 is SIGMOID, and 3 is ReLU activation functions.

The range of the SIGMOID function is [0,1].

The range of the tanh function is [-1,1].

The range of the RELU function is [0, infinity].

So Option D is the right answer.

What is the entropy of the target variable?

B) 5/8 log(5/8) + 3/8 log(3/8)

C) 3/8 log(5/8) + 5/8 log(3/8)

D) 5/8 log(3/8) – 3/8 log(5/8)

The formula for entropy is So the answer is A.

C) Train and Test always have the same distribution.

isn’t the same in the train and the test.

Which one of the following models depicts the skip-gram model?

CBOW model, whereas Model2 represents the Skip-gram model.

following activation function could X represent?

Log loss cannot have negative values.

1. Type I error is known as a false positive, and Type II error is

known as a false negative.

2. Type I error is known as a false negative, and Type II error is

known as a false positive.

2. Stop word removal

etc.) from a word.

Object Standardization is also a good way to pre-process the text.

datasets “X_projected_PCA” , “X_projected_tSNE”.

Which of the following statements is true for “X_projected_PCA” & “X_projected_tSNE”?

A) X_projected_PCA will have interpretation in the nearest neighbor space.

B) X_projected_tSNE will have interpretation in the nearest neighbor space.

C) Both will have interpretation in the nearest neighbor space.

D) None of them will have interpretation in the nearest neighbor space.

Context for Questions 16-17

D) Features in Images 1 & 2

E) Features in Images 2 & 3

F) Features in Images 3 & 1

1. Remove both collinear variables.

3. Removing correlated variables might lead to a loss of information.

In order to retain those variables, we can use penalized regression

models like ridge or lasso regression.

one, the R-squared always increases.

D2 & D3 relate to C1, C2 & C3?

A) D1= C1, D2 < C2, D3 > C3

B) D1 = C1, D2 > C2, D3 > C3

C) D1 = C1, D2 > C2, D3 < C3

D) D1 = C1, D2 < C2, D3 < C3

1. The accuracy metric is not a good idea for imbalanced class

2. The accuracy metric is a good idea for imbalanced class problems.

3. Precision and recall metrics are good for imbalanced class

4. Precision and recall metrics aren’t good for imbalanced class

Refer the question number 4 from this article.

2. They have high bias, so they cannot solve complex learning

3. They usually overfit.

F) None of the above

1. An increase in K will result in a higher time required to cross-

validate the result.

2. Higher values of K will result in higher confidence in the cross-

validation result as compared to a lower value of K.

3. If K=N, then it is called Leave one out cross validation, where N is

the number of observations.

accuracy while selecting the k.

Context for Questions 23-24

for the prediction on remaining 1-fold is 2 seconds.