Advanced ML
Advanced ML
Correct Answer
Partially Correct
Incorrect Answer
1 Logarthmic tranformation helps to handle skewed data and after transformation, the distribution
becomes more approximate to normal.
Your Answer
True
Correct Answer
True
Explanation
None.
2 Drawback of Hierarchical Methods :
Your Answer
A technique is that they cannot correct erroneous decision.
Correct Answer
Both a & b
Explanation
No, that's not correct.
3 Which of the following distance metric can be used in k-NN?
Your Answer
Manhattan
Minkowski
Correct Answer
Manhattan
Minkowski
Tanimoto
Jaccard
Explanation
Sorry, you have selected the wrong answer.
4 With Lasso Regression the influence of the hyper parameter lambda, as lambda tends to zero the
solution approaches to _________________.
Your Answer
Linear Regression
Correct Answer
Linear Regression
Explanation
Excellent!
5 In which approach, models are developed using different machine learning algorithms to recommend
items to users.
Your Answer
All of the Above
Correct Answer
Model-based
Explanation
None.
6 Which of the following plots is/are used for univariate analysis?
Your Answer
All of the above
Correct Answer
All of the above
Explanation
None.
7 What is/are the metric(s) used for evaluating a recommender system (check all that apply) ?
Your Answer
Top-N Recall
Top-N Precision
MAP@K
AUC-ROC
Correct Answer
Top-N Recall
Top-N Precision
MAP@K
Explanation
None.
8 Which of the following is a disadvantage of decision trees?
Your Answer
Decision trees are prone to be overfit
Correct Answer
Decision trees are prone to be overfit
Explanation
You are correct!
9 Which among the following prevents overfitting when we perform bagging?
Your Answer
The use of classification algorithms which are not prone to overfitting
Correct Answer
The use of weak classifiers
Explanation
The presence of over-training (which leads to overfitting) is not generally a problem with weak classifiers. For
example, in decision stumps, i.e., decision trees with only one node (the root node), there is no real scope
for overfitting. This helps the classifier which combines the outputs of weak classifiers in avoiding overfitting.
10 Sudden transaction of huge amount from a credit card. This falls into which category of anomaly?
Your Answer
Point anomaly
Correct Answer
Point anomaly
Explanation
You are correct!
11 Additive model for time series Y = . . .
Your Answer
T+S+C+I
Correct Answer
T+S+C+I
Explanation
None.
12 For K-cross validation ,smaller k implies less variance.
Your Answer
False
Correct Answer
True
Explanation
No, that's not correct.
13 Would reducing the dimensions by doing PCA affect the anomalies in a dataset? Would it lead to the
disappearance of the anomalies?
Your Answer
No
Correct Answer
Partially
Explanation
Sorry, you have selected the wrong answer.
14 Naive Bayes classifiers are a collection _____ of algorithms.
Your Answer
all
Correct Answer
classification
Explanation
No, that's not correct.
15 What’s the cost function of the logistic regression?
Your Answer
both (A) and (B)
Correct Answer
both (A) and (B)
Explanation
You are correct!
16 The SVM’s are more effective when
Your Answer
The data is linearly separable
The data is noisy and contains overlapping points
The data is clean and ready to use
Correct Answer
The data is linearly separable
The data is clean and ready to use
Explanation
Incorrect. The corect options are 1 and 3.
17 Which of the following are the pros of Decision Trees?
Your Answer
Possible Scenarios can be added
best, Worst and expected values can be determined for different scenarios
Correct Answer
Possible Scenarios can be added
Use a white-box model, If a particular result is provided by a model
best, Worst and expected values can be determined for different scenarios
Explanation
Incorrect. The correct answers are 1,2 and 4.
18 What types of error does bias cause in a model?
Your Answer
Over Generalization
overfitting
Underfitting
Under-Generaliztion
Correct Answer
Over Generalization
Underfitting
Explanation
You selected the wrong option.
19 Two approaches to improving the quality of hierarchical clustering:
Your Answer
Both 1 & 2
Correct Answer
Both 1 & 2
Explanation
Yes, you are right!
20 Large values of the log-likelihood statistic indicate:
Your Answer
That as the predictor variable increases, the likelihood of the outcome occurring decreases.
Correct Answer
That the statistical model is a poor fit of the data.
Explanation
Sorry, you have selected the wrong answer. sd = squareroot of variance
Page 1 of 5
I'm done.
Software by
Version 11.2
Correct Answer
Partially Correct
Incorrect Answer
1 Logarthmic tranformation helps to handle skewed data and after transformation, the distribution
becomes more approximate to normal.
Your Answer
True
Correct Answer
True
Explanation
None.
2 Drawback of Hierarchical Methods :
Your Answer
A technique is that they cannot correct erroneous decision.
Correct Answer
Both a & b
Explanation
No, that's not correct.
3 Which of the following distance metric can be used in k-NN?
Your Answer
Manhattan
Minkowski
Correct Answer
Manhattan
Minkowski
Tanimoto
Jaccard
Explanation
Sorry, you have selected the wrong answer.
4 With Lasso Regression the influence of the hyper parameter lambda, as lambda tends to zero the
solution approaches to _________________.
Your Answer
Linear Regression
Correct Answer
Linear Regression
Explanation
Excellent!
5 In which approach, models are developed using different machine learning algorithms to recommend
items to users.
Your Answer
All of the Above
Correct Answer
Model-based
Explanation
None.
6 Which of the following plots is/are used for univariate analysis?
Your Answer
All of the above
Correct Answer
All of the above
Explanation
None.
7 What is/are the metric(s) used for evaluating a recommender system (check all that apply) ?
Your Answer
Top-N Recall
Top-N Precision
MAP@K
AUC-ROC
Correct Answer
Top-N Recall
Top-N Precision
MAP@K
Explanation
None.
8 Which of the following is a disadvantage of decision trees?
Your Answer
Decision trees are prone to be overfit
Correct Answer
Decision trees are prone to be overfit
Explanation
You are correct!
9 Which among the following prevents overfitting when we perform bagging?
Your Answer
The use of classification algorithms which are not prone to overfitting
Correct Answer
The use of weak classifiers
Explanation
The presence of over-training (which leads to overfitting) is not generally a problem with weak classifiers. For
example, in decision stumps, i.e., decision trees with only one node (the root node), there is no real scope
for overfitting. This helps the classifier which combines the outputs of weak classifiers in avoiding overfitting.
10 Sudden transaction of huge amount from a credit card. This falls into which category of anomaly?
Your Answer
Point anomaly
Correct Answer
Point anomaly
Explanation
You are correct!
11 Additive model for time series Y = . . .
Your Answer
T+S+C+I
Correct Answer
T+S+C+I
Explanation
None.
12 For K-cross validation ,smaller k implies less variance.
Your Answer
False
Correct Answer
True
Explanation
No, that's not correct.
13 Would reducing the dimensions by doing PCA affect the anomalies in a dataset? Would it lead to the
disappearance of the anomalies?
Your Answer
No
Correct Answer
Partially
Explanation
Sorry, you have selected the wrong answer.
14 Naive Bayes classifiers are a collection _____ of algorithms.
Your Answer
all
Correct Answer
classification
Explanation
No, that's not correct.
15 What’s the cost function of the logistic regression?
Your Answer
both (A) and (B)
Correct Answer
both (A) and (B)
Explanation
You are correct!
16 The SVM’s are more effective when
Your Answer
The data is linearly separable
The data is noisy and contains overlapping points
The data is clean and ready to use
Correct Answer
The data is linearly separable
The data is clean and ready to use
Explanation
Incorrect. The corect options are 1 and 3.
17 Which of the following are the pros of Decision Trees?
Your Answer
Possible Scenarios can be added
best, Worst and expected values can be determined for different scenarios
Correct Answer
Possible Scenarios can be added
Use a white-box model, If a particular result is provided by a model
best, Worst and expected values can be determined for different scenarios
Explanation
Incorrect. The correct answers are 1,2 and 4.
18 What types of error does bias cause in a model?
Your Answer
Over Generalization
overfitting
Underfitting
Under-Generaliztion
Correct Answer
Over Generalization
Underfitting
Explanation
You selected the wrong option.
19 Two approaches to improving the quality of hierarchical clustering:
Your Answer
Both 1 & 2
Correct Answer
Both 1 & 2
Explanation
Yes, you are right!
20 Large values of the log-likelihood statistic indicate:
Your Answer
That as the predictor variable increases, the likelihood of the outcome occurring decreases.
Correct Answer
That the statistical model is a poor fit of the data.
Explanation
Sorry, you have selected the wrong answer. sd = squareroot of variance
Page 1 of 5
I'm done.
Software by
Version 11.2
Correct Answer
Partially Correct
Incorrect Answer
21 Gradient of a continuous and differentiable function
Your Answer
is non-zero at a maximum
Correct Answer
is zero at a minimum
is zero at a saddle point
decreases as you get closer to the minimum
Explanation
No, that's not correct.
22 Which will be suitable NLP method For COVID 19 News Analysis from the online newspaers ?
Your Answer
Machine Transltion
Correct Answer
Sentiment Analysis
Explanation
None.
23 What are the approaches to Explainability?
Your Answer
Globally
Manually
Logically
Correct Answer
Globally
Locally
Explanation
None.
24 In SVM, the dimension of the hyperplane depends upon which one?
Your Answer
All of the above
Correct Answer
The number of features
Explanation
Sorry, you have selected the wrong answer.
25 Which of the following are true about isolation forest?
Your Answer
Identifies anomalies as the observations with short average path lengths
Isolation forest is built based on ensembles of decision trees.
Isolation forest needs an anomaly Score to have an idea of how anomalous a data point is
Splits the data points by randomly selecting a value between the maximum and the minimum of the
selected features.
Correct Answer
Identifies anomalies as the observations with short average path lengths
Isolation forest is built based on ensembles of decision trees.
Isolation forest needs an anomaly Score to have an idea of how anomalous a data point is
Splits the data points by randomly selecting a value between the maximum and the minimum of the
selected features.
Explanation
You are correct!
26 Which of the following methods do we use to find the best fit line for data in Linear Regression?
Your Answer
Least Square Error
Maximum Likelihood
Logarithmic Loss
Correct Answer
Least Square Error
Explanation
No, that's not correct.
27 A _________ is a decision support tool that uses a tree-like graph or model of decisions and their
possible consequences, including chance event outcomes, resource costs, and utility.
Your Answer
Decision tree
Correct Answer
Decision tree
Explanation
You are correct!
28 Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x. Choose the option which describes bias in best manner.In case of very large
x; bias is low
Your Answer
In case of very large x; bias is high
Correct Answer
In case of very large x; bias is low
Explanation
Sorry, you have selected the wrong answer.
29 Which of the following are true? Check all that apply
Your Answer
If you do not have any labeled data (or if all your data has label y = 0), then is is still possible to learn
p(x), but it may be harder to evaluate the system or choose a good value of ϵ.
If you are developing an anomaly detection system, there is no way to make use of labeled data to
improve your system.
If you have a large labeled training set with many positive examples and many negative examples, the
anomaly detection algorithm will likely perform just as well as a supervised learning algorithm such as
an SVM.
Correct Answer
If you do not have any labeled data (or if all your data has label y = 0), then is is still possible to learn
p(x), but it may be harder to evaluate the system or choose a good value of ϵ.
When choosing features for an anomaly detection system, it is a good idea to look for features that take
on unusually large or small values for (mainly the) anomalous examples.
Explanation
Sorry, you have selected the wrong answer.
30 If searching among a large number of hyper parameters,you should try values in grid rather than
random values,so that you can carry out the search more systematically and not rely on chance.
Your Answer
False
Correct Answer
False
Explanation
Correct answer!
31 Which of the following is not an assumption for binary logistic regression?
Your Answer
Independence of observations
Correct Answer
Normally distributed variables
Explanation
Sorry, you have selected the wrong answer.
32 Code to import PCA in Scikit-Learn.
Your Answer
from sklearn.decomposition import PCA
Correct Answer
from sklearn.decomposition import PCA
Explanation
You are correct!
33 Which of the following is true about training and testing error in such case?Suppose you want to apply
AdaBoost algorithm on Data D which has T observations. You set half the data for training and half for
testing initially. Now you want to increase the number of data points for training T1, T2 … Tn where T1
< T2…. Tn-1 < Tn.
Your Answer
The difference between training error and test error decreases as number of observations increases
Correct Answer
The difference between training error and test error decreases as number of observations increases
Explanation
You are correct!
34 A single document invokes multiple topics.
Your Answer
True
Correct Answer
True
Explanation
None.
35 Unlike multiple regression, logistic regression:
Your Answer
Does not have b weights.
Correct Answer
Predicts a categorical outcome variable.
Explanation
Sorry, you have selected the wrong answer.
36 The difficulty in recommendation when we have new user, and we cannot make a profile for him, or
when we have a new item, which has not got any rating yet.This is the meaning of Cold-start problem.
Your Answer
False
Correct Answer
True
Explanation
None.
37 The method / metric which is NOT useful to determine the optimal number of clusters in
unsupervised clustering algorithms is
Your Answer
Elbow method
Correct Answer
Scree plot
Explanation
Sorry, you have selected the wrong answer.
38 The process of learning, recognizing, and extracting these topics across a collection of documents is
called
Your Answer
Topic Modeling
Correct Answer
Topic Modeling
Explanation
None.
39 In a neural network, knowing the weight and bias of each neuron is the most important step. If you
can somehow get the correct value of weight and bias for each neuron, you can approximate any
function. What would be the best way to approach this?
Your Answer
Assign random values and pray to God they are correct
Correct Answer
Iteratively check that after assigning a value how far you are from the best values, and slightly change
the assigned values values to make them better
Explanation
Sorry, you have selected the wrong answer.
40 We use a validation and test set to avoid the bias and variance.
Your Answer
True
Correct Answer
True
Explanation
You're correct!
Page 2 of 5
I'm done.
Software by
Version 11.2
Correct Answer
Partially Correct
Incorrect Answer
21 Gradient of a continuous and differentiable function
Your Answer
is non-zero at a maximum
Correct Answer
is zero at a minimum
is zero at a saddle point
decreases as you get closer to the minimum
Explanation
No, that's not correct.
22 Which will be suitable NLP method For COVID 19 News Analysis from the online newspaers ?
Your Answer
Machine Transltion
Correct Answer
Sentiment Analysis
Explanation
None.
23 What are the approaches to Explainability?
Your Answer
Globally
Manually
Logically
Correct Answer
Globally
Locally
Explanation
None.
24 In SVM, the dimension of the hyperplane depends upon which one?
Your Answer
All of the above
Correct Answer
The number of features
Explanation
Sorry, you have selected the wrong answer.
25 Which of the following are true about isolation forest?
Your Answer
Identifies anomalies as the observations with short average path lengths
Isolation forest is built based on ensembles of decision trees.
Isolation forest needs an anomaly Score to have an idea of how anomalous a data point is
Splits the data points by randomly selecting a value between the maximum and the minimum of the
selected features.
Correct Answer
Identifies anomalies as the observations with short average path lengths
Isolation forest is built based on ensembles of decision trees.
Isolation forest needs an anomaly Score to have an idea of how anomalous a data point is
Splits the data points by randomly selecting a value between the maximum and the minimum of the
selected features.
Explanation
You are correct!
26 Which of the following methods do we use to find the best fit line for data in Linear Regression?
Your Answer
Least Square Error
Maximum Likelihood
Logarithmic Loss
Correct Answer
Least Square Error
Explanation
No, that's not correct.
27 A _________ is a decision support tool that uses a tree-like graph or model of decisions and their
possible consequences, including chance event outcomes, resource costs, and utility.
Your Answer
Decision tree
Correct Answer
Decision tree
Explanation
You are correct!
28 Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge
regression with penalty x. Choose the option which describes bias in best manner.In case of very large
x; bias is low
Your Answer
In case of very large x; bias is high
Correct Answer
In case of very large x; bias is low
Explanation
Sorry, you have selected the wrong answer.
29 Which of the following are true? Check all that apply
Your Answer
If you do not have any labeled data (or if all your data has label y = 0), then is is still possible to learn
p(x), but it may be harder to evaluate the system or choose a good value of ϵ.
If you are developing an anomaly detection system, there is no way to make use of labeled data to
improve your system.
If you have a large labeled training set with many positive examples and many negative examples, the
anomaly detection algorithm will likely perform just as well as a supervised learning algorithm such as
an SVM.
Correct Answer
If you do not have any labeled data (or if all your data has label y = 0), then is is still possible to learn
p(x), but it may be harder to evaluate the system or choose a good value of ϵ.
When choosing features for an anomaly detection system, it is a good idea to look for features that take
on unusually large or small values for (mainly the) anomalous examples.
Explanation
Sorry, you have selected the wrong answer.
30 If searching among a large number of hyper parameters,you should try values in grid rather than
random values,so that you can carry out the search more systematically and not rely on chance.
Your Answer
False
Correct Answer
False
Explanation
Correct answer!
31 Which of the following is not an assumption for binary logistic regression?
Your Answer
Independence of observations
Correct Answer
Normally distributed variables
Explanation
Sorry, you have selected the wrong answer.
32 Code to import PCA in Scikit-Learn.
Your Answer
from sklearn.decomposition import PCA
Correct Answer
from sklearn.decomposition import PCA
Explanation
You are correct!
33 Which of the following is true about training and testing error in such case?Suppose you want to apply
AdaBoost algorithm on Data D which has T observations. You set half the data for training and half for
testing initially. Now you want to increase the number of data points for training T1, T2 … Tn where T1
< T2…. Tn-1 < Tn.
Your Answer
The difference between training error and test error decreases as number of observations increases
Correct Answer
The difference between training error and test error decreases as number of observations increases
Explanation
You are correct!
34 A single document invokes multiple topics.
Your Answer
True
Correct Answer
True
Explanation
None.
35 Unlike multiple regression, logistic regression:
Your Answer
Does not have b weights.
Correct Answer
Predicts a categorical outcome variable.
Explanation
Sorry, you have selected the wrong answer.
36 The difficulty in recommendation when we have new user, and we cannot make a profile for him, or
when we have a new item, which has not got any rating yet.This is the meaning of Cold-start problem.
Your Answer
False
Correct Answer
True
Explanation
None.
37 The method / metric which is NOT useful to determine the optimal number of clusters in
unsupervised clustering algorithms is
Your Answer
Elbow method
Correct Answer
Scree plot
Explanation
Sorry, you have selected the wrong answer.
38 The process of learning, recognizing, and extracting these topics across a collection of documents is
called
Your Answer
Topic Modeling
Correct Answer
Topic Modeling
Explanation
None.
39 In a neural network, knowing the weight and bias of each neuron is the most important step. If you
can somehow get the correct value of weight and bias for each neuron, you can approximate any
function. What would be the best way to approach this?
Your Answer
Assign random values and pray to God they are correct
Correct Answer
Iteratively check that after assigning a value how far you are from the best values, and slightly change
the assigned values values to make them better
Explanation
Sorry, you have selected the wrong answer.
40 We use a validation and test set to avoid the bias and variance.
Your Answer
True
Correct Answer
True
Explanation
You're correct!
Page 2 of 5
I'm done.
Software by
Version 11.2
Correct Answer
Partially Correct
Incorrect Answer
41 It is possible that Assignment of observations to clusters does not change between successive
iterations in K-Means
Your Answer
True
Correct Answer
True
Explanation
You are correct!
42 Which is not an example of bias-variance trade off?
Your Answer
Overfitting
Correct Answer
Good Balance
Explanation
No, that's not correct.
43 What does "Naive" in naive classifier refers to?
Your Answer
Strong independence assumption between the features/variable.
Correct Answer
Strong independence assumption between the features/variable.
Explanation
You are correct!
44 Finding good hyperparameter is a time consuming process So typically you should do it once in the
beginning of the project and try to find best yperparameter do that you dont have to visit tuning them
again.
Your Answer
True
Correct Answer
False
Explanation
No, that's not correct.
45 Support vectors are the data points that lie farthest to the decision surface.
Your Answer
True
Correct Answer
False
Explanation
Actually, This is a false statement.
46 How do you choose the right node while constructing a decision tree?
Your Answer
An attribute having the highest information gain.
Correct Answer
An attribute having the highest information gain.
Explanation
You are correct!
47 What property in a model is bias-variance trade off?
Your Answer
The variance can be reduced by reducing the bias
Correct Answer
The variance can be reduced by increasing the bias
Explanation
No, that's not correct.
48 What is the naive assumption in a Naive Bayes Classifier?
Your Answer
The most probable feature for a class is the most important feature to be considered for classification
Correct Answer
All the features of a class are independent of each other
Explanation
You selected the wrong option.
49 Which of the following options is/are true for K-fold cross-validation? 1.)Increase in K will result in
higher time required to cross validate the result. 2.)Higher values of K will result in higher confidence
on the cross-validation result as compared to lower value of K. 3.)If K=N, then it is called Leave one out
cross validation, where N is the number of observations.
Your Answer
1,2,3
Correct Answer
1,2,3
Explanation
Correct answer!
50 What are the steps for using a gradient descent algorithm? Calculate error between the actual value
and the predicted value Reiterate until you find the best weights of network Pass an input through the
network and get values from output layer Initialize random weight and bias Go to each neurons which
contributes to the error and change its respective values to reduce the error
Your Answer
4, 3, 1, 5, 2
Correct Answer
4, 3, 1, 5, 2
Explanation
You are correct!
51 k-NN works well with a small number of input variables (p), but struggles when the number of inputs
is very large
Your Answer
True
Correct Answer
True
Explanation
You are correct!
52 In SVM, we are looking to maximize the margin between the data points and the hyperplane. The loss
function that helps maximize the margin is called hinge loss.
Your Answer
False
Correct Answer
True
Explanation
Actually, This is a true statement.
53 PCA is technique for _______.
Your Answer
Dimensionality Reduction
Data Augmentation
Feature Extraction
Variance Normalisation
Correct Answer
Dimensionality Reduction
Feature Extraction
Explanation
Incorrect. The correct answers are 1 and 3.
54 In K-NN, the algorithm used to compute the nearest neighbors:
Your Answer
Decisiom tree
Ball Tree
kd tree
brute
Correct Answer
Ball Tree
kd tree
brute
Explanation
Sorry, you have selected the wrong answer.
55 Which of the following is required by K-means clustering?
Your Answer
all of the mentioned
Correct Answer
all of the mentioned
Explanation
You are correct!
56 The component of a time series which is attached to short term variation is:
Your Answer
All of the above
Correct Answer
Seasonal variation
Explanation
None.
57 What would be then consequences for the OLS estimator if heteroscedasticity is present in a
regression model but ignored?
Your Answer
All
Correct Answer
It will be inefficient
Explanation
Incorrect!
58 In a simple Linear Regression model (one independent variable), if we change the input variable by 1
unit. How much output variable will change?
Your Answer
by 1
Correct Answer
by its slope
Explanation
You're incorrect.
59 Which of the following are data types of anomaly detection?
Your Answer
Outliers
Interquartile range
Level shift
Correct Answer
Outliers
Interquartile range
Spike
Level shift
Explanation
Sorry, you have selected the wrong answer.
60 In Latent Dirichlet Allocation model for text classification purposes, what does alpha and beta
hyperparameter represent?
Your Answer
Alpha: density of topics generated within documents, beta: density of terms generated within topics
Correct Answer
Alpha: density of topics generated within documents, beta: density of terms generated within topics
Explanation
None.
Page 3 of 5
I'm done.
Software by
Version 11.2
Correct Answer
Partially Correct
Incorrect Answer
61 Seasonal variation means the variation occurring within:
Your Answer
A number of years
Correct Answer
Parts of a year
Explanation
None.
62 Which regularization is used to reduce the over fit problem?
Your Answer
Both
Correct Answer
Both
Explanation
That's correct!
63 The important aspects of model explainability are:
Your Answer
All of the Above
Correct Answer
All of the Above
Explanation
None.
64 Discriminative models :
Your Answer
Estimate parameters directly from training data.
Correct Answer
Estimate parameters directly from training data.
Explanation
Correct.
65 Topic modelling refers to the task of identifying documents that best describes a set of topics.
Your Answer
True
Correct Answer
False
Explanation
None.
66 What is the advantage of hierarchical clustering over K-means clustering?
Your Answer
You don't have to assign the number of clusters from the beginning in the case of hierarchical clustering
Correct Answer
You don't have to assign the number of clusters from the beginning in the case of hierarchical clustering
Explanation
You are correct!
67 Statement 1: The cost function is altered by adding a penalty equivalent to the square of the
magnitude of the coefficients
Statement 2: Ridge and Lasso regression are some of the simple techniques to reduce model
complexity and prevent overfitting which may result from simple linear regression.
Your Answer
Both Statements are true
Correct Answer
Both Statements are true
Explanation
Correct.
68 Which of the following is true about Manhattan distance?
Your Answer
It can be used for continuous variables
Correct Answer
It can be used for continuous variables
Explanation
You are correct!
69 The bagging is suitable for high variance low bias models
Your Answer
False
Correct Answer
True
Explanation
Sorry, you have selected the wrong answer.
70 Decision Trees can be used for both Regression and Classification Tasks.
Your Answer
True
Correct Answer
True
Explanation
Yes, you are right!
71 Model-based CF algorithm is/are___________
Your Answer
All of The Above
Correct Answer
All of The Above
Explanation
None.
72 LIME is an example of
Your Answer
All of the Above
Correct Answer
Model-Agnostic Approach
Explanation
None.
73 Irregular variations in a time series are caused by:
Your Answer
All of the above
Correct Answer
All of the above
Explanation
None.
74 XGBoost as well as Catboost doesn’t have an inbuilt method for categorical features. Encoding (one-
hot, target encoding, etc.) should be performed by the user.
Your Answer
True
Correct Answer
False
Explanation
Catboost can handle categorical features whereas XGboost can't.
75 What is the right order for a text classification model components 1. Text cleaning 2. Text annotation
3. Gradient descent 4. Model tuning 5. Text to predictors
Your Answer
12534
Correct Answer
12534
Explanation
None.
76 Spam email detection comes under which domain?
Your Answer
Text Classification
Correct Answer
Text Classification
Explanation
None.
77 The curse of dimensionality refers to all the problems that arise working with data in the higher
dimensions.
Your Answer
True
Correct Answer
True
Explanation
Yes, you are right!
78 In Normalization____
Your Answer
(X - min(X)) / (max(X) - min(X))
Correct Answer
(X - min(X)) / (max(X) - min(X))
Explanation
None.
79 PCA is mostly used for ______________.
Your Answer
Both Supervised Learning and Unsupervised Learning
Correct Answer
Unsupervised Learning
Explanation
Sorry, you have selected the wrong answer.
80 Which of the following is finally produced by Hierarchical Clustering?
Your Answer
final estimate of cluster centroids
Correct Answer
tree showing how close things are to each other
Explanation
Sorry, you have selected the wrong answer.
Page 4 of 5
I'm done.
Software by
Version 11.2
Correct Answer
Partially Correct
Incorrect Answer
81 In Standardization _____
Your Answer
(X- mean(X)) / Std(X)
Correct Answer
(X- mean(X)) / Std(X)
Explanation
None.
82 k-NN makes no assumptions about the functional form of the problem being solved.
Your Answer
True
Correct Answer
True
Explanation
You are correct!
83 Which of the following statement is TRUE about k-means?
Your Answer
The default number of clusters to form as well as the number of centroids to generate are 5
The k-means problem is solved using either Lloyd’s or Elkan’s algorithm.
The average complexity is given by O(k n T)
In practice, the k-means algorithm is very fast
Correct Answer
The k-means problem is solved using either Lloyd’s or Elkan’s algorithm.
The average complexity is given by O(k n T)
In practice, the k-means algorithm is very fast
Explanation
Sorry, you have selected the wrong answer.
84 Which of the following is not an assumption of Linear Regression?
Your Answer
Linear Assumption between Dependent and Independent Variable
Correct Answer
Multicollinearity
Explanation
Sorry, you have selected the wrong answer.
85 Explainability and interpretability aren't used interchangeably.
Your Answer
False
Correct Answer
False
Explanation
None.
86 The amount of output of one unit received by another unit depends on what?
Your Answer
weight
Correct Answer
weight
Explanation
You are correct!
87 What will happen when eigenvalues are roughly equal?
Your Answer
PCA will perform outstandingly
Correct Answer
PCA will perform badly
Explanation
Sorry, you have selected the wrong answer.
88 The dirichlet distribution is a probability distribution as well - but it is not sampling from the space of
real numbers. Instead it is sampling over a probability simplex.
Your Answer
False
Correct Answer
True
Explanation
None.
89 Does pattern classification belongs to category of non-supervised learning?
Your Answer
True
Correct Answer
False
Explanation
Sorry, you have selected the wrong answer.
90 Select the correct option about regression with L2 regularization. A. Ridge regression technique
prevents coefficients from rising too high. B. As λ→∞, the impact of the penalty grows, and the ridge
regression coefficient estimates will approach infinity.
Your Answer
Both statements are true
Correct Answer
Statement A is true, Statement B is false
Explanation
Incorrect!
91 The output at each node is called_____.
Your Answer
node value
Correct Answer
node value
Explanation
You are correct!
92 Which function is used for k-means clustering?
Your Answer
k-means
Correct Answer
k-means
Explanation
You are correct!
93 What is Decision Tree?
Your Answer
Flow-Chart
Correct Answer
Flow-Chart & Structure in which internal node represents test on an attribute, each branch represents
outcome of test and each leaf node represents class label
Explanation
Sorry, you have selected the wrong answer.
94 Support vector machine is used for
Your Answer
Classification
Regression
Correct Answer
Classification
Regression
Outlier Detection Purposes
Explanation
Incorrect. The corect options are 1,2 and 3.
95 Select the appropriate option which describes the Single Linkage method.
Your Answer
In single linkage hierarchical clustering, the distance between two clusters is defined as the longest
distance between two points in each cluster.
Correct Answer
In single linkage hierarchical clustering, the distance between two clusters is defined as the shortest
distance between two points in each cluster.
Explanation
Sorry, you have selected the wrong answer.
96 In a naive Bayes algorithm, when an attribute value in the testing record has no example in the
training set, then the entire posterior probability will be zero.
Your Answer
False
Correct Answer
True
Explanation
Sorry! This needs work.
97 How conditional probability rewrite in language model? P(B | A) =P(A, B) / P(A)
Your Answer
P(A, B) = P(A) P(B | A)
Correct Answer
P(A, B) = P(A) P(B | A)
Explanation
None.
98 What’s the the hypothesis of logistic regression?
Your Answer
to limit the cost function between 0 and 1
Correct Answer
to limit the cost function between 0 and 1
Explanation
Yes, you are right!
99 Which of the following is correct use of cross validation?
Your Answer
All of the mentioned
Correct Answer
All of the mentioned
Explanation
Correct answer!
100 What happens when model complexity increases?
Your Answer
Model bias increases
Correct Answer
Variance of the model increases
Explanation
oh, incorrect!
Page 5 of 5
I'm done.
Software by
Version 11.2