Sample Questions
Sample Questions
Sample Questions
What would you do if you want to train logistic regression on same data that will take less
time as well as give the comparatively similar accuracy (may not be exactly the same)?
a. Decrease the learning rate and decrease the number of iterations
b. Decrease the learning rate and increase the number of iterations
c. Increase the learning rate and increase the number of iterations
d. Increase the learning rate and decrease the number of iterations
2. How will the bias in estimation in a logistic regression chance on using very high
regularization in the model?
a. Bias will be high
b. Bias will be low
c. Bias may be high some cases and low in some other cases
d. None of the above options are correct
4. The logit function is the natural log of odds. What could be the range of logit function in the
domain x = [0, 1].
a. (– ∞ , ∞)
b. (0, 1)
c. (0, ∞)
d. (– ∞, 0)
8. How does the bias-variance decomposition of a ridge regression estimator compare with
that of ordinary least squares regression?
a. Ridge has larger bias, larger variance
b. Ridge has smaller bias, larger variance
c. Ridge has smaller bias, and smaller variance
d. Ridge has larger bias, smaller variance
9. Both PCA and Lasso can be used for feature selection. Which of the following statements are
TRUE? (Multiple options are correct)
a. Lasso selects a subset (not necessarily a strict subset) of the original features
b. PCA and Lasso both allow you to specify how many features are chosen
c. PCA produces features that are linear combinations of the original features
d. PCA and Lasso are the same as far as their feature selection ability is concerned.
11. Which of the following are TRUE about Generative Models? (Multiple options are correct)
a. They model the joint distribution P(class = C AND sample = x)
b. An Artificial Neural Network is a generative model
c. They can be used for classification
d. Linear Discriminant Analysis is a generative model
12. Which of the following assumptions do we make while deriving linear regression
parameters? (Multiple options are correct)
a. The true relationship between the response variable y and the predictor variable x is
linear
b. The model errors are statistically independent
c. The error is normally distributed with 0 mean and constant standard deviation
d. The predictor x is non-stochastic and is measured error-free
14. As the model complexity increases, bias will decrease while variance will increase. This
statement is:
a. Always TRUE
b. Always FALSE
c. Sometimes TRUE and sometimes FALSE
d. Model complexity has got nothing to do with model variance
15. Likelihood
a. Is same as a p-value
b. Is the probability of observing a particular parameter value given a set of data
c. Attempts to find the parameter value which is the most likely given the observed
data
d. Minimizes the difference between the model and the data.
16. Which of the following assumptions do we make while deriving linear regression
parameters? (Multiple options may be correct. Choose all correct options to get full credit.)
a. The true relationship between the response variable y and the predictor variable x is
linear
b. The model errors are statistically independent
c. The errors are normally distributed with 0 mean and constant standard deviation
d. The predictor x is non-stochastic and is measured error-free.
18. As the model complexity increases, bias will decrease while variance will increase. This
statement is:
a. Always TRUE
b. Always FALSE
c. Sometimes TRUE and sometimes FALSE
d. Model complexity has got nothing to do with model variance.
19. In Principal Component Analysis, the correlation coefficients between the variables and the
components are known as:
a. Component scores
b. Component loadings
c. Correlation loadings
d. None of the above
20. Imagine, you are solving a classification problem with two highly imbalanced classes. The
majority class is observed 99% of the records in the training dataset. Your model has 99%
accuracy on the test data class prediction. Which of the following is TRUE in such as case?
a. Classification accuracy, Precision, and Recall are all good metrics
b. None of Classification accuracy, Precision, and Recall are good metrics
c. Classification accuracy is not a good metric, while Precision and Recall are
d. Classification accuracy is a good metric, while Precision and Recall are not
21. Which of the following is TRUE for a White Noise series? (Hint: Multiple options are
correct.)
a. Zero mean
b. Zero auto-covariances
c. Zero autocorrelations except at lag zero
d. Stationary time series
22. Suppose we fit Lasso Regression to a data set which has 100 features (X1, X2, ....X100). Now,
we rescale one of these features, say X1, by multiplying it by 10, and then refit Lasso
Regression with the same regularization parameter. Which of the following options is now
correct?
a. It is more likely that X1 will be excluded from the model
b. It is more likely that X1 will be included in the model
c. Nothing can be said beforehand
d. None of the above
23. Which of the following is TRUE about "Ridge" or "Lasso" Regression methods in case of
feature selection?
a. Ridge Regression uses subset selection of features
b. Lasso Regression uses subset selection of features
c. Both use subset selection of features
d. None of the above
24. Suppose you have fitted a complex regression model on a dataset. Now you are using Ridge
Regression with the tuning parameter lambda to reduce its complexity. Which of the
following statements is correct?
a. In case of very small lambda, bias is low and variance is high
b. In case of very small lambda, bias is high and variance is low
c. In case of very high lambda, bias is high and variance is low
d. In case of very high lambda, bias is low and variance is high
25. Suppose we have generated a synthetic dataset with a predictor and a response with the
help of polynomial regression of degree 3 (i.e., degree 3 will perfectly fit the data). Now
consider the following statements and identify which of them are CORRECT. (Multiple
options are correct; identify all of them)
a. A simple linear regression model fitted on the data will have a high bias and low
variance
b. A simple linear regression model will have a low bias and a high variance
c. A polynomial regression model with degree 3 will have a low bias and a high
variance
d. A polynomial regression model with a degree 3 will have a low bias and a low
variance
27. Suppose your model is exhibiting high variance across different training sets. Which of the
following is NOT a valid way to try and reduce the variance?
a. Increase the amount of training data in each training set
b. Improve the optimization algorithm being used for error minimization
c. Decrease the model complexity
d. Reduce the noise in the training data
28. What would you do if you want to train logistic regression on same data that will take less
time as well as give the comparatively similar accuracy (may not be exactly the same)?
a. Decrease the learning rate and decrease the number of iterations
b. Decrease the learning rate and increase the number of iterations
c. Increase the learning rate and increase the number of iterations
d. Increase the learning rate and decrease the number of iterations
29. How will the bias in estimation in a logistic regression change on using very high
regularization in the model?
a. Bias will be high
b. Bias will be low
c. Bias may be high some cases and low in some other cases
d. None of the above statements are correct
30. Which of the following statements are TRUE about subset selection? (Multiple options are
true)
a. Subset selection can substantially decrease the bias in estimation
b. Ridge regression frequently eliminates some of the features
c. Subset selection can reduce overfitting
d. Finding the best subset involves exponential time complexity
31. How does the bias-variance decomposition of a ridge regression estimator compare with
that of ordinary least squares regression?
a. Ridge has larger bias, larger variance
b. Ridge has smaller bias, larger variance
c. Ridge has smaller bias, and smaller variance
d. Ridge has larger bias, smaller variance
32. Both PCA and Lasso can be used for feature selection. Which of the following statements are
TRUE? (Multiple options are correct)
a. Lasso selects a subset (not necessarily a strict subset) of the original features
b. PCA and Lasso both allow you to specify how many features are chosen
c. PCA produces features that are linear combinations of the original features
d. PCA and Lasso are the same as far as their feature selection ability is concerned.
33. Likelihood
a. is same as a p-value
b. is the probability of observing a particular parameter value given a set of data
c. attempts to find the parameter value which is the most likely given the observed
data
d. minimizes the difference between the model and the data.
34. Which of the following techniques would perform better for reducing dimensions of a data
set?
a. Removing columns which have too many missing values
b. Removing columns which have high variance in data
c. Removing columns with dissimilar data trends
d. None of the above statements are correct
35. Which of the following statement(s) is/are TRUE about Principal Component Analysis?
(Multiple option are correct)
a. PCA is an unsupervised learning method
b. PCA searches for the directions in which the data have the largest variance
c. Maximum number of principal components is the number of features in the data
d. All principal components are orthogonal to each other
38. Which of the following statement is TRUE for Latent Dirichlet Allocation (LDA) technique of
Topic Modeling? (Multiple options are correct)
a. It is an unsupervised learning method
b. Selection of number of topics in a model does not depend on the size of the text
data
c. Number of topic terms is directly proportional to the size of the text data
d. It is used for sentiment analysis in the text data
39. When building a regression or classification model, which of the following is the correct
sequence to follow?
a. Removal of NAs Normalize the data PCA Training the model
b. Removal of NAs PCA Normalize the data Training the model
c. Normalize the data Removal of NAs Training the model PCA
d. Normalize the data PCA Training the model PCA
For two short answer type questions (each carrying five marks), study the following topics from the
materials that I shared with you:
1. Principal Component Analysis (the ppt and the Jupyter notebook that I shared)
2. LASSO and Ridge Regression (the jupyter note book that I shared)
3. Topic Modelling (from the jupyter notebook for text mining that I shared)
4. Artificial Neural Networks (from the pdf doc that I shared)