ML Question Bank
ML Question Bank
5. For a simple linear regression model, the null hypothesis H_0 in hypothesis testing states:
A) beta_1 = 0
B) beta_1 != 0
C) beta_0 = 0
D) beta_0 != 0
Answer: A) beta_1 = 0
6. If the p-value for the t-test of a regression coefficient is 0.03, what can we conclude at
the 5% significance level?
A) Reject the null hypothesis
B) Fail to reject the null hypothesis
C) The coefficient is not significant
D) The model is not valid
Answer: A) Reject the null hypothesis
7. In the context of bias and variance, which of the following scenarios typically results in
high bias?
A) Using a very complex model
B) Using a very simple model
C) Using a large amount of training data
D) Using a high-dimensional dataset
Answer: B) Using a very simple model
8. Which of the following scenarios typically results in high variance?
A) Using a very complex model
B) Using a very simple model
C) Using a small training dataset
D) Using a low-dimensional dataset
Answer: A) Using a very complex model
9. Suppose you have a dataset and you apply a linear regression model and a
polynomial regression model to it. You observe that the linear model has a higher
training error but a lower test error compared to the polynomial model. This indicates
that:
A) The linear model has high bias and low variance
B) The linear model has low bias and high variance
C) The polynomial model has high bias and low variance
D) The polynomial model has low bias and high variance
Answer: D) The polynomial model has low bias and high variance
10. Given a dataset, if you increase the complexity of your model (e.g., by adding more
features), what typically happens to bias and variance?
A) Both bias and variance increase
B) Bias increases and variance decreases
C) Bias decreases and variance increases
D) Both bias and variance decrease
Answer: C) Bias decreases and variance increases
11. The coefficient of determination (R2) in a Simple Linear Regression model
measures:
A. The slope of the regression line.
B. The proportion of variance in the independent variable explained by the
dependent variable
C. The proportion of variance in the dependent variable explained by the
independent variable.
D. The strength and direction of the linear relationship.
Answer: C) The proportion of variance in the dependent variable explained by the
independent variable
A. R2=0.15R^2 = 0.15R2=0.15
B. R2=0.85R^2 = 0.85R2=0.85
C. r=0.2r = 0.2r=0.2
D. r=−0.1r = -0.1r=−0.1
13.When performing Simple Linear Regression, a large p-value for the slope coefficient
suggests:
Answer: C. The relationship between the independent and dependent variable is weak or
nonexistent.
Correct Answer:
C. Cross-Entropy Loss
Explanation: Logistic regression minimizes the cross-entropy loss, which measures the
difference between predicted probabilities and actual binary outcomes.
A. ln(p)\ln(p)ln(p)
B. ln(1−p)\ln(1-p)ln(1−p)
C. ln(p1−p)\ln\left(\frac{p}{1-p}\right)ln(1−pp)
D. p/(1−p)p/(1-p)p/(1−p)
Correct Answer:
C. ln(p1−p)\ln\left(\frac{p}{1-p}\right)ln(1−pp)
Explanation: The logit function is the natural logarithm of the odds of success.
Correct Answers:
Correct Answer:
Correct Answer:
19. What happens when ridge regression is applied to standardized data (zero mean and
unit variance)?
Correct Answer:
Explanation: Standardizing the data ensures that the penalty term is applied consistently
across predictors, as all variables are on the same scale.
20. How is the optimal value of λ\lambdaλ typically chosen in ridge regression?
Correct Answer:
Correct Answer:
22.What is the key difference between ridge regression and lasso regression?
Correct Answer:
C. Ridge regression reduces coefficients but does not set them to zero, unlike lasso
regression.
Correct Answer:
B. It ranges from 0 to 1.
Explanation: Logistic regression outputs probabilities, which are constrained between 0 and 1
after applying the logistic function to the linear combination of predictors.
24.Which of the following regularization techniques can be applied to logistic
regression?
A. Lasso (L1)
B. Ridge (L2)
C. Elastic Net
D. All of the above
Correct Answer:
Explanation: Regularization techniques like L1 (Lasso), L2 (Ridge), and Elastic Net can
prevent overfitting in logistic regression by adding penalty terms to the loss function.
MSQ
11. Which of the following are assumptions of simple linear regression?
A) Linearity
B) Homoscedasticity
C) Independence of errors
D) Multicollinearity
Answer: A) Linearity, B) Homoscedasticity, C) Independence of errors
12. In a simple linear regression Y = beta_0 + beta_1X + e, which of the following statements
are true?
A) beta_0 is the intercept
B) beta_1 is the slope
C) e represents the residual error
D) X is the dependent variable
Answer: A) beta_0 is the intercept, B) beta_1 is the slope, C) e represents the residual error
13. In multiple linear regression, which of the following are true regarding the
coefficients?
A) Each coefficient represents the change in the dependent variable for a one-unit
change in the corresponding independent variable, holding other variables constant.
B) Multicollinearity can make the coefficients unstable.
C) Coefficients are estimated using the method of maximum likelihood.
D) Coefficients are estimated using the method of least squares.
Answer: A) Each coefficient represents the change in the dependent variable for a one-unit
change in the corresponding independent variable, holding other variables constant, B)
Multicollinearity can make the coefficients unstable, D) Coefficients are estimated using the
method of least squares
14. Which of the following can indicate the presence of multicollinearity in a multiple
linear regression model?
A) High variance inflation factor (VIF) values
B) Large changes in the estimated coefficients when adding or removing a variable
C) High correlation between independent variables
D) High R-squared value
Answer: A) High variance inflation factor (VIF) values, B) Large changes in the estimated
coefficients when adding or removing a variable, C) High correlation between independent
variables
15. Which of the following are steps in hypothesis testing for regression coefficients?
A) Formulate the null and alternative hypotheses
B) Calculate the test statistic
C) Determine the p-value or critical value
D) Make a decision to reject or not reject the null hypothesis
Answer: A) Formulate the null and alternative hypotheses, B) Calculate the test statistic,
C) Determine the p-value or critical value, D) Make a decision to reject or not reject the null
hypothesis
19. Which of the following strategies can help to reduce overfitting in a regression
model?
A) Use cross-validation
B) Increase the complexity of the model
C) Use regularization techniques such as Lasso or Ridge
D) Reduce the size of the training dataset
Answer: A) Use cross-validation, C) Use regularization techniques such as Lasso or Ridge
20. In the context of the bias-variance tradeoff, which of the following statements are
true?
A) Increasing model complexity decreases bias but increases variance
B) Decreasing model complexity increases bias but decreases variance
C) The optimal model is one that balances bias and variance
D) The bias-variance tradeoff is irrelevant for model selection
Answer: A) Increasing model complexity decreases bias but increases variance, B)
Decreasing model complexity increases bias but decreases variance, C) The optimal model is
one that balances bias and variance
21Which of the following trade-offs exist when choosing the significance level
(α\alphaα)? (Select all that apply)
A. A lower α\alphaα reduces the chance of Type I errors but increases the chance
of Type II errors.
B. A higher α\alphaα increases the power of the test but risks more Type I
errors.
C. The significance level does not impact the power of the test.
D. The choice of α\alphaα depends on the context and consequences of errors.
Correct Answers:
A. A lower α\alphaα reduces the chance of Type I errors but increases the chance of
Type II errors.
B. A higher α\alphaα increases the power of the test but risks more Type I errors.
D. The choice of α\alphaα depends on the context and consequences of errors.
Explanation: The significance level directly impacts the likelihood of both errors and test
power, and its choice should be context-sensitive.
Correct Answers:
Correct Answers:
A. R-squared
B. Accuracy
C. Precision, Recall, and F1 Score
D. Area Under the ROC Curve (AUC-ROC)
Correct Answers:
B. Accuracy
C. Precision, Recall, and F1 Score
D. Area Under the ROC Curve (AUC-ROC)
Explanation: R-squared is used for linear regression, not logistic regression. Metrics like
accuracy, precision, recall, and AUC-ROC are more appropriate for binary classification.
26.Which of the following are true about Type I and Type II errors? (Select
all that apply)
Correct Answers:
Explanation: The significance level (α\alphaα) controls the probability of a Type I error, not
a Type II error.
Correct Answers:
Explanation:
Trade-offs are an integral part of hypothesis testing: balancing error types, test sensitivity,
and practical considerations like sample size and robustness.
NAT
21. In a simple linear regression model Y = 5X + 2, ___________ is the predicted value of Y
when X = 3
Answer: 17
22. Given the simple linear regression equation Y = 3X + 7, and a data point (X, Y) = (4, 19),
the residual error for this data point is ______.
Answer: 4
23. In a multiple linear regression model Y = 2 + 3X_1 - 4X_2, _______ is the predicted
value of Y when X_1 = 2 and X_2 = 1.
Answer: 4
24. Suppose in a multiple linear regression model Y = 1 + 2X_1 + 3X_2 + 4X_3, the values
X_1 = 1, X_2 = 2, and X_3 = 3 are given. _________ is the predicted value of Y.
Answer: 20
25. In a simple linear regression model, if the t-statistic for a coefficient beta_1 is 2.5 and the
standard error is 0.4, _______ is the value of beta_1.
Answer: 1
26. For a regression model, if the p-value for the F-test is 0.03, at ________ significance level
(in percentage) we reject the null hypothesis.
Answer: 5
27. If the mean squared error (MSE) of a model is 25, the variance of the model is 16,
______ is the bias squared.
Answer: 9
28. In a dataset, the actual values are [3, 5, 7, 9], and the predicted values from a linear
regression model are [2.5, 5.5, 6.5, 9.5]. _________ is the mean squared error (MSE).
Answer: 0.25
29. A regression model has a bias of 2 and a variance of 3. __________ is the total error if the
irreducible error is 1.
Answer: 14
30. In a regression model, the sum of squared errors (SSE) is 50 and the total sum of squares
(SST) is 100. _________ is the R-squared value of the model.
Answer: 0.5