Unit 3 MCQ
Unit 3 MCQ
A) TRUE
B) FALSE
Solution: (A)
Yes, Linear regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variable (x) and an output variable
(Y) for each example.
A) TRUE
B) FALSE
Solution: (A)
A) TRUE
B) FALSE
Solution: (A)
4) Which of the following methods do we use to find the best fit line for data in Linear
Regression?
C2 General
A) Least Square Error
B) Maximum Likelihood
C) Logarithmic Loss
D) Both A and B
Solution: (A)
In linear regression, we try to minimize the least square errors of the model to identify the
line of best fit.
5) Which of the following evaluation metrics can be used to evaluate a model while
modeling a continuous output variable?
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: (D)
Since linear regression gives output as continuous values, so in such case we use mean
squared error metric to evaluate the model performance. Remaining options are use in case
of a classification problem.
A) TRUE
B) FALSE
Solution: (A)
True, In case of lasso regression we apply absolute penalty which makes some of the
coefficients zero.
A) Lower is better
B) Higher is better
C) A or B depend on the situation
D) None of these
Solution: (A)
C2 General
Residuals refer to the error values of the model. Therefore lower residuals are desired.
8) Suppose that we have N independent variables (X1,X2… Xn) and dependent variable is
Y. Now Imagine that you are applying linear regression by fitting the best fit line using least
square error on this data.
You found that correlation coefficient for one of it’s variable(Say X1) with Y is -0.95.
Solution: (B)
The absolute value of the correlation coefficient denotes the strength of the relationship.
Since absolute correlation is very high it means that the relationship is strong between X1
and Y.
9) Looking at above two characteristics, which of the following option is the correct
for Pearson correlation between V1 and V2?
If you are given the two variables V1 and V2 and they are following below two
characteristics.
Solution: (D)
C2 General
10) Suppose Pearson correlation between V1 and V2 is zero. In such case, is it right
to conclude that V1 and V2 do not have any relation between them?
A) TRUE
B) FALSE
Solution: (B)
Pearson correlation coefficient between 2 variables might be zero even when they have a
relationship between them. If the correlation coefficient is zero, it just means that that they
don’t move together. We can take examples like y=|x| or y=x^2.
11) Which of the following offsets, do we use in linear regression’s least square line
fit? Suppose horizontal axis is independent variable and vertical axis is dependent
variable.
A) Vertical offset
B) Perpendicular offset
C) Both, depending on the situation
D) None of above
Solution: (A)
12) True- False: Overfitting is more likely when you have huge amount of data to
train?
C2 General
A) TRUE
B) FALSE
Solution: (B)
With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly
i.e. overfitting.
13) We can also compute the coefficient of linear regression with the help of an
analytical method called “Normal Equation”. Which of the following is/are true about
Normal Equation?
A) 1 and 2
B) 1 and 3
C) 2 and 3
D) 1,2 and 3
Solution: (D)
Instead of gradient descent, Normal Equation can also be used to find coefficients. Refer
this article for read more about normal equation.
14) Which of the following statement is true about sum of residuals of A and B?
Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I
want to find the sum of residuals in both cases A and B.
Note:
C2 General
A) A has higher sum of residuals than B
B) A has lower sum of residual than B
C) Both have same sum of residuals
D) None of these
Solution: (C)
Sum of residuals will always be zero, therefore both have same sum of residuals
Suppose you have fitted a complex regression model on a dataset. Now, you are using
Ridge regression with penality x.
Solution: (B)
If the penalty is very large it means model is less complex, therefore the bias would be high.
16) What will happen when you apply very large penalty?
C2 General
Solution: (B)
In lasso some of the coefficient value become zero, but in case of Ridge, the coefficients
become close to zero but not zero.
17) What will happen when you apply very large penalty in case of Lasso?
A) Some of the coefficient will become zero
B) Some of the coefficient will be approaching to zero but not absolute zero
C) Both A and B depending on the situation
D) None of these
Solution: (A)
As already discussed, lasso applies absolute penalty, so some of the coefficients will
become zero.
18) Which of the following statement is true about outliers in Linear regression?
Solution: (A)
The slope of the regression line will change due to outliers in most of the cases. So Linear
Regression is sensitive to outliers.
19) Suppose you plotted a scatter plot between the residuals and predicted values in
linear regression and you found that there is a relationship between them. Which of
the following conclusion do you make about this situation?
Solution: (A)
C2 General
There should not be any relationship between predicted values and residuals. If there exists
any relationship between them,it means that the model has not perfectly captured the
information in the data.
Suppose that you have a dataset D1 and you design a linear regression model of degree 3
polynomial and you found that the training and testing error is “0” or in another terms it
perfectly fits the data.
20) What will happen when you fit degree 4 polynomial in linear regression?
A) There are high chances that degree 4 polynomial will over fit the data
B) There are high chances that degree 4 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (A)
Since is more degree 4 will be more complex(overfit the data) than the degree 3 model so it
will again perfectly fit the data. In such case training error will be zero but test error may not
be zero.
21) What will happen when you fit degree 2 polynomial in linear regression?
A) It is high chances that degree 2 polynomial will over fit the data
B) It is high chances that degree 2 polynomial will under fit the data
C) Can’t say
D) None of these
Solution: (B)
If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler model(degree
2 polynomial) might under fit the data.
22) In terms of bias and variance. Which of the following is true when you fit degree 2
polynomial?
C2 General
C) Bias will be high, variance will be low
D) Bias will be low, variance will be low
Solution: (C)
Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will be
high and variance will be low.
Which of the following is true about below graphs(A,B, C left to right) between the cost
function and Number of iterations?
23) Suppose l1, l2 and l3 are the three learning rates for A,B,C respectively. Which of
the following is true about l1,l2 and l3?
A) l2 < l1 < l3
B) l1 > l2 > l3
C) l1 = l2 = l3
D) None of these
Solution: (A)
In case of high learning rate, step will be high, the objective function will decrease quickly
initially, but it will not find the global minima and objective function starts increasing after a
few iterations.
In case of low learning rate, the step will be small. So the objective function will decrease
slowly
C2 General
Question Context 24-25:
We have been given a dataset with n records in which we have input attribute as x and
output attribute as y. Suppose we use a linear regression method to model this data. To test
our linear regressor, we split the data in training set and test set randomly.
24) Now we increase the training set size gradually. As the training set size increases,
what do you expect will happen with the mean training error?
A) Increase
B) Decrease
C) Remain constant
D) Can’t Say
Solution: (D)
Training error may increase or decrease depending on the values that are used to fit the
model. If the values used to train contain more outliers gradually, then the error might just
increase.
25) What do you expect will happen with bias and variance as you increase the size
of training data?
Solution: (D)
As we increase the size of the training data, the bias would increase while the variance
would decrease.
Consider the following data where one input(X) and one output(Y) is given.
C2 General
26) What would be the root mean square training error for this data if you run a
Linear Regression model of the form (Y = A0+A1X)?
A) Less than 0
B) Greater than zero
C) Equal to 0
D) None of these
Solution: (C)
We can perfectly fit the line on the following data so mean error will be zero.
Suppose you have been given the following scenario for training and validation error for
Linear Regression.
Number of Validation
Scenario Learning Rate Training Error
iterations Error
1 0.1 1000 100 110
2 0.2 600 90 105
3 0.3 400 110 110
4 0.4 300 120 130
5 0.4 250 130 150
C2 General
27) Which of the following scenario would give you the right hyper parameter?
A) 1
B) 2
C) 3
D) 4
Solution: (B)
Option B would be the better option because it leads to less training as well as validation
error.
28) Suppose you got the tuned hyper parameters from the previous question. Now,
Imagine you want to add a variable in variable space such that this added feature is
important. Which of the following thing would you observe in such case?
Solution: (D)
If the added feature is important, the training and validation error would decrease.
Suppose, you got a situation where you find that your linear regression model is under
fitting the data.
29) In such situation which of the following options would you consider?
A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1, 2 and 3
C2 General
Solution: (A)
In case of under fitting, you need to induce more variables in variable space or you can add
some polynomial degree variables to make the model more complex to be able to fir the
data better.
A) L1
B) L2
C) Any
D) None of these
Solution: (D)
I won’t use any regularization methods because regularization is used in case of overfitting.
A) TRUE
B) FALSE
Solution: A
True, Logistic regression is a supervised learning algorithm because it uses true labels for
training. Supervised learning algorithm should have input variables (x) and an target
variable (Y) when you train the model .
A) TRUE
B) FALSE
Solution: B
Logistic regression is a classification algorithm, don’t confuse with the name regression.
C2 General
3) True-False: Is it possible to design a logistic regression algorithm using a Neural
Network Algorithm?
A) TRUE
B) FALSE
Solution: A
A) TRUE
B) FALSE
Solution: A
Yes, we can apply logistic regression on 3 classification problem, We can use One Vs all
method for 3 class classification in logistic regression.
5) Which of the following methods do we use to best fit the data in Logistic
Regression?
Solution: B
Logistic regression uses maximum likely hood estimate for training a logistic regression.
6) Which of the following evaluation metrics can not be applied in case of logistic
regression output to compare with target?
C2 General
A) AUC-ROC
B) Accuracy
C) Logloss
D) Mean-Squared-Error
Solution: D
Since, Logistic Regression is a classification algorithm so it’s output can not be real time
value so mean squared error can not use for evaluating it
7) One of the very good methods to analyze the performance of Logistic Regression
is AIC, which is similar to R-Squared in Linear Regression. Which of the following is
true about AIC?
Solution: A
We select the best model in logistic regression which can least AIC. For more information
refer this source: http://www4.ncsu.edu/~shu3/Presentation/AIC.pdf
A) TRUE
B) FALSE
Solution: B
Standardization isn’t required for logistic regression. The main goal of standardizing
features is to help convergence of the technique used for optimization.
A) LASSO
B) Ridge
C2 General
C) Both
D) None of these
Solution: A
Context: 10-11
Consider a following model for logistic regression: P (y =1|x, w)= g(w0 + w1x)
where g(z) is the logistic function.
In the above equation the P (y =1|x; w) , viewed as a function of x, that we can get by
changing the parameters w.
A) (0, inf)
B) (-inf, 0 )
C) (0, 1)
D) (-inf, inf)
Solution: C
For values of x in the range of real number from −∞ to +∞ Logistic function will give the
output between (0,1)
11) In above question what do you think which function would make p between (0,1)?
A) logistic function
B) Log likelihood function
C) Mixture of both
D) None of them
Solution: A
C2 General
Context: 12-13
Suppose you train a logistic regression classifier and your hypothesis function H is
12) Which of the following figure will represent the decision boundary as given by
above classifier?
A)
B)
C)
D)
C2 General
Solution: B
Option B would be the right answer. Since our line will be represented by y = g(-6+x2) which
is shown in the option A and option B. But option B is the right answer because when you
put the value x2 = 6 in the equation then y = g(0) you will get that means y= 0.5 will be on
the line, if you increase the value of x2 greater then 6 you will get negative values so output
will be the region y =0.
13) If you replace coefficient of x1 with x2 what would be the output figure?
A)
B)
C)
D)
C2 General
Solution: D
14) Suppose you have been given a fair coin and you want to find out the odds of
getting heads. Which of the following option is true for such a case?
A) odds will be 0
B) odds will be 0.5
C) odds will be 1
D) None of these
Solution: C
Odds are defined as the ratio of the probability of success and the probability of failure. So
in case of fair coin probability of success is 1/2 and the probability of failure is 1/2 so odd
would be 1
15) The logit function(given as l(x)) is the log of odds function. What could be the
range of logit function in the domain x=[0,1]?
A) (– ∞ , ∞)
B) (0,1)
C) (0, ∞)
D) (- ∞, 0)
Solution: A
For our purposes, the odds function has the advantage of transforming the probability
function, which has values from 0 to 1, into an equivalent function with values between 0
and ∞. When we take the natural log of the odds function, we get a range of values from -∞
to ∞.
C2 General
16) Which of the following option is true?
A) Linear Regression errors values has to be normally distributed but in case of Logistic
Regression it is not the case
B) Logistic Regression errors values has to be normally distributed but in case of Linear
Regression it is not the case
C) Both Linear Regression and Logistic Regression error values have to be normally
distributed
D) Both Linear Regression and Logistic Regression error values have not to be normally
distributed
Solution:A
Only A is true.
17) Which of the following is true regarding the logistic function for any value “x”?
Note:
Logistic(x): is a logistic function of any number “x”
A) Logistic(x) = Logit(x)
B) Logistic(x) = Logit_inv(x)
C) Logit_inv(x) = Logit(x)
D) None of these
Solution: B
C2 General