Assignment Week 2
Assignment Week 2
Introduction to
Machine Learning
Assignment- Week 2
TYPE OF QUESTION: MCQ
Number of questions: 10 Total mark: 10 X 2 = 20
MCQ Question
QUESTION 1:
In a binary classification problem, out of 30 data points 12 belong to class I and 18 belong to
class II. What is the entropy of the data set?
A. 0.97
B. 0
C. 1
D. 0.67
Correct Answer : A. 0.97
Detailed Solution :
𝐸𝑁𝑇𝑅𝑂𝑃𝑌(𝑝+, 𝑝−) = − 𝑝+𝑙𝑜𝑔2𝑝+ − 𝑝−𝑙𝑜𝑔2𝑝− , here
𝑝+ = 12/30 𝑎𝑛𝑑 𝑝− = 18/30
______________________________________________________________________________
QUESTION 2:
Decision trees can be used for the problems where
______________________________________________________________________________
QUESTION 3:
A. Variance is the error of the trained classifier with respect to the best classifier in the
concept class.
B. Variance depends on the training set size.
C. Variance increases with more training data.
D. Variance increases with more complicated classifiers.
______________________________________________________________________________
QUESTION 4:
In linear regression, our hypothesis is ℎθ(𝑥) = θ0 + θ1𝑥, the training data is given in the table.
X y
6 7
5 4
10 9
3 4
𝑚
2
If the cost function is 𝐽(θ) =
1
2𝑚 ( )
∑ (ℎθ 𝑥𝑖 − 𝑦𝑖) , where m is no. of training data points.
𝑖=1
What is the value of 𝐽(θ) when θ = (1,1).
A. 0
B. 1
C. 2
D. 0.5
Correct Answer: B. 1
Detailed Solution : Substitute θ0 by 1 and θ1 by 1 and compute 𝐽(θ).
______________________________________________________________________________
______________________________________________________________________________
QUESTION 5:
The value of information gain in the following decision tree is:
A. 0.380
B. 0.620
C. 0.190
D. 0.477
Correct Answer: A
Detailed Solution :
Information Gain = 0.996 - ( (17/30)*0.787 + (13/30)*0.391 ) = 0.380
___________________________________________________________________
QUESTION 6:
QUESTION 7:
Answer Questions 7-8 with the data given below:
ISRO wants to discriminate between Martians (M) and Humans (H) based on the following
features: Green ∈ {N,Y}, Legs ∈ {2,3}, Height ∈ {S,T}, Smelly ∈ {N,Y}. The training data is as follows:
M N 3 S Y
M Y 2 T N
M Y 3 T N
M N 2 S Y
M Y 3 T N
H N 2 T Y
H N 2 S N
H N 2 T N
H Y 2 S N
H N 2 T Y
_____________________________________________________________________________
QUESTION 9:
A. Discrete
B. Continuous and always lies in a finite range
C. Continuous
D. May be discrete or continuous
_____________________________________________________________________________
QUESTION 10:
A. True
B. False
Detailed Solution : With a small training dataset, it’s easier to find a hypothesis to fit the training
data exactly,i.e., overfit.
_____________________________________________________________________________
*****END*****