Sample Question Paper
Sample Question Paper
Consider the dataset given below where A and B are attributes which can take the values 0 and 1,
and Y is the classification. The values marked “*” represent data values that are corrupted. It is
known that during the construction of a decision tree to represent the clean dataset (i.e one without
any “*”), the attribute B was chosen at the root instead of attribute A using information gain. Is this
information enough to guess the value of the bit that must replace “*”? Give detailed justification
for your answer.
[5 Marks]
A B Y
1 0 no
1 1 no
0 * no
0 1 yes
0 1 yes
1 1 yes
Answer
Thus regardless of whether *= 0 or 1, B has a higher information gain than A and would have been
chosen to be the root. Thus the information given is not sufficient to decide the value of *.
2. Suppose a logistic regression classifier σ(x,y)=1/(1+exp(-w0-7.5*x-7.5*y)) is used to
classify the following dataset. What is the range of values w0 can take for 100%
classification accuracy? Show the steps clearly.
3. For the data given below (Instance, Predicted Value, Actual Value), calculate precision, recall and
the F 1 measure.
[5 Marks]
4. Apply the logistic regression with gradient descent and show only the first iteration of the
algorithm. Use the learning rate =0.5, W0 = 0.25, W1=2.5, W2= –3.5, W3=2.5,
WTX= W0 + W1X1 + W2X2 + W3X3 and assume no regularization is used. Find the value
of cost function and Write the complete form of the hypothesis at the end of the first
iteration.
w0 = 0.243
w1=-1.531
w2=-1.8885
w3=2.6795
Value of Cost Function: ~-13.88
Final Hypothesis :
IsProductPopular = Yes if below hypothesis gives values >=0.5 else No
(1/(1+e-(0.243-1.531*No.of.Clicks-
1.8885*No.of.Purchases+2.6795*No.of.SavedWishlists))
5. Consider the following training set with 5 examples and regression model as
Y=3-4X+2X2
X Y
5 30
8 90
12 250
15 498
20 900
Calculate
i) Root Mean Square Error
[3]
ii) Mean Absolute Error
[2]
Solution:
Y=3-4X+2X^2
abs(Y-
X Y Yhat Y-Yhat (Yhat)^2 X Y Yhat
Yhat)
5 30 33 -3 9 5 30 33 3
8 90 99 -9 81 8 90 99 9
12 250 243 7 49 12 250 243 7
15 498 393 105 11025 15 498 393 105
20 900 723 177 31329 20 900 723 177
sum 42493 sum 301
sum/n 8498.6 MAE 60.2
RMSE 92.1879
6. City government has collected the following data on annual sales tax collections and new
car registrations: as shown in below table.