CS2011 Ai & ML End Sem
CS2011 Ai & ML End Sem
1. (a) Imagine we are playing a game where you have to guess if the next item is a
ball or not. You are aware it happens around half the time. You make a guess,
the item is a ball or nor a ball, and you are awarded one or zero points. Here
is an example dataset of results of 10 guesses, the correct answers and points
awarded.
(i) item = ball, correct guess, 1 pt
(ii) item = ball, incorrect guess, 0 pts
(iii)item = no ball, correct guess, 1 pt
(iv) item = ball, correct guess, 1 pt
(v) item = no ball, incorrect guess, 0 pts
(vi) item = no ball, incorrect guess, 0 pts
(vii) item = no ball, correct guess, 1 pt
(viii)item = no ball, incorrect guess, 0 pts
(ix) item = ball, correct guess, 1 pt
(x) item = no ball, correct guess, 1 pt
Complete the confusion matrix and evaluate the following: Accuracy, Precision, Re-
call and F1-score? [5]
(b) Given a model y(x) = wT x + b, w ∈ Rm being the weight vector of m dimensions and
b the random error. Assuming a k-fold cross validation model, find the sum of the
ratio of the training sample size by test sample size. Also express the generic term
of the ratio in terms of ‘m′ . [5]
2. (a) Apply K(=2)-means algorithm over the data points: P1(185, 72), P2(170, 56),
P3(168, 60), P4(179,68), P5(182,72), P6(188,77) up to two iterations and show
the clusters. Initially choose P1, P2 as centroids. Give appropriate diagrams.
[5]
(b) (i) What is the full form of DBSCAN? What is the advantage of DBSCAN and
how it is different from K-means clustering algorithm? [3]
(ii) In the following graph (Figure 1), assume that if there is ever a choice
amongst multiple nodes, both BFS and DFS algorithms will choose the left
most node first. Starting with the node “A” at the top, which algorithm will
visit the least number of nodes before visiting the goal node “G”? Justify your
answer. [2]
3. (a) Consider the following dataset: A1(2, 10)/Class C2, A2(2, 6)/Class C1, A3(11,
11)/Class C3, A4(6,9)/Class C2. In the above dataset, we have four data points
with three class labels. Applying the KNN algorithm (K=3), find the class label
of the point P(5, 7). Numbers within parentheses are coordinates. [5]
(b) Given a linear regression model Y = W T X + β with the following information.
The initial weight vector: W = [2, 3, 7, 5]T ; Input training sample: X =
[5.6, 6.8, 4, 8]T ; Bias: β = [1.2, 1.3, 2.2, 1.5]T and Training output: Y =
[ya1 , ya2 , . . . , ya4 ]T . Assuming that the loss function as L(RM SE) to be mini-
mum, find the values of Y vector, given that the model is trained till first pass
only. [RM SE = Root Mean Square Error] [5]
Page 2