0% found this document useful (0 votes)
34 views8 pages

Machine Learning, (CS-3035), Online Spring End Semester Examination 2021

The document outlines the format for an online end semester examination for the Machine Learning course at KIIT Deemed to be University, detailing the structure, marking scheme, and types of questions. It includes multiple-choice questions (MCQs) and descriptive questions, covering various topics in machine learning such as Naive Bayes, SVM, PCA, and neural networks. The examination is divided into two sections, with specific time allocations and marks for each question.

Uploaded by

Mua Deb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views8 pages

Machine Learning, (CS-3035), Online Spring End Semester Examination 2021

The document outlines the format for an online end semester examination for the Machine Learning course at KIIT Deemed to be University, detailing the structure, marking scheme, and types of questions. It includes multiple-choice questions (MCQs) and descriptive questions, covering various topics in machine learning such as Naive Bayes, SVM, PCA, and neural networks. The examination is divided into two sections, with specific time allocations and marks for each question.

Uploaded by

Mua Deb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Sample Question Format

(For all courses having end semester Full Mark=50)

KIIT Deemed to be University


Online End Semester Examination(Spring Semester-2021)

Subject Name & Code: Machine Learning CS3035


Applicable to Courses: B.Tech (IT)

Full Marks=50 Time:2 Hours

SECTION-A (Answer All Questions. Each question carries 2 Marks)

Time:30 Minutes (7×2=14 Marks)

Question Question Question CO Answer Key


No Type Mapping (For MCQ
(MCQ/SAT) Questions
only)
Q.No:1 MCQ Which of the following outcome is 1 C
odd man out in the below list?
A) R Squared
B) RMSE
C) Kappa
D) All of the mentioned
Point out the wrong statement. 1 C
A) Regression through the origin
yields an equivalent slope if you
center the data first
B) Normalizing variables results in
the slope being the correlation
C) Least squares is not an
estimation tool
D) None of the mentioned
Which of the following implies no 1 A
relationship with respect to
correlation?
A) Cor(X, Y) = 0
B) Cor(X, Y) = 1
C) Cor(X, Y) = 2
D) All of the mentioned
Residual ______ plots investigate 1 B
normality of the errors.
A) RR
B) QQ
C) PP
D) None of the mentioned
Q.No:2 MCQ Which of the following is the 2 B
correct formula for total variation?
A) Total Variation = Residual
Variation – Regression Variation
B) Total Variation = Residual
Variation + Regression Variation
C) Total Variation = Residual
Variation * Regression Variation
D) All of the mentioned
Which of the following is required 2 D
by K-means clustering?
A) defined distance metric
B) number of clusters
C) initial guess as to cluster
centroids
D) All of these
Minimizing the likelihood is the 2 A
same as maximizing -2 log
likelihood.
A) True
B) False
Let us say that we have computed 2 B
the gradient of our cost function
and stored it in a vector g. What is
the cost of one gradient descent
update given the gradient?
(A) O(N)
(B) O(D)
(C) O(ND)
(D) O(ND2)
Q.No:3 MCQ Consider a linear-regression model 3 A
with N = 3 and D = 1 with
input-ouput pairs as follows: y1 =
22, x1 = 1, y2 = 3, x2 = 1, y3 = 3, x3
= 2. What is the gradient of
mean-square error (MSE) with
respect to β1 when β0 = 0 and β1 =
1? Give your answer correct to two
decimal digits.
A) -1.66
B) 1.66
C) 1.39
D) None of these
Gradient of a continuous and 3 D
differentiable function
(A) is zero at a minimum
(B) is non-zero at a maximum
(C) decreases as you get closer to
the minimum
(D) All of these

Which of the following sentence is 3 D


FALSE regarding regression?
(A) It relates inputs to outputs.
(B) It is used for prediction.
(C) It may be used for
interpretation.
(D) It discovers causal
relationships
For the one-parameter model, 3 C
mean-Square error (MSE) is
1
defined as follows: 2𝑁 ∑𝑁 𝑖=1(yn −
2
β0 ) ). We have a half term in the
front because,
(A) scaling MSE by half makes
gradient descent converge faster.
(B) presence of half makes it easy
to do grid search.
(C) it does not matter whether half
is there or not.
(D) none of the above

Q.No:4 MCQ Lasso can be interpreted as 3 A


least-squares linear regression
where
(A) weights are regularized with
the L1 norm
(B) weights are regularized with
the L2 norm
(C) the weights have a Gaussian
prior
(D) the solution algorithm is
simpler

Regarding bias and variance, which 3 B


of the follwing statements are true?
(Here ‘high’ and ‘low’ are relative to
the ideal model.)
(A) Models which overfit have a
high bias.
(B) Models which overfit have a
low bias.
(C) Models which underfit have a
high variance.
(D) All of these
In K-fold cross-validation, K is 3 B
(A) A float (decimal) value
(B) An integer value
(C) An complex imaginary value
(D) None of these
The second principal (PC2) 3 A
component is _______ to the first
principal component (PC1)
A) Orthogonal
B) Inverse
C) Transpose
D) None of these
Q.No:5 MCQ What is the cosine similarity 1 A
between [4, 3, 3, 5], and [2, 0,
0, 0]?.
A) 0.52
B) 0.62
C) 0.72
D) 0.74
Find the odd out from the following 1 C
list: Genetic Algorithm (GA),
Particle Swarm Optimization
(PSO), Stochastic Gradient Descent
(SGD), and Gravitational Search
Algorithm (GSA).
A) GA
B) PSO
C) SGD
D) None of these
What is the cosine similarity 1 D
between [5, 2, 0, 5], and [2, 0,
0, 0]
A) 0.58
B) 0.55
C) 0.75
D) 0.68
The Manhattan distance between 1 C
two points (10, 10) and (30,30) is:
A) 20
B) 30
C) 40
D) 50
Q.No:6 MCQ Suppose we train a hard-margin 5 C
linear SVM on n > 100 data points
in , yielding a hyperplane with
exactly 2 support vectors. If we add
one more data point and retrain
the classifier, what is the maximum
possible number of support vectors
for the new hyperplane (assuming
the n + 1 points are linearly
separable)?
(A)2
(B)3
(C)(n+1)
(D)n
A valid Kernel function follows 5 C
_______ condition.
A) Vornoi
B) Minkowiski
C) Mercers
D) None of them
In Soft-margin Support Vector 5 B
Classifier, the _____ variable
allows few instances within the
margin.
A) Slack
B) Sigma
C) Beta
D) None of these
_______ multipliers used to 5 B
integrate the inequality constraints
into the main objective function.
A) Euler
B) Lagrangian
C) Lypanov
D) None of these
Q.No:7 MCQ Neural networks 6 B
(A) optimize a convex cost function
(B) can be used for regression as
well as classification
(C) always output values between 0
and 1
(D) None of these
Sigmoidal activation function maps 6 B
the neuron output between
A) -1 to +1
B) 0 to 1
C) Either 0 or 1
D) None of these
A single input and output nuron 6 A
has an input of 2, a weight 0f 6 and
a bias of -5.5. What will be the
output if the activation function is
bipolar sigmoid (where, slope
s=0.6 and round off 4 decimal
places)
A) 0.9802
B) 0.7806
C) 0.9881
D) None of these
In a multi-layer neural network, 6 B
the term “multi” suggests
A) Only one hidden layer in
between input and output layers
B) One or more hidden layer(s) in
between input and output layers
C) Always more than one hidden
layers in between input and output
layers
D) None of these

SECTION-B(Answer Any Three Questions. Each Question carries 12


Marks)
Time: 1 Hour and 30 Minutes (3×12=36 Marks)

Question No Question CO Mapping


(Each
question
should be
from the
same CO(s))
Q.No:8 A) Why do we call Naive Bayes classifier “Naive”? State its CO3
advantages and disadvantages. [1+2=3]
B) Explain the following with examples: One-Against-All
(OAA) and One-Against-One (OAO). [2+2=4]
C) Derive the Naive Bayes classification algorithm using the
following toy dataset. [5]

Classify a “Red SUV Domestic” using Naive Bayes Classifier.


A) What is slack variable and its role in soft-margin SVM
classifier? Explain the concept using suitable diagram. [2]
B) Find the entropy, information gain and draw the decision
trees for the following set of training dataset. [4]

C) Explain the following terms with examples: Confusion


Matrix, Accuracy, Precision, Recall, F1-score and Area Under
the Curve (AUC). [6]
A) Explain the following with examples: One-Against-All
(OAA) and One-Against-One (OAO). [4]
B) States the Mercers conditions for a valid kernel function.
Name at least four valid kernel functions used in SVM
classifier with appropriate equations. [4]
C) The merits and demerits of the SVM classifier. [4]
Q.No:9 A) What is “curse of dimensionality”? State the possible CO5
remedies. [2+1=3]
B) What is a Covariance Matrix? State its limitation. [2+1=3]
C) Using PCA, reduce the Dimension of the given data
from 2 to 1. [6]

Feature /Example Ex1 Ex2 Ex3 Ex4


F1 12 4 8 17
F2 5 9 16 7
A) Differentiate between lossy and lossless feature/attribute
reduction with suitable examples. [3]
B) What is a Covariance Matrix? State its limitation. [2+1=3]
C) Explain the Principal Component Analysis (PCA) and
reduce the following dataset step-by-step from 2 dimensions
to 1. [6]
Feature /Example Ex1 Ex2 Ex3 Ex4
F1 6 -3 -2 7
F2 -4 5 6 -3

A) Differentiate between lossy and lossless feature/attribute


reduction with suitable examples. [2]
B) What is a principal component? Explain the
(mathematical) relationship between the first and the second
principal components. [2+2=4]
C) Using PCA, reduce the Dimension of the given data
from 2 to 1. [6]

Feature /Example Ex1 Ex2 Ex3 Ex4


F1 12 4 8 17
F2 5 9 16 7
Q.No:10 A) What are over-fitting and under-fitting? Explain them with CO2, CO5
suitable examples. [4]
B) Explain the Elbow technique to determine sn appropriate
“K” value in K-NN classifier. [2]
C) What are the intra-cluster and inter-cluster distances?
Using K-means Clustering Algorithm, cluster the following
data points: P1=(75,102), P2=(201,16), P3=(68, 80),
P4=(188,36), P5=(165,55) and P6=(100,42). where, K=2, and
use Euclidean distance for the purpose. (start the
computation with P2 and P3 as the two initial centroids
points) [1+5=6]
A) What is the significance of bias in the decision boundary?
Explain the Bias-Variance relationship in machine learning
with suitable diagrams. [2+2=4]
B) Why does we use the term “regression” in the Logistic
Regression classification algorithm. [2]
C) Using KNN algorithm and the given data set, predict the
class label of the test data point (16,8), where K=3 and
Euclidean distance. [6]
X Y Label
---------------------
10 05 0
6.5 11 1
7 15 1
12 05 0
8 10 1
15 8 0
A. What are the over-fitting and the under-fitting? Explain
them with suitable examples. [3]
B. What are the intra-cluster and inter-cluster distances?
Explain the Elbow technique to determine an appropriate “K”
value in K-NN classifier. [1.5+1.5=3]
C. Using K-means Clustering Algorithm, cluster the following
data points: P1=(75,102), P2=(201,16), P3=(68, 80),
P4=(188,36), P5=(165,55) and P6=(100,42). where, K=2, and
use Euclidean distance for the purpose. (start the
computation with P1 and P3 as the two initial centroids
points) [6]
Q.No:11 A) What is Binary Activated Neuron model? States its CO6
limitations. [4]
B) A two input single output neuron model has weights value
[1.5 -2.1] and bias of -3. It is given an input [2.0 2.5]. What
will be the output if the binary step function threshold=1 is
used? [4]
C) Describe Multi-layer Perceptron Neural Network with
suitable mathematical expressions and diagram. [4]
A) What is perceptron? Explain it with an example. [4]
B) Differentiate between linearly and non-linearly separable
datasets. [2]
C) Solve XOR problem with a two input artificial neuron
model. [6]
A) What are the learning rate and the momentum in Artificial
Neural Network (ANN) model? State different learning rules
used in ANN. [2+2=4]
B) Draw a diagram of a multiple input single output artificial
neural network and compute its input-output relationship. [4]
C) Describe the Backpropagation algorithm using appropriate
mathematical expressions. [4]

You might also like