0% found this document useful (0 votes)
4 views

ML

The document is a sessional examination paper for a Machine Learning course, consisting of multiple-choice questions (MCQs) and descriptive questions (DES) related to various machine learning concepts. Topics covered include Principal Component Analysis (PCA), linear discriminant analysis, support vector machines (SVM), dimensionality reduction techniques, and decision tree models. The exam is structured to assess students' understanding of these concepts and their application in practical scenarios.

Uploaded by

khushpatel1222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

ML

The document is a sessional examination paper for a Machine Learning course, consisting of multiple-choice questions (MCQs) and descriptive questions (DES) related to various machine learning concepts. Topics covered include Principal Component Analysis (PCA), linear discriminant analysis, support vector machines (SVM), dimensionality reduction techniques, and decision tree models. The exam is structured to assess students' understanding of these concepts and their application in practical scenarios.

Uploaded by

khushpatel1222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Sessional – 2, April 2023

Machine Learning (DSE 2254), IV Sem, DSE

Date: 19/04/2023 Max. Marks: 15


Duration: 1 hr

Instructions to Candidates
 Answer ALL the questions.
 Use of Calculator is allowed. Use of Mobile is NOT allowed.

Type: MCQ

Q1. What is the goal of Principal Component Analysis (PCA)? (0.5)


1. **To transform the original features into a new set of uncorrelated features.
2. To find the optimal decision boundary between classes.
3. To identify the most important features in a dataset.
4. To reduce the number of features in a dataset.
Q2. The ____________ probability is one of the quantities involved in Bayes' rule. It is the
conditional probability of a given event, computed after observing a second event whose conditional
and unconditional probabilities were known in advance. It is computed by revising the prior
pobability. (0.5)
1. Prior.
2. Zero.
3. **Posterior.
4. None of these.
Q3. How is the number of principal components chosen in PCA? (0.5)
1. Based on the number of features in the dataset.
2. **Based on the amount of variance explained by each component.
3. Based on the correlation between each pair of features.
4. None of these
Q4. What is the relationship between principal components and original features in PCA?(0.5)
1. Each principal component represents a single original feature.
2. Each original feature is a linear combination of all the principal components.
3. ** Each principal component is a linear combination of all the original features.
4. There is no relationship between principal components and original features.
Q5. What is the primary goal of linear discriminant analysis? (0.5)
1. To reduce the dimensionality of a dataset
2. **To find the decision boundary that maximizes class separation
3. To identify the most important features in a dataset
4. To fit a linear regression model to the data
Q6. Which of the following is a use case for linear discriminant analysis? (0.5)
1. **Identifying fraudulent credit card transactions
2. Predicting the price of a house
3. Segmenting customers based on demographics
4. All of the these
Q7. Eigen vectors are essentially defined for a ________ matrix (0.5)
1. Identity
2. ** Square
3. Orthogonal
4. Diagonal
Q8. Which of the following is false about non-linear SVM?(0.5)
1. It can only handle linearly separable data
2. It always ignores the outliers in data
3. It is only applicable to binary classification problems
4. **All these are false
Q9. Which of the following is true about SVM?(0.5)
1. It is only applicable to binary classification problems
2. **It can handle high-dimensional data
3. It cannot handle nonlinear data
4. It is a type of unsupervised learning algorithm
Q10. Suppose that an individual is extracted at random from a population of men. The
probability of extracting a married individual is 50%. The probability of extracting a childless
individual is 40%. The conditional probability that an individual is childless given that he is
married is equal to 20%. If the individual we extract at random from the population turns out
to be childless, what is the conditional or posterior probability that he is married? (0.5)
1. ** 1/4
2. 2/3
3. 3/8
4. 1/2
Type: DES

Q11. Explain the role of Kernel function in SVM. Discuss about different types of Kernel functions. (2)
Role of Kernel in SVM: 0.5 Marks
Atleast 3 different kernel functions with one line definition: 0.5 * 3 = 1.5 Marks

Q12. Explain any FOUR (4) dimensionality reduction techniques. (2)


 Only List FOUR DR method 0.5 marks

 Explain FOUR DR methods 2.0 marks (0.5 each)

Dimensionality reduction techniques are used to reduce the number of features or variables
in a dataset while still retaining the important information. This is particularly useful when
dealing with high-dimensional data where the number of variables is much larger than the
number of observations.
Principal Component Analysis (PCA): PCA is a linear dimensionality reduction technique that
aims to find a new set of uncorrelated variables, known as principal components, that
capture the maximum amount of variance in the original data. The first principal component
captures the direction of maximum variance in the data, the second captures the direction
of the maximum remaining variance, and so on. PCA is commonly used in data visualization,
feature extraction, and data compression.
t-Distributed Stochastic Neighbor Embedding (t-SNE): t-SNE is a nonlinear dimensionality
reduction technique that maps high-dimensional data onto a low-dimensional space
(typically 2D or 3D) by preserving the pairwise similarities between data points. It uses a
probabilistic approach to model the similarity between points in high-dimensional space and
low-dimensional space, with a focus on preserving the structure of the data. t-SNE is
particularly useful for visualizing high-dimensional data, as it can reveal the underlying
structure and relationships between the data points.
Uniform Manifold Approximation and Projection (UMAP): UMAP is a nonlinear
dimensionality reduction technique that is similar to t-SNE, but uses a different approach to
construct the low-dimensional representation. UMAP works by constructing a high-
dimensional graph of the data points and then using a smooth function to map the points
onto a low-dimensional space. This smooth function is designed to preserve the local
structure of the data, which makes UMAP particularly useful for preserving the cluster
structure of the data.
Locally Linear Embedding (LLE): LLE is a nonlinear dimensionality reduction technique that
works by finding a low-dimensional representation of the data that preserves the local
structure of the data. LLE constructs a graph of the data points and then finds a low-
dimensional representation that minimizes the difference between the distances in the high-
dimensional space and the distances in the low-dimensional space. LLE is particularly useful
for preserving the local structure of the data, which makes it useful for data visualization and
anomaly detection.

Q13. Predict whether the tuple (1.8, 2.1) belongs to Class A or Class B using the principles of
Maximum Likelihood Estimation. (3)
 Formulae 0.5 marks
 Correct Steps 2.0 marks
 Correct Answer 0.5 marks

µx µy σx σy

Class A -0.19 5.03 4.12 1.78

Class B -2.18 -2.84 2.04 0.85

likelihood_A = (1 / (2 * pi * 4.12 * 1.78)) * exp(-(((1.8 - (-0.19)) / 4.12)**2 / 2) * exp(-(((2.1 -


5.03) / 1.78)**2 / 2)) = 0.000053
likelihood_B = (1 / (2 * pi * 2.04 * 0.85)) * exp(-(((1.8 - (-2.18)) / 2.04)**2 / 2) * exp(-(((2.1 - (-
2.84)) / 0.85)**2 / 2)) = 0.000012
Since likelihood_A is greater than likelihood_B, tuple (1.8, 2.1) belongs to Class A.
Q14. Answer the following questions: (3)
A) Explain different ways of measuring impurity in data using Decision tree model.
3 methods: Gain Ratio, Entropy and Gini Index – 0.5 * 3 = 1.5 Marks

B) Explain different ways of fitting the decision model to avoid over-fitting.


Ans: Overfitting is the phenomena that results when the trained model is more biased towards the
training data and does not do well on the test or unseen data instances resulting in violation of the
concept of Generalization. Definition : 0.5 M
We need to avoid overfitting by reducing the number of training data tuples to generate the model
Or Identify redundant or irrelevant attributes from the original data.
2 Methods with one line explanation – 0.5 * 2 = 1 M

You might also like