0% found this document useful (0 votes)
16 views6 pages

Machine Learning Insem-01 QP

The document contains a question paper for a Machine Learning course, featuring multiple-choice questions and descriptive questions related to various concepts in machine learning. Topics include k-Nearest Neighbors, supervised learning, distance metrics, handling outliers, missing data, and regularization techniques. It also includes practical tasks such as data analysis and correlation calculation based on a dataset of cars.

Uploaded by

khushpatel1222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views6 pages

Machine Learning Insem-01 QP

The document contains a question paper for a Machine Learning course, featuring multiple-choice questions and descriptive questions related to various concepts in machine learning. Topics include k-Nearest Neighbors, supervised learning, distance metrics, handling outliers, missing data, and regularization techniques. It also includes practical tasks such as data analysis and correlation calculation based on a dataset of cars.

Uploaded by

khushpatel1222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

MACHINE LEARNING INSEM-01 QP

1 Which of the following is a disadvantage of k-Nearest Neighbors algorithm? [0.5 M]


a) Low accuracy b) Insensitive to outliers
c) Computationally expensive d) Need very less memory

2 The instance-based learner is a ____________. [0.5 M]


a) Lazy-learner b) Eager learner c) Fast Learner
d) None of these

3 Which of the following is not a supervised learning? [0.5 M]


a) Naive Bayesian b) Linear Regression
c) Principal Component Analysis d) Decision Tree

4 The Euclidean distance between the data-point A & B is ______. [0.5 M]

a) 0.17 b) 0.38 c) 0.91 d) 0.9

5 The ______ processes are powerful, non-parametric tools that can be used in supervised [0.5 M]
learning, namely in regression but also in classification problems.
(a) Stochastic (b) Markov (c) Gaussian (d) Statistical

6 Machine Learning uses the theory of ___________ in building mathematical models, [0.5 M]
because the core task is making inference from a sample.
(a) Statistics (b) Mathematics (c) optimization (d) physics

7 A _________________ is a function that separates the examples of different classes. [0.5 M]


(a) Determinant (b) Discriminant (c) Random Process
(d) Optimization Problem

8 If the values of two variables move in the opposite direction, ___________ the [0.5 M]
correlation is ___________
(a) Strong (b) Weak (c) Positive (d) Negative

9 The _________ is a model assessment technique used to evaluate a machine learning [0.5 M]
algorithm’s performance when making predictions on new datasets it has not been
trained on. This is done by partitioning a dataset and using a subset to train the algorithm
and the remaining data for testing.
(a) Correlation (b) Cross-Validation (c) Generalization (d) Normalization

Page 1 of 6
10 The most important way to characterize a random variable is to associate probabilities [0.5 M]
with the values it can take. If the random variable is discrete, i.e., it takes on a finite
number of values, then this assignment of probabilities is called a Probability Mass
Function . It must be, by definition, non-negative and must sum to one.

(a) Probability Mass Function (b) Probability Density Function


(c) Cost Function (d) None of the above

11 [3 M]
Consider the following data set, describing CARS:

Display count of cars, average MPG, minimum weight and maximum


displacement of cars with 8 and 4 cylinders in a tabular format.

Calculate the coefficient of correlation value between the attributes “Horse Power” and
“Cylinders” in the above dataset.

First row of the table: 1 M


Second row of the table: 1 M
Correlation Coefficient = 1 M
Ans

Page 2 of 6
12 What are outliers? Mention any two strategies to deal with outliers in datasets. [3 M]
Noise or outlier is a random error or variance in a measured variable.
Ans Strategies to deal with outliers include
Rule of thumb
• 1.5 * IQR above Q3 or below Q1 is an outlier
• 2 standard deviations away from mean

Binning
• Smoothens a sorted data value by consulting its neighborhood.
• The sorted values are distributed into a number of buckets or bins
• Also called as local smoothing

Regression- Data can be smoothed by fitting the data to a function


Clustering- Outliers can be detected by clustering.

1 mark for outlier, 2 marks for any 2 strategies

13 In the real-world data, tuples with missing values for some attributes are a [2 M]
common occurrence. Describe various methods for handling this problem.
Ans
First, determine the pattern of your missing data.

There are three types of missing data:


• Missing Completely at Random: There is no pattern in the
missing data on any variables. This is the best you can hope for.
• Missing at Random: There is a pattern in the missing data but
not on your primary dependent variables such as likelihood to
recommend.
• Missing Not at Random: There is a pattern in the missing data
that affect your primary dependent variables. For example, lower -
income participants are less likely to respond and thus affect your
conclusions about income and likelihood to recommend. Missing not
at random is your worst-case scenario. Proceed with caution.
(a) Replacing a missing value with the most commonly occurring value for that
attribute, or
(b) With the most probable value based on statistics
(c) Replace missing values with the mean.
(d) Replace missing values with the median.
(e) Replace missing values with an interpolated estimate.
(f) Replace missing values with a constant.

Page 3 of 6
(g) Replace missing values using imputation. Imputation is a way of using
features to model each other. That way, when one is missing, the others can be
used to fill in the blank in a reasonable way.
(h) Replace missing values with a dummy value and create an indicator variable
for "missing." When a missing value really means that the feature is not
applicable, then that fact can be highlighted. Filling in a dummy value that is
clearly different from actual values, such as a negative rank, is one way to do this.
Another is to create a new true/false feature tracking whether the original feature
is missing.
(i) Replace missing values with 0. A missing numerical value can mean zero.

First, determine the pattern of your missing data. [0.5 M]

Any 3 strategies: 0.5 * 3 = 1.5 Marks

14 What is Regularization? What is the main application of a Regularizer on cost [2 M]


functions of a Machine Learning model?
Ans Regularization optimizes the predictive models by preventing overfitting.

The performance of a machine learning model can be evaluated through a cost


function.

Generally, a cost function is represented by the sum of squares of the difference


between the actual and predicted value.

This is also called the ‘Sum of squared residuals’ or ‘Sum of squared errors’.

A predictive model when being trained attempts to fit the data in a manner that
minimizes this cost function.

A model begins to overfit when it passes through all the data points.

In such instances, although the value of the cost function is equal to zero, the
model having considered the noise in the dataset, does not represent the actual
function.

Under such circumstances, the error calculated on training data is less.

However, on the test data, the error remains huge.

Page 4 of 6
Essentially a model overfits the data by employing highly complex curves having
terms with large degrees of freedom and corresponding coefficients for each term
that provide weight to it.

For higher degrees of freedom the test set error is large when compared to the
train set error.

Regularization is a concept by which machine learning algorithms can be


prevented from overfitting a dataset.

Regularization achieves this by introducing a penalizing term in the


cost function which assigns a higher penalty to complex curves.
Lambda is a hyperparameter determining the severity of the penalty.

As the value of the penalty increases, the coefficients shrink in value in order to
minimize the cost function.

Since these coefficients also act as weights for the polynomial terms, shrinking
these will reduce the weight assigned to them and ultimately reduce its impact.

Therefore, for the case above, the coefficients assigned to higher degrees of
polynomial terms have shrunk to an extent where the value of such terms no
longer impacts the model as severely as it did before and so we have a simple
curve.

****************************
****************************
Regularization is an effective technique to prevent a model from overfitting.

It allows us to reduce the variance in a model without a substantial increase in


it’s bias.

This method allows us to develop a more generalized model even if only a few
data points are available in our dataset.

Ridge regression helps to shrink the coefficients of a model where the parameters
or features that determine the model is already known.

In contrast, lasso regression can be effective to exclude insignificant variables


from the model’s equation. In other words, lasso regression can help in feature
selection.

Page 5 of 6
Overall, it’s an important technique that can substantially improve the
performance of our model.
Regularization Definition – 0.5 M

Overfitting Definition – 0.5 M

Any one cost function – 0.5 M (Example SSE)

Impact of Regularization on the above cost function – 0.5 M

Page 6 of 6

You might also like