0% found this document useful (0 votes)
0 views

Spring Mid Sem ML Evalution Scheme

Uploaded by

Shivam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Spring Mid Sem ML Evalution Scheme

Uploaded by

Shivam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

SPRING MID SEMESTER ML EVALUATION SCHEME -2025

School of Computer Engineering


Kalinga Institute of Industrial Technology, Deemed to be University
Subject Name: Machine Learning
[Subject Code: CS31002]

1. Answer all the questions.


a) Given three 2D vectors such as a = (2, 5), b = (-3, 7), and c = (4, -2), find out which
two vectors are the closest to each other based on cosine similarity?

Ans: To determine which two vectors are the closest based on cosine similarity,
we need to calculate the cosine similarity between each pair of vectors. The
cosine similarity between two vectors 𝒖 and 𝒗 is given by:
𝑢. 𝑣
𝑐𝑜𝑠𝑖𝑛𝑒 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 =
‖𝑢‖‖𝑣‖
where 𝒖. 𝒗 is the dot product of the vectors, and ‖𝑢‖ and ‖𝑣‖ are the magnitudes
(norms) of the vectors.
We will compute:
𝑐𝑜𝑠(𝑎, 𝑏), 𝑐𝑜𝑠(𝑏, 𝑐), and 𝑐𝑜𝑠(𝑎, 𝑐)
The highest cosine similarity indicates the closest pair in terms of direction.
Dot Products:

Norms:

Calculate cosine similarities:


Since, 𝑐𝑜𝑠(𝑎, 𝑏) = 0.707 is the largest compared to -0.083 and -0.763.

b) Training accuracy is 100% for designing a classification model done by you. Will
you be proud of your design? Justify your answer.

Ans: A 100% training accuracy most of the time suggests overfitting. It is not
necessarily a sign of a great model. One should check generalization error using
cross validation scheme to ensure that the model generalizes well.

c) What is the purpose of feature scaling in machine learning.


The main purpose of scaling in machine learning is to bring all feature values in
the same numerical ranges. Widely used feature scaling methods are min-max
normalization or z-score normalization. Following are the advantages of feature
scaling:
1. Prevents large-valued features from dominating the learning process
2. Speeds up convergence for gradient-based methods
3. Handles outliers more effectively
4. Prevents overflow or underflow errors in the variables in the computation

d) Define log-odds function and what is the range of log-odds in logistic regression?
Ans: Logistic regression predicts
𝑝 = 𝜎(𝒘𝑇 𝒙 + 𝑏)
1
𝑤ℎ𝑒𝑟𝑒 𝜎(𝒘𝑇 𝒙 + 𝑏) = 𝑇
(1 + 𝑒 −(𝒘 𝒙+𝑏)
The log-odds is
𝑝
( ) = 𝒘𝑇 𝒙 + 𝑏
1−𝑝

The main purpose of using log-odds in logistic regression is to span (𝒘𝑇 𝒙 + 𝑏) as


(−∞ 𝑡𝑜 ∞).
e) A dataset has 10 instances, where 6 belong to Spam Class and 4 belongs to Not
Spam Class. Compute the entropy of the given dataset?
Ans: Entropy Calculation for the Given Dataset as follows:
𝑐

𝐻(𝐷) = − ∑ 𝑝𝑖 𝑙𝑜𝑔2 (𝑝𝑖 )


𝑖=1
where H(D) is the entropy of the dataset; c is the number of classes; 𝑝𝑖 is the
probability of each class.
Step 1: Compute Class Probabilities
Spam Class: 6 instances
Not Spam Class: 4 instances
Total instances: 10
6 4
𝑝(𝑆𝑝𝑎𝑚) = = 0.6 and 𝑝(𝑁𝑜𝑡 𝑆𝑝𝑎𝑚) = = 0.4
10 10
Step 2: Compute Entropy
𝐻(𝐷) = −[0.6 ∗ 𝑙𝑜𝑔2 (0.6) + 0.4𝑙𝑜𝑔2 (0.4)] = 0.97

So, the entropy of the given dataset is 0.97.

Note: As desired by the faculty, step marks should be awarded if at least some words in
the answer of the student match with evaluation scheme. Otherwise award 0 mark.

2. You are given the following data from a simple linear regression model, where
represents the predicted values and represents the true values:

𝑦𝑖 -114 -36.5 86 40

𝑦̂𝑖 -123 -36 122 50

Evaluate the performance of linear regression, calculates the residuals, MAE, MSE, RMSE,
R-squared (R²) value, and adjusted R-squared value. Based on the performance metric
values provide a comment on whether the model is a good fit for the dataset.

Ans: The given data as follows:


The R-squared value of 0.9362 indicates that approximately 93.62% of the variance in the
dependent variable is explained by the model, which is quite high. The adjusted R-squared
value of 0.9043 also suggests a good fit, accounting for the number of predictors.

Comment: The linear regression model is a good fit for the dataset, as indicated by the
high R-squared and adjusted R-squared values and relatively low error metrics (MAE, MSE,
RMSE).
Note: For comment award 1 mark and rest as desired by the faculty, step marks should be
awarded.

3. Consider the following dataset:

Using the Naive Bayes classifier, determine whether a new customer with the following
attributes is likely to default on a loan: Income Level = Low, Credit Score = Average,
Loan Amount = Medium.
4. A retail company wants to classify new customers based on their annual income and
spending behavior. The goal is to identify whether a customer is a Low Spender or a High
Spender to tailor marketing strategies accordingly. The dataset below represents existing
customers

Given a new customer with Annual Income = $17,000 and Spending Score = 50, Classify
this new customer using KNN with k = 3. Use the Euclidean distance for calculations.

Ans:
Calculate Euclidean distances:
Distance to (15, 39): sqrt((17-15)^2 + (50-39)^2) = sqrt(4 + 121) = sqrt(125) ≈ 11.18
Distance to (16, 81): sqrt((17-16)^2 + (50-81)^2) = sqrt(1 + 961) = sqrt(962) ≈ 31.00
Distance to (17, 6): sqrt((17-17)^2 + (50-6)^2) = sqrt(0 + 1936) = sqrt(1936) = 44.00
Distance to (18, 77): sqrt((17-18)^2 + (50-77)^2) = sqrt(1 + 729) = sqrt(730) ≈ 27.02
Distance to (19, 40): sqrt((17-19)^2 + (50-40)^2) = sqrt(4 + 100) = sqrt(104) ≈ 10.20

Nearest Neighbors (k=3):


1. (19, 40) → Low Spender (10.20)
2. (15, 39) → Low Spender (11.18)
3. (18, 77) → High Spender (27.02)
Majority Category: 2 Low Spenders vs 1 High Spender
Hence, the new customer is classified as a Low Spender.

Note: As desired by the faculty, step marks should be awarded.

5. Find the distance from the point [1 1 1 1 1]𝑇 to the hyperplane


𝑥1 − 𝑥2 + 𝑥3 − 𝑥4 + 𝑥5 + 1 = 0. [1 Mark]
Note: If a student has calculated up to norm of the weight vector, he/she should award
0.5 mark.

Explain the primal and dual formulation of the Support Vector Machine (SVM)
optimization problem.

[4 Marks]
Note: Here short derivation has given. However, complete derivation must be derived
by the student to award 4 marks. Based on derivation, the students should be awarded
marks out of 4.

You might also like