0% found this document useful (0 votes)

6 views8 pages

Spring Mid Sem ML Evalution Scheme

Uploaded by

Shivam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views8 pages

Spring Mid Sem ML Evalution Scheme

Uploaded by

Shivam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

SPRING MID SEMESTER ML EVALUATION SCHEME -2025

School of Computer Engineering

Kalinga Institute of Industrial Technology, Deemed to be University
Subject Name: Machine Learning
[Subject Code: CS31002]

1. Answer all the questions.

a) Given three 2D vectors such as a = (2, 5), b = (-3, 7), and c = (4, -2), find out which
two vectors are the closest to each other based on cosine similarity?

Ans: To determine which two vectors are the closest based on cosine similarity,
we need to calculate the cosine similarity between each pair of vectors. The
cosine similarity between two vectors 𝒖 and 𝒗 is given by:
𝑢. 𝑣
𝑐𝑜𝑠𝑖𝑛𝑒 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 =
‖𝑢‖‖𝑣‖
where 𝒖. 𝒗 is the dot product of the vectors, and ‖𝑢‖ and ‖𝑣‖ are the magnitudes
(norms) of the vectors.
We will compute:
𝑐𝑜𝑠(𝑎, 𝑏), 𝑐𝑜𝑠(𝑏, 𝑐), and 𝑐𝑜𝑠(𝑎, 𝑐)
The highest cosine similarity indicates the closest pair in terms of direction.
Dot Products:

Norms:

Calculate cosine similarities:

Since, 𝑐𝑜𝑠(𝑎, 𝑏) = 0.707 is the largest compared to -0.083 and -0.763.

b) Training accuracy is 100% for designing a classification model done by you. Will
you be proud of your design? Justify your answer.

Ans: A 100% training accuracy most of the time suggests overfitting. It is not
necessarily a sign of a great model. One should check generalization error using
cross validation scheme to ensure that the model generalizes well.

c) What is the purpose of feature scaling in machine learning.

The main purpose of scaling in machine learning is to bring all feature values in
the same numerical ranges. Widely used feature scaling methods are min-max
normalization or z-score normalization. Following are the advantages of feature
scaling:
1. Prevents large-valued features from dominating the learning process
2. Speeds up convergence for gradient-based methods
3. Handles outliers more effectively
4. Prevents overflow or underflow errors in the variables in the computation

d) Define log-odds function and what is the range of log-odds in logistic regression?
Ans: Logistic regression predicts
𝑝 = 𝜎(𝒘𝑇 𝒙 + 𝑏)
1
𝑤ℎ𝑒𝑟𝑒 𝜎(𝒘𝑇 𝒙 + 𝑏) = 𝑇
(1 + 𝑒 −(𝒘 𝒙+𝑏)
The log-odds is
𝑝
( ) = 𝒘𝑇 𝒙 + 𝑏
1−𝑝

The main purpose of using log-odds in logistic regression is to span (𝒘𝑇 𝒙 + 𝑏) as

(−∞ 𝑡𝑜 ∞).
e) A dataset has 10 instances, where 6 belong to Spam Class and 4 belongs to Not
Spam Class. Compute the entropy of the given dataset?
Ans: Entropy Calculation for the Given Dataset as follows:
𝑐

𝐻(𝐷) = − ∑ 𝑝𝑖 𝑙𝑜𝑔2 (𝑝𝑖 )

𝑖=1
where H(D) is the entropy of the dataset; c is the number of classes; 𝑝𝑖 is the
probability of each class.
Step 1: Compute Class Probabilities
Spam Class: 6 instances
Not Spam Class: 4 instances
Total instances: 10
6 4
𝑝(𝑆𝑝𝑎𝑚) = = 0.6 and 𝑝(𝑁𝑜𝑡 𝑆𝑝𝑎𝑚) = = 0.4
10 10
Step 2: Compute Entropy
𝐻(𝐷) = −[0.6 ∗ 𝑙𝑜𝑔2 (0.6) + 0.4𝑙𝑜𝑔2 (0.4)] = 0.97

So, the entropy of the given dataset is 0.97.

Note: As desired by the faculty, step marks should be awarded if at least some words in
the answer of the student match with evaluation scheme. Otherwise award 0 mark.

2. You are given the following data from a simple linear regression model, where
represents the predicted values and represents the true values:

𝑦𝑖 -114 -36.5 86 40

𝑦̂𝑖 -123 -36 122 50

Evaluate the performance of linear regression, calculates the residuals, MAE, MSE, RMSE,
R-squared (R²) value, and adjusted R-squared value. Based on the performance metric
values provide a comment on whether the model is a good fit for the dataset.

Ans: The given data as follows:

The R-squared value of 0.9362 indicates that approximately 93.62% of the variance in the
dependent variable is explained by the model, which is quite high. The adjusted R-squared
value of 0.9043 also suggests a good fit, accounting for the number of predictors.

Comment: The linear regression model is a good fit for the dataset, as indicated by the
high R-squared and adjusted R-squared values and relatively low error metrics (MAE, MSE,
RMSE).
Note: For comment award 1 mark and rest as desired by the faculty, step marks should be
awarded.

3. Consider the following dataset:

Using the Naive Bayes classifier, determine whether a new customer with the following
attributes is likely to default on a loan: Income Level = Low, Credit Score = Average,
Loan Amount = Medium.
4. A retail company wants to classify new customers based on their annual income and
spending behavior. The goal is to identify whether a customer is a Low Spender or a High
Spender to tailor marketing strategies accordingly. The dataset below represents existing
customers

Given a new customer with Annual Income = $17,000 and Spending Score = 50, Classify
this new customer using KNN with k = 3. Use the Euclidean distance for calculations.

Ans:
Calculate Euclidean distances:
Distance to (15, 39): sqrt((17-15)^2 + (50-39)^2) = sqrt(4 + 121) = sqrt(125) ≈ 11.18
Distance to (16, 81): sqrt((17-16)^2 + (50-81)^2) = sqrt(1 + 961) = sqrt(962) ≈ 31.00
Distance to (17, 6): sqrt((17-17)^2 + (50-6)^2) = sqrt(0 + 1936) = sqrt(1936) = 44.00
Distance to (18, 77): sqrt((17-18)^2 + (50-77)^2) = sqrt(1 + 729) = sqrt(730) ≈ 27.02
Distance to (19, 40): sqrt((17-19)^2 + (50-40)^2) = sqrt(4 + 100) = sqrt(104) ≈ 10.20

Nearest Neighbors (k=3):

1. (19, 40) → Low Spender (10.20)
2. (15, 39) → Low Spender (11.18)
3. (18, 77) → High Spender (27.02)
Majority Category: 2 Low Spenders vs 1 High Spender
Hence, the new customer is classified as a Low Spender.

Note: As desired by the faculty, step marks should be awarded.

5. Find the distance from the point [1 1 1 1 1]𝑇 to the hyperplane

𝑥1 − 𝑥2 + 𝑥3 − 𝑥4 + 𝑥5 + 1 = 0. [1 Mark]
Note: If a student has calculated up to norm of the weight vector, he/she should award
0.5 mark.

Explain the primal and dual formulation of the Support Vector Machine (SVM)
optimization problem.

[4 Marks]
Note: Here short derivation has given. However, complete derivation must be derived
by the student to award 4 marks. Based on derivation, the students should be awarded
marks out of 4.

decision tree
No ratings yet
decision tree
66 pages
Unit 2 ML_Ver 2
No ratings yet
Unit 2 ML_Ver 2
129 pages
ML Assignments 2025
No ratings yet
ML Assignments 2025
91 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
mean shift clustering
No ratings yet
mean shift clustering
23 pages
Svm
No ratings yet
Svm
52 pages
BD DBSS Parameters OVF30 AAA30288AAG - 2005-10-17
No ratings yet
BD DBSS Parameters OVF30 AAA30288AAG - 2005-10-17
23 pages
6.CS3691_ES&IOT_ESP32 and Raspberry Pi
No ratings yet
6.CS3691_ES&IOT_ESP32 and Raspberry Pi
49 pages
1 Test Cases Form MSBTE Theory Exam (STE 17624 and STE 22518)
No ratings yet
1 Test Cases Form MSBTE Theory Exam (STE 17624 and STE 22518)
15 pages
Sabre Basic and Advanced Updated 2024
No ratings yet
Sabre Basic and Advanced Updated 2024
8 pages
CNN
No ratings yet
CNN
20 pages
Final Compre - Solutions - updated FoDS
No ratings yet
Final Compre - Solutions - updated FoDS
12 pages
Infineon IKCM30F60GD DataSheet v02 - 05 EN
No ratings yet
Infineon IKCM30F60GD DataSheet v02 - 05 EN
17 pages
Escet+jrl+ (Vol+2+no+2) +10 103125
No ratings yet
Escet+jrl+ (Vol+2+no+2) +10 103125
13 pages
IOT Report
No ratings yet
IOT Report
31 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
12 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
Statistics Quiz
No ratings yet
Statistics Quiz
20 pages
MicroSCADA X SYS600 Operation Manual For Workplace X
No ratings yet
MicroSCADA X SYS600 Operation Manual For Workplace X
178 pages
ass 2
No ratings yet
ass 2
7 pages
Machine 2020 Jul-Dec
No ratings yet
Machine 2020 Jul-Dec
45 pages
Id5059 23 2 1
No ratings yet
Id5059 23 2 1
8 pages
2024_Machine_Learning
No ratings yet
2024_Machine_Learning
8 pages
Statistical Methods-1
No ratings yet
Statistical Methods-1
63 pages
Question Bank
No ratings yet
Question Bank
6 pages
IML-IITKGP - Assignment 5 Solution
No ratings yet
IML-IITKGP - Assignment 5 Solution
7 pages
Giachetti, Ronald E. - Design of Enterprise Systems - Theory, Architecture, and Methods (2010, CRC Press) - Páginas-229-261
No ratings yet
Giachetti, Ronald E. - Design of Enterprise Systems - Theory, Architecture, and Methods (2010, CRC Press) - Páginas-229-261
33 pages
K-Means Clustering Numerical
No ratings yet
K-Means Clustering Numerical
12 pages
Ericsson RBS Series
100% (1)
Ericsson RBS Series
2 pages
Homework Set 3
No ratings yet
Homework Set 3
7 pages
2022_Machine Learning
No ratings yet
2022_Machine Learning
6 pages
DS3-Lab5-v3
No ratings yet
DS3-Lab5-v3
4 pages
hw01s
No ratings yet
hw01s
10 pages
quiz-1
No ratings yet
quiz-1
3 pages
Exp 2
No ratings yet
Exp 2
6 pages
SMAI Question Papers
No ratings yet
SMAI Question Papers
13 pages
Mid-Term A2 ML Solution
No ratings yet
Mid-Term A2 ML Solution
7 pages
GPRS White Paper
100% (1)
GPRS White Paper
18 pages
Project2 2022 Fall
No ratings yet
Project2 2022 Fall
7 pages
Chapter 5 Learning Deterministic Models
No ratings yet
Chapter 5 Learning Deterministic Models
28 pages
Answers to All Questions
No ratings yet
Answers to All Questions
4 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
11 pages
AI 時尚行業
No ratings yet
AI 時尚行業
99 pages
GDiMPS Part Breakdown (1804c) Instructions
No ratings yet
GDiMPS Part Breakdown (1804c) Instructions
11 pages
Machine Learning (7)
No ratings yet
Machine Learning (7)
4 pages
20180122_Board Resolution_Seller
No ratings yet
20180122_Board Resolution_Seller
2 pages
EE 660 Assignments
No ratings yet
EE 660 Assignments
9 pages
Developer Title
100% (1)
Developer Title
374 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Communication System Design For White-Fi (802.11af)
No ratings yet
Communication System Design For White-Fi (802.11af)
6 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
S&UL Subjective Question Bank
No ratings yet
S&UL Subjective Question Bank
7 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
P04 EvaluationKNN SolutionNotes
No ratings yet
P04 EvaluationKNN SolutionNotes
3 pages
Latest Operating System For The CS6X and CS6R
No ratings yet
Latest Operating System For The CS6X and CS6R
3 pages
Practice Questions
No ratings yet
Practice Questions
3 pages
Machine Learning,( CS-3035), Online Spring End Semester Examination 2021
No ratings yet
Machine Learning,( CS-3035), Online Spring End Semester Examination 2021
8 pages
Curve Fitting: There Are Two General Approaches For Curve Fitting
No ratings yet
Curve Fitting: There Are Two General Approaches For Curve Fitting
63 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
ml-20230316-1
No ratings yet
ml-20230316-1
9 pages
Extra 2
No ratings yet
Extra 2
7 pages
ML June 2024
No ratings yet
ML June 2024
12 pages
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
4 pages
User Guide
100% (2)
User Guide
116 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
DFS
No ratings yet
DFS
37 pages
CS-3035 (ML) - CS Mid March 2023
No ratings yet
CS-3035 (ML) - CS Mid March 2023
3 pages
Service Manual: Mini Component Sound System
100% (1)
Service Manual: Mini Component Sound System
44 pages
Questions and Solutions On Linear Regression
No ratings yet
Questions and Solutions On Linear Regression
5 pages
Instruction module-IBT Student Login
No ratings yet
Instruction module-IBT Student Login
5 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
Machine Learning PYQ 2021
No ratings yet
Machine Learning PYQ 2021
4 pages
DCN-D Conference Delegate Units: DCN-DS/L, DCN-DCS, DCN-DV, Dcn-Dvcs
No ratings yet
DCN-D Conference Delegate Units: DCN-DS/L, DCN-DCS, DCN-DV, Dcn-Dvcs
26 pages
HW_02
No ratings yet
HW_02
3 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
No ratings yet
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
6 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
01 AI Overview
No ratings yet
01 AI Overview
62 pages
Worksheet For Quiz
No ratings yet
Worksheet For Quiz
5 pages
HCD Ec77
No ratings yet
HCD Ec77
104 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
Assignment_III
No ratings yet
Assignment_III
3 pages
Silicon Techlab (STL) : Website Industry
No ratings yet
Silicon Techlab (STL) : Website Industry
3 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
3D Printer: Owner's Manual
No ratings yet
3D Printer: Owner's Manual
17 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
No ratings yet
Machine Learning Foundations and Applications Assignment 1 Due Date: 10 October, 2021
3 pages
HW 1
No ratings yet
HW 1
4 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
No ratings yet
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
6 pages
Original Edition Delta - Book of War
100% (1)
Original Edition Delta - Book of War
22 pages
ENG 112 Fall 2011 Syllabus FJW03
No ratings yet
ENG 112 Fall 2011 Syllabus FJW03
11 pages
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Spring Mid Sem ML Evalution Scheme

Uploaded by

Spring Mid Sem ML Evalution Scheme

Uploaded by

SPRING MID SEMESTER ML EVALUATION SCHEME -2025

School of Computer Engineering

1. Answer all the questions.

Calculate cosine similarities:

c) What is the purpose of feature scaling in machine learning.

The main purpose of using log-odds in logistic regression is to span (𝒘𝑇 𝒙 + 𝑏) as

𝐻(𝐷) = − ∑ 𝑝𝑖 𝑙𝑜𝑔2 (𝑝𝑖 )

So, the entropy of the given dataset is 0.97.

𝑦̂𝑖 -123 -36 122 50

Ans: The given data as follows:

3. Consider the following dataset:

Nearest Neighbors (k=3):

Note: As desired by the faculty, step marks should be awarded.

5. Find the distance from the point [1 1 1 1 1]𝑇 to the hyperplane

You might also like