0% found this document useful (0 votes)

8 views5 pages

ML Soln 2

This document outlines Homework 1 for a Machine Learning course at IIT Delhi, covering topics such as probability, linear algebra, K-NN, and decision trees. It includes detailed instructions for submission, a list of questions with specific tasks and marks allocation, and emphasizes the importance of original work and proper documentation. The homework is due on February 9, 2025, and consists of theoretical and practical problems related to various machine learning concepts.

Uploaded by

Neha girish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

ML Soln 2

Uploaded by

Neha girish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Homework 1

Probability, Linear algebra, K-NN and Decison Tree

Instructor: Dr. Md Rushdie Ibne Islam

APL 405: Machine Learning for Mechanics
Department of Applied Mechanics, Indian Institute of Technology Delhi, Winter 2025
Jan 31, 2025
Submission due on Feb 09, 2025

Instructions: (I) You must submit a copy of your own handwritten homework (except
for the coding portion, no computer-typed submission will be accepted). Further, copying
homework from your friends is forbidden. If found copied, the homework submission will be
considered invalid for all the students involved and will be graded zero. (II) Write all the
steps, including your reasoning and the formulae you referred to, if any. While submitting
codes, make sure the code is running and you must submit a report of the plots and results
in a PDF, along with your observations. Submit all the scanned copies in a ZIP file named
“HW1 Name EntryNumber”.

Question 1. [Total Marks: 40]

(A) If Q ∈ Rn×n is nonsingular matrix and the perturbation’s of Q are defined by the
nonsingular matrices Λ ∈ Rk×k and V ∈ Rn×k , then prove that inverse of the rank-k
update of the matrix Q can be computed by doing a rank-k correction to the inverse
of the original matrix Q, i.e., [Marks: 3]
−1 −1
Q + VΛV⊤ = Q−1 − Q−1 V Λ−1 + V⊤ Q−1 V V⊤ Q−1

(B) Given a block matrix M of the form: [Marks: 5+3+3 = 11]

 
A B
M=
C D

where A and D are square matrices, and assuming that D and A − BD−1 C are
invertible:
(a) Derive the inverse of M and show that:
 
(A − BD−1 C)−1 −A−1 B(D − CA−1 B)−1 
M−1 =
−D−1 C(A − BD−1 C)−1 (D − CA−1 B)−1

(b) Show that the determinant of M is given by: det(M) = det(D)·det(A−BD−1 C)

and similarly: det(M) = det(A) · det(D − CA−1 B)

1
APL 744, Homework 1

Question 2. Conditional Expectaion [Total Marks: 20]

(A) Consider two random variables x and y with joint distribution p(x, y). You have to
prove the following two results:
(i) Expectation:
h i
E[x] = Ey Ex [x|y]

(ii) Variance:
h i h i
var[x] = Ey varx [x|y] + vary Ex [x|y] .

Here Ex [x|y] denotes the expectation of x under the conditional distribution p(x|y),
with a similar notation for the conditional variance.
(B) In the following question you have to use MCMC sampling for the following two
problems:
(i) Write a small code that uses random numbers to approximate the value of π.
(ii) Besides simulating random processes, the random numbers can also be used for
evaluating integrals. Write computer programs to approximate the following
integrals and compare your estimate with the exact answers. (For exact answers,
you are allowed to use inbuilt functions.)

R 1 ex R∞ R 1 R 1 (x+y)2
(i) 0 e dx, (ii) 0 x(1 + x2 )−2 dx, (iii) 0 0 e dxdy.

Question 3. Gaussian distribution [Total Marks: 20]

(A) Prove that the convolution of two Gaussian distributions is a Gaussian distribution.
Write a computer code to verify your results. [Marks: 5]
(B) Given the following bivariate distribution, [Marks: 5]
   
xa σ2 ρσa σb 
p(x | µ, Σ) = N   ,  a
xb ρσa σb σb2

show that the conditional distribution p(xa | xb ) is given as,

(ρσa σb )2
!
ρσa σb 2
p(xa | xb ) = N xa | µa + (x b − µ b ), σa −
σb2 σb2

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

APL 744, Homework 1

Question 4. Maximum Likelihood Estimation (MLE) [Total Marks: 20]

(A) Consider a random experiment where we toss a coin N = 5 times. Let X be the
number of heads. Suppose we observe fewer than 3 heads, however, the exact number is
unknown apriori. We consider the prior probability of heads to be p(θ) = Beta(θ|1, 1).
Your task is to derive the posterior p(θ|X < 3) up to a normalization constant.
(B) Suppose you have a uniform distribution centered on 0 with width 2a. The density
function is given as
1
p(x) = I(x ∈ [−a, a])
2a
(i) Given a dataset x1 , . . . , xn , what is the MLE of a?
(ii) What probability would the model assign to x̂n+1 using the MLE estimate of a?
(iii) Do you see any problem with the above approach? If yes, briefly suggest a better
alternative.

Question 5. Design and Implementation of Decision Trees with Custom Splitting Criteria
[Total Marks: 25]

You are tasked with designing a decision tree algorithm with support for custom splitting
criteria. Consider the following tasks:

Part A: Theoretical Development. 1. Assume a dataset D with N samples and a target

variable with k classes. Derive the splitting formulas for the following:
• Gini Impurity: Measure the impurity of nodes based on the probability of misclas-
sification. The formula is given by:
k
Pi2
X
Gini, G = 1 −
i=1

where Pi is the proportion of samples belonging to class i.

• Entropy and Information Gain: Compute the entropy of a node and the reduction
in entropy after a split:
k
X
Entropy, E = − Pi log2 (Pi )
i=1

and
X |Tv |
Information Gain, Ig = E(Parent Node) − E(Tv ),
v∈values |T |
where Tv represents the subset of data for a specific value of the splitting feature and
values represents the unique values or splits of the feature being considered.

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

APL 744, Homework 1

• Information Gain Ratio: Normalize the information gain to penalize high-cardinality

features. The formula is:
Inf ormation Gain
Information Gain Ratio, Igr = ,
Si
n |Ti | |Ti |
where Si = −
P
log2 .
i=1 |T | |T |
2. Discuss scenarios where each criterion is preferred, particularly in the presence of
high-cardinality features or imbalanced class distributions.

Part B: Algorithm Design and Implementation. Implement decision tree algorithm

using all splitting criterion (Gini Impurity, Information Gain, and Information Gain Ra-
tio). Your implementation must not use any library functions (such as Scikit-learn’s
DecisionTreeClassifier). Only use libraries like Pandas or Numpy for data manipula-
tion. Use Iris dataset, ref from sklearn.datasets import load iris (use 70:10:20 as
the train:validation:test split). Provide an analysis of the performance of various splitting
criterion.

Question 6. K Nearest Neighbor [Total Marks: 25]

Problem Statement You are tasked with implementing and optimizing a K-Nearest Neigh-
bors (K-NN) classifier on the Digits dataset (ref sklearn.datasets import load digits).
The dataset contains:
• Features: 64 numerical features representing pixel intensity values for an 8x8 grayscale
image.
• Classes: 10 classes, representing digits (0 through 9).
• Samples: 1,797 total samples.
Perform a 70:10:20 split for training, validation, and testing sets, ensuring stratification to
maintain class balance.
Question Components
Part A: Data Preprocessing
1. Split the dataset into train, validation, and test sets using the specified 70:10:20 ratio.
Ensure that class distribution remains consistent across the splits.
2. Normalize the features. Explain why normalization can be essential for K-NN. Will it be
helpful for this particular dataset.
Part B: Custom Distance Metric
1. Implement a custom distance metric with feature-specific weights:
v
u
uXd
d(x, xi ) = wj (xj − xi,j )2 .
u
t
j=1

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

APL 744, Homework 1

Initially set wj = 1 (uniform weights). Later, adjust wj based on the variance of feature j,
assigning higher weights to features with lower variance.
2. Compare the performance of this weighted Euclidean distance metric with the standard
Euclidean distance on the validation set.
Part C: Hyperparameter Tuning
1. Optimize the number of neighbors k using the validation set. Experiment with k ∈
{1, 3, 5, 10, 15}.
2. Compare the performance of different distance metrics, Standard Euclidean distance,
Manhattan distance and Weighted Euclidean distance (from Part B).
Part D: Weighted Voting
1. Implement a weighted voting mechanism where the weight of each neighbor is inversely
proportional to its distance:
1
wi = .
1 + d(x, xi )

2. Analyze the impact of weighted voting on the overall performance metrics, including:
• Accuracy.
• Confusion Matrix.
Part E: Model Evaluation
1. Evaluate the final model with the optimal k and distance metric on the test set.
2. Analyze and report:
• Overall accuracy.
• Class-specific performance to detect any biases.

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

Practice Questions Lec 18 45
No ratings yet
Practice Questions Lec 18 45
4 pages
Stochastic Gradient Descent 1
No ratings yet
Stochastic Gradient Descent 1
42 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
Structural Calculations - Cal PDF
No ratings yet
Structural Calculations - Cal PDF
117 pages
ELP On Mushroom Cultivation
No ratings yet
ELP On Mushroom Cultivation
19 pages
A2 Sol
No ratings yet
A2 Sol
17 pages
ES Key
No ratings yet
ES Key
4 pages
ML Ctanujit
No ratings yet
ML Ctanujit
56 pages
DeepLearning HW1
No ratings yet
DeepLearning HW1
1 page
GS Paper 1 1 Geography of India
No ratings yet
GS Paper 1 1 Geography of India
42 pages
Cognitive Architecture - Designing For How We Respond To The Built Environment
No ratings yet
Cognitive Architecture - Designing For How We Respond To The Built Environment
3 pages
EndSem 202223 Solution
No ratings yet
EndSem 202223 Solution
4 pages
ML June 2024
No ratings yet
ML June 2024
12 pages
2024 Machine Learning
No ratings yet
2024 Machine Learning
8 pages
Stal S700 - Porownanie - 10-Hillong-Milan-Veljkovic
No ratings yet
Stal S700 - Porownanie - 10-Hillong-Milan-Veljkovic
30 pages
HW 2
No ratings yet
HW 2
4 pages
ML4N Exam Sample 2024
No ratings yet
ML4N Exam Sample 2024
6 pages
Ad
No ratings yet
Ad
5 pages
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
56 pages
Cromeans J Breanan NU-607-818 Theoretical Underpinnings
No ratings yet
Cromeans J Breanan NU-607-818 Theoretical Underpinnings
8 pages
Best Students - Coe
No ratings yet
Best Students - Coe
2 pages
Creative Strategies of Local Resources in Managing Geotourism in The Ijen Geopark Bondowoso, E
No ratings yet
Creative Strategies of Local Resources in Managing Geotourism in The Ijen Geopark Bondowoso, E
20 pages
Country Frost King Creek Cowboys Book 8 Cheyenne Mccray PDF Download
No ratings yet
Country Frost King Creek Cowboys Book 8 Cheyenne Mccray PDF Download
29 pages
Angela Ales Bello The Divine in Husserl and Other Explorations 1st Edition Angela Ales Bello Auth Instant Download
No ratings yet
Angela Ales Bello The Divine in Husserl and Other Explorations 1st Edition Angela Ales Bello Auth Instant Download
29 pages
HW 2
No ratings yet
HW 2
5 pages
ML PG Assignment 3
No ratings yet
ML PG Assignment 3
3 pages
Semester5 Term-Test-I Papers
No ratings yet
Semester5 Term-Test-I Papers
11 pages
Motion in 2D DPP 7 Min
No ratings yet
Motion in 2D DPP 7 Min
3 pages
Machine Learning Solutions
No ratings yet
Machine Learning Solutions
6 pages
Career Opportunities - Food Security Cluster Coordinator - WFP
No ratings yet
Career Opportunities - Food Security Cluster Coordinator - WFP
4 pages
English Questions
No ratings yet
English Questions
6 pages
MidSem 202223 Solution
No ratings yet
MidSem 202223 Solution
4 pages
Lecture10 Mid
No ratings yet
Lecture10 Mid
43 pages
Konica Monolta Drum (Photoconductor) DR512-DR512K
No ratings yet
Konica Monolta Drum (Photoconductor) DR512-DR512K
4 pages
Kottak Chapter Highlighted PDF
No ratings yet
Kottak Chapter Highlighted PDF
22 pages
Manual Servicio - URIT-8021A
No ratings yet
Manual Servicio - URIT-8021A
148 pages
hw3 Red
No ratings yet
hw3 Red
4 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Egzmm20b2 en
No ratings yet
Egzmm20b2 en
2 pages
15114L23 Popa-Mirela 2023 29-2 - 133-137
No ratings yet
15114L23 Popa-Mirela 2023 29-2 - 133-137
5 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Midterm Solutions
No ratings yet
Midterm Solutions
8 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Assignment 1
No ratings yet
Assignment 1
6 pages
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 4: Solutions
No ratings yet
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 4: Solutions
8 pages
Ain Shams University Faculty of Engineering
No ratings yet
Ain Shams University Faculty of Engineering
8 pages
2017-18-I MS Key
No ratings yet
2017-18-I MS Key
6 pages
HW 4
No ratings yet
HW 4
5 pages
CLIL WORKSHEETS Globalization
No ratings yet
CLIL WORKSHEETS Globalization
1 page
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
Homework Solution 01 KNN DT
No ratings yet
Homework Solution 01 KNN DT
4 pages
Impact of Colonialism On Africa and Its Economic Development
No ratings yet
Impact of Colonialism On Africa and Its Economic Development
8 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
11 pages
Verbal Classfication
No ratings yet
Verbal Classfication
2 pages
ST3189 - Machine Learning - 2019 Exam - Zone-B
No ratings yet
ST3189 - Machine Learning - 2019 Exam - Zone-B
6 pages
CSPC - 204
No ratings yet
CSPC - 204
4 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
HW 3
No ratings yet
HW 3
7 pages
CS246 Final Exam Solutions, Winter 2011
No ratings yet
CS246 Final Exam Solutions, Winter 2011
18 pages
hw2 Red
No ratings yet
hw2 Red
4 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Midterm 2006
No ratings yet
Midterm 2006
11 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
Homework 1
0% (1)
Homework 1
4 pages
Environmental Law Principles Sustainable Development
No ratings yet
Environmental Law Principles Sustainable Development
14 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions
No ratings yet
Solutions: 10-601 Machine Learning, Midterm Exam: Spring 2008 Solutions
8 pages
HW 1 Sol
No ratings yet
HW 1 Sol
6 pages
1000099853
No ratings yet
1000099853
2 pages
Complacency - Safety Toolbox Talks Meeting Topics
No ratings yet
Complacency - Safety Toolbox Talks Meeting Topics
2 pages
Leaders Are Born Not Made
No ratings yet
Leaders Are Born Not Made
6 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
ACR-Orientation Work Arrangement
No ratings yet
ACR-Orientation Work Arrangement
10 pages
HW 23 P 4 Rie
No ratings yet
HW 23 P 4 Rie
5 pages
Assessment Task 1.2
No ratings yet
Assessment Task 1.2
14 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Practice Midterm
No ratings yet
Practice Midterm
4 pages
HW 1
No ratings yet
HW 1
4 pages
Chemistry Investigatory Project
33% (3)
Chemistry Investigatory Project
11 pages
IECEx PRE 19.0093U 000
No ratings yet
IECEx PRE 19.0093U 000
5 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Quarter 1 Least Learned Competencies in Science
No ratings yet
Quarter 1 Least Learned Competencies in Science
3 pages
15-Nguyen Van Thin-Bai Bao28!3!2007
No ratings yet
15-Nguyen Van Thin-Bai Bao28!3!2007
8 pages
NATM PPT Gall-Natm-Design-Construction PDF
No ratings yet
NATM PPT Gall-Natm-Design-Construction PDF
63 pages

ML Soln 2

Uploaded by

ML Soln 2

Uploaded by

Homework 1

Probability, Linear algebra, K-NN and Decison Tree

Instructor: Dr. Md Rushdie Ibne Islam

Question 1. [Total Marks: 40]

(B) Given a block matrix M of the form: [Marks: 5+3+3 = 11]

(b) Show that the determinant of M is given by: det(M) = det(D)·det(A−BD−1 C)

Question 2. Conditional Expectaion [Total Marks: 20]

Question 3. Gaussian distribution [Total Marks: 20]

show that the conditional distribution p(xa | xb ) is given as,

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

Question 4. Maximum Likelihood Estimation (MLE) [Total Marks: 20]

Part A: Theoretical Development. 1. Assume a dataset D with N samples and a target

where Pi is the proportion of samples belonging to class i.

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

• Information Gain Ratio: Normalize the information gain to penalize high-cardinality

Part B: Algorithm Design and Implementation. Implement decision tree algorithm

Question 6. K Nearest Neighbor [Total Marks: 25]

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

You might also like