0% found this document useful (0 votes)
8 views5 pages

ML Soln 2

This document outlines Homework 1 for a Machine Learning course at IIT Delhi, covering topics such as probability, linear algebra, K-NN, and decision trees. It includes detailed instructions for submission, a list of questions with specific tasks and marks allocation, and emphasizes the importance of original work and proper documentation. The homework is due on February 9, 2025, and consists of theoretical and practical problems related to various machine learning concepts.

Uploaded by

Neha girish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views5 pages

ML Soln 2

This document outlines Homework 1 for a Machine Learning course at IIT Delhi, covering topics such as probability, linear algebra, K-NN, and decision trees. It includes detailed instructions for submission, a list of questions with specific tasks and marks allocation, and emphasizes the importance of original work and proper documentation. The homework is due on February 9, 2025, and consists of theoretical and practical problems related to various machine learning concepts.

Uploaded by

Neha girish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Homework 1

Probability, Linear algebra, K-NN and Decison Tree

Instructor: Dr. Md Rushdie Ibne Islam


APL 405: Machine Learning for Mechanics
Department of Applied Mechanics, Indian Institute of Technology Delhi, Winter 2025
Jan 31, 2025
Submission due on Feb 09, 2025

Instructions: (I) You must submit a copy of your own handwritten homework (except
for the coding portion, no computer-typed submission will be accepted). Further, copying
homework from your friends is forbidden. If found copied, the homework submission will be
considered invalid for all the students involved and will be graded zero. (II) Write all the
steps, including your reasoning and the formulae you referred to, if any. While submitting
codes, make sure the code is running and you must submit a report of the plots and results
in a PDF, along with your observations. Submit all the scanned copies in a ZIP file named
“HW1 Name EntryNumber”.

Question 1. [Total Marks: 40]


(A) If Q ∈ Rn×n is nonsingular matrix and the perturbation’s of Q are defined by the
nonsingular matrices Λ ∈ Rk×k and V ∈ Rn×k , then prove that inverse of the rank-k
update of the matrix Q can be computed by doing a rank-k correction to the inverse
of the original matrix Q, i.e., [Marks: 3]
 −1  −1
Q + VΛV⊤ = Q−1 − Q−1 V Λ−1 + V⊤ Q−1 V V⊤ Q−1

(B) Given a block matrix M of the form: [Marks: 5+3+3 = 11]


 
A B
M=
C D

where A and D are square matrices, and assuming that D and A − BD−1 C are
invertible:
(a) Derive the inverse of M and show that:
 
(A − BD−1 C)−1 −A−1 B(D − CA−1 B)−1 
M−1 =
−D−1 C(A − BD−1 C)−1 (D − CA−1 B)−1

(b) Show that the determinant of M is given by: det(M) = det(D)·det(A−BD−1 C)


and similarly: det(M) = det(A) · det(D − CA−1 B)

1
APL 744, Homework 1

Question 2. Conditional Expectaion [Total Marks: 20]

(A) Consider two random variables x and y with joint distribution p(x, y). You have to
prove the following two results:
(i) Expectation:
h i
E[x] = Ey Ex [x|y]

(ii) Variance:
h i h i
var[x] = Ey varx [x|y] + vary Ex [x|y] .

Here Ex [x|y] denotes the expectation of x under the conditional distribution p(x|y),
with a similar notation for the conditional variance.
(B) In the following question you have to use MCMC sampling for the following two
problems:
(i) Write a small code that uses random numbers to approximate the value of π.
(ii) Besides simulating random processes, the random numbers can also be used for
evaluating integrals. Write computer programs to approximate the following
integrals and compare your estimate with the exact answers. (For exact answers,
you are allowed to use inbuilt functions.)

R 1 ex R∞ R 1 R 1 (x+y)2
(i) 0 e dx, (ii) 0 x(1 + x2 )−2 dx, (iii) 0 0 e dxdy.

Question 3. Gaussian distribution [Total Marks: 20]

(A) Prove that the convolution of two Gaussian distributions is a Gaussian distribution.
Write a computer code to verify your results. [Marks: 5]
(B) Given the following bivariate distribution, [Marks: 5]
   
xa σ2 ρσa σb 
p(x | µ, Σ) = N   ,  a
xb ρσa σb σb2

show that the conditional distribution p(xa | xb ) is given as,

(ρσa σb )2
!
ρσa σb 2
p(xa | xb ) = N xa | µa + (x b − µ b ), σa −
σb2 σb2

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn


APL 744, Homework 1

Question 4. Maximum Likelihood Estimation (MLE) [Total Marks: 20]


(A) Consider a random experiment where we toss a coin N = 5 times. Let X be the
number of heads. Suppose we observe fewer than 3 heads, however, the exact number is
unknown apriori. We consider the prior probability of heads to be p(θ) = Beta(θ|1, 1).
Your task is to derive the posterior p(θ|X < 3) up to a normalization constant.
(B) Suppose you have a uniform distribution centered on 0 with width 2a. The density
function is given as
1
p(x) = I(x ∈ [−a, a])
2a
(i) Given a dataset x1 , . . . , xn , what is the MLE of a?
(ii) What probability would the model assign to x̂n+1 using the MLE estimate of a?
(iii) Do you see any problem with the above approach? If yes, briefly suggest a better
alternative.

Question 5. Design and Implementation of Decision Trees with Custom Splitting Criteria
[Total Marks: 25]

You are tasked with designing a decision tree algorithm with support for custom splitting
criteria. Consider the following tasks:

Part A: Theoretical Development. 1. Assume a dataset D with N samples and a target


variable with k classes. Derive the splitting formulas for the following:
• Gini Impurity: Measure the impurity of nodes based on the probability of misclas-
sification. The formula is given by:
k
Pi2
X
Gini, G = 1 −
i=1

where Pi is the proportion of samples belonging to class i.


• Entropy and Information Gain: Compute the entropy of a node and the reduction
in entropy after a split:
k
X
Entropy, E = − Pi log2 (Pi )
i=1

and
X |Tv |
Information Gain, Ig = E(Parent Node) − E(Tv ),
v∈values |T |
where Tv represents the subset of data for a specific value of the splitting feature and
values represents the unique values or splits of the feature being considered.

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn


APL 744, Homework 1

• Information Gain Ratio: Normalize the information gain to penalize high-cardinality


features. The formula is:
Inf ormation Gain
Information Gain Ratio, Igr = ,
Si
n |Ti | |Ti |
where Si = −
P
log2 .
i=1 |T | |T |
2. Discuss scenarios where each criterion is preferred, particularly in the presence of
high-cardinality features or imbalanced class distributions.

Part B: Algorithm Design and Implementation. Implement decision tree algorithm


using all splitting criterion (Gini Impurity, Information Gain, and Information Gain Ra-
tio). Your implementation must not use any library functions (such as Scikit-learn’s
DecisionTreeClassifier). Only use libraries like Pandas or Numpy for data manipula-
tion. Use Iris dataset, ref from sklearn.datasets import load iris (use 70:10:20 as
the train:validation:test split). Provide an analysis of the performance of various splitting
criterion.

Question 6. K Nearest Neighbor [Total Marks: 25]


Problem Statement You are tasked with implementing and optimizing a K-Nearest Neigh-
bors (K-NN) classifier on the Digits dataset (ref sklearn.datasets import load digits).
The dataset contains:
• Features: 64 numerical features representing pixel intensity values for an 8x8 grayscale
image.
• Classes: 10 classes, representing digits (0 through 9).
• Samples: 1,797 total samples.
Perform a 70:10:20 split for training, validation, and testing sets, ensuring stratification to
maintain class balance.
Question Components
Part A: Data Preprocessing
1. Split the dataset into train, validation, and test sets using the specified 70:10:20 ratio.
Ensure that class distribution remains consistent across the splits.
2. Normalize the features. Explain why normalization can be essential for K-NN. Will it be
helpful for this particular dataset.
Part B: Custom Distance Metric
1. Implement a custom distance metric with feature-specific weights:
v
u
uXd
d(x, xi ) = wj (xj − xi,j )2 .
u
t
j=1

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn


APL 744, Homework 1

Initially set wj = 1 (uniform weights). Later, adjust wj based on the variance of feature j,
assigning higher weights to features with lower variance.
2. Compare the performance of this weighted Euclidean distance metric with the standard
Euclidean distance on the validation set.
Part C: Hyperparameter Tuning
1. Optimize the number of neighbors k using the validation set. Experiment with k ∈
{1, 3, 5, 10, 15}.
2. Compare the performance of different distance metrics, Standard Euclidean distance,
Manhattan distance and Weighted Euclidean distance (from Part B).
Part D: Weighted Voting
1. Implement a weighted voting mechanism where the weight of each neighbor is inversely
proportional to its distance:
1
wi = .
1 + d(x, xi )

2. Analyze the impact of weighted voting on the overall performance metrics, including:
• Accuracy.
• Confusion Matrix.
Part E: Model Evaluation
1. Evaluate the final model with the optimal k and distance metric on the test set.
2. Analyze and report:
• Overall accuracy.
• Class-specific performance to detect any biases.

Department of Applied Mechanics, Indian Institute of Technology Delhi, Autumn

You might also like