0% found this document useful (0 votes)

3 views

Machine Learning Homework1 Solutions

The document presents solutions to four machine learning problems, including polynomial regression and multivariate regression, highlighting optimal degrees and regularization techniques. It also proves properties of logistic functions and implements a logistic regression algorithm in MATLAB, detailing the performance metrics and convergence rates for various hyperparameter settings. The findings indicate that the model can achieve 100% accuracy with zero classification error under specific conditions.

Uploaded by

harishbalaji0905

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Machine Learning Homework1 Solutions

Uploaded by

harishbalaji0905

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

MACHINE LEARNING HOMEWORK 1 SOLUTIONS

HARISH BALAJI BOOMINATHAN (hb2917)

September 16,2024

Problem 1:
Goal : To find a one dimensional function that takes a scalar input and outputs a scalar
f : ℝ → ℝ.

Form of the function :

f( x,𝜃 ) = 𝜃0 + 𝜃1 𝑥 + 𝜃2 𝑥 2 + . . . + 𝜃𝑑 𝑥 𝑑

Where d is the degree of the polynomial.

Empirical Risk is given by

𝑁
1 2
𝑅𝑒𝑚𝑝 (𝜃) = ∑ (𝑦𝑖 − 𝑓( 𝑥; 𝜃))
2𝑁
𝑖=1

On partially differentiating with respect to 𝜃 and equating to 0 ,we get:

𝜃 ∗ = (𝑋 𝑇 𝑋)−1 𝑋 𝑇 𝑌

After splitting the dataset into two random halves, compute 𝜃 ∗ and use this Regression model for d ranging
from 1 to 100 we get,
Regression Model for D = 100 :
When we use cross validation and plot the Train Error vs Test Error graph, we can see that the minimum value
of Test Error is present at Degree = 9 with error value = 1.085044e+21
As we can see in the graph that the testing error reaches minimum when the degree = 9 and starts increasing
afterwards while the training error decreases and stays constant once it reaches 0, This shows that the model
starts to overfit, as the degree increases and hence the testing error increases rapidly.

Problem 2:
Goal : To Build a Multi-variate Regression function f : ℝ100 → ℝ , where the basis functions are of the form
𝑘

𝑓(𝑥; 𝜃) = ∑ 𝜃𝑖 𝑥𝑖
𝑖=1

We use a l2 function to penalize the model and minimize the risk of overfitting.
𝑁
1 2 𝜆 2
𝑅(𝑅𝑒𝑔) = ∑ (𝑦𝑖 − 𝑓(𝑥; 𝜃)) + ||𝜃 ||
2𝑁 2𝑁
𝑖=1

When we partially differentiate Risk with respect to 𝜃 and equate to 0 we get the model with minimum risk.
𝜃 ∗ = (𝑋 𝑇 𝑋 + 𝜆𝐼)−1𝑋 𝑇 𝑌

On applying two-fold cross validation and plotting the graph for various values of 𝜆 in the range 0 to 1000.we
obtain :
We can see that the test error is minimum for 𝜆 = 422. And as 𝜆 increases we can also observe that the values in
𝜃 decreases and the model starts to slowly underfit. Hence the training error slowly decreases while the testing
error increases.

Problem 3:

(i) To prove:
1
For 𝑔(𝑧) = , 𝑔(−𝑧) = 1 − 𝑔(𝑧) is true.
1+𝑒 −𝑧

Proof :

1
𝑔(𝑧) = → (1)
1+𝑒 −𝑧

Then,

1
𝑔(−𝑧) =
1 + 𝑒𝑧

On multiplying and dividing by 𝑒 −𝑧 , we get

𝑒 −𝑧
𝑔(−𝑧) = 𝑒 −𝑧 +1
On adding and subtracting 1 on the numerator,

1 + 𝑒 −𝑧 − 1
𝑔(−𝑧) =
1 + 𝑒 −𝑧

1 + 𝑒 −𝑧 1
𝑔(−𝑧) = −
1 + 𝑒 −𝑧 1 + 𝑒 −𝑧
1
𝑔(−𝑧) = 1 −
1 + 𝑒 −𝑧

Using equation (1):

𝑔(−𝑧) = 1 − 𝑔(𝑧)

And hence proved.

(ii) To prove :

𝑦
𝑔 −1 (𝑦) = ln ( )
1−𝑦

Proof

We know that 𝑔(𝑔−1 (𝑦)) = 𝑦

Therefore,
1
𝑔(𝑔−1 (𝑦)) = = 𝑦
1 + 𝑒 −𝑔−1(𝑦)

1 −1
= 1 + 𝑒 −𝑔 (𝑦)
𝑦

1 −1
− 1 = 𝑒 −𝑔 (𝑦)
𝑦

1−𝑦 −1
= 𝑒 −𝑔 (𝑦)
𝑦

On taking natural logarithm on both sides we have,

1−𝑦 −1
ln ( ) = ln(𝑒 −𝑔 (𝑦) )
𝑦

Using the properties of logarithm,

1−𝑦
ln ( ) = −𝑔−1 (𝑦)
𝑦
Therefore,

1−𝑦
𝑔 −1 (𝑦) = − ln ( )
𝑦

Again using properties of logarithm,

𝑦
𝑔 −1 (𝑦) = ln ( )
1−𝑦

Hence Proved.

Problem 4:

Goal :
To Implement a linear Logistic regression algorithm for binary classification in MatLab using gradient
descent.

Classification Function :

𝑓(𝑥; 𝜃) = (1 + exp(−𝜃 𝑇 𝑋))−1

That minimizes the Empirical Risk with logistical loss

𝑁
1
𝑅𝐸𝑚𝑝 (𝜃) = ∑(𝑦𝑖 − 1) log(1 − 𝑓(𝑥𝑖 , 𝜃)) − 𝑦𝑖 log(𝑓(𝑥𝑖 , 𝜃))
𝑁
𝑖=1

𝑁
1
𝑅𝐸𝑚𝑝 (𝜃) = − ∑(1 − 𝑦𝑖 ) log(1 − 𝑓(𝑥𝑖 , 𝜃)) + 𝑦𝑖 log(𝑓(𝑥𝑖 , 𝜃))
𝑁
𝑖=1

We know that,
𝜕
𝑓(𝑥𝑖 , 𝜃) = 𝑔(𝜃 𝑇 𝑋) and 𝜕𝜃 𝑔(𝜃 𝑇 𝑋) = (𝑔(𝜃 𝑇 𝑋))(1 − 𝑔(𝜃 𝑇 𝑋)) 𝑋 → (1)

Therefore,
𝑁
1
𝑅𝐸𝑚𝑝 (𝜃) = − ∑(1 − 𝑦𝑖 ) log(1 − 𝑔(𝜃 𝑇 𝑋)) + 𝑦𝑖 log(𝑔(𝜃 𝑇 𝑋))
𝑁
𝑖=1

Hence the gradient is,

𝑁
𝜕 1 1 𝜕 1 𝜕
𝑅𝐸𝑚𝑝 (𝜃) = − ∑(1 − 𝑦𝑖 ) ( 𝑇
) (− 𝑔(𝜃 𝑇 𝑋)) + 𝑦𝑖 𝑇
( 𝑔(𝜃 𝑇 𝑋))
𝜕𝜃 𝑁 1 − 𝑔(𝜃 𝑋) 𝜕𝜃 𝑔(𝜃 𝑋) 𝜕𝜃
𝑖=1

𝑁
𝜕 1 1 1 𝜕
𝑅𝐸𝑚𝑝 (𝜃) = − ∑ ( 𝑦𝑖 ( 𝑇
) − (1 − 𝑦𝑖 ) ( 𝑇
)) ( 𝑔(𝜃 𝑇 𝑋))
𝜕𝜃 𝑁 𝑔(𝜃 𝑋) 1 − 𝑔(𝜃 𝑋) 𝜕𝜃
𝑖=1

Using Equation (1):

𝑁
𝜕 1 1 1
𝑅𝐸𝑚𝑝 (𝜃) = − ∑ ( 𝑦𝑖 ( 𝑇
) − (1 − 𝑦𝑖 ) ( )) (𝑔(𝜃 𝑇 𝑋)) (1 − 𝑔(𝜃 𝑇 𝑋))𝑋
𝜕𝜃 𝑁 𝑔(𝜃 𝑋) 1 − 𝑔(𝜃 𝑇 𝑋)
𝑖=1

𝑁
𝜕 1
𝑅𝐸𝑚𝑝 (𝜃) = − ∑ ( 𝑦𝑖 (1 − 𝑔(𝜃 𝑇 𝑋)) − (1 − 𝑦𝑖 )(𝑔(𝜃 𝑇 𝑋))) 𝑋
𝜕𝜃 𝑁
𝑖=1

𝑁
𝜕 1
𝑅𝐸𝑚𝑝 (𝜃) = − ∑( 𝑦𝑖 − (𝑦𝑖 ) 𝑔(𝜃 𝑇 𝑋) + (𝑦𝑖 ) 𝑔(𝜃 𝑇 𝑋) − 𝑔(𝜃 𝑇 𝑋))𝑋
𝜕𝜃 𝑁
𝑖=1

𝑁
𝜕 1
𝑅𝐸𝑚𝑝 (𝜃) = − ∑( 𝑦 − 𝑔(𝜃 𝑇 𝑋))𝑋
𝜕𝜃 𝑁
𝑖=1

We can Apply batch gradient to get 𝜃 ∗ ,

(𝑡+1) 𝜕
𝜃{ = 𝜃𝑡 − 𝜂 𝑅(𝜃).
𝜕𝜃

Here , 𝜂 is the step size and the iteration can be stopped when the descent is negligible in size or when it is less
than tolerance 𝜀 . Where both 𝜀 and 𝜂 are hyper parameters.

We can Initialize 𝜃 0 to a small random matrix.

For 𝜂 = 3 and 𝜀 = 0.1 , The model produces a 12.5% binary classification error and performs with
87.5 % accuracy and it takes 19 iterations to converge.
For 𝜂 = 2 and 𝜀 = 0.03 , The model produces a 10.5% binary classification error and performs with
89.5 % accuracy and it takes 161 iterations to converge.
For 𝜂 = 2 and 𝜀 = 0.01 , The model produces a 2% binary classification error and performs with
98 % accuracy and it takes 1050 iterations to converge.
For 𝜂 = 2 and 𝜀 = 0.001 , The model produces a 0% binary classification error and performs with
100% accuracy and it takes 27784 iterations to converge.
This model performs excellently with 0% error and with 100% accuracy but takes 27784 iterations and
converges many iterations after it reaches zero error. So, we can increase the step size and tolerance
level so that the model reaches convergence and attains zero error just at the right time.

For 𝜂 = 4 and 𝜀 = 0.0025 , The model produces a 0% binary classification error and performs with
100% accuracy and it takes just 10,292 iterations to converge.
Here the model performs with 100% accuracy and 0% error and also converges right after the binary
classification error reaches zero. And runs in least amount of time.

ANN Lab Manual
100% (3)
ANN Lab Manual
35 pages
LogisticRegression_ExercisesSolutions
No ratings yet
LogisticRegression_ExercisesSolutions
5 pages
HW 2
No ratings yet
HW 2
5 pages
Machine Learning Homework
No ratings yet
Machine Learning Homework
8 pages
sol3_2016
No ratings yet
sol3_2016
8 pages
COL774 Practice Problems
No ratings yet
COL774 Practice Problems
22 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
11 pages
Data_604_HW_5_Taneir_Arani
No ratings yet
Data_604_HW_5_Taneir_Arani
13 pages
Machine Learning Lab (3) Report (21 CP 81)
No ratings yet
Machine Learning Lab (3) Report (21 CP 81)
7 pages
HW 23 P 4 Rie
No ratings yet
HW 23 P 4 Rie
5 pages
03 Regression
No ratings yet
03 Regression
55 pages
hw3_red
No ratings yet
hw3_red
4 pages
sol3_2015
No ratings yet
sol3_2015
8 pages
480-note-lin
No ratings yet
480-note-lin
11 pages
Lab Manual 05
No ratings yet
Lab Manual 05
13 pages
hw4_red
No ratings yet
hw4_red
6 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Homework 1
No ratings yet
Homework 1
8 pages
Cost Function
No ratings yet
Cost Function
17 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
Homework2
No ratings yet
Homework2
3 pages
Formula Sheet Posted
No ratings yet
Formula Sheet Posted
5 pages
Lecture 5
No ratings yet
Lecture 5
9 pages
hw3
No ratings yet
hw3
7 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
Maths Behind ML Algos
No ratings yet
Maths Behind ML Algos
18 pages
Math Behind ML Algos
No ratings yet
Math Behind ML Algos
18 pages
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
No ratings yet
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
6 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
CH - En.u4cse19101 Cheduri Linearregression
No ratings yet
CH - En.u4cse19101 Cheduri Linearregression
8 pages
EndSem 202223 Solution
No ratings yet
EndSem 202223 Solution
4 pages
HW 1
No ratings yet
HW 1
3 pages
HW 1 in 2015
No ratings yet
HW 1 in 2015
3 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Mid-Term A2 ML Solution
No ratings yet
Mid-Term A2 ML Solution
7 pages
COGS 118 Homework 3 Supervised Machine Learning Algorithms
No ratings yet
COGS 118 Homework 3 Supervised Machine Learning Algorithms
7 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Exercises On Backpropagation
No ratings yet
Exercises On Backpropagation
4 pages
Lecture 1.5-1.6
No ratings yet
Lecture 1.5-1.6
23 pages
02 - Linear Models - C - Regularization - Logistic - Regression
No ratings yet
02 - Linear Models - C - Regularization - Logistic - Regression
16 pages
Linear-Regression 231212 072619
No ratings yet
Linear-Regression 231212 072619
13 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
A Layman's Guide to the Project
No ratings yet
A Layman's Guide to the Project
34 pages
Machine Learning (CSEN3203) 1-14
No ratings yet
Machine Learning (CSEN3203) 1-14
15 pages
online gradient descent
No ratings yet
online gradient descent
7 pages
ПМиИИ Демо ENG
No ratings yet
ПМиИИ Демо ENG
11 pages
Solutions Problem Set 1
No ratings yet
Solutions Problem Set 1
7 pages
Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
Updating_Weight
No ratings yet
Updating_Weight
9 pages
Practice Midterm Sol
No ratings yet
Practice Midterm Sol
15 pages
DEEP_LEARNING REC
No ratings yet
DEEP_LEARNING REC
40 pages
Machine Learning: Col774: Assignment 1
No ratings yet
Machine Learning: Col774: Assignment 1
14 pages
2022 Linear Regression
No ratings yet
2022 Linear Regression
34 pages
problem_set_1_with_solution
No ratings yet
problem_set_1_with_solution
4 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
From Everand
ALGEBRA SIMPLIFIED EQUATIONS WORKBOOK WITH ANSWERS: Linear Equations, Quadratic Equations, Systems of Equations
Luke Aneke
No ratings yet
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
From Everand
De Moiver's Theorem (Trigonometry) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
No ratings yet
San Jose National High School Division Leader School: Demonstration Teaching For Capacity Building On Test Construction
No ratings yet
San Jose National High School Division Leader School: Demonstration Teaching For Capacity Building On Test Construction
55 pages
Graphical Method and Simplex Method
100% (1)
Graphical Method and Simplex Method
61 pages
Assignment
No ratings yet
Assignment
11 pages
Elimination Gauss Seidel - Examples
No ratings yet
Elimination Gauss Seidel - Examples
3 pages
Nnet - Ug 1 150 PDF
No ratings yet
Nnet - Ug 1 150 PDF
150 pages
1999, Article, A Fast, Matrix-Free Implicit Method For Compressible Flows o Unstructured Grids
No ratings yet
1999, Article, A Fast, Matrix-Free Implicit Method For Compressible Flows o Unstructured Grids
27 pages
EMT 3200 Group #4 Curve Fitting
No ratings yet
EMT 3200 Group #4 Curve Fitting
50 pages
Operations Research Paper PDF
No ratings yet
Operations Research Paper PDF
4 pages
Ann-Unit Ii
No ratings yet
Ann-Unit Ii
21 pages
Finite Elements For Analysis and Design Akin Akin PDF
No ratings yet
Finite Elements For Analysis and Design Akin Akin PDF
1 page
Chapter - 4 Matrices
No ratings yet
Chapter - 4 Matrices
49 pages
212C Numerical Methods
77% (30)
212C Numerical Methods
22 pages
Mesh Types You All Must Know in CFD
No ratings yet
Mesh Types You All Must Know in CFD
6 pages
Greedy Algorithm
No ratings yet
Greedy Algorithm
28 pages
lecture_8
No ratings yet
lecture_8
29 pages
Soft Computing
No ratings yet
Soft Computing
16 pages
Finite Difference and Interpolation PDF
No ratings yet
Finite Difference and Interpolation PDF
29 pages
MBA Assignment
No ratings yet
MBA Assignment
27 pages
1.Factoring-Greatest Common Monomial Factor
100% (1)
1.Factoring-Greatest Common Monomial Factor
16 pages
Multiple Integrals
No ratings yet
Multiple Integrals
5 pages
Analyze Graphs of Polynomial Functions: For Your Notebook
No ratings yet
Analyze Graphs of Polynomial Functions: For Your Notebook
6 pages
Using Convolutional Neural Networks and Transfer Learning To Perform Yelp Restaurant Photo Classification
No ratings yet
Using Convolutional Neural Networks and Transfer Learning To Perform Yelp Restaurant Photo Classification
9 pages
Machine Learning Questions and Answers For Interview
No ratings yet
Machine Learning Questions and Answers For Interview
20 pages
Importance of Clustering
No ratings yet
Importance of Clustering
5 pages
Engineering Design Optimization: Lecture 3: Optimization in Spreadsheets
No ratings yet
Engineering Design Optimization: Lecture 3: Optimization in Spreadsheets
9 pages
MS14E Chapter 21 Final
No ratings yet
MS14E Chapter 21 Final
12 pages
Numerical Methods MCQ
81% (16)
Numerical Methods MCQ
5 pages
Smithwaterman 130216133804 Phpapp02
No ratings yet
Smithwaterman 130216133804 Phpapp02
15 pages
CS 230 - Convolutional Neural Networks Cheatsheet
No ratings yet
CS 230 - Convolutional Neural Networks Cheatsheet
7 pages

Machine Learning Homework1 Solutions

Uploaded by

Machine Learning Homework1 Solutions

Uploaded by

MACHINE LEARNING HOMEWORK 1 SOLUTIONS

HARISH BALAJI BOOMINATHAN (hb2917)

Form of the function :

Where d is the degree of the polynomial.

Empirical Risk is given by

On partially differentiating with respect to 𝜃 and equating to 0 ,we get:

On multiplying and dividing by 𝑒 −𝑧 , we get

Using equation (1):

And hence proved.

We know that 𝑔(𝑔−1 (𝑦)) = 𝑦

On taking natural logarithm on both sides we have,

Using the properties of logarithm,

Again using properties of logarithm,

𝑓(𝑥; 𝜃) = (1 + exp(−𝜃 𝑇 𝑋))−1

That minimizes the Empirical Risk with logistical loss

Hence the gradient is,

Using Equation (1):

We can Apply batch gradient to get 𝜃 ∗ ,

We can Initialize 𝜃 0 to a small random matrix.

You might also like