Ass 1

This document provides instructions for an assignment in a machine learning course. It consists of 4 questions worth a total of 80 points. Question 1 involves linear regression, question 2 examines stochastic gradient descent with different batch sizes, question 3 implements logistic regression using Newton's method, and question 4 applies Gaussian discriminant analysis to classify salmon data. Detailed instructions are provided for implementing algorithms, plotting results, and analyzing outputs for each question. Code, write-ups, and other submission guidelines are also outlined.

Uploaded by

Vibhanshu Lodhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views3 pages

Ass 1

Uploaded by

Vibhanshu Lodhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

COL 774: Machine Learning.

Assignment 1

Due Date: 11:50 pm, Friday Sep 1, 2023. Total Points: 80

Notes:

• You should submit all your code as well as any graphs that you might plot. Do not submit answers to
theoretical questions.
• Do not submit the datasets.
• Include a single write-up (pdf ) file which includes a brief description for each question explaining what
you did. Include any observations and/or plots required by the question in this single write-up file.
• You should use Python for all your programming solutions.
• Your code should have appropriate documentation for readability.
• You will be graded based on what you have submitted as well as your ability to explain your code.
• Refer to the course website for assignment submission instructions.
• This assignment is supposed to be done individually. You should carry out all the implementation by
yourself.
• We plan to run Moss on the submissions. We will also include submissions from previous years since some of
the questions may be repeated. Any cheating will result in a zero on the assignment, an addtional penalty
of the negative of the total weightage of the assignment and possibly much stricter penalties (including a
fail grade and/or referring to a DisCo).
• Many of the problems below have been adapted from the Machine Learning course offered by Andrew Ng
at Stanford.
• You should normalize the data (x’s) to have zero mean and unit variance in each dimension for Q1, Q3 and
Q4, as described in class. Do Not perform any normalization for Q2.

1. (20 points) Linear Regression

In this problem, we will implement least squares linear regression to predict density of wine based on its
acidity. Recall that the error metric for least squares is given by:
m
1 X (i) 2
J(θ) = (y − hθ (x(i) ))
2m i=1

where hθ (x) = θT x and all the symbols are as discussed in the class. The files linearX.csv and linearY.csv
contain the acidity of the wine (x(i) ’s, x(i) ∈ R) and its density (y (i) ’s, y (i) ∈ R), respectively, with
one training example per row. We will implement least squares linear regression to learn the relationship
between x(i) ’s and y (i) ’s.
(a) (8 points) Implement batch gradient descent method for optimizing J(θ). Choose an appropriate
learning rate and the stopping criteria (as a function of the change in the value of J(θ)). You can
initialize the parameters as θ = ~0 (the vector of all zeros). Do not forget to include the intercept
term. Report your learning rate, stopping criteria and the final set of parameters obtained by your
algorithm.

1
(b) (3 points) Plot the data on a two-dimensional graph and plot the hypothesis function learned by your
algorithm in the previous part.
(c) (3 points) Draw a 3-dimensional mesh showing the error function (J(θ)) on z-axis and the parameters
in the x − y plane. Display the error value using the current set of parameters at each iteration of
the gradient descent. Include a time gap of 0.2 seconds in your display for each iteration so that the
change in the function value can be observed by the human eye.
(d) (3 points) Repeat the part above for drawing the contours of the error function at each iteration of
the gradient descent. Once again, chose a time gap of 0.2 seconds so that the change be perceived by
the human eye.(Note here plot will be 2-D)
(e) (3 points) Repeat the part above (i.e. draw the contours at each learning iteration) for the step size
values of η = {0.001, 0.025, 0.1}. What do you observe? Comment.
2. (20 points) Sampling and Stochastic Gradient Descent
In this problem, we will introduce the idea of sampling by adding Gaussian noise to the prediction of a
hypothesis and generate
" # synthetic training data. Consider a given hypothesis hθ (i.e. known θ0 , θ1 , θ2 ) for
x0
a data point x = x1 . Note that x0 = 1 is the intercept term.
x2

y = hθ (x) = θ0 + θ1 x1 + θ2 x2

Adding Gaussian noise, equation becomes

y = θ 0 + θ 1 x1 + θ 2 x2 +
where ∼ N (0, σ 2 )
To gain deeper understanding behind Stochastic Gradient Descent (SGD), we will use the SGD algorithm
to learn the original hypothesis from the data generated using sampling, for varying batch sizes. We will
implement the version where we make a complete pass through the data in a round robin fashion (after
initially shuffling the examples). If there are r examples in each batch, then there is a total of m r batches
assuming m training examples. For the batch number b (1 ≤ b ≤ m r ), the set of examples is given as:
(i1 ) (i2 ) (ir )
{x , x , · · · , x } where ik = (b − 1)r + k. The Loss function computed over these r examples is given
as:
r
1 X (ik ) 2
Jb (θ) = (y − hθ (x(ik ) ))
2k
k=1
" # " #
θ0 3
(a) (4 points) Sample 1 million data points taking values of θ = θ1 = 1 , x1 ∼ N (3, 4) and x2 ∼
θ2 2
N (−1, 4) independently, and noise variance in y, σ 2 = 2.
" #
θ0
(b) (6 points) Implement Stochastic gradient descent method for optimizing J(θ). Relearn θ = θ1
θ2
using sampled data points of part a) keeping everything same except the batch size. Keep η = 0.001
and initialize ∀j θj = 0. Report the θ learned each time separately for values of batch size r =
{1, 100, 10000, 1000000}. Carefully decide your convergence criteria in each case. Make sure to watch
the online video posted on the course website for deciding the convergence of SGD algorithm.
(c) (6 points) Do different algorithms in the part above (for varying values of r) converge to the same
parameter values? How much different are these from the parameters of the original hypothesis from
which the data was generated? Comment on the relative speed of convergence and also on number
of iterations in each case. Next, for each of learned models above, report the error on a new test
data of 10,000 samples provided in the file named q2test.csv. Note that this test set was generated
using the same sampling procedure as described in part (a) above. Also, compute the test error with
respect to the prediction of the original hypothesis, and compare with the error obtained using learned
hypothesis in each case. Comment.
(d) (4 points) In the 3 dimensional parameter space(θj on each axis), plot the movement of θ as the
parameters are updated (until convergence) for varying batch sizes. How does the (shape of) movement
compare in each case? Does it make intuitive sense? Argue.

2
3. (15 points) Logistic Regression
Consider the log-likelihood function for logistic regression:
m
X
L(θ) = y (i) log hθ (x(i) ) + (1 − y (i) ) log(1 − hθ (x(i) ))
i=1

For the following, you will need to calculate the value of the Hessian H of the above function.

(a) (10 points) The files logisticX.csv and logisticY.csv contain the inputs (x(i) ∈ R2 ) and outputs
(y (i) ∈ {0, 1}) respectively for a binary classification problem, with one training example per row.
Implement1 Newton’s method for optimizing L(θ), and apply it to fit a logistic regression model to
the data. Initialize Newton’s method with θ = ~0 (the vector of all zeros). What are the coefficients θ
resulting from your fit? (Remember to include the intercept term.)
(b) (5 points) Plot the training data (your axes should be x1 and x2 , corresponding to the two coordinates
of the inputs, and you should use a different symbol for each point plotted to indicate whether that
example had label 1 or 0). Also plot on the same figure the decision boundary fit by logistic regression.
(i.e., this should be a straight line showing the boundary separating the region where h(x) > 0.5 from
where h(x) ≤ 0.5.)

4. (25 points) Gaussian Discrmimant Analysis

In this problem, we will implement GDA for separating out salmons from Alaska and Canada. Each salmon
is represented by two attributes x1 and x2 depicting growth ring diameters in 1) fresh water, 2) marine
water, respectively. File q4x.dat stores the two attribute values with one entry on each row. File q4y.dat
contains the target values (y (i) ’s ∈ {Alaska, Canada}) on respective rows.

(a) (6 points) Implement Gaussian Discriminant Analysis using the closed form equations described in
class. Assume that both the classes have the same co-variance matrix i.e. Σ0 = Σ1 = Σ. Report the
values of the means, µ0 and µ1 , and the co-variance matrix Σ.
(b) (2 points) Plot the training data corresponding to the two coordinates of the input features, and
you should use a different symbol for each point plotted to indicate whether that example had label
Canada or Alaska.
(c) (3 points) Describe the equation of the boundary separating the two regions in terms of the parameters
µ0 , µ1 and Σ. Recall that GDA results in a linear separator when the two classes have identical co-
variance matrix. Along with the data points plotted in the part above, plot (on the same figure)
decision boundary fit by GDA.
(d) (6 points) In general, GDA allows each of the target classes to have its own covariance matrix. This
results (in general) results in a quadratic boundary separating the two class regions. In this case, the
maximum-likelihood estimate of the co-variance matrix Σ0 can be derived using the equation:
m
1{y (i) = 0}(x(i) − µy(i) )(x(i) − µy(i) )T
P
i=1
Σ0 = m
P (1)
1{y (i) = 0}
i=1

And similarly, for Σ1 . The expressions for the means remain the same as before. Implement GDA for
the above problem in this more general setting. Report the values of the parameter estimates i.e. µ0 ,
µ1 , Σ0 , Σ1 .
(e) (5 points) Describe the equation for the quadratic boundary separating the two regions in terms of
the parameters µ0 , µ1 and Σ0 , Σ1 . On the graph plotted earlier displaying the data points and the
linear separating boundary, also plot the quadratic boundary obtained in the previous step.
(f) (3 points) Carefully analyze the linear as well as the quadratic boundaries obtained. Comment on
your observations.

1 Write your own version, and do not call a built-in library function.

Practice Midterm
No ratings yet
Practice Midterm
4 pages
Col774 Ass1 v1
No ratings yet
Col774 Ass1 v1
5 pages
HW1
No ratings yet
HW1
4 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
Ps 1
No ratings yet
Ps 1
5 pages
hw5 1
No ratings yet
hw5 1
6 pages
HW 3
No ratings yet
HW 3
7 pages
hw3 Red
No ratings yet
hw3 Red
4 pages
Homework2 Advanced ML
No ratings yet
Homework2 Advanced ML
4 pages
Ps 1
No ratings yet
Ps 1
16 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
11 pages
Taller 3 (A. NG.) - Introducción Al Aprendizaje Supervisado
No ratings yet
Taller 3 (A. NG.) - Introducción Al Aprendizaje Supervisado
8 pages
Department of Electrical Engineering School of Science and Engineering
No ratings yet
Department of Electrical Engineering School of Science and Engineering
10 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Assgmt 1
No ratings yet
Assgmt 1
7 pages
Ps 1
No ratings yet
Ps 1
9 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
178 hw3
No ratings yet
178 hw3
3 pages
Homework 4
No ratings yet
Homework 4
3 pages
CS 229, Summer 2020 Problem Set #1
No ratings yet
CS 229, Summer 2020 Problem Set #1
14 pages
Dda3020 2024F HW1
No ratings yet
Dda3020 2024F HW1
6 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
ML PG Assignment 3
No ratings yet
ML PG Assignment 3
3 pages
hw2 2020
No ratings yet
hw2 2020
3 pages
HW 1
No ratings yet
HW 1
3 pages
KDAG Task
No ratings yet
KDAG Task
2 pages
LinearRegression Tutorial
No ratings yet
LinearRegression Tutorial
40 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Homework 9 Due: March 13, 2020, 11:59PM PT
No ratings yet
Homework 9 Due: March 13, 2020, 11:59PM PT
2 pages
HW 4
No ratings yet
HW 4
6 pages
50 Inference
No ratings yet
50 Inference
31 pages
HW 1
No ratings yet
HW 1
4 pages
CS4100 CS5100 CW1 20241001
No ratings yet
CS4100 CS5100 CW1 20241001
10 pages
Machine Learning Lab (3) Report (21 CP 81)
No ratings yet
Machine Learning Lab (3) Report (21 CP 81)
7 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
Vi Int368 Ml-I
No ratings yet
Vi Int368 Ml-I
2 pages
Assignment III
No ratings yet
Assignment III
3 pages
Assignment 2 Regression2
No ratings yet
Assignment 2 Regression2
4 pages
Assignment DL
No ratings yet
Assignment DL
20 pages
Practice Midterm 2010
No ratings yet
Practice Midterm 2010
4 pages
Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
HW 3
No ratings yet
HW 3
4 pages
Semester End Examinations - July 2024: USN 1 M S
No ratings yet
Semester End Examinations - July 2024: USN 1 M S
4 pages
Exam 2011
No ratings yet
Exam 2011
22 pages
Qs ML
No ratings yet
Qs ML
8 pages
MedTerm Machine Learning
No ratings yet
MedTerm Machine Learning
14 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
HW 4
No ratings yet
HW 4
7 pages
CS 229, Public Course Problem Set #1 Solutions: Supervised Learning
No ratings yet
CS 229, Public Course Problem Set #1 Solutions: Supervised Learning
10 pages
COL774 Practice Problems
No ratings yet
COL774 Practice Problems
22 pages
hw4 Red
No ratings yet
hw4 Red
6 pages
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
No ratings yet
2023-24 AIML ML Mid-Semester Regular QP Anwer-Keys
4 pages
Ps 1
No ratings yet
Ps 1
25 pages
Statistical Machine Learning Solutions For Exam 2020-08-22
No ratings yet
Statistical Machine Learning Solutions For Exam 2020-08-22
7 pages
Midterm Sol
No ratings yet
Midterm Sol
23 pages
MLvsMAP Merged
No ratings yet
MLvsMAP Merged
208 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
BITSAT Question Paper 2024 Session 1 May 23
No ratings yet
BITSAT Question Paper 2024 Session 1 May 23
2 pages
Sets in Programming Languages
No ratings yet
Sets in Programming Languages
11 pages
Poster Final
No ratings yet
Poster Final
1 page
Lecture 7-8 Substitution-Income Effect-Elasticity-Short
No ratings yet
Lecture 7-8 Substitution-Income Effect-Elasticity-Short
65 pages
Tutorial 2
No ratings yet
Tutorial 2
6 pages
Tutorial 1
No ratings yet
Tutorial 1
5 pages
I2DL Student Lecture Notes
No ratings yet
I2DL Student Lecture Notes
97 pages
PHD IT Syllabus 01
No ratings yet
PHD IT Syllabus 01
27 pages
Deep Learning - Summary - Deep - Learning
No ratings yet
Deep Learning - Summary - Deep - Learning
17 pages
DL Ut - 1
No ratings yet
DL Ut - 1
14 pages
Deep Learning With Python Mini Course
No ratings yet
Deep Learning With Python Mini Course
26 pages
Module 2 Part3
No ratings yet
Module 2 Part3
31 pages
Interview Questions AI
No ratings yet
Interview Questions AI
7 pages
Machine Learning GenAI Roadma
No ratings yet
Machine Learning GenAI Roadma
36 pages
Algorithms For Optimization
No ratings yet
Algorithms For Optimization
10 pages
AI - Question Bank
No ratings yet
AI - Question Bank
20 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Sem 7 All
No ratings yet
Sem 7 All
15 pages
DL, Course Introduction
No ratings yet
DL, Course Introduction
9 pages
Ai/Ml: Generativeai - Mlops Roadmap
No ratings yet
Ai/Ml: Generativeai - Mlops Roadmap
33 pages
Heuristics For Backpropagation Algorithm
No ratings yet
Heuristics For Backpropagation Algorithm
2 pages
Scikit-ANFIS Python Implementation
No ratings yet
Scikit-ANFIS Python Implementation
19 pages
III Cse (DS) Sem-2 Subjects
No ratings yet
III Cse (DS) Sem-2 Subjects
31 pages
Autoencoders and Their Applications in Machine Learning
No ratings yet
Autoencoders and Their Applications in Machine Learning
52 pages
MFD S Assignment 2
No ratings yet
MFD S Assignment 2
18 pages
Deep Learning in Bioinformatics PDF
No ratings yet
Deep Learning in Bioinformatics PDF
18 pages
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
No ratings yet
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
56 pages
Segmentation of Mammogram Images Using Deep Learning For Breast Cancer Detection
No ratings yet
Segmentation of Mammogram Images Using Deep Learning For Breast Cancer Detection
6 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
151 pages
File 160828
No ratings yet
File 160828
43 pages
DL Modules
No ratings yet
DL Modules
1 page
UNIT-1: 1. What Is Machine Learning?
No ratings yet
UNIT-1: 1. What Is Machine Learning?
130 pages
A New Dust Detection Method For Photovoltaic Panel Surface Based On Pytorch and Its Economic Benefit Analysis
No ratings yet
A New Dust Detection Method For Photovoltaic Panel Surface Based On Pytorch and Its Economic Benefit Analysis
10 pages
Image Dehazing Using Artificial Intelligence and Multi Exposure
No ratings yet
Image Dehazing Using Artificial Intelligence and Multi Exposure
50 pages
Machine Learning
100% (3)
Machine Learning
2,520 pages
Statistical Learning and Inver
No ratings yet
Statistical Learning and Inver
18 pages

Ass 1

Uploaded by

Ass 1

Uploaded by

COL 774: Machine Learning.

Due Date: 11:50 pm, Friday Sep 1, 2023. Total Points: 80

1. (20 points) Linear Regression

Adding Gaussian noise, equation becomes

4. (25 points) Gaussian Discrmimant Analysis

You might also like