0% found this document useful (0 votes)
7 views2 pages

hw1 2025

This document outlines the assignment for CS4442 & CS9542: Artificial Intelligence II, due on February 24, 2025. It includes tasks related to mathematics, linear and polynomial regression, and regularization techniques, with specific questions and points allocated for each section. Students are required to submit a PDF with answers and source code files for the assignment.

Uploaded by

tiancong2013
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views2 pages

hw1 2025

This document outlines the assignment for CS4442 & CS9542: Artificial Intelligence II, due on February 24, 2025. It includes tasks related to mathematics, linear and polynomial regression, and regularization techniques, with specific questions and points allocated for each section. Students are required to submit a PDF with answers and source code files for the assignment.

Uploaded by

tiancong2013
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CS4442 & CS9542: Artificial Intelligence II – Assignment #1

Due: 23:59, February 24, 2025

Instructions
Your submission should include: 1) a .pdf file that containing all the answers to the questions. For the
questions requiring to plot figures, you should also include all the figures in the .pdf fie. 2) source code
(e.g., .py files, python notebook, .m files...)

1 Refreshing Mathematics [25 points]


Let w ∈ Rn is an n-dimensional column vector, and f (w) ∈ R is a function of w. In Lecture 2, we have
defined the gradient ∇f (w) ∈ Rn and Hessian matrix H ∈ Rn×n of f with respect to w.

(a) [5 points] Let f (w) = w⊤ Xb, where X ∈ Rn×p is a n × p matrix, and b is a p-dimensional column
vector. Compute ∇f (w) using the definition of gradient.

(b) [5 points] Let f (w) = tr(Bww⊤ A), where A, B ∈ Rn×n are squared matrices of size n × n, and
tr(A) is the trace of the squared matrix A. Using the definition of gradient, compute ∇f (w).
(c) [5 points] Let f (w) = tr(Bww⊤ A). Compute the Hessian matrix H of f with respect to w using
the definition.
   
1 1 2 −1
(d) [5 points] If A = and B = . Is f (w) a convex function? (Hint: you may use
1 2 −1 3
Python/Matlab/R for this question.)
(e) [5 points] In Lecture 5, we have define the sigmoid function: σ(a) = 1+e1−a . Let f (w) = log(σ(w⊤ x)),
where log is the natural logarithmic function. Compute ∇f (w) using the definition of gradient.

2 Linear and Polynomial Regression [50 points]


For this exercise, you will implement linear and polynomial regression in any programming language of
your choice (e.g., Python/Matlab/R). The training data set consists of the features hw1xtr.dat and their
desired outputs hw1ytr.dat. The test data set consists of the features hw1xte.dat and their desired
outputs hw1yte.dat.

(a) [5 points] Load the training data hw1xtr.dat and hw1ytr.dat into the memory and plot it on one
graph. Load the test data hw1xte.dat and hw1yte.dat into the memory and plot it on another
graph.
(b) [10 points] Add a column vector of 1’s to the features, then use the linear regression formula
discussed in Lecture 3 to obtain a 2-dimensional weight vector. Plot both the linear regression line
and the training data on the same graph. Also report the average error on the training set using
Eq. (1).
m
1 X ⊤
err = (w xi − yi )2 (1)
m i=1

(c) [5 points] Plot both the regression line and the test data on the same graph. Also report the
average error on the test set using Eq. (1).

1
(d) [10 points] Implement the 2nd-order polynomial regression by adding new features x2 to the inputs.
Repeat (b) and (c). Compare the training error and test error. Is it a better fit than linear
regression?
(e) [10 points] Implement the 3rd-order polynomial regression by adding new features x2 , x3 to the
inputs. Repeat (b) and (c). Compare the training error and test error. Is it a better fit than linear
regression and 2nd-order polynomial regression?

(f) [10 points] Implement the 4th-order polynomial regression by adding new features x2 , x3 , x4 to
the inputs. Repeat (b) and (c). Compare the training error and test error. Compared with the
previous results, which order is the best for fitting the data?

3 Regularization and Cross-Validation [25 points]


(a) [10 points] Using the training data to implement ℓ2 -regularized for the 4th-order polynomial regres-
sion (page 12 of Lecture 4, note that we do not penalize the bias term w0 ), vary the regularization
parameter λ ∈ {0.01, 0.1, 1, 10, 100, 1000}. Plot the training and test error (averaged over all in-
stances) using Eq. (1) as a function of λ (you should use a log10 scale for λ). Which λ is the best
for fitting the data?

(b) [10 points] Plot the value of each weight parameter (including the bias term w0 ) as a function of λ.
(c) [5 points] Write a procedure that performs five-fold cross-validation on your training data (page 7
of Lecture 4). Use it to determine the best value for λ. Show the average error on the validation
set as a function of λ. Is the same as the best λ in (a)? For the best fit, plot the test data and the
ℓ2 -regularized 4th-order polynomial regression line obtained.

You might also like