0% found this document useful (0 votes)
42 views

CS209 Practice Problems 1 ML

This problem set introduces concepts from linear algebra and multivariable calculus that are important foundations for machine learning. It contains 10 practice problems related to gradients, Hessians, positive definite matrices, eigenvalues/eigenvectors, and decision trees. The problems are intended for practice and will not be evaluated, but solving them helps prepare for graded problem sets. Submissions will be made using Gradescope.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

CS209 Practice Problems 1 ML

This problem set introduces concepts from linear algebra and multivariable calculus that are important foundations for machine learning. It contains 10 practice problems related to gradients, Hessians, positive definite matrices, eigenvalues/eigenvectors, and decision trees. The problems are intended for practice and will not be evaluated, but solving them helps prepare for graded problem sets. Submissions will be made using Gradescope.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

CS229 Problem Set #0 1

CS 229, Fall 2018


Problem Set #0: Linear Algebra and Multivariable
CalculusCS 209: Practice Problem Set: I (Machine Learning)

Notes: (1) These questions require thought, but do not require long answers. Please be as
Note: These
concise questions(2)
as possible. are given
If youforhave
yourapractice
questionandabout
will notthis
be evaluated.
homework, Trywe
to solve all questions.
encourage you to post
your question on our Piazza forum, at https://piazza.com/stanford/fall2018/cs229. (3) If
you missed the first lecture or are unfamiliar with the collaboration or honor code policy, please
read the policy on Handout #1 (available from the course website) before starting work. (4)
This specific homework is not graded, but we encourage you to solve each of the problems to
brush up on your linear algebra. Some of them may even be useful for subsequent problem sets.
It also serves as your introduction to using Gradescope for submissions.

Q1[0 points] Gradients and Hessians


1.
Recall that a matrix A ∈ Rn×n is symmetric if AT = A, that is, Aij = Aji for all i, j. Also
recall the gradient ∇f (x) of a function f : Rn → R, which is the n-vector of partial derivatives
 ∂   
∂x1 f (x) x1
.
.. . 
∇f (x) =   where x =  ..  .
  

∂xn f (x)
xn

The hessian ∇2 f (x) of a function f : Rn → R is the n × n symmetric matrix of twice partial


derivatives,
∂2 ∂2 2
 
∂x21
f (x) ∂x1 ∂x2 f (x) · · · ∂x1∂∂xn f (x)
 ∂2 ∂2 ∂2


∂x ∂x f (x) ∂x 2 f (x) · · · ∂x ∂x f (x) 
∇2 f (x) = 
2 1 2 2 n
.
 
.. .. .. ..

 . . . . 

2 2 2
∂ ∂ ∂
∂xn ∂x1 f (x) ∂xn ∂x2 f (x) · · · ∂x 2 f (x)
n

(a) Let f (x) = 21 xT Ax + bT x, where A is a symmetric matrix and b ∈ Rn is a vector. What


is ∇f (x)?
(b) Let f (x) = g(h(x)), where g : R → R is differentiable and h : Rn → R is differentiable.
What is ∇f (x)?
(c) Let f (x) = 12 xT Ax + bT x, where A is symmetric and b ∈ Rn is a vector. What is ∇2 f (x)?
(d) Let f (x) = g(aT x), where g : R → R is continuously differentiable and a ∈ Rn is a vector.
What are ∇f (x) and ∇2 f (x)? (Hint: your expression for ∇2 f (x) may have as few as 11
symbols, including 0 and parentheses.)

2.Q.2
[0 points] Positive definite matrices
A matrix A ∈ Rn×n is positive semi-definite (PSD), denoted A  0, if A = AT and xT Ax ≥ 0
for all x ∈ Rn . A matrix A is positive definite, denoted A  0, if A = AT and xT Ax > 0 for
all x 6= 0, that is, all non-zero vectors x. The simplest example of a positive definite matrix is
the identity I (the diagonal matrix with 1s on the diagonal and 0s elsewhere), which satisfies
2 Pn
xT Ix = kxk2 = i=1 x2i .
CS229 Problem Set #0 2

(a) Let z ∈ Rn be an n-vector. Show that A = zz T is positive semidefinite.


(b) Let z ∈ Rn be a non-zero n-vector. Let A = zz T . What is the null-space of A? What is
the rank of A?
(c) Let A ∈ Rn×n be positive semidefinite and B ∈ Rm×n be arbitrary, where m, n ∈ N. Is
BAB T PSD? If so, prove it. If not, give a counterexample with explicit A, B.

3. [0 points] Eigenvectors, eigenvalues, and the spectral theorem


Q.3 The eigenvalues of an n × n matrix A ∈ Rn×n are the roots of the characteristic polynomial
pA (λ) = det(λI − A), which may (in general) be complex. They are also defined as the the
values λ ∈ C for which there exists a vector x ∈ Cn such that Ax = λx. We call such a pair
(x, λ) an eigenvector, eigenvalue pair. In this question, we use the notation diag(λ1 , . . . , λn )
to denote the diagonal matrix with diagonal entries λ1 , . . . , λn , that is,
 
λ1 0 0 ··· 0
 0 λ2 0 · · · 0 
 
diag(λ1 , . . . , λn ) =  0
 0 λ3 · · · 0  .
 .. .. .. .. .. 
. . . . . 
0 0 0 · · · λn

(a) Suppose that the matrix A ∈ Rn×n is diagonalizable, that is, A = T ΛT −1 for an invertible
matrix T ∈ Rn×n , where Λ = diag(λ1 , . . . , λn ) is diagonal. Use the notation t(i) for the
columns of T , so that T = [t(1) · · · t(n) ], where t(i) ∈ Rn . Show that At(i) = λi t(i) , so
that the eigenvalues/eigenvector pairs of A are (t(i) , λi ).

A matrix U ∈ Rn×n is orthogonal if U T U = I. The spectral theorem, perhaps one of the most
important theorems in linear algebra, states that if A ∈ Rn×n is symetric, that is, A = AT ,
then A is diagonalizable by a real orthogonal matrix. That is, there are a diagonal matrix
Λ ∈ Rn×n and orthogonal matrix U ∈ Rn×n such that U T AU = Λ, or, equivalently,

A = U ΛU T .
Q.4 Let λi = λi (A) denote the ith eigenvalue of A.

(b) Let A be symmetric. Show that if U = [u(1) · · · u(n) ] is orthogonal, where u(i) ∈
Rn and A = U ΛU T , then u(i) is an eigenvector of A and Au(i) = λi u(i) , where Λ =
diag(λ1 , . . . , λn ).
(c) Show that if A is PSD, then λi (A) ≥ 0 for each i.

Q.5
Q.6

Q.7

Q.8 (Decision Trees)

i. What is the entropy H(passed)?


ii. What is the entropy H(Passed| GPA)?

iii. What is the entropy H(Passed | Studied)

iv. Draw the full Decision Tree that would be learned for this dataset.
No need to show detailed calculations
Q9. Consider f(x) = x^2 + 3x + 2 . Calculate f(1), f(2), f(3), f(4) and f(5) using the definition of f(x). Calculate f(2.5) using
KNN. Use K as 1, 2 and 3. Repeat the process for f(0). Compare with the values of f(2.5) and f(0) from the
function and explain which value of K gives the optimal result. Does this algorithm perform better for f(0) or f(2.5)

? Give sufficient reasoning for why it is the case.

Q 10. Consider the system of equations characterised by the following equations

-x + y = 10
2x - y = 5
x - 2y = 20

Find the least squares solution to the above equations using any two techniques learnt in class.

Q11. 1) Maximize 3x+2y−x^2 with respect to constraints : 2x+3y=12.


2) Minimize f(x,y)=x^2+y^2 w.r.t constraints x+2y−1= 0

You might also like