0% found this document useful (0 votes)
74 views6 pages

CSE 597 Spring 2019 Exercise 1 Due Sunday 11:59 PM, February 3th

This document provides instructions and problems for Exercise 1 of CSE 597 Spring 2019. It includes 4 problems related to convex optimization and matrix analysis. Students are asked to show proofs and derivations for each problem. Key points covered are: - Problem 1 proves smoothness of a squared p-norm function - Problem 2 derives the dual problem for standard Lasso regression - Problem 3 identifies the convex envelope of the rank function for matrices - Problem 4 considers convex optimization over matrices and derives convergence rates for projected subgradient descent

Uploaded by

florisande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views6 pages

CSE 597 Spring 2019 Exercise 1 Due Sunday 11:59 PM, February 3th

This document provides instructions and problems for Exercise 1 of CSE 597 Spring 2019. It includes 4 problems related to convex optimization and matrix analysis. Students are asked to show proofs and derivations for each problem. Key points covered are: - Problem 1 proves smoothness of a squared p-norm function - Problem 2 derives the dual problem for standard Lasso regression - Problem 3 identifies the convex envelope of the rank function for matrices - Problem 4 considers convex optimization over matrices and derives convergence rates for projected subgradient descent

Uploaded by

florisande
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

CSE 597 Spring 2019

Exercise 1
Due Sunday 11:59 PM, February 3th

Instructions:

• There are four problems in this excercise.

• Please mention if you are auditing the course

• Using this LATEX template will be helpful for grading purposes.

• Please write down every mathematical fact you are using in derivation (even if it looks obvious
to you).

1
Problem 1 (25 points). Consider the squared p-norm of a vector x ∈ Rd defined as f (x) = kxk2p =
P 2/p
d p
|x
i=1 i | . Prove the that f (x) is (p − 1)-smooth for p ∈ [2, ∞].

Solution. 

2
Problem 2 (25 points). Assuming the data matrix is X ∈ Rn×d , the standard Lasso problem is
given by:
min kXw − yk22 + λkwk1
w∈Rd

Show that the dual problem is:


2
λ2

1 2 y
minn kyk2 − α −
α∈R 2 2 λ 2
subject to |x>
i α| ≤ 1, i = 1, 2, . . . , d

where xi , i = 1, 2, . . . , d are the feature vectors (columns of the data matrix X).

Solution. 

3
Problem 3 (25 points). The convex envelope of a function f : C → R is defined as the largest
(point-wise) convex function g such that g(x) ≤ f (x) for all x ∈ C, meaning that among all convex
functions, g is the one that is closest to f (e.g., `1 norm is the convex envelope of `0 norm).
To obtain the convex envelope of a non-convex function we can rely on a basic result is convex
analysis that states for a non-convex function f , the conjugate of conjugate f ∗∗ is the convex
envelope of the function f . Using this fact, show that the convex envelope of function f (X) =
rank(X) on the set n o
C = X ∈ Rn×d | kXk2 ≤ 1

is the function
min(n,d)
X
g(X) = kXk∗ = σi (X)
i=1

An immediate implication of this result is that the trace norm of a matrix is the tightest convex
relaxation of the rank.
Solution. 

4
Problem 4 (25 points). Consider the following convex optimization problem over matrices:
 
min F (X) = f (X) + λkXk∗
C

where C = {X ∈ Rn×d | kXkF ≤ M }, f (X) is any convex function (not necessarily differentiable),
λ > 0 is a regularization parameter, and kXk∗ denotes the nuclear (trace) norm of a matrix X,
which is the sum, or equivalently the `1 norm, of the singular values of X.

(a) Show that projection over the set C is:


 
M
ΠC (X) = min 1, X
kXkF

(b) What is the subdiffernartiable set ∂F (X) of the objective function. You might need to
read (AW).

(c) Consider the projected subgradient descent algorithm for solving the above optimization
problem which iteratively updates the initial solution X 0 = 0 by

X t+1 = ΠC (X t − ηt Gt ) ,

where Gt ∈ ∂F (X t ). Show the convergence rate after T iterations is:


T
kX ∗ k2F + G2 Tt=1 ηt2
P
1X
E[F (X t )] ≤ F (X ∗ ) +
T 2 Tt=1 ηt
P
t=1

• Decide an optimal value for learning rate ηt and simply the convergence rate.

Solution. 

5
References
[AW] Alistair G Watson Characterization of the subdifferential of some matrix norms, Linear
algebra and its applications, 1992

You might also like