0% found this document useful (0 votes)
72 views20 pages

EE364a Homework 6 Solutions: I 1,..., K I I I

The document provides solutions to homework problems related to rational function fitting and maximum likelihood estimation of team abilities. For problem 1, the document finds optimal coefficients for a rational function that minimizes the error when fitting exponential data, plotting the fit and error. For problem 2, the document (a) formulates the maximum likelihood estimation of team abilities as a convex optimization problem using a game incidence matrix, (b) estimates abilities from tournament data, and (c) uses the estimates to predict outcomes for a new tournament, comparing accuracy to assuming last year's winners.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views20 pages

EE364a Homework 6 Solutions: I 1,..., K I I I

The document provides solutions to homework problems related to rational function fitting and maximum likelihood estimation of team abilities. For problem 1, the document finds optimal coefficients for a rational function that minimizes the error when fitting exponential data, plotting the fit and error. For problem 2, the document (a) formulates the maximum likelihood estimation of team abilities as a convex optimization problem using a game incidence matrix, (b) estimates abilities from tournament data, and (c) uses the estimates to predict outcomes for a new tournament, comparing accuracy to assuming last year's winners.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

EE364a, Winter 2007-08 Prof. S.

Boyd

EE364a Homework 6 solutions


6.9 Minimax rational function fitting. Show that the following problem is quasiconvex:

p(t )
i
minimize max
− yi
i=1,...,k q(ti )

where

p(t) = a0 + a1 t + a2 t2 + · · · + am tm , q(t) = 1 + b1 t + · · · + bn tn ,

and the domain of the objective function is defined as

D = {(a, b) ∈ Rm+1 × Rn | q(t) > 0, α ≤ t ≤ β}.

In this problem we fit a rational function p(t)/q(t) to given data, while constraining
the denominator polynomial to be positive on the interval [α, β]. The optimization
variables are the numerator and denominator coefficients ai , bi . The interpolation
points ti ∈ [α, β], and desired function values yi , i = 1, . . . , k, are given.
Solution. Let’s show the objective is quasiconvex. Its domain is convex. Since
q(ti ) > 0 for i = 1, . . . , k, we have

max |p(ti )/q(ti ) − yi | ≤ γ


i=1,...,k

if and only if
|p(ti ) − yi q(ti )| ≤ γq(ti ), i = 1, . . . , k,
which defines a convex set in the variables a and b, since the lefthand side is convex,
and the righthand side is linear. We can further express these inequalities as a set of
2k linear inequalities,

−γq(ti ) ≤ p(ti ) − yi q(ti ) ≤ γq(ti ), i = 1, . . . , k.

1
Solutions to additional exercises
1. Minimax rational fit to the exponential. (See exercise 6.9.) We consider the specific
problem instance with data
ti = −3 + 6(i − 1)/(k − 1), yi = eti , i = 1, . . . , k,
where k = 201. (In other words, the data are obtained by uniformly sampling the
exponential function over the interval [−3, 3].) Find a function of the form
a0 + a1 t + a2 t2
f (t) =
1 + b1 t + b2 t2
that minimizes maxi=1,...,k |f (ti ) − yi |. (We require that 1 + b1 ti + b2 t2i > 0 for i =
1, . . . , k.)
Find optimal values of a0 , a1 , a2 , b1 , b2 , and give the optimal objective value, com-
puted to an accuracy of 0.001. Plot the data and the optimal rational function fit on
the same plot. On a different plot, give the fitting error, i.e., f (ti ) − yi .
Hint. You can use strcmp(cvx_status,’Solved’), after cvx_end, to check if a feasi-
bility problem is feasible.
Solution. The objective function (and therefore also the problem) is not convex, but
it is quasiconvex. We have maxi=1,...,k |f (ti ) − yi | ≤ γ if and only if
a + a t + a t2


0 1 i 2 i
1 + b1 ti + b2 t2i
− y i ≤ γ,


i = 1, . . . , k.

This is equivalent to (since the denominator is positive)


|a0 + a1 t + a2 t2i − yi (1 + b1 ti + b2 t2i )| ≤ γ(1 + b1 ti + b2 t2i ), i = 1, . . . , k,
which is a set of 2k linear inequalities in the variables a and b (for fixed γ). In particular,
this shows the objective is quasiconvex. (In fact, it is a generalized linear fractional
function.)
To solve the problem we can use a bisection method, solving an LP feasibility problem
at each step. At each step we select some value of γ and solve the feasibility problem
find a, b
subject to |a0 + a1 ti + a2 t2i − yi (1 + b1 ti + b2 t2i )| ≤ γ(1 + b1 ti + b2 t2i ), i = 1, . . . , k,
with variables a and b. (Note that as long as γ > 0, the condition that the denomi-
nator is positive is enforced automatically.) This can be turned into the LP feasibility
problem
find a, b
subject to a0 + a1 ti + a2 t2i − yi (1 + b1 ti + b2 t2i ) ≤ γ(1 + b1 ti + b2 t2i ), i = 1, . . . , k .
2 2 2
a0 + a1 ti + a2 ti − yi (1 + b1 ti + b2 ti ) ≥ −γ(1 + b1 ti + b2 ti ), i = 1, . . . , k.

The following Matlab code solves the problem for the particular problem instance.

2
k=201;
t=(-3:6/(k-1):3)’;
y=exp(t);

Tpowers=[ones(k,1) t t.^2];

u=exp(3); l=0; % initial upper and lower bounds


bisection_tol=1e-3; % bisection tolerance

while u-l>= bisection_tol


gamma=(l+u)/2;
cvx_begin % solve the feasibility problem
cvx_quiet(true);
variable a(3);
variable b(2);
subject to
abs(Tpowers*a-y.*(Tpowers*[1;b])) <= gamma*Tpowers*[1;b];
cvx_end

if strcmp(cvx_status,’Solved’)
u=gamma;
a_opt=a;
b_opt=b;
objval_opt=gamma;
else
l=gamma;
end
end

y_fit=Tpowers*a_opt./(Tpowers*[1;b_opt]);

figure(1);
plot(t,y,’b’, t,y_fit,’r+’);
xlabel(’t’);
ylabel(’y’);

figure(2);
plot(t, y_fit-y);
xlabel(’t’);
ylabel(’err’);
The optimal values are
a0 = 1.0099, a1 = 0.6117, a2 = 0.1134, b1 = −0.4147, b2 = 0.0485,

3
25

20

15
y

10

0
−3 −2 −1 0 1 2 3

x
Figure 1 Chebyshev fit with rational function. The line represents the data and the
crosses the fitted points.

and the optimal objective value is 0.0233. We also get the following plots.
2. Maximum likelihood prediction of team ability. A set of n teams compete in a tourna-
ment. We model each team’s ability by a number aj ∈ [0, 1], j = 1, . . . , n. When teams
j and k play each other, the probability that team j wins is equal to prob(aj −ak +v >
0), where v ∼ N (0, σ 2 ).
You are given the outcome of m past games. These are organized as
(j (i) , k (i) , y (i) ), i = 1, . . . , m,
meaning that game i was played between teams j (i) and k (i) ; y (i) = 1 means that team
j (i) won, while y (i) = −1 means that team k (i) won. (We assume there are no ties.)

(a) Formulate the problem of finding the maximum likelihood estimate of team abil-
ities, â ∈ Rn , given the outcomes, as a convex optimization problem. You will
find the game incidence matrix A ∈ Rm×n , defined as
y (i) l = j (i)



Ail =  −y (i) l = k (i)

0 otherwise,
useful.
The prior constraints âi ∈ [0, 1] should be included in the problem formulation.
Also, we note that if a constant is added to all team abilities, there is no change in

4
0.025

0.02

0.015

0.01

0.005

Error
0

−0.005

−0.01

−0.015

−0.02

−0.025
−3 −2 −1 0 1 2 3

x
Figure 2 Fitting error for Chebyshev fit of exponential with rational function.

the probabilities of game outcomes. This means that â is determined only up to


a constant, like a potential. But this doesn’t affect the ML estimation problem,
or any subsequent predictions made using the estimated parameters.
(b) Find â for the team data given in team_data.m, in the matrix train. (This
matrix gives the outcomes for a tournament in which each team plays each other
team once.)
CVX does not support the concave function log Φ, where Φ is the cumulative dis-
tribution of a unit Gaussian, but we have provided a good enough approximation,
log_normcdf, on the course web site. This function is overloaded to handle vector
inputs (elementwise).
You can form A using the commands
A = sparse(1:m,train(:,1),train(:,3),m,n) + ...
sparse(1:m,train(:,2),-train(:,3),m,n);
(c) Use the maximum likelihood estimate â found in part (b) to predict the out-
comes of next year’s tournament games, given in the matrix test, using ŷ (i) =
sign(âj (i) − âk(i) ). Compare these predictions with the actual outcomes, given in
the third column of test. Given the fraction of correctly predicted outcomes.
The games played in train and test are the same, so another, simpler method
for predicting the outcomes in test it to just assume the team that won last year’s
match will also win this year’s match. Give the percentage of correctly predicted
outcomes using this simple method.

Solution.

5
(a) The likelihood of the outcomes y given a is
1 (i)
Y  
p(y|a) = Φ y (aj (i) − ak(i) ) ,
i=1,...,n σ

where Φ is the cumulative distribution of the standard normal. The log-likelihood


function is therefore
X
l(a) = log p(y|a) = log Φ ((1/σ)(Aa)i ) .
i

This is a concave function.


The maximum likelihood estimate â is any solution of

maximize l(a)
subject to 0  a  1.

This is a convex optimization problem since the objective, which is maximized, is


concave, and the constraints are 2n linear inequalities.
(b) The following code solves the problem
% Form adjacency matrix
A1 = sparse(1:m,train(:,1),train(:,3),m,n);
A2 = sparse(1:m,train(:,2),-train(:,3),m,n);
A = A1+A2;

% Estimate abilities
cvx_begin
variable a_hat(n)
minimize(-sum(log_normcdf(A*a_hat/sigma)))
subject to
a_hat >= 0
a_hat <= 1
cvx_end
Using this code we get that â = (1.0, 0.0, 0.68, 0.37, 0.79, 0.58, 0.38, 0.09, 0.67, 0.58).
(c) The following code is used to predict the outcomes in the test set
% Estimate errors in test set
res = sign(a_hat(test(:,1))-a_hat(test(:,2)));
Pml = 1-length(find(res-test(:,3)))/m_test
Ply = 1-length(find(train(:,3)-test(:,3)))/m_test
The maximum likelihood estimate gives a correct prediction of 86.7% of the games
in test. On the other hand, 75.6% of the games in test have the same outcome
as the games in train.

6
3. Piecewise-linear fitting. In many applications some function in the model is not given
by a formula, but instead as tabulated data. The tabulated data could come from
empirical measurements, historical data, numerically evaluating some complex expres-
sion or solving some problem, for a set of values of the argument. For use in a convex
optimization model, we then have to fit these data with a convex function that is com-
patible with the solver or other system that we use. In this problem we explore a very
simple problem of this general type.
Suppose we are given the data (xi , yi ), i = 1, . . . , m, with xi , yi ∈ R. We will assume
that xi are sorted, i.e., x1 < x2 < · · · < xm . Let a0 < a1 < a2 < · · · < aK be a
set of fixed knot points, with a0 ≤ x1 and aK ≥ xm . Explain how to find the convex
piecewise linear function f , defined over [a0 , aK ], with knot points ai , that minimizes
the least-squares fitting criterion
m
(f (xi ) − yi )2 .
X

i=1

You must explain what the variables are and how they parametrize f , and how you
ensure convexity of f .
Hints. One method to solve this problem is based on the Lagrange basis, f0 , . . . , fK ,
which are the piecewise linear functions that satisfy

fj (ai ) = δij , i, j = 0, . . . , K.

Another method is based on defining f (x) = αi x + βi , for x ∈ (ai−1 , ai ]. You then


have to add conditions on the parameters αi and βi to ensure that f is continuous and
convex.
Apply your method to the data in the file pwl_fit_data.m, which contains data with
xj ∈ [0, 1]. Find the best affine fit (which corresponds to a = (0, 1)), and the best
piecewise-linear convex function fit for 1, 2, and 3 internal knot points, evenly spaced
in [0, 1]. (For example, for 3 internal knot points we have a0 = 0, a1 = 0.25, a2 =
0.50, a3 = 0.75, a4 = 1.) Give the least-squares fitting cost for each one. Plot the
data and the piecewise-linear fits found. Express each function in the form

f (x) = max (αi x + βi ).


i=1...,K

(In this form the function is easily incorporated into an optimization problem.)
Solution. Following the hint, we will use the Lagrange basis functions f0 , . . . , fK .
These can be expressed as
a1 − x
 
f0 (x) = ,
a1 − a0 +
!!
x − ai−1 ai+1 − x
fi (x) = min , , i = 1, . . . , K − 1,
ai − ai−1 ai − ai+1 +

7
and !
x − aK−1
fK (x) = .
aK − aK−1 +

The function f can be parametrized as


K
X
f (x) = zi fi (x),
i=0

where zi = f (ai ), i = 0, . . . , K. We will use z = (z0 , . . . , zK ) to parametrize f . The


least-squares fitting criterion is then
m
(f (xi ) − yi )2 = kF z − yk22 ,
X
J=
i=1

where F ∈ Rm×(K+1) is the matrix

Fij = fj (xi ), i = 1, . . . , m, j = 0, . . . , K.

(We index the columns of F from 0 to K here.)


We must add the constraint that f is convex. This is the same as the condition that
the slopes of the segments are nondecreasing, i.e.,
zi+1 − zi zi − zi−1
≥ , i = 1, . . . , K − 1.
ai+1 − ai ai − ai−1
This is a set of linear inequalities in z. Thus, the best PWL convex fit can be found
by solving the QP

minimize kF z − yk22
i+1 −zi
subject to azi+1 −ai
≥ azii −z i−1
−ai−1
, i = 1, . . . , K − 1.

The following code solves this problem for the data in pwl_fit_data.

cvx_quiet(’true’)
figure
plot(x,y,’k:’,’linewidth’,2)
hold on

% Single line
p = [x ones(100,1)]\y;
alpha = p(1)
beta = p(2)
plot(x,alpha*x+beta,’b’,’linewidth’,2)
mse = norm(alpha*x+beta-y)^2

8
for K = 2:4
% Generate Lagrange basis
a = (0:(1/K):1)’;
F = max((a(2)-x)/(a(2)-a(1)),0);
for k = 2:K
a_1 = a(k-1);
a_2 = a(k);
a_3 = a(k+1);
f = max(0,min((x-a_1)/(a_2-a_1),(a_3-x)/(a_3-a_2)));
F = [F f];
end
f = max(0,(x-a(K))/(a(K+1)-a(K)));
F = [F f];

% Solve problem
cvx_begin
variable z(K+1)
minimize(norm(F*z-y))
subject to
(z(3:end)-z(2:end-1))./(a(3:end)-a(2:end-1)) >=...
(z(2:end-1)-z(1:end-2))./(a(2:end-1)-a(1:end-2))
cvx_end

% Calculate alpha and beta


alpha = (z(2:end)-z(1:end-1))./(a(2:end)-a(1:end-1))
beta = z(2:end)-alpha(1:end).*a(2:end)

% Plot solution
y2 = F*z;
mse = norm(y2-y)^2
if K==2
plot(x,y2,’r’,’linewidth’,2)
elseif K==3
plot(x,y2,’g’,’linewidth’,2)
else
plot(x,y2,’m’,’linewidth’,2)
end

end
xlabel(’x’)
ylabel(’y’)

9
2

1.5

y
0.5

−0.5

−1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x
Figure 3 Piecewise-linear approximations for K = 1, 2, 3, 4

This generates figure 3. We can see that the approximation improves as K increases.
The following table shows the result of this approximation.

K α1 , . . . , αK β1 , . . . , βK J
1 1.91 −0.87 12.73
2 −0.27, 4.09 −0.33, −2.51 2.62
3 −1.80, 2.67, 4.25 −0.10, −1.59, −2.65 0.60
4 −3.15, 2.11, 2.68, 4.90 0.03, −1.29, −1.57, −3.23 0.22

There is another way to solve this problem. We are looking for a piecewise linear
function. If we have at least one internal knot (K ≥ 2), the function should satisfy the
two following constraints:
• convexity: α1 ≤ α2 ≤ · · · ≤ αK
• continuity: αi ai + βi = αi+1 ai + βi+1 , i = 1, . . . , K − 1.
Therefore, the opimization problem is
minimize ( m i=1 f (xi ) − yi )
2
P

subject to αi ≤ αi+1 , i = 1, . . . , K − 1
αi ai + βi = αi+1 ai + βi+1 , i = 1, . . . , K − 1
Reformulating the problem by representing f (xi ) in matrix form, we get
minimize k diag(x)F α + F β − yk2
subject to αi ≤ αi+1 , i = 1, . . . , K − 1
αi ai + βi = αi+1 ai + βi+1 , i = 1, . . . , K − 1

10
where the variables are α ∈ RK and β ∈ RK , and problem data are x ∈ Rm , y ∈ Rm
and 
 1
 if aj−1 = xi , j = 1
Fij =  1 if aj−1 < xi ≤ aj

0 otherwise .

% another approach for PWL fitting problem


clear all;
pwl_fit_data;
m = length(x);
xp = 0:0.001:1; % for fine-grained pwl function plot
mp = length(xp);
yp = [];

for K = 1:4 % internal knot 1,2,3

a = [0:1/K:1]’; % a_0,...,a_K
% matrix for sum f(x_i)
F = sparse(1:m,max(1,ceil(x*K)),1,m,K);

% solve problem
cvx_begin
variables alpha(K) beta(K)
minimize( norm(diag(x)*F*alpha+F*beta-y) )
subject to
if (K>=2)
alpha(1:K-1).*a(2:K)+beta(1:K-1) == alpha(2:K).*a(2:K)+beta(2:K)
a(1:K-1) <= a(2:K)
end
cvx_end

fp = sparse(1:mp,max(1,ceil(xp*K)),1,mp,K);
yp = [yp diag(xp)*fp*alpha+fp*beta];
end
plot(x,y,’b.’,xp,yp);

4. Robust least-squares with interval coefficient matrix. An interval matrix in Rm×n is a


matrix whose entries are intervals:

A = {A ∈ Rm×n | |Aij − Āij | ≤ Rij , i = 1, . . . , m, j = 1, . . . , n}.

The matrix Ā ∈ Rm×n is called the nominal value or center value, and R ∈ Rm×n ,
which is elementwise nonnegative, is called the radius.

11
The robust least-squares problem, with interval matrix, is

minimize supA∈A kAx − bk2 ,

with optimization variable x ∈ Rn . The problem data are A (i.e., Ā and R) and
b ∈ Rm . The objective, as a function of x, is called the worst-case residual norm. The
robust least-squares problem is evidently a convex optimization problem.

(a) Formulate the interval matrix robust least-squares problem as a standard opti-
mization problem, e.g., a QP, SOCP, or SDP. You can introduce new variables
if needed. Your reformulation should have a number of variables and constraints
that grows linearly with m and n, and not exponentially.
(b) Consider the specific problem instance with m = 4, n = 3,
60 ± 0.05 45 ± 0.05 −8 ± 0.05 −6
   
 90 ± 0.05 30 ± 0.05 −30 ± 0.05   −3 
A= , b= .
   
 0 ± 0.05 −8 ± 0.05 −4 ± 0.05   18 
30 ± 0.05 10 ± 0.05 −10 ± 0.05 −9

(The first part of each entry in A gives Āij ; the second gives Rij , which are all 0.05
here.) Find the solution xls of the nominal problem (i.e., minimize kĀx − bk2 ),
and robust least-squares solution xrls . For each of these, find the nominal residual
norm, and also the worst-case residual norm. Make sure the results make sense.

Solution:
(a) The problem is equivalent to

minimize supA∈A kAx − bk22 ,

which can be reformulated as


minimize y T y
subject to −y  Ax − b  y, for all A ∈ A.
We have
n
X
sup (Ax − b)i = (Āij xj + Rij |xj |) − bi
A∈A j=1
Xn
inf (Ax − b)i = (Āij xj − Rij |xj |) − bi .
A∈A
j=1

We can therefore write the problem as


minimize y T y
subject to Āx + R|x| − b  y
Āx − R|x| − b  −y

12
where |x| ∈ Rn is the vector with elements |x|i = |xi |, or equivalently as the QP

minimize y T y
subject to Āx + Rz − b  y
Āx − Rz − b  −y
−z  x  z.

The variables are x ∈ Rn , y ∈ Rm , z ∈ Rn .


The problem also has an alternative formulation: the trick is to find an al-
ternative expression for the worst-case residual norm as a function of x. Let
f (x) = supA∈A kAx − bk22 .

f (x) = sup{kAx − bk22 | A = Ā + ∆, |∆ij | ≤ Rij , i = 1, . . . , m, j = 1, . . . , n}


= sup{kĀx + ∆x − bk22 | |∆ij | ≤ Rij , i = 1, . . . , m, j = 1, . . . , n}
= sup{kr + ∆xk22 | |∆ij | ≤ Rij , i = 1, . . . , m, j = 1, . . . , n}

where r = Āx − b.
Since kr+∆xk22 = m
Pn 2 Pm Pn 2
i=1 (ri + j=1 ∆ij xj ) = i=1 |ri + j=1 ∆ij xj | , it’s easy to see
P

that this expression is separable in the rows of ∆. So in order to find an expression


for f (x), we just need to find an expression for sup{|ri + nj=1 ∆ij xj | | |∆ij | ≤
P

Rij , j = 1, . . . , n}. The supremum is achieved by taking ∆ij = Rij sign(xj ) if ri ≥


0 and ∆ij = −Rij sign(xj ) if ri < 0. The supremum is equal to |ri | + nj=1 Rij |xj |.
P

Therefore
m n
Rij |xj |)2
X X
f (x) = (|ri | +
i=1 j=1

= k|Āx − b| + R|x|k22

The robust least-square problem then be reformulated as

minimize k|Āx − b| + R|x|k2

Note that the objective function is convex since the Euclidian norm is convex and
increasing on Rm
+ and |Āx − b| + R|x| is convex and nonnegative.

(b) The following script computes the least-squares and the robust solutions and also
computes, for each one, the nominal and the worst-case residual norms.

% input data
A_bar = [ 60 45 -8; ...
90 30 -30; ...
0 -8 -4; ...
30 10 -10];
d = .05;
R = d*ones(4,3);

13
b = [ -6; -3; 18; -9];

% least-squares solution
x_ls = A_bar\b;

% robust least-squares solution


cvx_begin
variables x(3) y(4) z(3)
minimize ( norm( y ) )
A_bar*x + R*z - b <= y
A_bar*x - R*z - b >= -y
x <= z
x + z >= 0
cvx_end

% computing nominal residual norms


nom_res_ls = norm(A_bar*x_ls - b);
nom_res_rls = norm(A_bar*x - b);

% computing worst-case nominal norms


r = A_bar*x_ls - b;
Delta = zeros(4,3);
for i=1:length(r)
if r(i) < 0
Delta(i,:) = -d*sign(x_ls’);
else
Delta(i,:) = d*sign(x_ls’);
end
end
wc_res_ls = norm(r + Delta*x_ls);
wc_res_rls = cvx_optval;

% display
disp(’Residual norms for the nominal problem when using LS solution: ’);
disp(nom_res_ls);
disp(’Residual norms for the nominal problem when using robust solution: ’);
disp(nom_res_rls);
disp(’Residual norms for the worst-case problem when using LS solution: ’);
disp(wc_res_ls);
disp(’Residual norms for the worst-case problem when using robust solution: ’);
disp(wc_res_rls);

The robust least-square solution can also be found using the following script:

14
cvx_begin
variables x(3) t(4)
minimize ( norm ( t ) )
abs(A_bar*x - b) + R*abs(x) <= t
cvx_end
This script returns the following results:
Residual norms for the nominal problem when using LS solution:
7.5895

Residual norms for the nominal problem when using robust solution:
17.7106

Residual norms for the worst-case problem when using LS solution:


26.7012

Residual norms for the worst-case problem when using robust solution:
17.7940
We also generated, for fun, the following histograms showing the distribution of
the residual norms for the case where x = xks and x = xrls . Those were obtained
by creating 1000 instances of A by sampling Aij uniformly between Āij − Rij and
Āij + Rij , and then evaluating the residual norm for each A and each of the 2
solutions.
Residual norm distribution for x = xls
60

50

40

30

20

10

0
0 2 4 6 8 10 12 14 16 18 20

15
Residual norm distribution for x = xrls
60

50

40

30

20

10

0
17.64 17.66 17.68 17.7 17.72 17.74 17.76 17.78 17.8

The following script generates these histograms:


% Monte-Carlo simulation
N = 1000;
res_ls = zeros(N,1);
res_rls = zeros(N,1);
for k=1:N
Delta = d*(2*rand(4,3)-1);
A = A_bar + Delta;
res_ls(k) = norm(A*x_ls - b);
res_rls(k) = norm(A*x - b);
end
figure;
hist(res_ls,50);
figure;
hist(res_rls,50);

5. Total variation image interpolation. A grayscale image is represented as an m × n


matrix of intensities U orig . You are given the values Uijorig , for (i, j) ∈ K, where K ⊂
{1, . . . , m} × {1, . . . , n}. Your job is to interpolate the image, by guessing the missing
values. The reconstructed image will be represented by U ∈ Rm×n , where U satisfies
the interpolation conditions Uij = Uijorig for (i, j) ∈ K.
The reconstruction is found by minimizing a roughness measure subject to the inter-

16
polation conditions. One common roughness measure is the ℓ2 variation (squared),
n 
m X 
(Uij − Ui−1,j )2 + (Uij − Ui,j−1 )2 .
X

i=2 j=2

Another method minimizes instead the total variation,


m X
X n
(|Uij − Ui−1,j | + |Uij − Ui,j−1 |) .
i=2 j=2

Evidently both methods lead to convex optimization problems.


Carry out ℓ2 and total variation interpolation on the problem instance with data given
in tv_img_interp.m. This will define m, n, and matrices Uorig and Known. The matrix
Known is m × n, with (i, j) entry one if (i, j) ∈ K, and zero otherwise. The mfile also
has skeleton plotting code. (We give you the entire original image so you can compare
your reconstruction to the original; obviously your solution cannot access Uijorig for
(i, j) 6∈ K.)
Solution. The code for the interpolation is very simple. For ℓ2 interpolation, the code
is the following.

cvx_begin
variable Ul2(m, n);
Ul2(Known) == Uorig(Known); % Fix known pixel values.
Ux = Ul2(2:end,2:end) - Ul2(2:end,1:end-1); % x (horiz) differences
Uy = Ul2(2:end,2:end) - Ul2(1:end-1,2:end); % y (vert) differences
minimize(norm([Ux(:); Uy(:)], 2)); % l2 roughness measure
cvx_end

For total variation interpolation, we use the following code.

cvx_begin
variable Utv(m, n);
Utv(Known) == Uorig(Known); % Fix known pixel values.
Ux = Utv(2:end,2:end) - Utv(2:end,1:end-1); % x (horiz) differences
Uy = Utv(2:end,2:end) - Utv(1:end-1,2:end); % y (vert) differences
minimize(norm([Ux(:); Uy(:)], 1)); % tv roughness measure
cvx_end

We get the following images

17
Original image Obscured image

10 10

20 20

30 30

40 40

50 50
10 20 30 40 50 10 20 30 40 50

ℓ2 reconstructed image Total variation reconstructed image

10 10

20 20

30 30

40 40

50 50
10 20 30 40 50 10 20 30 40 50

6. Relaxed and discrete A-optimal experiment design. This problem concerns the A-
optimal experiment design problem, described on page 387, with data generated as
follows.

n = 5; % dimension of parameters to be estimated


p = 20; % number of available types of measurements
m = 30; % total number of measurements to be carried out
randn(’state’, 0);
V=randn(n,p); % columns are vi, the possible measurement vectors

Solve the relaxed A-optimal experiment design problem,


P −1
p
minimize (1/m) tr i=1λi vi viT
subject to 1T λ = 1, λ  0,

with variable λ ∈ Rp . Find the optimal point λ⋆ and the associated optimal value of
the relaxed problem. This optimal value is a lower bound on the optimal value of the
discrete A-optimal experiment design problem,
P −1
p T
minimize tr i=1 mi vi vi
subject to m1 + · · · + mp = m, mi ∈ {0, . . . , m}, i = 1, . . . , p,

18
with variables m1 , . . . , mp . To get a suboptimal point for this discrete problem, round
the entries in mλ⋆ to obtain integers m̂i . If needed, adjust these by hand or some other
method to ensure that they sum to m, and compute the objective value obtained. This
is, of course, an upper bound on the optimal value of the discrete problem. Give the
gap between this upper bound and the lower bound obtained from the relaxed problem.
Note that the two objective values can be interpreted as mean-square estimation error
E kx̂ − xk22 .
Solution. The objective of the relaxed problem is convex, so it is a convex problem.
Expressing it in cvx requires a little work. We’d like to write the objective as

minimize ((1/m)*trace(inv(V*diag(lambda)*V’)))

but this won’t work, because cvx doesn’t know about matrix convex functions. Instead,
we can express the objective as a sum of matrix fractional functions,
P −1
p
minimize (1/m) nk=1 eTk T
i=1 λi vi vi ek
P

subject to 1T λ = 1, λ  0.

where ek ∈ Rn is the kth unit vector. (Note that e is defined in exercise 6.9 as the
estimation error vector, so ek could also mean the kth entry of the error vector. But
here, clearly, ek is kth unit vector.)
We can express this in cvx using the function matrix_frac. The following code solves
the problem.

n = 5; % dimension
p = 20; % number of available types of measurements
m = 30; % total number of measurements to be carried out
randn(’state’, 0);
V=randn(n,p); % columns are vi, the possible measurement vectors

cvx_begin
variable lambda(p)
obj = 0;
for k=1:n
ek = zeros(n,1);
ek(k)=1;
obj = obj + (1/m)*matrix_frac(ek,V*diag(lambda)*V’);
end
minimize( obj )
subject to
sum(lambda) == 1
lambda >= 0

19
cvx_end

lower_bound = cvx_optval

t = -0.00; % small offset chosen by hand to make rounding work out.


% for this problem data, none is needed!
m_rnd = pos(round(m*lambda+t));
sum(m_rnd) % should be == m

% now find objective value of rounded experiment design


upper_bound = trace(inv(V*diag(m_rnd/m)*V’))/m
gap = upper_bound-lower_bound
rel_gap = gap/lower_bound

For this problem instance, simple rounding yielded m̂i that summed to m = 30, so
no adjustment of the rounded values is needed. The lower bound is 0.2481; the upper
bound is 0.2483. The gap is 0.00023, which is around 0.1%.
What this means is this: We have found a choice of 30 measurements, each one from
the set of 20 possible measurements, that yields a mean-square estimation error E kx̂ −
xk22 = 0.2483. We do not know whether this is the optimal choice of 30 measurements.
But we do know that this choice is no more than 0.1% suboptimal; the optimal choice
can achieve a mean-square error that is no smaller than 0.2481. Our experiment design
is, if not optimal, very nearly optimal. (In fact, it is very likely to be optimal.)

20

You might also like