Midterm Exam Solutions
Midterm Exam Solutions
Boyd
Oct. 27 – 28 or Oct. 28 – 29, 2006.
(In other words, dist(z, L) gives the closest distance between the point z and the line
L.)
We seek a point z ⋆ ∈ Rn that minimizes the sum of the squares of the distances to the
lines,
m
dist(z, Li )2 .
X
i=1
⋆
The point z that minimizes this quantity is called the point of closest convergence.
(a) Explain how to find the point of closest convergence, given the lines (i.e., given
p1 , . . . , pm and v1 , . . . , vm ). If your method works provided some condition holds
(such as some matrix being full rank), say so. If you can relate this condition to
a simple one involving the lines, please do so.
(b) Find the point z ⋆ of closest convergence for the lines with data given in the Matlab
file line_conv_data.m. This file contains n × m matrices P and V whose columns
are the vectors p1 , . . . , pm , and v1 , . . . , vm , respectively. The file also contains
commands to plot the lines and the point of closest convergence (once you have
found it). Please include this plot with your solution.
Solution.
(a) There are several ways to solve this problem. Our first solution starts by working
out an explicit expression for dist(z, Li ). To find this distance we need to solve
the simple least-squares problem of minimizing kz − pi − tvi k2 over t ∈ R. The
optimal t is given by t⋆ = viT (z − pi ), so we have
1
This makes sense: we recognize I − vi viT as projection onto the orthogonal com-
plement of the line through the origin in the direction vi , i.e., projection onto the
plane with normal vector vi .
We can now set up our problem as a standard least-squares problem. We define
I − v1 v1T
(I − v1 v1T )p1
A= ..
,
b= ..
,
. .
T T
I − vm vm (I − vm vm )pm
so we can write m
dist(z, Li )2 = kAz − bk2 .
X
i=1
Now we can solve the problem, assuming A is full rank (we’ll come back to this).
The solution is
m
!−1 m
⋆ T −1 T
vi viT (pi − vi viT pi ).
X X
z = (A A) A b = mI −
i=1 i=1
Finally, let’s look at the conditions under which A is not full rank. Each n × n
block of A, i.e., I − vi viT , has rank exactly n − 1, with nullspace span(vi ). So
unless all the vi are aligned (i.e., vi = vj or vi = −vj for all i, j), A is full rank.
Geometrically, this means that the lines are all parallel. So we can say that A
above is full rank, unless all the lines are parallel.
Here is another solution of the problem (or really, a variation on the solution given
above). If we define
−v1 0 ··· 0 I t1
p1 .
0 −v2 ··· 0 I
.. ..
C= .. .. .. .. , d=
,
. u=
,
. . . . I
tm
pm
0 0 · · · −vm I z
we have m
dist(z, Li )2 = min kCu − dk,
X
t1 ,...,tm
i=1
and m
dist(z, Li )2 = min kCu − dk.
X
min
z u
i=1
In the last expression, we are optimizing over the line parameters ti and the point
z at the same time.
Therefore, assuming C is full rank, we have
" #
0 0
⋆
z = (C T C)−1 C T d,
0 I
which expands to the same solution we have above. And of course, C is full rank
if and only if A is, which occurs exactly when the lines are not all parallel.
2
(b) The following code solves for the point of closest convergence using the two dif-
ferent approaches and checks that the solutions are identical.
% first solution
A=[];
b=[];
for i=1:m
A=[A;eye(n)-V(:,i)*V(:,i)’];
b=[b;(eye(n)-V(:,i)*V(:,i)’)*P(:,i)];
end
zstar=A\b;
% second solution
C=zeros(n*m,m);
E=[];
d=[];
for i=1:m
E=[E;eye(n)];
C(n*(i-1)+1:n*i,i)=-V(:,i);
d=[d;P(:,i)];
end
C=[C E];
zstar=A\b;
f=C\d;
zstar2=f(m+1:m+n);
3
20
15
10
−5
−10
−15
−20
−20 −15 −10 −5 0 5 10 15 20
2. Estimating direction and amplitude of a light beam. A light beam with (nonnegative)
amplitude a comes from a direction d ∈ R3 , where kdk = 1. (This means the beam
travels in the direction −d.) The beam falls on m ≥ 3 photodetectors, each of which
generates a scalar signal that depends on the beam amplitude and direction, and the
direction in which the photodetector is pointed. Specifically, photodetector i generates
an output signal pi , with
pi = aα cos θi + vi ,
where θi is the angle between the beam direction d and the outward normal vector qi
of the surface of the ith photodetector, and α is the photodetector sensitivity. You can
interpret qi ∈ R3 , which we assume has norm one, as the direction the ith photodetector
is pointed. We assume that |θi | < 90◦ , i.e., the beam illuminates the top of the
photodetectors. The numbers vi are small measurement errors.
You are given the photodetector direction vectors q1 , . . . , qm ∈ R3 , the photodetector
sensitivity α, and the noisy photodetector outputs, p1 , . . . , pm ∈ R. Your job is to
estimate the beam direction d ∈ R3 (which is a unit vector), and a, the beam amplitude.
To describe unit vectors q1 , . . . , qm and d in R3 we will use azimuth and elevation,
defined as follows:
cos φ cos θ
q = cos φ sin θ .
sin φ
Here φ is the elevation (which will be between 0◦ and 90◦ , since all unit vectors in this
problem have positive 3rd component, i.e., point upward). The azimuth angle θ, which
varies from 0◦ to 360◦ , gives the direction in the plane spanned by the first and second
coordinates. If q = e3 (i.e., the direction is directly up), the azimuth is undefined.
4
(a) Explain how to do this, using a method or methods from this class. The simpler
the method the better. If some matrix (or matrices) needs to be full rank for your
method to work, say so.
(b) Carry out your method on the data given in beam_estim_data.m. This mfile
defines p, the vector of photodetector outputs, a vector det_az, which gives the
azimuth angles of the photodetector directions, and a vector det_el, which gives
the elevation angles of the photodetector directions. Note that both of these are
given in degrees, not radians. Give your final estimate of the beam amplitude a
and beam direction d (in azimuth and elevation, in degrees).
Solution.
(a) Since cos θi = qiT d/(kqi kkdk) = qiT d (using kqi k = kdk = 1), we have
pi = aαqiT d + vi .
x̂ = (1/α)(QT Q)−1 QT p.
5
beam_estim_data
for i=1:m
Q(i,:)=[ cosd(det_el(i))*cosd(det_az(i)),...
cosd(det_el(i))*sind(det_az(i)),...
sind(det_el(i)) ];
end
xhat=(1/alpha)*(Q\p);
ahat=norm(xhat);
dhat=xhat/norm(xhat);
elevation=asind(dhat(3))
azimuth=acosd(dhat(1)/cosd(elevation))
The result is â = 5.0107, φ̂d = 38.7174, and θ̂d = 77.6623.
6
3. Minimum energy input with way-point constraints. We consider a vehicle that moves
in R2 due to an applied force input. We will use a discrete-time model, with time
index k = 1, 2, . . .; time index k corresponds to time t = kh, where h > 0 is the sample
interval. The position at time index k is denoted by p(k) ∈ R2 , and the velocity by
v(k) ∈ R2 , for k = 1, . . . , K + 1. These are related by the equations
where f (k) ∈ R2 is the force applied to the vehicle at time index k, m > 0 is the vehicle
mass, and α ∈ (0, 1) models drag on the vehicle: In the absence of any other force, the
vehicle velocity decreases by the factor 1 − α in each time index. (These formulas are
approximations of more accurate formulas that we will see soon, but for the purposes
of this problem, we consider them exact.) The vehicle starts at the origin, at rest, i.e.,
we have p(1) = 0, v(1) = 0. (We take k = 1 as the initial time, to simplify indexing.)
The problem is to find forces f (1), . . . , f (K) ∈ R2 that minimize the cost function
K
kf (k)k2 ,
X
J=
k=1
p(ki ) = wi , i = 1, . . . , M,
where ki are integers between 1 and K. (These state that at the time ti = hki , the
vehicle must pass through the location wi ∈ R2 .) Note that there is no requirement
on the vehicle velocity at the way-points.
(a) Explain how to solve this problem, given all the problem data (i.e., h, α, m, K,
the way-points w1 , . . . , wM , and the way-point indices k1 , . . . , kM ).
(b) Carry out your method on the specific problem instance with data h = 0.1, m = 1,
α = 0.1, K = 100, and the M = 4 way-points
" # " # " # " #
2 −2 4 −4
w1 = , w2 = , w3 = , w4 = ,
2 3 −3 −2
7
(a) The equations of motion can be written as the discrete-time linear dynamical
system
x(k + 1) = Ax(k) + Bf (k), p(k) = Cx(k), x(1) = 0,
where
" # " # " #
p(k) I hI 0
x(k) = , A= , B= , C = [ I 0 ].
v(k) 0 (1 − α)I (h/m)I
We can solve these state equations to get x(k) in terms of the input forces
f (1), . . . , f (k − 1):
f (1)
k−2 k−3
f (2)
x(k) = [ A B A B ··· B ] .. .
.
f (k − 1)
The position of the vehicle at way-point index ki is therefore
f (1)
f (2)
p(ki) = Cx(ki ) = C[ Aki −2 B Aki −3 B · · · B ]
.. .
.
f (ki − 1)
We can write the way-point constraint p(ki) = wki as
f (1)
wi = C[ Aki −2
B Aki −3
B ··· B 0 ··· 0 ] ..
,
.
f (K)
or equivalently ,
wi = Gi u,
where
f (1)
u = ... 2K
Gi = C[ Aki −2 B Aki −3 B · · · B 0 · · · 0 ] ∈ R2×2K .
∈R ,
f (K)
Using notation
w1 G1
. .
.. ,
w= .. ,
G=
wM GM
and noting that J = kuk2 , the problem becomes
minimize kuk2
subject to Gu = w.
8
This is just a least-norm problem and the optimal u is given by
u = G† w = GT (GGT )−1 w.
(b) The following Matlab script computes the minimum norm input, and plots it and
the associated trajectory.
% problem parameters
h = .1;
m = 1;
M=4;
alpha=0.1;
K = 100;
% way-points
k1=10; w1=[ 2; 2];
k2=30; w2=[ -2; 3];
k3=40; w3=[ 4; -3];
k4=80; w4=[-4; -2];
k = [k1 k2 k3 k4];
G = [];
for i = 1:M
ABmatrix = [];
temp = B;
for j=1:k(i)-1
ABmatrix = [temp ABmatrix];
temp = A*temp;
end
Gi = C*[ABmatrix zeros(n, nn*(K-k(i)+1))];
G = [G; Gi];
end
w = [w1; w2; w3; w4];
u = pinv(G)*w;
9
20
15
10
f1
0
−5
−10
0 10 20 30 40 50 60 70 80 90 100
k
10
0
f2
−5
−10
−15
0 10 20 30 40 50 60 70 80 90 100
k
Figure 2: f versus k.
% Optimal value of J
J = norm(u)^2
figure;
plot(p(1,:),p(2,:));
hold on
ps = [w1 w2 w3 w4];
plot(ps(1,:),ps(2,:),’*’);
Figure (2) shows the minimum norm input forces. We see that for k ≥ 80, the
optimal force is zero. This makes perfect sense: for k ≥ 80, the force f (k) does
not affect the vehicle position at any of the way-points, so using any force on the
vehicle for k ≥ 80 just increases the cost J.
10
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
Figure 3: Trajectory in R2 .
11
4. Digital circuit gate sizing. A digital circuit consists of a set of n (logic) gates, intercon-
nected by wires. Each gate has one or more inputs (typically between one and four),
and one output, which is connected via the wires to other gate inputs and possibly
to some external circuitry. When the output of gate i is connected to an input of
gate j, we say that gate i drives gate j, or that gate j is in the fan-out of gate i.
We describe the topology of the circuit by the fan-out list for each gate, which tells
us which other gates the output of a gate connects to. We denote the fan-out list of
gate i as FO(i) ⊆ {1, . . . , n}. We can have FO(i) = ∅, which means that the out-
put of gate i does not connect to the inputs of any of the gates 1, . . . , n (presumably
the output of gate i connects to some external circuitry). It’s common to order the
gates in such a way that each gate only drives gates with higher indices, i.e., we have
FO(i) ⊆ {i + 1, . . . , n}. We’ll assume that’s the case here. (This means that the gate
interconnections form a directed acyclic graph.)
To illustrate the notation, a simple digital circuit with n = 4 gates, each with 2 inputs,
is shown below. For this circuit we have
1
3
2
4
The 3 input signals arriving from the left are called primary inputs, and the 3 output
signals emerging from the right are called primary outputs of the circuit. (You don’t
need to know this, however, to solve this problem.)
Each gate has a (real) scale factor or size xi . These scale factors are the design variables
in the gate sizing problem. They must satisfy 1 ≤ xi ≤ xmax , where xmax is a given
maximum allowed gate scale factor (typically on the order of 100). The total area of
the circuit has the form nX
A= ai xi ,
i=1
Ciin = αi xi ,
12
Each gate has a delay di , which is given by
di = βi + γi Ciload /xi ,
where βi and γi are positive constants, and Ciload is the load capacitance of gate i.
Note that the gate delay di is always larger than βi , which can be intepreted as the
minimum possible delay of gate i, achieved only in the limit as the gate scale factor
becomes large.
The load capacitance of gate i is given by
j∈FO(i)
where Ciext is a positive constant that accounts for the capacitance of the interconnect
wires and external circuitry.
We will follow a simple design method, which assigns an equal delay T to all gates in
the circuit, i.e., we have di = T , where T > 0 is given. For a given value of T , there
may or may not exist a feasible design (i.e., a choice of the xi , with 1 ≤ xi ≤ xmax )
that yields di = T for i = 1, . . . , n. We can assume, of course, that T > maxi βi , i.e.,
T is larger than the largest minimum delay of the gates.
Finally, we get to the problem.
(a) Explain how to find a design x⋆ ∈ Rn that minimizes T , subject to a given area
constraint A ≤ Amax . You can assume the fanout lists, and all constants in the
problem description are known; your job is to find the scale factors xi . Be sure to
explain how you determine if the design problem is feasible, i.e., whether or not
there is an x that gives di = T , with 1 ≤ xi ≤ xmax , and A ≤ Amax .
Your method can involve any of the methods or concepts we have seen so far
in the course. It can also involve a simple search procedure, e.g., trying (many)
different values of T over a range.
Note: this problem concerns the general case, and not the simple example shown
above.
(b) Carry out your method on the particular circuit with data given in the file
gate_sizing_data.m. The fan-out lists are given as an n × n matrix F, with
i, j entry one if j ∈ FO(i), and zero otherwise. In other words, the ith row of F
gives the fanout of gate i. The jth entry in the ith row is 1 if gate j is in the
fan-out of gate i, and 0 otherwise.
• You do not need to know anything about digital circuits; everything you need to
know is stated above.
• Yes, this problem does belong on the EE263 midterm.
13
Solution.
(a) We define the fanout matrix F as Fij = 1, if j ∈ FO(i), and Fij = 0 otherwise.
The matrix F is strictly upper triangular, since FO(i) ⊆ {i + 1, . . . , n}.
Using the formulas given above, and di = T , we have
T = di
Ciload
= βi + γi
xi
Ciext + j∈FO(i) Cjin
P
= βi + γi
xi
ext
C + j∈FO(i) αj xj
P
= βi + γi i .
xi
Multiplying by xi we get the equivalent equations
T xi = βi xi + γi Ciext +
X
αj xj ,
j∈FO(i)
Defining
K = diag(β) + diag(γ)F diag(α),
we can write the equations as
14
Thus, for each value of T (larger than maxi βi ) there is exactly one possible
choice of gate sizes. Among the ones that are feasible, we have to choose the one
corresponding to the smallest value of T .
We can solve this problem by examing a reasonable range of values of T , and
for each value, finding x. We check whether x is feasible, by looking at mini xi ,
maxi xi , and A. We take our final design as the one which is feasible, and has
smallest value of T . Alternatively, we can start with a value of T just a little bit
larger than maxi βi , then increase T until we find our first feasible x, which we
take as our solution.
(b) The following code generatea x for a range of value of T , and plots mini xi , maxi xi ,
and A, versus T .
gate_sizing_data
deltaT=0.001;
Trange=max(beta)+deltaT:deltaT:6;
i=1;
for T=Trange
K=diag(beta)+diag(gamma)*F*diag(alpha);
x=(T*eye(n)-K)\diag(gamma)*Cext;
maxX(i)=max(x);
minX(i)=min(x);
Area(i)=a’*x;
i=i+1;
end
subplot(3,1,1)
plot(Trange,minX)
ylabel(’minx’)
axis([2 6 0 4])
line([2,6],[1,1],’Color’,’r’)
grid on
subplot(3,1,2)
plot(Trange,maxX)
ylabel(’maxx’)
axis([2 6 0 150])
line([2,6],[100,100],’Color’,’r’)
grid on
subplot(3,1,3)
15
4
mini xi
2
0
2 2.5 3 3.5 4 4.5 5 5.5 6
150
maxi xi
100
50
0
2 2.5 3 3.5 4 4.5 5 5.5 6
500
400
300
A
200
100
0
2 2.5 3 3.5 4 4.5 5 5.5 6
T
Figure 4: maxi xi , mini xi , and A versus T .
plot(Trange,Area)
xlabel(’T’)
ylabel(’A’)
axis([2 6 0 500])
line([2,6],[400,400],’Color’,’r’)
grid on
• Since the matrix T I − K is upper triangular, we can solve for x very, very quickly.
In fact, if we use sparse matrix operations, we can easily compute x very quickly
(seconds or less) for a problem with n = 105 gates or more. You didn’t need to
know this; we’re just pointing it out for fun.
16
• The plots above show that as T increases, all of gate sizes decrease. This implies
that mini xi , maxi xi , and A all decrease as T increases. This means you can use
a more efficient bisection search to find the optimal T . Again, you didn’t need to
know this; we’re just pointing it out.
17
5. Oh no. It’s the dreaded theory problem. In the list below there are 11 statements
about two square matrices A and B in Rn×n .
(a) R(B) ⊆ R(A).
(b) there exists a matrix Y ∈ Rn×n such that B = Y A.
(c) AB = 0.
(d) BA = 0.
(e) rank([ A B ]) = rank(A).
(f) R(A) ⊥ N (B T ).
" #
A
(g) rank( ) = rank(A).
B
(h) R(A) ⊆ N (B).
(i) there exists a matrix Z ∈ Rn×n such that B = AZ.
(j) rank([ A B ]) = rank(B).
(k) N (A) ⊆ N (B).
Your job is to collect them into (the largest possible) groups of equivalent statements.
Two statements are equivalent if each one implies the other. For example, the state-
ment ‘A is onto’ is equivalent to ‘N (A) = {0}’ (when A is square, which we assume
here), because every square matrix that is onto has zero nullspace, and vice versa. Two
statements are not equivalent if there exist (real) square matrices A and B for which
one holds, but the other does not. A group of statements is equivalent if any pair of
statements in the group is equivalent.
We want just your answer, which will consist of lists of mutually equivalent statements.
We will not read any justification. If you add any text to your answer, as in ‘c and e
are equivalent, provided A is nonsingular’, we will mark your response as wrong.
Put your answer in the following specific form. List each group of equivalent statements
on a line, in (alphabetic) order. Each new line should start with the first letter not
listed above. For example, you might give your answer as
a, c, d, h
b, i
e
f, g, j, k.
This means you believe that statements a, c, d, and h are equivalent; statements b and
i are equivalent; and statements f, g, j, and k are equivalent. You also believe that the
first group of statements is not equivalent to the second, or the third, and so on.
We will take points off for false groupings (i.e., listing statements in the same line when
they are not equivalent) as well as for missed groupings (i.e., when you list equivalent
statements in different lines).
18
Solution. Let bi be the ith column of B.
R(A) ⊥ N (B T ) ⇔ R(A) ⊆ N (B T )⊥
⇔ R(A) ⊆ R(B)
⇔ rank([ A B ]) = rank(B). (4)
but rank([A B]) = 2, groups (2) and (1) are not equivalent. Groups (2) and (4) are
not either.
19
When A = B 6= 0, N (A) = N (B) but AB = BA = A2 6= 0. Hence groups (2) and (3)
are not equivalent. Group (2) and statement c are not equivalent either.
Take " #
0 0
A = I, B= .
1 0
Since rank([AB]) = rank(A) = 2 but rank(B) = 1, groups (1) and (4) are not
equivalent. Furthermore since BA 6= 0 groups (1) and (3) are not equivalent. Since
AB 6= 0, group (1) and statement c aren’t either.
In a similar fashion, taking
" #
0 0
A= , B = I,
1 0
shows that groups (3) and (4) are not equivalent and that statement c and group (4)
aren’t either.
Thus, the final answer is
a, e, i
b, g, k
c
d, h
f, j.
20
6. Smooth interpolation on a 2D grid. This problem concerns arrays of real numbers on
an m × n grid. Such as array can represent an image, or a sampled description of a
function defined on a rectangle. We can describe such an array by a matrix U ∈ Rm×n ,
where Uij gives the real number at location i, j, for i = 1, . . . , m and j = 1, . . . , n. We
will think of the index i as associated with the y axis, and the index j as associated
with the x axis.
It will also be convenient to describe such an array by a vector u = vec(U) ∈ Rmn .
Here vec is the function that stacks the columns of a matrix on top of each other:
u1
.
.
. ,
vec(U) =
un
where U = [u1 · · · un ]. To go back to the array representation, from the vector, we have
U = vec−1 (u). (This looks complicated, but isn’t; vec−1 just arranges the elements in
a vector into an array.)
We will need two linear functions that operate on m × n arrays. These are simple
approximations of partial differentiation with respect to the x and y axes, respectively.
The first function takes as argument an m × n array U and returns an m × (n − 1)
array V of forward (rightward) differences:
The roughness measure R is the sum of the squares of the differences of each element
in the array and its neighbors. Small R corresponds to smooth, or smoothly varying,
U. The roughness measure R is zero precisely for constant arrays, i.e., when Uij are
all equal.
21
Now we get to the problem, which is to interpolate some unknown values in an array
in the smoothest possible way, given the known values in the array. To define this
precisely, we partition the set of indices {1, . . . , mn} into two sets: Iknown and Iunknown .
We let k ≥ 1 denote the number of known values (i.e., the number of elements in Iknown ),
and mn − k the number of unknown values (the number of elements in Iunknown ). We
are given the values ui for i ∈ Iknown ; the goal is to guess (or estimate or assign) values
for ui for i ∈ Iunknown . We’ll choose the values for ui, with i ∈ Iunknown , so that the
resulting U is as smooth as possible, i.e., so it minimizes R. Thus, the goal is to fill in
or interpolate missing data in a 2D array (an image, say), so the reconstructed array
is as smooth as possible.
We give the k known values in a vector wknown ∈ Rk , and the mn − k unknown values
in a vector wunknown ∈ Rmn−k . The complete array is obtained by putting the entries of
wknown and wunknown into the correct positions of the array. We describe these operations
using two matrices Zknown ∈ Rmn×k and Zunknown ∈ Rmn×(mn−k) , that satisfy
(This looks complicated, but isn’t: Each row of these matrices is a unit vector, so
multiplication with either matrix just stuffs the entries of the w vectors into particular
locations in vec(U). In fact, the matrix [Zknown Zunknown ] is an mn × mn permutation
matrix.)
In summary, you are given the problem data wknown (which gives the known array
values), Zknown (which gives the locations of the known values), and Zunknown (which
gives the locations of the unknown array values, in some specific order). Your job is
to find wunknown that minimizes R.
(a) Explain how to solve this problem. You are welcome to use any of the operations,
matrices, and vectors defined above in your solution (e.g., vec, vec−1 , Dx , Dy ,
Zknown , Zunknown , wknown , . . . ). If your solution is valid provided some matrix is
(or some matrices are) full rank, say so.
(b) Carry out your method using the data created by smooth_interpolation.m. The
file gives m, n, wknown , Zknown and Zunknown . This file also creates the matrices Dx
and Dy , which you are welcome to use. (This was very nice of us, by the way.)
You are welcome to look at the code that generates these matrices, but you do
not need to understand it. For this problem instance, around 50% of the array
elements are known, and around 50% are unknown.
The mfile also includes the original array Uorig from which we removed elements
to create the problem. This is just so you can see how well your smooth recon-
struction method does in reconstructing the original array. Of course, you cannot
use Uorig to create your interpolated array U.
To visualize the arrays use the Matlab command imagesc(), with matrix argu-
ment. If you prefer a grayscale image, or don’t have a color printer, you can
22
issue the command colormap gray. The mfile that gives the problem data will
plot the original image Uorig, as well as an image containing the known values,
with zeros substituted for the unknown locations. This will allow you to see the
pattern of known and unknown array values.
Compare Uorig (the original array) and U (the interpolated array found by your
method), using imagesc(). Hand in complete source code, as well as the plots.
Be sure to give the value of roughness R of U.
Hints:
• In Matlab, vec(U) can be computed as U(:);
• vec−1 (u) can be computed as reshape(u,m,n).
Solution.
(a) We can express our roughness measure directly in terms of the vector of known
values wknown and unknown values wunknown as
wunknown = A† b
= (AT A)−1 AT b
−1
T
= − Zunknown (DxT Dx + DyT Dy )Zunknown ·
T
· Zunknown (DxT Dx + DyT Dy )Zknown wknown .
When is A ∈ R(2mn−m−n)×(mn−k) skinny and full rank? It’s always skinny, since
2mn − m − n ≥ mn − k. If A were not full rank, then there would exist some
nonzero w with Aw = 0. This means that Zunknown w is in the nullspace of both
Dx and Dy , which means that Zunknown w is a constant (i.e., its entries are all the
same). This means that we have to have w = 0, assuming there is at least one
known array value. In other words, A is always full rank and skinny!
23
(b) wunknown is easily found in Matlab with the command
wunkown = [Dx; Dy]*Zunknown \ -[Dx; Dy]*Zknown*wknown;
Yes, that really is the solution, in just one line.
Next we need to create our complete array by putting the entries of wknown and
wunknown in the correct positions of the array. We use Matlab again:
U = reshape([Zknown Zunknown]*[wknown; wunknown], m, n);
We calculate the roughness of our final array U as
R = norm(Dx*U(:))^2 + norm(Dy*U(:))^2
which for our example is R = 12.8794.
Finally, we graph Uorig, Uobscured and U, with the results shown in Figure (5).
subplot(221);
imagesc(Uorig)
title(’Original image’);
subplot(222);
imagesc(Uobscured);
title(’Obscured image’);
subplot(223);
imagesc(U);
title(’Reconstructed image’);
One thing you notice about the reconstructed image is, it’s a really, really good ap-
proximation of the orginal image. It’s very impressive; we’ve guessed (very well) half
the entries of a (smooth) image, from the remaining half.
24
Original image Known pixel values
5 5
10 10
15 15
20 20
25 25
5 10 15 20 25 5 10 15 20 25
Reconstructed image
10
15
20
25
5 10 15 20 25
25
ee263 midterm grades, fall 2006
60
50
40
frequency
30
20
10
0
20 30 40 50 60 70 80 90 100 110 120
score / 120
26