10 - Regularization
10 - Regularization
10. Regularization
More on tradeoffs
Regularization
10-2
Multi-objective tradeoff
10-3
Minimum-norm as a regularization
When Ax = b is underdetermined (A is wide), we can
resolve ambiguity by adding a cost function, e.g.
min-norm LS:
minimize kxk2
x
subject to: Ax = b
10-5
Proof of minimum-norm equivalence
Solution of 2-norm regularization is:
x̂ = (AT A + λI )−1 AT b
10-6
Proof of minimum-norm equivalence
Solution of 2-norm regularization is:
x̂ = (AT A + λI )−1 AT b
x̂ = AT (AAT + λI )−1 b
10-7
Tradeoff visualization
λ→0
0, kA† bk2
kxk2
λ→∞
kbk2 , 0
kAx − bk2
10-8
Regularization
10-9
Regularization
-1.5 -1.0 -0.5 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.5 1.0 1.5 -1.5 -1.0 -0.5 0.5 1.0 1.5
-0.5 -0.5 -0.5
minimize kxkp
x
subject to: Ax = b
10-12
Simple example
2.5
2.0 x
for p = 1, this occurs at 1.5
one of the axes. 1.0
0.5
sparsifying behavior
-1 1 2 3 4
-0.5
2.5
for p = ∞, this occurs at 2.0
1.5 x
equal values of
1.0
coordinates 0.5
equalizing behavior -1
-0.5
1 2 3 4
10-13
Another simple example
Suppose we have data points {y1 , . . . , ym } ⊂ R, and we would
like to find the best estimator for the data, according to
different norms. Suppose data is sorted: y1 ≤ · · · ≤ ym .
y1 x
.. ..
minimize . − .
x
ym x p
p = 2: x̂ = 1
(y + · · · + ym ).
m 1
This is the mean of the data.
p = 1: x̂ = y dm/2e . This is the median of the data.
p = ∞: x̂ = 12 (y1 + ym ). This is the mid-range of the data.
10-14
Example: hovercraft revisited
10-15
Example: hovercraft revisited
minimize kukp
xt ,vt ,ut
10-16
Model simplification
xt+1 = xt + vt
for: t = 1, 2, . . . , 49
vt+1 = vt + ut
10-17
Model simplification
xt+1 = xt + vt
for: t = 1, 2, . . . , 49
vt+1 = vt + ut
10-18
Model simplification
xt+1 = xt + vt
for: t = 1, 2, . . . , 49
vt+1 = vt + ut
10-19
Results
1. Minimizing kuk22 (smooth)
0.3
0.2
0.1
Thrust
0.0
0.1
0.2
0.3
0 10 20 30 40 50
Time
0
1
2
3
0 10 20 30 40 50
Time
0.00
0.05
0.10
0.15
0.20
0 10 20 30 40 50
Time 10-20
Tradeoff studies
1. Minimizing kuk22 + λkuk1 (smooth and sparse)
0.4
0.2
Thrust
0.0
0.2
0.4
0 10 20 30 40 50
Time
0.0
0.2
0.4
0.6
0 10 20 30 40 50
Time
0.0
0.1
0.2
0.3
0 10 20 30 40 50
Time 10-21