0% found this document useful (0 votes)

69 views

Gradient Flow PDF

The document summarizes gradient flow in different metrics. It provides examples of gradient flow in Euclidean space, L2 space, and Wasserstein space. For each metric, it gives the definition of the gradient, formulas for the gradient, examples of energies and their gradient flows. Gradient flow evolves curves in the direction of steepest descent of an energy functional.

Uploaded by

Jhon Edison Bravo Buitrago

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

Gradient Flow PDF

Uploaded by

Jhon Edison Bravo Buitrago

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

3

Time
1

0
Pos
itio 0.0
0.5
n -0.5

Gradient Flow in the Wasserstein Metric

Katy Craig
University of California, Santa Barbara

NIPS, Optimal Transport & Machine Learning

December 9th, 2017
gradient flow in finite dimensions
A curve x(t): [0,T ] → ℝd is the gradient flow of an energy E: ℝd → ℝ if
d
x(t) = rE(x(t))
dt
• “x(t) evolves in the direction of steepest descent of E”
• initial value problem: given x(0), find the gradient flow x(t)

Example:
metric energy functional gradient flow

d 1 2 d
(R , | · |) E(x) = x x(t) = x(t)
2 dt
Given x(0) ∈ ℝd, x(t) = x(0)e-t is unique solution of the gradient flow.

2
gradient flow in finite dimensions
Gradient flows often arise when solving optimization problems:
min E(x)
x2Rd

Convexity of the energy determines stability and long time behavior.

Def: An energy E is λ-convex if D 2

E Id⇥d or, equivalently, if
E((1 t)x + ty)  (1 t)E(x) + tE(y) t(1 t) |x y|2
2
for all x,y ∈ ℝ, t ∈ [0,1].
x2
f (x) = , =1 f (x) = sin(x), = 1
2

3
gradient flow in finite dimensions
If E(x) is λ-convex, then…

1) Stability: for any gradient flows x(t) and y(t),

t
|x(t) y(t)|  e |x(0) y(0)|

2) long time behavior: if λ>0, there is a unique solution x̅ of min E(x)

x2Rd
and any gradient flow x(t) converges to x̅ as t → +∞:
t
|x(t) x̄|  e |x(0) x̄|

4
gradient flow

5
gradient flow

5
gradient flow with different metrics
In general, given a complete metric space (X,d), a curve x(t): ℝ → X is the
gradient flow of an energy E: X → ℝ if
“ d
x(t) = rX E(x(t)) ’’
dt
Examples: Euclidean L2
metric (X,d) (Rd , | · |) (L2 (Rd ), k · kL2 )

E(x + hv) E(x) E(f + hg) E(f )

def of ∇X hrE(x), vi = lim
h!0 h
hrE(f ), gi = lim
h!0 h
@E
formula for r
∇XX rRd E(x) = rE(x) rL2 (Rd ) E(f ) =
@f
Z
1 2 1
energy E(x) = x E(f ) = |f |2
2 2
d d
gradient flow x(t) = x(t) f (x, t) = f (x, t)
dt dt
6
gradient flow with different metrics
In general, given a complete metric space (X,d), a curve x(t): ℝ → X is the
gradient flow of an energy E: X → ℝ if
“ d
x(t) = rX E(x(t)) ’’
dt
Examples: Euclidean L2
metric (X,d) (Rd , | · |) (L2 (Rd ), k · kL2 )

E(x + hv) E(x) E(f + hg) E(f )

def of ∇X hrE(x), vi = lim
h!0 h
hrE(f ), gi = lim
h!0 h
@E
formula for r
∇XX rRd E(x) = rE(x) rL2 (Rd ) E(f ) =
@f
Z
1 2 1
energy E(x) = x E(f ) = |rf
|f |2 |2
2 2
d d
gradient flow x(t) = x(t) f (x, t) = f (x, t)
dt dt
6
gradient flow with different metrics
In general, given a complete metric space (X,d), a curve x(t): ℝ → X is the
gradient flow of an energy E: X → ℝ if
“ d
x(t) = rX E(x(t)) ’’
dt
Examples: Euclidean L2
metric (X,d) (Rd , | · |) (L2 (Rd ), k · kL2 )

E(x + hv) E(x) E(f + hg) E(f )

def of ∇X hrE(x), vi = lim
h!0 h
hrE(f ), gi = lim
h!0 h
@E
formula for r
∇XX rRd E(x) = rE(x) rL2 (Rd ) E(f ) =
@f
Z
1 2 1
energy E(x) = x E(f ) = |rf
|f |2 |2
2 2
d d
gradient flow x(t) = x(t) f (x, t) = f (x, t)
dt dt
6
gradient flow with different metrics
finite diﬀerence approximation
f : Rd ! R approximated by {fi }i2hZd
approximate values of function
Examples: Euclidean L2
metric (X,d) (Rd , | · |) (L2 (Rd ), k · kL2 )

E(x + hv) E(x) E(f + hg) E(f )

Examples: Euclidean W2
metric (X,d) (Rd , | · |) (P2 (Rd ), W2 )

E(x + hv) E(x)

def of ∇X hrE(x), vi = lim
h!0 h
✓ ◆
@E
formula for r
∇XX rRd E(x) = rE(x) rW2 E(⇢) = r · ⇢r
@⇢
Z
1 2 1
energy E(x) = x E(⇢) = x2 ⇢(x)dx
2 2
d d
gradient flow x(t) = x(t) ⇢(x, t) = r · (x⇢(x, t))
dt dt
8
gradient flow with different metrics

E((id + h⇠)#µ) E(µ)

hrE(µ), r · (⇠µ)iTanµ P2 (Rd ) = lim
h!0 h

Examples: Euclidean W2
metric (X,d) (Rd , | · |) (P2 (Rd ), W2 )

E(x + hv) E(x)

Examples: Euclidean W2
metric (X,d) (Rd , | · |) (P2 (Rd ), W2 )

E(x + hv) E(x)

Examples: Euclidean W2
metric (X,d) (Rd , | · |) (P2 (Rd ), W2 )

E(x + hv) E(x)

def of ∇X hrE(x), vi = lim
h!0 h
✓ ◆
@E
formula for r
∇XX rRd E(x) = rE(x) rW2 E(⇢) = r · ⇢r
@⇢
ZZ
1 2 1
energy E(x) = x E(⇢) = ⇢(x) x2 ⇢(x)dx
log(⇢(x))dx
2 2
d d
gradient flow x(t) = x(t) ⇢(x, t) = r · (x⇢(x, t))
dt dt
8
gradient flow with different metrics

Examples: Euclidean W2
metric (X,d) (Rd , | · |) (P2 (Rd ), W2 )

E(x + hv) E(x)

def of ∇X hrE(x), vi = lim
h!0 h
✓ ◆
@E
formula for r
∇XX rRd E(x) = rE(x) rW2 E(⇢) = r · ⇢r
@⇢
ZZ
1 2 1
energy E(x) = x E(⇢) = ⇢(x) x2 ⇢(x)dx
log(⇢(x))dx
2 2
d d
gradient flow x(t) = x(t) ⇢(x, t) = r⇢(x,
· (x⇢(x,
t) t))
dt dt
8
gradient flow with different metrics
particle approximation N
X
f : Rd ! R approximated by xi mi
i=1
approximate mass of function
Examples: Euclidean W2
metric (X,d) (Rd , | · |) (P2 (Rd ), W2 )

E(x + hv) E(x)

def of ∇X hrE(x), vi = lim
h!0 h
✓ ◆
@E
formula for r
∇XX rRd E(x) = rE(x) rW2 E(⇢) = r · ⇢r
@⇢
ZZ
1 2 1
energy E(x) = x E(⇢) = ⇢(x) x2 ⇢(x)dx
log(⇢(x))dx
2 2
d d
gradient flow x(t) = x(t) ⇢(x, t) = r⇢(x,
· (x⇢(x,
t) t))
dt dt
9
interpolating with different metrics
The same dichotomy between values of a function and mass of a
function is also present in the geodesics.
Def: A constant speed geodesic between two points ρ0 and ρ1 in a
metric space (X,d) is any curve ρ:[0,1]→X s.t.
⇢(0) = ⇢0 , ⇢(1) = ⇢1 , d(⇢(t), ⇢(s)) = |t s|d(⇢0 , ⇢1 )

L2 geodesic W2 geodesic
⇢(t) = (1 t)⇢0 + t⇢1 ⇢(t) = ((1 t)id + tT⇢⇢01 )#⇢0

10
interpolating with different metrics
The same dichotomy between values of a function and mass of a
function is also present in the geodesics.
Def: A constant speed geodesic between two points ρ0 and ρ1 in a
metric space (X,d) is any curve ρ:[0,1]→X s.t.
⇢(0) = ⇢0 , ⇢(1) = ⇢1 , d(⇢(t), ⇢(s)) = |t s|d(⇢0 , ⇢1 )

L2 geodesic W2 geodesic
⇢(t) = (1 t)⇢0 + t⇢1 ⇢(t) = ((1 t)id + tT⇢⇢01 )#⇢0

10
gradient flow in the Wasserstein metric
Examples:
energy functional gradient flow
Z
d
E(⇢) = ⇢ log ⇢ ⇢= ⇢
dt
Z
1 d
E(⇢) = ⇢m ⇢ = ⇢m
m 1 dt
Z
d
E(⇢) = V ⇢ ⇢ = r · (rV ⇢)
dt
Z
d
E(⇢) = (K ⇤ ⇢)⇢ ⇢ = r · (r(K ⇤ ⇢)⇢)
dt
All Wasserstein gradient flows are of the form
d
⇢ + r · (v⇢) = 0
dt
continuity equation 11
gradient flow in the Wasserstein metric
Examples:
energy functional gradient flow
Z r⇢
d
E(⇢) = ⇢ log ⇢ ⇢= ⇢ v=
dt ⇢
Z
1 d
E(⇢) = ⇢m ⇢ = ⇢m
m 1 dt
Z
d
E(⇢) = V ⇢ ⇢ = r · (rV ⇢)
dt
Z
d
E(⇢) = (K ⇤ ⇢)⇢ ⇢ = r · (r(K ⇤ ⇢)⇢)
dt
All Wasserstein gradient flows are of the form
d
⇢ + r · (v⇢) = 0
dt
continuity equation 11
gradient flow in the Wasserstein metric
Examples:
energy functional gradient flow
Z r⇢
d
E(⇢) = ⇢ log ⇢ ⇢= ⇢ v=
dt ⇢
Z
1 d
E(⇢) = ⇢m ⇢ = ⇢m v= m⇢m 2
r⇢
m 1 dt
Z
d
E(⇢) = V ⇢ ⇢ = r · (rV ⇢)
dt
Z
d
E(⇢) = (K ⇤ ⇢)⇢ ⇢ = r · (r(K ⇤ ⇢)⇢)
dt
All Wasserstein gradient flows are of the form
d
⇢ + r · (v⇢) = 0
dt
continuity equation 11
gradient flow in the Wasserstein metric
Examples:
energy functional gradient flow
Z r⇢
d
E(⇢) = ⇢ log ⇢ ⇢= ⇢ v=
dt ⇢
Z
1 d
E(⇢) = ⇢m ⇢ = ⇢m v= m⇢m 2
r⇢
m 1 dt
Z
d
E(⇢) = V ⇢ ⇢ = r · (rV ⇢) v= rV
dt
Z
d
E(⇢) = (K ⇤ ⇢)⇢ ⇢ = r · (r(K ⇤ ⇢)⇢)
dt
All Wasserstein gradient flows are of the form
d
⇢ + r · (v⇢) = 0
dt
continuity equation 11
gradient flow in the Wasserstein metric
Examples:
energy functional gradient flow
Z r⇢
d
E(⇢) = ⇢ log ⇢ ⇢= ⇢ v=
dt ⇢
Z
1 d
E(⇢) = ⇢m ⇢ = ⇢m v= m⇢m 2
r⇢
m 1 dt
Z
d
E(⇢) = V ⇢ ⇢ = r · (rV ⇢) v= rV
dt
Z
d
E(⇢) = (K ⇤ ⇢)⇢ ⇢ = r · (r(K ⇤ ⇢)⇢) v = r(K ⇤ ⇢)
dt
All Wasserstein gradient flows are of the form
d
⇢ + r · (v⇢) = 0
dt
continuity equation 11
gradient flow in the Wasserstein metric
Examples:
energy functional gradient flow
Z r⇢
d
E(⇢) = ⇢ log ⇢ ⇢= ⇢ v=
dt ⇢
Z
1 d
E(⇢) = ⇢m ⇢ = ⇢m v= m⇢m 2
r⇢
m 1 dt
Z
d
E(⇢) = V ⇢ ⇢ = r · (rV ⇢) v= rV
dt
Z
d
E(⇢) = (K ⇤ ⇢)⇢ ⇢ = r · (r(K ⇤ ⇢)⇢) v = r(K ⇤ ⇢)
dt
All Wasserstein gradient flows are of the form
d @E
⇢ + r · (v⇢) = 0 v= r
dt @⇢
continuity equation 11
gradient flow in the Wasserstein metric
aggregation, drift, and degenerate diffusion:
d
⇢ = r · ((rK ⇤ ⇢)⇢) + r · (rV ⇢) + ⇢m K, V : Rd ! R, and m 1
dt
| {z } |{z} |{z}
self|interaction
{z } drift |{z}
diffusion
|{z}
Z Z Z
1 1
E(⇢) = K ⇤ ⇢d⇢ + V d⇢ + ⇢m
2 m 1

interaction kernels: degenerate diffusion:

m m 1
• granular media: K(x) = |x|3 • ⇢ = r · (m⇢ r⇢)
| {z }
• swarming: K(x) = |x|ᵃ/a
( - |x|ᵇ/b, -d<b<a
D
1
log |x| if d = 2,
• chemotaxis: K(x) = 2⇡
Cd |x|2 d otherwise.

12
biological chemotaxis
a colony of slime mold [Gregor, et. al]

13
biological chemotaxis
a colony of slime mold [Gregor, et. al]

13
gradient flow in the Wasserstein metric
aggregation, drift, and degenerate diffusion:
d
⇢ = r · ((rK ⇤ ⇢)⇢) + r · (rV ⇢) + ⇢m K, V : Rd ! R, and m 1
dt