0% found this document useful (0 votes)
3 views

Decomposition Methods

Uploaded by

Tuấn Đỗ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Decomposition Methods

Uploaded by

Tuấn Đỗ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Decomposition Methods

• separable problems, complicating variables


• primal decomposition
• dual decomposition
• complicating constraints
• general decomposition structures

EE364b, Stanford University


Separable problem

minimize f1(x1) + f2(x2)


subject to x1 ∈ C1, x2 ∈ C2

• we can solve for x1 and x2 separately (in parallel)

• even if they are solved sequentially, this gives advantage if


computational effort is superlinear in problem size

• called separable or trivially parallelizable

• generalizes to any objective of form Ψ(f1, f2) with Ψ nondecreasing


(e.g., max)

EE364b, Stanford University 1


Complicating variable

consider unconstrained problem,

minimize f (x) = f1(x1, y) + f2(x2, y)

x = (x1, x2, y)
• y is the complicating variable or coupling variable; when it is fixed
the problem is separable in x1 and x2

• x1, x2 are private or local variables; y is a public or interface or


boundary variable between the two subproblems

EE364b, Stanford University 2


Primal decomposition
fix y and define

subproblem 1: minimizex1 f1(x1, y)


subproblem 2: minimizex2 f2(x2, y)

with optimal values φ1(y) and φ2(y)


original problem is equivalent to master problem

minimizey φ1(y) + φ2(y)

with variable y
called primal decomposition since master problem manipulates primal
(complicating) variables

EE364b, Stanford University 3


• if original problem is convex, so is master problem

• can solve master problem using


– bisection (if y is scalar)
– gradient or Newton method (if φi differentiable)
– subgradient, cutting-plane, or ellipsoid method

• each iteration of master problem requires solving the two subproblems


(in parallel)

• if master algorithm converges fast enough and subproblems are


sufficiently easier to solve than original problem, we get savings

EE364b, Stanford University 4


Primal decomposition algorithm
(using subgradient algorithm for master)

repeat
1. Solve the subproblems (in parallel).
Find x1 that minimizes f1(x1, y), and a subgradient g1 ∈ ∂φ1(y).
Find x2 that minimizes f2(x2, y), and a subgradient g2 ∈ ∂φ2(y).
2. Update complicating variable.
y := y − αk (g1 + g2).

step length αk can be chosen in any of the standard ways

EE364b, Stanford University 5


Example

• x1, x2 ∈ R20, y ∈ R
• fi are PWL (max of 100 affine functions each); f ⋆ ≈ 1.71
3.5
φ1 (y)
φ2 (y)
3 φ1 (y) + φ2 (y)

2.5

1.5

0.5
−1 −0.5 0 0.5 1
y
EE364b, Stanford University 6
primal decomposition, using bisection on y

0
10

−1
10
f (k) − f ⋆

−2
10

−3
10

−4
10
1 2 3 4 5 6 7 8
k

EE364b, Stanford University 7


Dual decomposition
Step 1: introduce new variables y1, y2

minimize f (x) = f1(x1, y1) + f2(x2, y2)


subject to y1 = y2

• y1, y2 are local versions of complicating variable y


• y1 = y2 is consensus constraint

EE364b, Stanford University 8


Step 2: form dual problem

L(x1, y1, x2, y2) = f1(x1, y1) + f2(x2, y2) + ν T (y1 − y2)

separable; can minimize over (x1, y1) and (x2, y2) separately

inf f1(x1, y1) + ν y1 = −f1∗(0, −ν)


T

g1(ν) =
x1 ,y1

g2(ν) = inf f2(x2, y2) − ν y2 = −f2∗(0, ν)


T

x2 ,y2

dual problem is: maximize g(ν) = g1(ν) + g2(ν)


• computing gi(ν) are the dual subproblems
• can be done in parallel
• a subgradient of −g is y2 − y1 (from solutions of subproblems)

EE364b, Stanford University 9


Dual decomposition algorithm
(using subgradient algorithm for master)

repeat
1. Solve the dual subproblems (in parallel).
Find x1, y1 that minimize f1(x1, y1) + ν T y1.
Find x2, y2 that minimize f2(x2, y2) − ν T y2.
2. Update dual variables (prices).
ν := ν − αk (y2 − y1).

• step length αk can be chosen in standard ways


• at each step we have a lower bound g(ν) on p⋆
• iterates are generally infeasible, i.e., y1 6= y2

EE364b, Stanford University 10


Finding feasible iterates

• reasonable guess of feasible point from (x1, y1), (x2, y2):

(x1, ȳ), (x2, ȳ), ȳ = (y1 + y2)/2

– projection onto feasible set y1 = y2


– gives upper bound p⋆ ≤ f1(x1, ȳ) + f2(x2, ȳ)

• a better feasible point: replace y1, y2 with ȳ and solve primal


subproblems minimizex1 f1(x1, ȳ), minimizex2 f2(x2, ȳ)
– gives (better) upper bound p⋆ ≤ φ1(ȳ) + φ2(ȳ)

EE364b, Stanford University 11


(Same) example

2.5
g1 (ν)
g2 (ν)
g1 (ν) + g2 (ν)
2

1.5

0.5

0
−1 −0.5 0 0.5 1
ν

EE364b, Stanford University 12


dual decomposition convergence (using bisection on ν)

2.6
better bound
2.5 worse bound
2.4 g(ν)
2.3

2.2

2.1

1.9

1.8

1.7

1.6
0 5 10 15
k

EE364b, Stanford University 13


Interpretation

• y1 is resources consumed by first unit, y2 is resources generated by


second unit

• y1 = y2 is consistency condition: supply equals demand

• ν is a set of resource prices

• master algorithm adjusts prices at each step, rather than allocating


resources directly (primal decomposition)

EE364b, Stanford University 14


Recovering the primal solution from the dual
• iterates in dual decomposition:
(k) (k) (k) (k)
ν (k), (x1 , y1 ), (x2 , y2 )
(k) (k)
– x1 , y1 is minimizer of f1(x1, y1) + ν (k)T y1 found in subproblem 1
(k) (k)
– x2 , y2 is minimizer of f2(x2, y2) − ν (k)T y2 found in subproblem 2

• ν (k) → ν ⋆ (i.e., we have price convergence)


(k) (k)
• subtlety: we need not have y1 − y2 →0
(k) (k)
• the hammer: if fi strictly convex, we have y1 − y2 →0

• can fix allocation (i.e., compute φi), or add regularization terms ǫkyik2

EE364b, Stanford University 15


Decomposition with constraints
can also have complicating constraints, as in

minimize f1(x1) + f2(x2)


subject to x1 ∈ C1, x2 ∈ C2
h1(x1) + h2(x2)  0

• fi, hi, Ci convex

• h1(x1) + h2(x2)  0 is a set of p complicating or coupling constraints,


involving both x1 and x2

• can interpret coupling constraints as limits on resources shared between


two subproblems

EE364b, Stanford University 16


Primal decomposition
fix t ∈ Rp and define

minimize f1(x1)
subproblem 1:
subject to x1 ∈ C1, h1(x1)  t

minimize f2(x2)
subproblem 2:
subject to x2 ∈ C2, h2(x2)  −t

• t is the quantity of resources allocated to first subproblem


(−t is allocated to second subproblem)

• master problem: minimize φ1(t) + φ2(t) (optimal values of


subproblems) over t

• subproblems can be solved separately (in parallel) when t is fixed

EE364b, Stanford University 17


Primal decomposition algorithm
repeat
1. Solve the subproblems (in parallel).
Solve subproblem 1, finding x1 and λ1.
Solve subproblem 2, finding x2 and λ2.
2. Update resource allocation.
t := t − αk (λ2 − λ1).
• λi is an optimal Lagrange multiplier associated with resource constraint
in subproblem i

• λ2 − λ1 ∈ ∂(φ1 + φ2)(t)

• αk is an appropriate step size

• all iterates are feasible (when subproblems are feasible)

EE364b, Stanford University 18


Example

• x1, x2 ∈ R20, t ∈ R2; fi are quadratic, hi are affine, Ci are polyhedra


defined by 100 inequalities; p⋆ ≈ −1.33; αk = 0.5/k
0
10

−1
10
f (k) − p⋆

−2
10

−3
10

−4
10
0 20 40 60 80 100
k

EE364b, Stanford University 19


resource allocation t to first subsystem (second subsystem gets −t)

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

−0.1

−0.2
0 20 40 60 80 100
k

EE364b, Stanford University 20


Dual decomposition
form (separable) partial Lagrangian

L(x1, x2, λ) = f1(x1) + f2(x2) + λT (h1(x1) + h2(x2))


T T
 
= f1(x1) + λ h1(x1) + f2(x2) + λ h2(x2)

fix dual variable λ and define

minimize f1(x1) + λT h1(x1)


subproblem 1:
subject to x1 ∈ C1

minimize f2(x2) + λT h2(x2)


subproblem 2:
subject to x2 ∈ C2

with optimal values g1(λ), g2(λ)

EE364b, Stanford University 21


• −hi(x̄i) ∈ ∂(−gi)(λ), where x̄i is any solution to subproblem i

• −h1(x̄1) − h2(x̄2) ∈ ∂(−g)(λ)

• the master algorithm updates λ using this subgradient

EE364b, Stanford University 22


Dual decomposition algorithm
(using projected subgradient method)
repeat
1. Solve the subproblems (in parallel).
Solve subproblem 1, finding an optimal x̄1.
Solve subproblem 2, finding an optimal x̄2.
2. Update dual variables (prices).
λ := (λ + αk (h1(x̄1) + h2(x̄2)))+.

• αk is an appropriate step size

• iterates need not be feasible

• can again construct feasible primal variables using projection

EE364b, Stanford University 23


Interpretation

• λ gives prices of resources

• subproblems are solved separately, taking income/expense from resource


usage into account

• master algorithm adjusts prices

• prices on over-subscribed resources are increased; prices on


undersubscribed resources are reduced, but never made negative

EE364b, Stanford University 24


(Same) example
subgradient method for master; resource prices λ
0.5

0.45

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

0
0 5 10 15 20 25 30
k

EE364b, Stanford University 25


dual decomposition convergence; fˆ is objective of projected feasible
allocation

−0.4
g
−0.6 fˆ

−0.8

−1

−1.2

−1.4

−1.6

−1.8
0 5 10 15 20 25 30
k

EE364b, Stanford University 26


1
10
p⋆ − g(λ)
0 fˆ − g(λ)
10

−1
10

−2
10

−3
10

−4
10

0 5 10 15 20 25 30
k

EE364b, Stanford University 27


General decomposition structures

• multiple subsystems

• (variable and/or constraint) coupling constraints between subsets of


subsystems

• represent as hypergraph with subsystems as vertices, coupling as


hyperedges or nets

• without loss of generality, can assume all coupling is via consistency


constraints

EE364b, Stanford University 28


Simple example

1 2 3

• 3 subsystems, with private variables x1, x2, x3, and public variables y1,
(y2, y3), and y4
• 2 (simple) edges

minimize f1(x1, y1) + f2(x2, y2, y3) + f3(x3, y4)


subject to (x1, y1) ∈ C1, (x2, y2, y3) ∈ C2, (x3, y4) ∈ C3
y1 = y2 , y3 = y4

EE364b, Stanford University 29


A more complex example

c1
1 2
c2

c3 c4
3 4 5

EE364b, Stanford University 30


General form

PK
minimize i=1 fi (xi , yi )
subject to (xi, yi) ∈ Ci, i = 1, . . . , K
yi = Eiz, i = 1, . . . , K

• private variables xi, public variables yi


• net (hyperedge) variables z ∈ RN ; zi is common value of public
variables in net i
• matrices Ei give netlist or hypergraph
row k is ep, where kth entry of yi is in net p

EE364b, Stanford University 31


Primal decomposition
φi(yi) is optimal value of subproblem

minimize fi(xi, yi)


subject to (xi, yi) ∈ Ci

repeat
1. Distribute net variables to subsystems.
yi := Eiz, i = 1, . . . , K.
2. Optimize subsystems (separately).
Solve subproblems to find optimal xi, gi ∈ ∂φi(yi), i = 1, . . . , K.
3. Collect and sum subgradients for each net.
PK
g := i=1 EiT gi.
4. Update net variables.
z := z − αk g.

EE364b, Stanford University 32


Dual decomposition
gi(νi) is optimal value of subproblem

minimize fi(xi, yi) + νiT yi


subject to (xi, yi) ∈ Ci

given initial price vector ν that satisfies E T ν = 0 (e.g., ν = 0).


repeat
1. Optimize subsystems (separately).
Solve subproblems to obtain xi, yi.
2. Compute average value of public variables over each net.
ẑ := (E T E)−1E T y.
3. Update prices on public variables.
ν := ν + αk (y − E ẑ).

EE364b, Stanford University 33


A more complex example
subsystems: quadratic plus PWL objective with 10 private variables;
9 public variables and 4 nets; p⋆ ≈ 11.1; α = 0.5
8
g(ν)
7 f (x̂, ŷ)
f (x, ŷ)
6

0
0 2 4 6 8 10
k

EE364b, Stanford University 34


consistency constraint residual ky − E ẑk versus iteration number

1
10

0
10
ky − E ẑk

−1
10

−2
10

−3
10

−4
10
0 2 4 6 8 10
k

EE364b, Stanford University 35

You might also like