Francoi Delarue Lectures
Francoi Delarue Lectures
Ramesh Kadambi
1
May, 09, 2021
Contents
1 Motivation 4
2 Asymptotic Formulation 4
2.1 Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4 Interaction Model 5
6 Propagation of Chaos 6
6.1 Distance in Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
6.2 Taking N to ∞. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
10 Other Approaches 15
10.1 The fixed point problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
10.2 Solvability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
15 Convergence Problem 18
15.1 Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
15.2 The Master Equation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
15.3 Finding the NASH Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
15.4 The Classical Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
15.5 The ASNATZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
15.6 From Master Equation to Nash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
May, 09, 2021
15.7 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1 Motivation
Mean Field Games are extension of finite player games to Infinitely many players. the main characteristics of the Mean Field
models are,
2 Asymptotic Formulation
From the finite to infinite reduces the complexity via the notions of propagation of chaos and LLN. We do this in two steps and
are interested in how equilibria are formed and the nature of these equilibria.
2.1 Paradigm
1. The main ideas used are mean-field/symmetry justified by propagation of chaos and LLN.
2. We reduce the asymptotic analysis to one typical player with interaction with a theoretical distribution of the population?
3. Decrease the complexity of the asymptotic formulation.
2.2 Program
1. Are there asymptotic equilibria? Are they unique? If they exist what is the shapre of this equilibrium?
2. We want to then go backward to finite games. We want to use Asymptotic equilibria as quasi-equilibria in finite game.
3. Prove the convergence of equilibria in finite-player games.
The state Xti ∈ Rd , Wti are N -independent Brownian motions. The cost function for each player.for a finite horizon,
( )
Z T
i 1 i2
J =E g(XTi , <>) + i
f (Xi , <>) + |αt | dt
0 2
Above the Wti are independent Brownian Motions. On the same token, X0i are independent. The next issue is the interaction
of the agents. That is where the place holder <> comes into play. The functions g : Rd × <>7→ R and similarly f : Rd × <>7→ R
May, 09, 2021
take in the interaction term specified by the empirical distribution of the agents. The empirical distribution is essentially given
by,
N
1 X
µN
t = δ j
N j=1 Xt
The collective state of the population is described by this measure. The cost function now becomes,
( Z T )
i i N i N 1 i2
J = E g(XT , µt ) + f (Xi , µt ) + |αt | dt
0 2
1 1
PN 1
PN
If now we consider µ = µN
P
X = N j δxj , we can write f (xi , N j=1 δxj ) = N j=1 F (xi − xj ). This is an interesting
function that indeed depends on the distribution of the agent states. If F is the distance between the states, then it would capture
the average distance of all the points from xi . This is compared to the potential energy in physics.
In the above definition the equilibrium is only based on the equilibrium values of the remaining agents. It is fixed and solved as
an open loop control problem. There is a closed loop control solution where the optimal is a function of the current state of the
system. In the case of the open loop set up the optimal is the function just of the noises. In the case of the closed loop system,
the control or strategy is given as,
From the optimal control problem for example in the case of a robot, the open loop control is just a minimization problem that
reveals a function u(t) that drives the robot from point a to b given the performance criteria. In the case of a feed back control,
u(t) = f (Xd − Xt ) and it will not require a path planning where you specify the tracking trajectory.
4 Interaction Model
The interaction can be modeled as a function of the mutual states and the empirical distribution of the states of the agents, or
it can be modeled as the empirical distribution of their actions. In the case of price formation and price setting one is probably
setting ones price based on the mean of all the prices that are set. The latter problem where the interaction is modeled via the
strategies/actions of the agents is called MFG of control type.
May, 09, 2021
Where µ∗,N
N
= N1 j=1 δX ∗,j . The question we would like to answer now is what happens to this new particle system that has
P
t t
the modifications using symmetry as N → ∞. Enter propagation of chaos.
6 Propagation of Chaos
Now that we have the asymptotic formulation with a fixed α as in (5.2), our question is for this large system of particles as
follows.
1. Existence.
2. Well-Posedness.
Apparently the Brownian motions make easier and are helpful. The conditions for this are similar to that for first order ODE.
That is,
N
d×N 1 X
[0, T ] × R ∋ (t, (x1 , x2 , · · · , xN )) 7→ α(t, xi , δx ) (6.1)
N j=1 j
where xi ∈ Rd , is Lipschitz. Note that here the empirical measure is fixed. It also needs to be bounded in time. The issue is that
α is defined on the space of probability measures and not in Euclidean space. The smoothness and Lipschitz conditions have to
be specified in he measure space. If you see the definition of (6.1), you see that we replaced the {x−i } by the empirical measure.
So essentially the functions α takes the states and maps it into a measure space.
We now given µ, ν ∈ Pp (Rd ) define a measure Π on Rd × Rd such that µ, ν are marginal densities of Π. We basically have,
µ = Π ◦ e−1
1 ; e1 : (x1 , x2 ) ∈ Rd × Rd 7→ x1 (6.2)
ν =Π◦ e−1
2 ;
d d
e2 : (x1 , x2 ) ∈ R × R 7→ x2 (6.3)
The distance function ∀µ, ν ∈ P(Rd ) now is given as,
Z p1
p
Wp (µ, ν) = inf |x1 − x2 | dΠ(x1 , x2 )
Π Rd ×Rd
May, 09, 2021
Remark 2 (Bound for distance). Consider two random variables X1 and X2 , if Px1 and Px2 are the probability densities
of these randomR variables. We have the joint density given by PX1 ,X2 . Clearly, PX1 , PX2 ∈ mathcalP (R). We now have
E[|x1 − x2 |p ] = R×R |x1 − x2 |p dPX1 ,X2 (x1 , x2 ). Now to find a bound to our Wp (µ, ν), we define Πmax = PX1 ,X2 , clearly PX1 ,X2
satisfies the conditions (6.2), (6.3). We also have,
Z p1
1
Π p p
Wp (PX1 , PX2 ) ≤ E [|x1 − x2 | ] ≤ p |x1 − x2 | dPX1 ,X2 (x1 , x2 ) . (6.4)
R×R
We see that (6.4) is true by definition. This is a major advantage. If we have random variables then it is fairly straight forward
to construct a distance function that is bounded.
PN PN
Remark 3 (Empirical Measures Distance). Consider X, Y ∈ RN , we then have Wp ( N1 i=1 δxi , N1 i=1 δyi ) where PX =
1
PN 1
Pn
N i=1 δxi and PY = N i=1 δyi . This is fairly straight forward to compute. The distance function is given by,
( n
) p1
1X
Wp (P1 , P2 ) ≤ E[|x1 − x2 |p ] = |xi − yi |p .
n i=1
Remark 4 (Lipschitz requirement for α). The standard assumption regarding the functions α(t, xi , µN t ) is that it is Lipschitz
continuous in x and µN
t ∈ Pp (Rd
) where p ∈ {1, 2}, and Pp (Rd
) is equipped with distance functions Wp.
6.2 Taking N to ∞.
When N → ∞ our particle system can be written independent of the optimization problem as,
dXti = α(t, xi , µN i
t )dt + dBt (6.5)
We observe that as N → ∞ the individual particles Xi will be impacted less by the noises Bt−i . We have ansatz, particles are
uncorrelated as N → ∞. This also impacts our early assumption that the initial distribution of Xi were independent. As N → ∞
we claim that they are indentically distributed as well. The initial state of Xi is that of N -iid particles. Given our symmetry
they are exchangeable and iid for all time. The level of uncorrelation is strong enough to satisfy independence of particles. Now
we can use LLN. We therefore have,
µN
t ≈ Px .
N →∞
Proposition 6.1 (The same distribution.). Let us denote the particles as {XtN,1 , ·, XtN,N }t∈[0,T ] , the solution to (6.5) as N → ∞
has the probability law,
L
P (XtN,1 , · · · , XtN,N ) ⇒ P (X t )⊗k , t ∈ [0, T ]
N →∞
The above equation is called the McKean-Vlasov stochastic differential equation. The solution here is interacting with its own
distribution and we have introduced a new Brownian motion that is independent of the label. This again boils down to the
indistinguishability of the particles.
Since this is true for any arbitrary distribution function, we have the following for the density function.
1
∂t µt (x) = −∇.[α(t, X t , µt (x))µt (x)] + ∇T ∇µt (x)
2
1
We solve this as an initial value problem with m0 (x) = P (X 0 ). These result are from the notes Sznitman [2].
Remark 5 (N → ∞ Dynamics). We just recall the N → ∞ dynamics and the Fokker-Plank equations.
∗ ∗
dX t = α(t, X t , µt (x))dt + dBt
∗ 1
∂t µt (x) = −∇.(α(t, X t , µt (x))µt (x)) + ∇T ∇µt (x)
2
∗
µ0 (x) = P (X 0 )
However, we are looking for an intrinsic characterization of the equilibrium. In order to do so we go back t our optimization
problem and try to simplify our original model.
subject to:
Subject to:
dXt = αt dt = dBt X0 ≈ m0
with cost
( )
Z T
1 2
J(α, µ) = E g(XT , mT ) + f (Xt , mt ) + |αt | dt
0 2
towards it.
Model 2 (The tagged player). The value function u(t, x) is given as,
( Z T )
1 2
u(t, x) = inf E g(XT , mT ) + f (Xs , ms ) + |αs | ds
αs t 2
s∈[t,T ]
subject to:
dXt = αs ds + dBs , t ≤ s ≤ T, Xt = x
where mt is the equilibrium distribution. Note we have used m here as later we need to distinguish between m and the µ we
have used all along.
βδx u(t, x) + 12 |β|2 we obtain the term − 21 |δx u(t, x)|2 . This would make the above system non-linear.
The minimization of inf
β∈Rd
If f, g are bounded and smooth, there are classical solutions to this problem. The solution is obtained through linearization via
the Cole-Hopf transformation. The optimal trajectory on solving the system is obtained to be,
The well posedness is as usual the lipschitz and boundedness of ∂x u(t, Xt∗ ). We still have to identify our distribution Xt∗ . This
is done with the Fokker Planck equation, after appropriately replacing the α term.
1
∂t µt (x) = −∇.[−∂x u(t, <>)µt (x)] + ∇T ∇µt (x)
2
Not that u(t, <>) depends on mt our theoretical distribution. We resolve this by setting µt = mt . Also taking into note that
f does not depend on all the Xi , ∇T ∇ the Hessian reduces to just the Laplacian1 . We now have the following,
The understanding here is that as in a typical optimal control problem, you have a backward and forward pair. The
distribution equations (mt ) are forward equations. The state equations are backward equations. The distribution represents an
equilibrium distribution of the population at t. The function u(t, x) is the cost of the representative agent or the tagged player.
The classical solution to the HJB when u, ∂x u are bounded is obtained using the transformation v(t, x) = eu(t,x) this linearizes
the system 3 .This may not work in all situations. This is called the Cole-Hopf transformation.
The problem we have is that f : Rd × P(R) 7→ R, g : Rd × P(R) 7→ R need to be bounded uniformly in the measure argu-
ment. This is because the Cole-Hopf transformation is an exponential transformation. Depending on the requirements you need
more and more conditions on these functions.
Once the HJB equation are solved, we can design a mapping as below for the measure and measure flow. The environment (the
entire path) µ = µt , t ∈ [0, T ] : [0, T ] 7→ P2 (Rd . We require the measure to be continuous for W2 -topology. This is a weak
topology on the space of measures using the Wasserstein-distance. The interesting point is that this continuity comes for free
in our setting 4 . We are also looking for time continuity of the cost function. This requires f to be continuous in the measure
arguments with respect to W2 . We call our classical solution that meet these criteria uµ .
Proposition 1. Prove that P (Xt ) is automatically H2 (Holder Continuous) in W2 for a constant C that is independent of
µ.
1
Proof. Consider W2 (P (Xt ), P (Xs )), 0 ≤ s ≤ t. We know that W2 (P (Xt ), P (Xs )) ≤ E[|Xt − Xs |2 ] 2 . We now substitute for
Xt , Xs . We have,
"Z
t 2 # 12
1
2
E[|Xt − Xs | ] = E
2 −∂x u(r, Xr )dr + Bt − Bs
s
now we have the following, since ∂x u(t, Xt ) is bounded we have the following,
"Z
t 2 # 21
W2 (P (Xt ), P (Xs )) ≤ E −∂x u(r, Xr )dr + Bt − Bs
s
"Z
t 2 Z t
#
2
=E −∂x u(r, Xr )dr − 2(Bt − Bs ) ∂x u(t, Xr )dr + (Bt − Bs )
s s
√ √
≤α+β+γ t−s≤C t−s
Despite all this, it is not easy to solve this fixed point problem. To show that this a fixed point we need to show that the
mapping Φ : P(Rd ) 7→ P(Rd ) that creates the probability law of Xt is a contraction mapping in a suitable topology, W2 topology.
This is difficult to prove unless time is small. 5
this is a two point boundary value problem in one dimension. Even when things are linear there are examples where a
solution does not exist. However, if b, f, g are Lipschitz there exists by contraction mapping arguments using Cauchy-
Lipschitz theory a solution to the problem if T is small. The meaning of small is interpreted relative to the Lipschitz
constants of b, f, g.
A simple counter example can be found in [3]. If there is no contraction argument, then you do them separately. Existence
and uniqueness as separate entities. Especially because we do not have a direct contraction argument. In conclusion we need to
separate uniqueness and existence.
0 xT < −l
g(xT ) = xT −l ≤ xT ≤ l
0 xT > l
you will see how the solution behaves. In the case of monotonically increasing g the solutions are indeed unique. When it is
not monotonically decreasing, the solution is unique depending on the time T . If T is small relative to Lipschitz constant.
Definition 5 (Lasry-Lions monotonicity). We say that function f satisfies Lasry-Lions monotonicity condition if, ∀m, m′ ∈
P(Rd ) and f we have,
Z
[f (x, m) − f (x, m′ )] d(m, m′ )(x) ≥ 0, ∀m ̸= m′ .
Rd
This is a fairly intuitive definition. All it is saying that there is a unique probability measure that maximizes expected value of
the given function. Stated simply this means,
′
E m [f ] ≥ E m [f ]. ∀m ̸= m′ .
R
Example 1 (MFG monotonicity). Consider h(x, µ) = Rd L(z, ρ∗µ(z))ρ(x−z)dz, we are trying to minimize this integral. Given
a µ that describes the empirical distribution, the minimum happens only when z is not in a dense neighborhood of x. This is
interesting, the question to ask is how does a choice of this distance function impact the MFG solution? This
works for some problems such as seat selection when you enter a bus, or train. You tend to find a spot where there is no others.
This phenomenon is expected to provide uniqueness.
May, 09, 2021
I am not sure the above condition is the right way to look at it. Since we solve a minimization problem, the right way to
do it would be,
′ ′ ′ ′
E µ [uµ (t, Xt )] ≤ E µ [uµ (t, Xt )], E µ [uµ (t, Xt′ )] ≤ E µ [uµ (t, Xt′ )]
Now take expectations of the performance criteria and since we are at a minimum we have,a
Z " Z T # Z " Z T #
1 2 1
g(x, µ) + f (x, t, µ) + αs ds dµ ≤ g(x, µ) + f (x, t, µ) + (αs ) ds dµ′
2
R d 0 2 R d 0 2
Noting that αs is not a function of µ adding the two will give us the following result,
Z T Z Z T Z
′
f (x, t, µ)d(µ − µ )ds − f (x, t, µ′ )d(µ − µ′ )ds
0 Rd 0 Rd
Z
+ (g(x, µ) − g(x, µ′ )) d(µ − µ′ ) ≤ 0
Rd
We still do not know the structure of the utility function that actually gives you uniqueness. The issue is that contraction
mapping arguments cannot be made. That means we have to do roundabout conditions that facilitate uniqueness and existence.
We make the following choices, Φ is the T . This can be found in [4] chapter 4. We chose
Z
d
E = ν : finite signed measures on R : |x|d|ν|(x) < ∞ .
Rd
Clearly C = P1 (Rd ).
Theorem 2 (Kantorovich-Rubinstein Duality). This can be found in Villani [5]. Given probability measures µ and µ′ on Rd
such that,
Z Z
|x|dµ(x) < ∞, |x|dµ′ (x) < ∞,
Rd Rd
These differential equations should map to the P (XT ), and dXtµ = −∂x uµ (t, Xtµ )+dBt . Our operator T in the Schauder theorem
is the PDE operator. We also have the t C = P1 (Rd ) is closed convex under W1 . What we need to prove is that our PDE
operator Lµ = ∂t uµ + 21 ∇2 uµ − 12 |∂x uµ |2 is continuous.
9.6 Continuity
′
Given µ, µ′ ∈ P1 (Rd ), there is not time. So these are simple measures. We look at the distance between P (XTµ ) and P (XTµ ).
We consider the Wasserstein distance.
′ ′
W1 (P (XTµ , P (XTµ )) ≤ E[|XTµ − XTµ |]
′
The above means that we compare ∂xµ and ∂xµ .
Z T
′ ′ ′
E[|XTµ − XTµ |] ≤ E |∂x u(t, Xtµ ) − ∂xµ (t, X µ )|dt
0
We use the fact that given f = 0, we have from the HJB systems ∂t uµ + 12 ∂xx uµ − 12 |∂x uµ |2 = 0. We now have established
that ∂x u is a martingale. Now we can do something interesting,
Z T Z T
′ ′ ′
E |∂x u(t, Xtµ ) − ∂xµ (t, X µ )|dt ≤ E |∂x g(XTµ , µ) − ∂x g(XTµ , µ′ )|dt
0 0
6 Note u]2
that ∂x [∂x = 2∂x u∂xx u and this gives us the result here. This is an interesting result. So the delta of an option is a Martingale? Not
really, we need it to satisfy HJB. That is interesting.
May, 09, 2021
′
We have not much of an advantage since we still have to compare XTµ and XTµ . Instead we compare.
′ ′ ′ ′ ′
d(XTµ − dXTµ = −(∂x uµ (t, XTµ ) − ∂x uµ (t, XTµ ))dt ≤ ∥∂x uµ − ∂x uµ ∥∞ dt + C|Xtµ − Xtµ |dt
′
Where C ≈ ∥∂xx W∥∞ . We now have to show that if W1 (µn , µ) → 0 ⇒ ∥∂x uµn − ∂x uµ ∥∞ → 0. The latter relation (∥∂x uµn −
′
∂x uµ ∥∞ ) is the regularity condition of the HJB equation with respect to parameter. The remainder is in he references provided.
we have X0 = µ0 and given that |∂x uµ (t, Xtµ )| ≤ C, we can see that.
Z
E[|XT |2 ] ≤ C, if E[|X02 |] = x2 dµ0 (x) < ∞.
Rd
The above result implies that, Rd |x|2 dP (XTµ )(x) ≤ C. We now use the result from [5] that states that µ ∈ P(Rd ) : Rd |x|2 dµ(x) ≤ C
R R
is relatively compact in P(Rd ) equipped with W1 . We now have all the pieces for applying the Shauder fixed point theorem.
9.8 If f ̸= 0 ?
Here we cannot just work with the terminal value of the population. We need to do as follows. Define M1 = E = ν : finite signed measur
We now define E = C 0 ([0, T ], M1 ) (set of continuous mappings from [0, T ] 7→ M1 ). We define C = C 0 ([0, T ], P1 (Rd )). We define
Φ : C 7→ C. We do the same thing as before, the continuity Φ is the continuity of the HJB differential operator with respect to a
parameter. For relative compactness is similar as before when f = 0. We prove that sup Rd |x|2 dP (Xtµ )(x) ≤ C. This essen-
R
0≤t≤T
√
tially needs that the variance be bounded for the entire process Xtµ . We then show that W1 (P1 (Xtµ ), P1 (Xsµ )) ≤ C t − s, 0 ≤
s ≤ t ≤ T . This proves ideas of equicntinuity and uniform continuity. For more details chapter 4 of [4].
10 Other Approaches
So far we looked at PDE approach to the solution of mean-field-games. There are also probabilistic approach to mean-field-games.
In the probabilistic approach,
1. Reformulate the MFG system as a McKean-Vlasov FBSDE.
2. Unlike in the PDE approach here we follow the cost along the evolution of the optimal trajectory.
With the given MFG system, optimal trajectory in the environment µ = {µt }t ,
We define Yt = uµ (t, Xt ) as the cost of the remaining path of the system at time t. We expand Yt using Ito’s lemma.
1
dYt = ∂t uµ (t, Xt )dt + ∂x uµ (t, Xt )[−∂x uµ (t, Xt )dt + dBt ] + ∂xx uµ (t, Xt )dt
2
1
= ∂t uµ (t, Xt ) + ∂xx uµ (t, Xt ) − |∂x uµ (t, Xt )|2 dt + ∂x uµ (t, Xt )dBt
2
So we use the PDE (9.1) for replacing the drift term by − f (Xt , µt ) + 12 |∂x uµ (t, Xt )|2 to obtain,
1
dYt = − f (Xt , µt ) + |∂x uµ (t, Xt )|2 dt + ∂x uµ (t, Xt )dBt
2
1
dYt = − f (Xt , µt ) + |Zt |2 dt + Zt dBt
2
YT = g(XT , µT )
The theory for these BSDE is by Peng and Pardoux in 1990s. We note that there are two unknowns in this BSDE, the value
function Yt = uµ (t, Xt ) and the gradient of the value function, Zt = ∂x uµ (t, Xt ), which is unusual. The options pricing system is
a BSDE. It should be noted that Yt is adapted process, this implies that Yt is σ(X0 , Bs |s≤t ) measurable. The Brownian motion
makes the process non-anticipative.
10.2 Solvability
We assume that f, g have bounded derivatives uniformly in the measure argument. f, g are Lipschitz continuous in the measure
argument, w.r.t W1 . These are the same type of constraints as the PDE. This is again Chapter 4, [4]. The equations here are
for the cost.
Yt = ∂x u(t, Xtµ )
we apply Ito’s lemma, but we have a non-zero f unlike the last time we did in 9.6, (9.3). We have,
1 1
dYt = ∂x [∂t uµ + ∂xx uµ − |∂x uµ |2 ] + ∂xx uµ dBt
2 2
= −∂x f (Xtµ , µt )dt + ∂xx uµ (t, Xt )dBt dXt = −∂x uµ (t, Xtµ )dt + dBt
Achdou and Lavriere are the references for this, Chapter 6 [6].
Refrences: Gangbo, Meszarios, Zhang, Mou, Daniel Lacber
There is a requirement of uniqueness of Equilibrium is subtle and important. Mean field games in finite state space and
potential games that facilitate solutions to Master Equations.
4. HJB:
1 1
∂t usm + ∇2 um m 2
s − |∂x us | + f (x, ms ) = 0
2 2
um (T, x) = g(x, mT ) (14.1)
5. Marginal laws m of the optimal trajectory flow according to the Fokker-Planck equation,
1
∂s ms − ∇2 ms − div(∂x um
s ms ) = 0.
2
The above system is solvable if f, g are smooth in x, bounded with bounded derivatives and Lipschitz in measure. In order to
avoid shocks in the system (not sure what this means) we require the Lasry-Lions monotonicity.
Z
[f (x, m) − f (x, m′ )] d(m − m′ )(x) > 0,
Rd
Z
[g(x, m) − g(x, m′ )] d(m − m′ )(x) > 0.
Rd
Goal 1. Construct an equilibrium for some time initial t. Given an initial population µt . We would like to construct µs
t≤s≤T
given by the MFG system.
We know the optimal control that is solved by the tagged player. The best cost function under µ is uµ (t, x) if Xt = x. Since
u is determined uniquely by µ, we denote u as uµ (t, x) as the value of the game. Here implicitly t is the initial time for both x
and µ.
May, 09, 2021
Definition 6 (Value of the Game). We call the value uµ (t, x) the value of the game. Mathematically,
U : [0, T ] × Rd × P2 (Rd ) 7→ R
(t, x, µ) 7→ uµ (t, x)
Goal 2. Show that U solves a PDE. This PDE is the master equation.
We rewrite our dynamics with a new notation to denote the time the population distribution was initialized,
dXs = −∂x ut,µ (s, Xs )ds + dBs
Here we know that ut,µ solves the HJB (14.1) on [t, T ]. We now take an incremental step h and restrict s ∈ [t + h, T ] where
0 < h <= T − h. Now we change the environment from (µs )t≤s≤T ⇒ (µs )t+h≤s≤T . What we are essentially doing is moving the
environment along time using the uniqueness property. Instead of keeping the environment fixed at a chosen time. Our notation
essentially is µt = µ and µt+h = µt,µ
t+h . This has an impact on our value function notation. We claim that,
t,µ
ut,µ (t + h, .) = ut+h,µt+h (t + h, .)
t,µ
The above is notationally ugly. ut+h,µt+h (t + h, .) is the value function at t + h using the environment µt+h at t + h. It is obtained
t,µ
as a result of the flow from t, µt → t + h, µt+h . This latter term ut+h,µt+h (t + h, .) is essentially our newly defined function U .
t,µ
We basically have U (t + h, x, µt+h ). Summarizing we have,
ut,µ (t + h, x) = U (t + h, x, µt,µ
t+h )
.
We now go back to the optimal trajectory and rewrite the dynamics in terms of U .
dXs = −∂x U (s, Xs , µt,µ
s )ds + dBs
If we now say that we initialize Xt ∼ µ then P (Xt ) = µ, Since we are at equilibrium at each step,
dXs = −∂x U (s, Xs , P (Xs ))ds + dBs
In the above equation µs = P (Xs ). We have another interesting observation to make. The action function that we had in the
original McKean-Vlasof equation and termed αi has an excellent candidate. This is the ∂x U .
15 Convergence Problem
Going from the infinite to the finite problem is one approach The other is to solve the N -player game and see if as N → ∞ it
converges. The question is, in the limit do we get a solution to the MFG? This is called the convergence problem. This is a
difficult problem.
15.1 Approaches
1. The Master equation comes to the rescue. The rate of convergence is the benefit from this approach.
2. Compactness on the equilibrium controls of the N-player game. This is too demanding. There is no proof at this time
using Compactness.
3. Relaxed Controls: This is a compactification of standard controls. The controls are replaced by measures. This is a
relaxation process. Daniel Laker This is less demanding in terms of assumptions. there is no rate of convergance.
All the above methods are complementary.
May, 09, 2021
ui (T, x) = g(xi , µN
x )
We now argue that the equilibrium is when the best response is the optimal response. So we go back to our optimal response
αi = −∂xi ui (t, x). Substituting this back to our Nash system under the assumption that every ones optimal action is given by
the partial derivative of ui w.r.t xi . We now have,
1 X 1
∂t ui (t, x) + ∇2 ui (t, x) − ∂xj uj (t, x).∂xj ui (t, x) + |∂xi ui (t, x)|2 + f (xi , µN
x )=0
2 2
j̸=i
ui (T, x) = g(xi , µN
x )
This is a PDE system. The curse of dimensionality damns it. This is a non-degenerate PDE system, it becomes degenerate
in the limit. There are a series of paper by Bensoussan, Fiehse in the 90s that look at these problems. Carmonas book gives all
these references Chapter 6 [6].
∂t ui (t, x) ≈ ∂t U (t, xi , µN
x )
X N
X
− ∂xj uj (t, x).∂xj ui (t, x) ≈ − ∂xj U (t, xj , µN N
x ).∂xj U (t, xi , µx )
j̸=i j=1
X 1
1
N N N
=− ∂xj U (t, xj , µx ) + ∂µ U (t, xj , µx ) . ∂µ U (t, xi , µx )
N N
j̸=i
1 1
− ∂xi U (t, xi , µNx ) + ∂ µ U (t, xi , µN
x ) . ∂ xi U (t, x i , µN
x ) + ∂µ U (t, xi , µN
x )
N N
1 X 1
= −|∂xi U (t, xi , µNx )|2
− ∂ µ U (t, xi , µN
x )(x j ).∂ xj U (t, xj , µN
x ) + O
N j N
h P i
N
Note that ∂xi N1 1 δxi (xi ) = N1 . We now write the middle term N1 j ∂µ U (t, xj , µN N
P
x )(xj ).∂xj U (t, xj , µx ) as an integral. This
was shown in the master equation lectures. We then obtain,
Z
1 X
∂µ U (t, xi , µN
x )(x j ).∂xj U (t, xj , µN
x ) = ∂xj U (t, xj , µN N N
x ).∂µ U (t, xi , µx )(xj )(ν)dµx (ν)
N j R d
N N
1 2 i 1X n
N
o 1 X
N 1 N
∇ u (t, x) ≈ T r ∂xj ∂xj U (t, xi , µx ) = T r ∂xj ∂xi U (t, xi , µx ) + ∂µ U (t, xi , µx )(xj )
2 2 j=1 2 j=1 N
1 1 X 1 X 1 X
= ∂xi xi U (t, xi , µN
x )+ ∂xi U (t, xi , µN
x )+ ∂µ U (t, xi , µN
x )+ T r ∂xi µ U (t, xi , µN
x )(xj )
2 N N2 N j
j̸=i j̸=i
Z
1 2 N
h
N
i
N 1
= ∇ U (t, xi , µx ) + T r ∂xi ∂µ U (t, xi , µx )(xj ) dµx (xj ) + O
2 Rd N
We again note that this is a term from the master equation. Essentially when we consider the Nash system and apply DPP
we do recover the terms from the Master Equation. We can piece the other two remaining pieces f, g.
Proposition 4 (Close Solution to Nash). The claim is that the projection (ui )1≤i≤N is almost a solution to the Nash
system.
May, 09, 2021
ui (T, x) = g(xi , µN
x )
1
The term ri (x, x) is of the order O N .
15.7 Comparison
References
[1] I. Swiecicki, T. Gobron, and D. Ullmo, “Schrödinger approach to mean field games,” Physical review letters, vol. 116, no. 12,
p. 128701, 2016.
[2] A.-S. Sznitman, “Topics in propagation of chaos,” Lecture notes in mathematics, pp. 165–251, 1991.
[3] J. Ma and J. Yong, Forward-backward stochastic differential equations and their applications. No. 1702, Springer Science &
Business Media, 1999.
[4] R. Carmona and F. Delarue, “Probabilistic theory of mean field games: vol. i, mean field fbsdes, control, and games,”
Stochastic Analysis and Applications. Springer Verlag, 2018.
[5] C. Villani et al., Optimal transport: old and new, vol. 338. Springer, 2009.
[6] R. Carmona and F. Delarue, “Probabilistic theory of mean field games: vol. ii, mean field games with common noise and
master equations,” Stochastic Analysis and Applications. Springer Verlag, 2017.