Convex Duality and Financial Mathematics - Compress
Convex Duality and Financial Mathematics - Compress
Convex Duality
and Financial
Mathematics
123
SpringerBriefs in Mathematics
Series Editors
Nicola Bellomo
Michele Benzi
Palle Jorgensen
Tatsien Li
Roderick Melnik
Otmar Scherzer
Benjamin Steinberg
Lothar Reichel
Yuri Tschinkel
George Yin
Ping Zhang
123
Peter Carr Qiji Jim Zhu
Department of Finance and Risk Engineering Department of Mathematics
Tandon School of Engineering Western Michigan University
New York University Kalamazoo, MI, USA
New York, NY, USA
Mathematics Subject Classification: 26B25, 49N15, 52A41, 60J60, 90C25, 91B16, 91B25, 91B26,
91B30, 91G10, 91G20
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To Carol and Olivia
To Lilly and Charles.
And in memory of Jonathan Borwein
(1951–2016) with respect.
Preface
vii
viii Preface
the portfolio, the dual problem has only two variables related to the two constraints
on the initial endowment and the expected return. In fact, the key observation of
Markowitz is that one can evaluate the performance of a portfolio in the dual space
using the variance-expected return pair. Second, the duality relationship between the
primal Markowitz portfolio problem and its dual helps us to understand that the set
of optimal portfolios is an affine set, which leads to the important two-fund theorem.
The core methodology of optimizing a quadratic function with linear constraints was
also used in the capital asset pricing model, which leads to the widely used Sharpe
ratio. Duality also plays a crucial role in this problem.
Next, we consider portfolio optimization from the perspective of maximizing
expected utility. There has been a very long history of using utility functions in
economics. In financial problems, utility functions are increasing concave functions
of wealth. The concavity of the utility function captures the risk aversion of an
investor. Arrow and Pratt introduced widely used measures of the level of risk
aversion. It turns out that there is a precise way of using generalized convexity to
characterize Pratt–Arrow risk aversion. This application illustrates the relevance of
generalized convexity in dealing with financial problems. It is even more interesting
to consider the dual of the expected utility maximization problem. It turns out
that in the absence of arbitrage, solutions to the dual problem are in essence
the equivalent martingale measures (also called risk-neutral probabilities), which
are widely used in pricing financial derivatives. Considering the expected utility
maximization problem along with its dual leads us to rediscover the fundamental
theorem of asset pricing. An added benefit of this alternative approach is that
martingale measures can be related to the risk aversion of agents in the market.
The last application that we cover in Chapter 2 concerns the dual representation
of coherent risk measures. Coherent risk measures are motivated by the common
regulatory practice of assigning each position in a risky asset with the appropriate
amount of cash reserves. Hence, they are widely used to analyze risks. Mathemat-
ically, a coherent risk measure is characterized by a sublinear function: a convex
function with positive homogeneity. It is well known that the dual of a sublinear
function is an indicator function. Thus, using dual representation, a coherent risk
measure is just the support function of a closed convex set. Financially, we can view
the generating set of a coherent risk measure as the probabilities assigned to risky
scenarios in a stress test. Duality also generates numerical methods for calculating
some important coherent risk measures such as the conditional value at risk.
We expand our discussion to a more general multiperiod financial market model
in Chapter 3. This more general setting allows us to model dynamic trading. The
added complexity in dealing with a multiperiod model mainly involves capturing
the increase in information using an information structure. After laying out the
multiperiod financial market model, we show that the fundamental theorem of asset
pricing also arises in a multiperiod financial market model. After that we also dis-
cuss two new topics: super (sub) hedging and conic finance. In general, the absence
of arbitrage leads to multiple (usually infinitely many) pricing martingale measures
Preface ix
we thank Monty Essid, Tom Li, Matthew Foreman, Sanjay Karanth, Jay Treiman,
Mehdi Vazifadan, and Guolin Yu whose detailed comments on various parts of our
lecture notes have been incorporated into the text.
1 Convex Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Convex Sets and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Convex Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Subdifferential and Lagrange Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Nonemptiness of Subdifferential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 Role in Convex Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Fenchel Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 The Fenchel Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 The Fenchel–Young Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3 Graphic Illustration and Generalizations . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Convex Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.1 Rockafellar Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.2 Fenchel Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.3 Lagrange Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4.4 Generalized Fenchel–Young Inequality . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Generalized Convexity, Conjugacy and Duality . . . . . . . . . . . . . . . . . . . . . . . 28
2 Financial Models in One Period Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.1 Markowitz Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.1.2 Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.1.3 Sharpe Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.1 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.2 Measuring Risk Aversion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.3 Growth Optimal Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.2.4 Efficiency Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
xi
xii Contents
Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Chapter 1
Convex Duality
1.1.1 Definitions
Definition 1.1.1 (Convex Sets and Functions) Let X be a Banach space. We say
that a subset C of X is a convex set if, for any x, y ∈ C and any λ ∈ [0, 1],
λx + (1 − λ)y ∈ C. We say an extended-valued function f : X → R ∪ {+∞} is a
convex function if its domain, dom f := {x ∈ X | f (x) < ∞}, is convex and for
any x, y ∈ dom f and any λ ∈ [0, 1], one has
characterization is very useful in many situations. For instance, it is easy to see that
the intersection of a class of convex sets is convex. Now let fα be a class of convex
functions we can see that
is always convex. Note that allowing the extended value +∞ in the definition of
convex function is important in establishing those relations.
An important property of convex functions related to applications in economics
and finance is the Jensen inequality.
Proposition 1.1.2 (Jensen’s Inequality) Let f be a convex function. Then, for any
random variable X on a finite probability space,
We will often encounter various forms of the general convex programming problems
below in financial applications in subsequent chapters. Let X, Y , and Z be finite
dimensional Banach spaces. Assume that Y has a partial order ≤K generated by the
pointed convex cone K. We will use X∗ , Y ∗ , and Z ∗ to denote the dual spaces of
X, Y , and Z, respectively, and denote the polar cone of K by
which may take values ±∞ (in infeasible or unbounded below cases), and S(y, z)
the (possibly empty) solution set of problem P (y, z).
A concrete example is
spaces, for applications in this book we will often need to consider the Banach space
of random variables.
It turns out that the optimal value function of a convex programming problem is
convex.
4 1 Convex Duality
This is a very potent result that can help us to recognize the convexity of many
other functions. For example, let C be a convex set then, dC , the distance function
to C defined by dC (z) := inf[z − c : c ∈ C] is a convex function because we can
rewrite it as the optimal value of the following special case of problem (1.1.2)
1.2.1 Definition
A natural and important question is that when we can ensure the subdifferential
is nonempty. The following Fenchel-Rockafellar theorem provides a basic form of
sufficient conditions.
Theorem 1.2.4 (Fenchel-Rockafellar Theorem on Nonemptiness of Subdiffer-
ential) Let f : X → R ∪ {+∞} be a convex function. Suppose that x̄ ∈
int(dom f ), the interior of dom f . Then the subdifferential ∂f (x̄) is nonempty.
Proof We observe that (x̄, f (x̄)) is a boundary point of the closed set epi f which
has a nonempty interior. Thus by the Hahn–Banach extension theorem there exists
6 1 Convex Duality
1.2.3 Calculus
For more complicated convex functions we need the help of a convenient calculus
for calculating or estimating its subdifferential. It turns out that the key for
developing such a calculus is to combine a decoupling mechanism with the existence
of subgradient. We summarize this idea in the following lemma.
Lemma 1.2.7 (Decoupling Lemma) Let X and Y be Banach spaces. Let the
functions f : X → R and g : Y → R be convex and let A : X → Y be a linear
transform. Suppose that f , g, and A satisfy the condition
Thus,
We now use the tools established above to deduce calculus rules for the convex
functions. We start with a sum rule playing a role similar to the sum rule for
derivatives in calculus.
Theorem 1.2.9 (Convex Subdifferential Sum Rule) Let f : X → R ∪ {+∞}
and g : Y → R ∪ {+∞} be convex functions and let A : X → Y be a linear map.
Then at any point x in X, we have the sum rule
x → f (x) + g(Ax) − x ∗ , x
attains its minimum 0 at x = x̄. By the sandwich theorem there exists an affine
function α(x) := A∗ y ∗ , x + r with −y ∗ ∈ ∂g(Ax̄) such that
Note that when A is the identity mapping and both f and g are differentiable
Theorem 1.2.9 recovers sum rules in calculus. The geometrical interpretation of this
is that one can find a hyperplane in X × R that separates the epigraph of f and
hypograph of −g, i.e. {(x, r) : −g(x) ≥ r}.
By applying the subdifferential sum rule to the indicator functions of two convex
sets we have parallel results for the normal cones to the intersection of convex sets.
Theorem 1.2.10 (Normals to an Intersection) Let C1 and C2 be two convex
subsets of X and let x ∈ C1 ∩ C2 . Suppose that C1 ∩ int C2 = ∅. Then
f (x) − f (x̄) ≥ 0 = 0, x − x̄ ,
subject to x ∈ C ⊂ X,
Proof Apply the convex subdifferential sum rule of Theorem 1.2.9 to f + ιC at x̄.
Proof The “only if” part. Suppose that −λ ∈ ∂v(0, 0). It is easy to see that v(y, 0)
is non-increasing with respect to the partial order ≤K . Thus, for any y ∈ K,
so that λ ∈ K + × Z ∗ verifying (i). Conclusion (ii) follows from the fact that for all
x ∈ C,
The “if” part. Suppose λ satisfies conditions (i) and (ii). Then we have, for any
x ∈ C, g(x) ≤K y and h(x) = z,
Taking the infimum of the leftmost term under the constraints x ∈ C, g(x) ≤K y
and h(x) = z, we arrive at
If we denote by Λ(y, z) the multipliers satisfying (i) and (ii) of Theorem 1.2.13,
then we may write the useful set equality
When an optimal solution for the problem P (0, 0) exists, we can also derive a
so-called complementary slackness condition.
Theorem 1.2.15 (Lagrange Multiplier when Optimal Solution Exists) Let
v(y, z) be the optimal value function of the constrained optimization problem
P (y, z). Then the pair (x̄, λ) satisfies −λ ∈ ∂v(0, 0) and x̄ ∈ S(0, 0) if and only
if
(i) (nonnegativity) λ ∈ K + × Z ∗ ;
(ii) (unconstrained optimum) the function
We can deduce from Theorems 1.2.13 and 1.2.15 that ∂v(0, 0) completely
characterizes the set of Lagrange multipliers.
12 1 Convex Duality
This is an elementary but important result that relates conjugate operation with the
subgradient.
Proposition 1.3.1 (Fenchel–Young Inequality) Let f : X → R ∪ {+∞} be a
convex function. Suppose that x ∗ ∈ X∗ and x ∈ dom f . Then
f (x) + f ∗ (x ∗ ) ≥ x ∗ , x . (1.3.2)
f (x) + f ∗ (x ∗ ) = x ∗ , x ,
f (x) + x ∗ , y − f (y) ≤ x ∗ , x .
1.3 Fenchel Conjugate 13
That is
f (y) − f (x) ≥ x ∗ , y − x ,
or x ∗ ∈ ∂f (x).
where Dx is the differential operator with respect to x, I is the identity operator, and
[A, B] = AB − BA represents the commutator of operator A and B. Symmetrically
we also have
Dx ∗ f ((f ∗ ) (x ∗ )) = [Dx ∗ , (f ∗ ) (x ∗ )I ]x ∗ .
t t
x∗
x∗
φ−1 φ−1
φ φ
s s
O x O x
φ−1
φ
s
O x
v(0) = p ≥ d = v ∗∗ (0).
This is called weak duality and the non-negative number p − d = v(0) − v ∗∗ (0) is
called the duality gap—which we aspire to be small or zero.
Let F (x, (y, z)) := f (x) + ιepi(g) (x, y) + ιgraph(h) (x, z). Then problem P (y, z)
in (1.1.2) becomes problem (1.4.1) with parameters (y, z). On the other hand, we
can rewrite (1.4.1) as
1 Theuse of the term “primal” is much more recent than the term “dual” and was suggested by
George Dantzig’s father Tobias when linear programming was being developed in the 1940s.
1.4 Convex Duality Theory 17
Proof If the primal problem has a Lagrange multiplier λ, then −λ ∈ ∂v(0). By the
Fenchel–Young equality
Since
v(0) + v ∗ (−λ) = 0.
Letting u = Ax + y we have
When N < dim X (often infinite) the dual problem is typically much easier to solve
than the primal.
Example 1.4.5 (Boltzmann–Shannon Entropy in Euclidean Space) Let
N
f (x) := p(xn ), (1.4.9)
n=1
where
⎧
⎪
⎨t ln t − t
⎪ if t > 0,
p(t) := 0 if t = 0,
⎪
⎪
⎩+∞ if t < 0.
20 1 Convex Duality
Example 1.4.4 can help us conveniently derive an explicit formula for solutions
of (1.4.10) in terms of the solution to its dual problem.
First we note that the sublevel sets of the objective function are compact,
thus ensuring the existence of solutions to problem (1.4.10). We can also see by
direct calculation that the directional derivative of the cost function is −∞ on any
boundary point x of dom f = RN + , the domain of the cost function, in the direction
of z − x. Thus, any solution of (1.4.10) must be in the interior of RN + . Since the cost
function is strictly convex on int (RN + ), then the solution is unique.
Let us denote this unique solution of (1.4.10) by x̄. Then the duality result in
Example 1.4.4 implies that
Now let φ̄ be a solution to the dual problem, i.e., a Lagrange multiplier for the
constrained minimization problem (1.4.10). We have
Indeed, we can use the existence of the dual solution to prove that the primal
problem has the given solution without direct appeal to compactness—we deduce
the existence of the primal from the duality theory.
Remark 1.4.6 In view of Remark 1.2.6, when both f and g are polyhedral functions
the constraint qualification condition (1.2.2) simplifies to
This is very useful in dealing with polyhedral cone programming and, in particular,
linear programming problems. One can also similarly handle a subset of polyhedral
constraints, see [7, 8].
Then
f (x) if g(x) ≤K y, h(x) = z
sup L(λ, x; (y, z)) = .
λ∈K + ×Z ∗ +∞ otherwise.
We can calculate
We can see that the weak duality inequality v(0) ≥ v ∗∗ (0) is simply the familiar
fact that
max c, x (1.4.15)
subject to Ax ≤ b, x ≥ 0
min b, λ (1.4.16)
∗
subject to A λ ≥ c, λ ≥ 0.
L(λ, x) = −c, x + λ, Ax − b
So we have
1.4 Convex Duality Theory 23
max[ c, x : Ax ≤ b, x ≥ 0] = − max[− λ, b : A∗ λ ≥ c]
λ≥0
= min[ λ, b : A∗ λ ≥ c, λ ≥ 0].
Clearly all the functions involved here are polyhedral. Applying the constraint
qualification condition for polyhedral functions we can conclude that if either
the primal problem or the dual problem is feasible then there is no duality gap.
Moreover, when the common optimal value is finite then both problems have
optimal solutions.
The hard work in Example 1.4.7 was hidden in establishing that the constraint
qualification (1.4.12) is sufficient, but unlike many applied developments we have
rigorously recaptured linear programming duality within our framework.
Note that the primal Lagrange multiplier λ is the dual solution and vice versa.
Table 1.3 can help us formulating the dual problem.
x φ(s) x∗ φ −1 (t)
K(s, t)dtds + K(s, t)dsdt (1.4.17)
0 0 0 0
24 1 Convex Duality
x x∗ x φ(s)
≥ K(s, t)dsdt + K(s, t)dtds
0 0 φ −1 (x ∗ ) x ∗
x x∗
≥ K(s, t)dtds.
0 0
x φ(s) x∗ φ −1 (t)
K(s, t)dtds + K(s, t)dsdt (1.4.18)
0 0 0 0
x x∗ φ −1 (x ∗ ) x∗
≥ K(s, t)dsdt + K(s, t)dtds
0 0 x φ(s)
x x∗
≥ K(s, t)dtds.
0 0
The condition φ(0) = 0 merely conveniently locates the lower left corner of the
graph to the coordinate origin and is clearly not essential. In general we can always
shift this corner to any point (a, φ(a)). More substantively, the requirement that φ
being a continuous increasing function can be relaxed to nondecreasing as long as
φ −1 is replaced appropriately by
−1
φinf (t) = inf{s, φ(s) ≥ t}.
Now we can state a more general Fenchel–Young inequality whose proof is an easy
exercise.
Theorem 1.4.9 (Weighted Fenchel–Young Inequality) Let K(x, y) be a
bounded essentially positive measurable function and let φ be a nondecreasing
function. Then
−1
x x∗ x φ(s) x∗ φinf (t)
K(s, t)dsdt ≤ K(s, t)dtds + K(s, t)dsdt
a φ(a) a φ(a) φ(a) a
−1 ∗ −1 ∗
with equality attained when x ∗ ∈ [φ(x−), φ(x+)], x ∈ [φinf (x −), φinf (x +)].
The above idea can be further pushed in two different directions in the next two
sections.
1.4 Convex Duality Theory 25
φ2 φ2
φ2 (b2 )
φ2 (b2 )
φ1 φ1
(φ1 (a), φ2 (a)) φ1 (b1 ) (φ1 (a), φ2 (a)) φ1 (b1 )
φ1
(φ1 (a), φ2 (a)) φ1 (b1 )
φ2 (b2 ) φ1 (b1 )
K(s1 , s2 )ds1 ds2 (1.4.19)
φ2 (a) φ1 (a)
Similarly,
(1.4.20)
φ N (bN+1 )
LH S = K(s N +1 )ds N +1
φ N+1 (a·1N+1 )
Applying the induction hypothesis to the inner layer of the integration we have
The last equality groups the two out layers of the integration together. Now applying
the Fenchel–Young inequality with N = 2 to get
−1
φN+1 (bN+1 ) N φn (φN+1 (sN+1 )) φnN (φn−1 (sn )·1N−1 )
+ K(s N +1 )dsnN dsn dsN +1
φN+1 (a) n=1 φn (a) φnN (a·1N−1 )
Combining the inner layers of the integration in the first sum and applying the
equality part of the induction hypothesis for the second sum we arrive at
−1
φN+1 (bN+1 ) φ N (φN+1 (sN+1 )·1N )
+ K(s N +1 )ds N dsN +1
φN+1 (a) φnN (a·1N )
φ2 (b2 ) φ1 (b1 )
K(s1 , s2 )ds1 ds2
φ2 (a) φ1 (a)
28 1 Convex Duality
0.8
0.6
0.4
0.2
1
0.8
0.6
0 0.4
0 0.2
0.2
0.4
0.6
0.8
1
Note that the graphic illustrations in Section 1.4.3 only work when x, x ∗ ∈ R.
When, in general, (x, x ∗ ) ∈ X × X∗ we can imitate the general definition of the
Fenchel conjugate. In such a generalization a nonlinear function c(x, x ∗ ) replaces
x x∗
the role of x ∗ , x just as in Theorem 1.4.8 0 0 K(s, t)dsdt replacing the
product x ∗ x. In fact, x ∗ does not even have to be in X∗ . This is a more significant
generalization. To implement this idea, one needs to first revise the concept of
convexity.
Definition 1.5.1 (Generalized Convexity) Let Φ be a set of extended real valued
functions. We say f is Φ-convex if
It is easy to verify that Φ-convex functions are closed under supremum. Thus, every
function has a largest Φ-convex minorant called its Φ-convex hull. Moreover, if f is
Φ-convex then it is coincide with its Φ-convex hull. By setting Φ to be the class of
affine functions we get the usual convexity with in the class of lower semicontinuous
functions.
Similar to Fenchel conjugate we define:
1.5 Generalized Convexity, Conjugacy and Duality 29
Remark 1.5.4 We see from the discussion about generalized Fenchel conjugate that
what is essential in dealing with conjugate operation is the closedness with respect
to the sup operation. For simple convexity the key link is that a convex function is
the sup of all the affine functions it dominates. It is a fact based on the fundamental
convex separation theorem.
The generalized convexity can characterize many class of functions. The follow-
ings are a few examples that showcase the potent of this concept.
Example 1.5.5 Let ·, · be the dual pairing between X and X∗ . Define c(x, x ∗ ) =
ln x, x ∗ , with ln t = −∞ for t ≤ 0. Then a function f : X → R ∪ {+∞} is
Φc(1) -convex if and only if ef (with the convention e−∞ = 0) is sublinear.
Example 1.5.6 Let X = Y = [0, +∞] and define c(x, y) = xy, with the
convention a(+∞) = +∞. Then a function f : X → R ∪ {+∞} is Φc(1) -convex
if and only if it is convex and nondecreasing.
30 1 Convex Duality
Example 1.5.7 Let X be a Hilbert space and Y = R+ × X. Define c(x, (ρ, y)) =
−ρx − y2 . Then f : X → R ∪ {+∞} is Φc(1) -convex if and only if it is lower
semicontinuous and has a finite minorant φ ∈ Φc(1) .
The concept of subdifferential and its relationship with Fenchel conjugate can
also be generalized.
Definition 1.5.8 (Generalized Subdifferential) Let c be a function on X × Y .
We say y0 (x0 ) is a c(1)(c(2))-subdifferential of f (g) at x0 (y0 ) if
and noticing all the terms on the left-hand side are cancelled.
Proof The “if ” part: The two properties can be derived from direct computation.
For property (i)
where
32 1 Convex Duality
Rockafellar Duality
as one of the family v(y) = infx F (x, y) on the perturbation space Y . Let Z be
the “dual parameter space” and let c(y, z) be a coupling function. Define the dual
problem as
This definition is the same as the Rockafellar duality. However, since now c(0, z) is
not necessarily 0 the problem is more involved.
Theorem 1.5.12 (Dual Solution Set) If d = v c(1)c(2) (0) < ∞, then the optimal
solution set to the dual problem is ∂c v c(1)c(2) (0).
Proof It follows directly from definition and is left as an exercise.
Lagrange Duality
where Fx (y) := F (x, y). Then we have the Lagrange form of the primal: If Fx (y)
is Φc -convex for all x ∈ X at y = 0, then
sup L(x, z) = sup{c(0, z) − Fxc(1) (z)} = Fxc(1)c(2) (0) = Fx (0) = F (x, 0).
z z
1.5 Generalized Convexity, Conjugacy and Duality 33
Next we consider the Lagrange form of the dual. If c < +∞, we have
We see that the primal and dual value equal if and only if
Abstract This chapter focuses on financial models in a one period economy with
a finite sample space. Mathematically, these models involve only finite dimensional
spaces yet they still illustrate the main patterns.
In modeling the behavior of agents in a financial market, we usually use
concave utility functions and convex risk measure to characterize their attitude
towards risk. These agents are subject to various constraints ranging from the
availability of capital, contractual obligation to clients to mandates from regulators.
Thus, the theory regarding constrained (convex) optimization discussed in the
previous chapter is most relevant. The Lagrange multipliers in such financial models
often carry a special financial meaning and are worthy of attention. Moreover, as
illustrated in the previous chapter, they also provide the key link between the primal
and the dual problems.
2.1 Portfolio
Portfolio theory considers the one period financial model in which transaction can
only take place at either the beginning of the period or the end of the period
represented by t = 0 or 1, respectively. We use probability space (Ω, F, P ) to
represent an economy where the σ -algebra F is generated by finitely many atoms
F = σ ({B1 , . . . , BN }). We use RV (Ω, F, P ) to denote the Hilbert space of all
F-measurable random variables endowed with the inner product
N
x, y = EP [xy] = x(ω)y(ω)P (ω) = x(Bi )y(Bi )P (Bi ), (2.1.1)
ω∈Ω i=1
where x(Bi ) and y(Bi ) signify the common value of F-measurable random
variables x and y on atom Bi , respectively. We use · RV to denote the norm on
RV (Ω, F, P ) induced by the inner product in (2.1.1). Elements in RV (Ω, F, P )
represent the price or payoff of assets. In a one period economy we may think
Markowitz portfolio theory considers only risky assets and is based on the idea that
for a fixed expected return one should choose portfolios with minimum variation,
which serves as a measure for the risk. In general, a portfolio with a higher expected
return also accompanied with a higher variation (risk). The tradeoff is left to the
individual agent.
Use Ŝ = (S 1 , . . . , S M ) to denote the price process of the risky assets and Θ̂ =
(θ1 , . . . , θM ) to denote the portfolio. For a given expected payoff r0 and an initial
wealth w0 we can formulate the Markowitz portfolio problem as
where Var is the variation and σ signifies the standard deviation. Regarding Ŝ
as a row vector of random variables and Θ̂ as a row vector, denoting E[Ŝ1 ] =
[E[Ŝ11 ], . . . , E[Ŝ1M ]],
E[Ŝ1 ] r0
A= , and b = , (2.1.3)
Ŝ0 w0
1
minimize f (x) := x Σx
2
subject to Ax = b. (2.1.4)
2.1 Portfolio 37
Here x = Θ̂ and
The coefficient 1/2 is added to the risk function to make the computation easier.
Clearly, Σ is a symmetric positive semidefinite matrix. We will assume that it is in
fact positive definite. Then the Fenchel conjugate f ∗ of f (see (1.3.1)) is
1 −1
f ∗ (y) = y Σ y. (2.1.6)
2
The constraint qualification condition for strong duality here is b ∈ rangeA
which is to say (r0 , w0 ) is feasible for the constraint. Assuming that this constraint
qualification condition is satisfied, it follows from Theorem 1.4.3 on the strong
duality that the value of problem (2.1.4) equals to that of its dual:
1
maximize b y − y AΣ −1 A y
2
1
= b (AΣ −1 A )−1 b. (2.1.7)
2
Here the optimal solution to the dual is
It follows that
Let x̄ and ȳ be the solutions of (2.1.4) and (2.1.7), respectively. By the strong
duality in Theorem 1.4.3 we have f (x̄) = b ȳ − f ∗ (A ȳ). Since b ȳ = ȳ, Ax̄
it follows that
The equality (2.1.10) via the Fenchel–Young equality in Proposition 1.3.1 tells us
Theorem 2.1.2 (Markowitz Portfolio Theorem) For given initial wealth w0 and
expected payoff r0 , the Markowitz portfolio Θ and the minimum risk in terms of
standard deviation σ are determined by
γ r02 − 2βr0 w0 + αw02
σ (r0 , w0 ) = (2.1.12)
αγ − β 2
and
Note that both σ (r0 , w0 ) and Θ(r0 , w0 ) are positive homogeneous functions we
have
Corollary 2.1.3 Use μ to denote the expected return on unit initial wealth and let
σ = σ (μ, 1) and Θ = Θ(μ, 1). Then
γ μ2 − 2βμ + α
σ = (2.1.15)
αγ − β 2
and
Every point inside the Markowitz bullet represents a portfolio that can be moved
horizontally to the left to a point on the boundary of the bullet. This point on the
boundary represents a portfolio with the same expected return but less risk. For
every point on the lower half of the boundary of the Markowitz bullet, one can find
a corresponding point on the upper half of the boundary with the same variation and
a higher expected return. Thus, preferred portfolios are represented by points on the
upper boundary of the Markowitz bullet. We note that the upper boundary of the
Markowitz bullet has an asymptote whose slope can be determined by
μ αγ − β 2
lim = . (2.1.17)
σ →∞ σ γ
By taking the limit of the tangent line of points on the boundary of the Markowitz
bullet one can show that the μ-intercept of this asymptote is at β/γ . This number
will play an important role in our discussion of the capital asset pricing model. In
fact, the asymptote for the upper boundary of the Markowitz bullet passes through
this point.
Although the Markowitz bullet is nonlinear, the Markowitz portfolio (2.1.15) is
an affine function of the return. This leads to
Theorem 2.1.4 (Two Fund Theorem) Given two distinct portfolios on the
Markowitz bullet (2.1.15), then any portfolio on the Markowitz bullet can be
represented as their linear combination.
Proof This follows directly from the affine structure of the Markowitz optimal
portfolio (2.1.16). In fact, suppose that
are two distinct Markowitz portfolios so that μ1 = μ2 . Then any Markowitz efficient
portfolio described in (2.1.16) can be explicitly represented as
μ − μ2 μ − μ1
Θ= Θ1 + Θ2 .
μ1 − μ2 μ2 − μ1
Remark 2.1.5 In pointing out that all portfolios on the Markowitz frontier are
generated by just two such portfolios, the two fund theorem has great practical
significance. One can often use two broad based indices to approximate the two
basic generating portfolios for the Markowitz frontier. This can be viewed as a
theoretical foundation for the passive investment strategy of buy and hold broad
based indices.
If our sole goal is to minimize the risk, then our problem becomes
1
minimize f (x) := x Σx
2
subject to Ŝ0 x = w0 . (2.1.18)
Θmin = γ −1 w0 Ŝ0 Σ −1
σmin = γ −1/2 w0 .
Capital asset pricing model (CAPM) is an equilibrium model for determining the
price of risky assets. It is based on the Markowitz mean variance analysis that also
includes riskless bond. The mathematical model is
minimize Var(Θ · S1 )
subject to E[Θ · S1 ] = μ (2.1.19)
Θ · S0 = 1.
It turns out that the efficient portfolios determined by (2.1.19) all lie on a straight
line in the σ μ-plane. This line is called the capital market line. Then the model
prices a risky asset according to the principle that adding it to the market does not
change the capital market line.
We derive the capital market line using convex duality first. Recall that S1 =
(S10 , Ŝ1 ). Since Var(S10 ) = 0 one can show that
Relation (2.1.20) suggests a strategy of solving problem (2.1.19) in two steps. First,
for a portfolio with θ = θ0 ≥ 0, denote R = S10 /S00 , the return on the risk free asset,
we solve problem
f (θ ) = [σ (μ − θ R, 1 − θ )]2
γ (μ − θ R)2 − 2β(μ − θ R)(1 − θ ) + α(1 − θ )2
= (2.1.22)
αγ − β 2
α − β(μ + R) + γ μR
θ̄ = , (2.1.23)
α − 2βR + γ R 2
μ−R
1 − θ̄ = (β − γ R) (2.1.24)
Δ
We observe that only μ > R makes sense because by including risky assets we
always expect to get a higher return than the risk free assets. Note that the risky
assets are involved in the minimum variance portfolio only when 1 − θ̄ > 0. This
implies
by (2.1.24). Let us focus on the case when R satisfies (2.1.25). We can calculate
μ−R
μ − θ̄ R = (α − βR) . (2.1.26)
Δ
By the positive homogeneous property of σ we have
μ−R
σ = σ (μ − θ̄ R, 1 − θ̄ ) = σ (α − βR, β − γ R) . (2.1.27)
Δ
√
It is easy to verify that σ (α − βR, β − γ R) = Δ. Thus, all the optimal portfolios
lie on the line
√
μ=R+ Δσ. (2.1.28)
This line on the σ μ-plane is usually referred to as the capital market line. This
linear structure of the optimal portfolios suggests that we can derive all the optimal
portfolios as the linear combinations of two distinct portfolios. Taking the risk free
bond and a portfolio of pure risky assets we have the following
Theorem 2.1.7 (Two Fund Separation Theorem) All the optimal portfolios on
the capital market line can be represented as the linear combination of the riskless
bond and the capital market portfolio
Proof Clearly the riskless bond is on the capital market line and can be represented
in the σ μ-plane as (0, R). We now seek a portfolio on the capital market line that
contains only risky asset. We denote its coordinates by (σM , μM ). Note such a
portfolio corresponding to θ̄ = 0. It follows from (2.1.24) that
Δ α − βR
μM = R + = . (2.1.31)
β − γR β − γR
Thus, we can find risky part of the capital market portfolio by solving
E[Ŝ1 ] − R Ŝ0 −1
Θ̂M = Σ . (2.1.33)
β − γR
Noting that the weight on the riskless bond is 0 for the capital market portfolio we
arrive at the representation in (2.1.29): ΘM = (0, Θ̂M ).
Finally, comparing (2.1.28) and (2.1.31), we derive
√
Δ
σM = . (2.1.34)
β − γR
Clearly, the point (σM , μM ) lies on the boundary of the Markowitz bullet.
Moreover, since the capital market line represents optimal portfolio, the Markowitz
frontier must lie below it. Thus, the capital market line must tangent to the
Markowitz frontier at (σM , μM ) (see Figure 2.2). As a result, if R ≥ β/γ , there
is no capital market line (see Figure 2.3), which confirms what has been derived
analytically in (2.1.25).
Using the fact that both (0, R) and (σM , μM ) belong to the capital market line
we can rewrite the capital market line as
μM − R
μ= σ + R. (2.1.35)
σM
The theorem below tells us how to use this capital market line to price a risky asset
in terms of its expected return.
Theorem 2.1.8 (Capital Asset Pricing Model) Suppose that we know a financial
market S with a riskless bond returning R. Let a i be a fair priced risky asset with
(σM , μM )
(0, R)
σ
44 2 Financial Models in One Period Economy
Denote the expected return and the standard variation of p(α) by μα and σα ,
respectively, we have
and
where μi and σi are the expected return and standard deviation of asset a i ,
respectively. The parametric curve (σα , μα ) must lie below the capital market line
because the latter consists of optimal portfolios. On the other hand, it is clear that
when α = 0 this curve coincides with the capital market line. Thus, the capital
market line is a tangent line of the parametric curve (σα , μα ) at α = 0. It follows that
μM − R dμα σM (μi − μM )
= = . (2.1.40)
σM dσα α=0 σiM − σM 2
2.1 Portfolio 45
Think a little bit more we will realize that to construct the capital market portfolio,
theoretically, we need to use every available risky asset available to us. Given
the huge number of available equities, constructing the capital market portfolio is
practically impossible even if we have accurate probability distribution information
on all the available risky assets (which is another impossible task). Thus, we have
to deal with suboptimal situation. What happens if we mix risk free asset with an
arbitrary portfolio of risky assets (not necessarily the capital market portfolio)? Let
Θ̂ = (θ1 , . . . , θM ) be such a portfolio corresponding to risky assets (a 1 , . . . , a M )
with price random vector Ŝ = (S 1 , . . . , S M ). Again
we standardize the portfolio so
that Θ̂ · Ŝ0 = 1. Denote μ∗ = E[Θ̂ · Ŝ1 ] and σ ∗ = Var(Θ̂ · Ŝ1 ). Then any mix of
this portfolio with a risk free asset having return R will produce a portfolio whose
expected return μ and standard deviation σ lie on the line
μ∗ − R
μ= σ + R. (2.1.42)
σ∗
∗
Portfolios of risky assets with larger μ σ−R ∗ have the potential of generating higher
return for a fixed level of risk (see Figure 2.4). Sharpe proposes the formula to
compare risky portfolios such as those maintained by mutual funds using this idea.
As an illustration, suppose that R1 , . . . , RN are the monthly returns of a mutual fund
a in the past N months and the monthly return of the risk free asset is R. Define a
random variable X with finite values {Rn − R | n = 1, . . . , N } and prob(X =
Rn − R) = 1/N. Then the Sharpe ratio of a is defined as
E[X]
s(a) = √ . (2.1.43)
V ar(X)
μ∗ −R
We can see that the Sharpe ratio is, in fact, a statistical estimate of σ∗ .
46 2 Financial Models in One Period Economy
In financial problems maximizing utilities and minimizing risks are constant themes.
In the Markowitz portfolio theory, one uses expected return to measure performance
and the variance to measure the risk. They are among the simplest of such measures.
Since utility functions are concave and risk measures are convex, convex analysis is
a natural tool in dealing with financial modeling.
In 1713 Nicolas Bernoulli posted the following problem later known as the St.
Petersburg Wager paradox:
“Peter tosses a coin and continues to do so until it should land “heads” when it
comes to the ground. He agrees to give Paul one ducat1 if he gets “heads” on the
very first throw, two ducats if he gets it on the second, four if on the third, eight if
the on the fourth, and so on, so that with each additional throw the number he must
pay is doubled. Suppose we seek to determine the value of Paul’s expectation.”
Assuming a fair coin we can easily calculate the expectation to be
∞
2n−1 · P (getting the f irst head on the nth throw)
n=1
∞ ∞
1 1
= 2n−1 = = ∞.
2n 2
n=1 n=1
1 Currency unit.
2.2 Utility Functions 47
The paradox lies in according to this computation the value of the rights of playing
such a game would be infinity. In other words, one would be willing to pay any cost
to play it, which is obviously absurd.
Daniel Bernoulli, Nicolas cousin, suggested a solution in 1738 which became
highly influential later. Observing that an extra 100 ducat maybe considered a small
fortune to a poor it may mean little to a rich, Daniel Bernoulli argued that people
intuitively value money not according to its face value but its relative usefulness.
Mathematically, he introduced utility function to capture this. For the St. Petersburg
Wager problem, Bernoulli suggested to use u(x) = ln(x) as the utility function.
Bernoulli chose the ln as a utility function because of two of the properties of this
function. First the ln function is increasing signaling the more the better. Second the
derivative of the ln function is 1/x which is decreasing. This matches the intuition
that the more you have the less you care about additional money. Abstractly, let
us denote a utility function by u(x). For convenience let us assume u is twice
differentiable. Then we can characterize the above two properties as u (x) ≥ 0
and u (x) ≤ 0. Alternatively, without assuming differentiability of u we can also
coding the intuition above mathematically by requiring a utility function to be an
increasing concave function. We say a function f : R → R is concave if and
only if −f is convex. If −f is concave, we say f is convex. Usually we assume
rational agents maximizing their expected utility when making decisions. Thus,
convex optimization becomes important in analyzing financial problems.
There are many increasing concave functions. A few are listed below.
• Power utility: (x 1−γ − 1)/(1 − γ ), γ > 0.
• Log utility: ln(x).
• Exponential utility: −e−αx , α > 0.
In dealing with a particular application problem the choice of the utility function
is often based on economic or tractability considerations. Different agents can have
different utility functions that reflect their own attitude towards rewards and risks of
various degree.
For our mathematical model, it is important to know what kind of general
conditions we should impose on a utility function. We consider a general extended
valued upper semicontinuous utility function u. The following is a collection of
additional conditions that are often used in financial models to accommodate
different levels of tolerance to risk:
(u1) (Risk aversion) u is strictly concave,
(u2) (Profit seeking) u is strictly increasing and limt→+∞ u(t) = +∞,
(u3) (Bankruptcy forbidden) For any t < 0, u(t) = −∞.
48 2 Financial Models in One Period Economy
u (x)
A(x) = − .
u (x)
(x − x0 )1−γ
u(x) =
1−γ
xu (x)
R(x) = − .
u (x)
When ARA decreases the investor will increase risky investment in absolute
amount. Similarly, when RRA decreases the investor will increase risky investment
in percentage.
The property that a utility function has bounded ARA and RRA can be
characterized by generalized convexity. We showcase the proof for RRA.
Theorem 2.2.3 (Characterization of Bounded Relative Risk Aversion) Let u :
R+ → R be an increasing (decreasing) function with continuous second order
derivative. Then, for any p ∈ R, u has a coefficient of relative risk aversion R(x) ≤
(≥)1 − p if and only if u is Φ(x p y)(1) -convex.
Proof We focus on the case that u is increasing and the case of decreasing is similar.
The “If” part. Assume u is Φ(x p y)(1) -convex. Then, for any x > 0 we can find
y(x), b(x) such that
xu (x)
R(x) = − ≤ 1 − p.
u (x)
Similarly, we have
Theorem 2.2.4 (Characterization of Bounded Absolute Risk Aversion) Let u :
R+ → R be an increasing (decreasing) function with continuous second order
derivative. Then, for any p ∈ R, u has a coefficient of absolute risk aversion
A(x) ≤ (≥)p if and only if u is Φ(e−px y)(1) -convex.
Remark 2.2.5 It is not hard to see that the above two theorems are also valid
for functions with piecewise continuous second order derivatives.
√ As a concrete
example Figure 2.5 illustrates that the function f (x) = x is Φ(x −1/2 y)(1) -convex.
√
We can see there how the top curve x is represented as an envelop of a class of
functions of the form x −1/2 y − b for different parameters (y, b).
x
50 2 Financial Models in One Period Economy
Now consider investing for the long run (multiple period) and trying to maximize
the compounded return assuming that the financial market behaves the same on
each period as described in Section 2.1. The compounded return is much easier to
handle in percentage. We standardize the financial market by assuming S0 = 1 :=
(1, 1, . . . , 1) so that g = Ŝ1 − Ŝ0 represents the vector of percentage return of the
risky assets in the market. We also assume the risk free rate is 0 so that S10 = 1.
Similarly, we focus on portfolios that represent a percentage allocation of initial
endowments into the financial market, i.e., we require Θ · S0 = Θ · 1 = 1. When the
initial endowment is w0 the portfolio will be implemented as w0 Θ. The advantage
of focusing on the percentage portfolio is that when dealing with investment related
to multiple periods that repeats an identical one period market model, the percentage
portfolio on each period is the same. The growth portfolio theory seeks the portfolio
that maximizes the average compounded return in the above setting. This can be
phased as the utility maximization problem
or equivalently
In fact consider investing the initial endowment w0 for l periods and rebalancing
with a fixed (percentage) portfolio Θ in each period. Using wk to denote the balance
at kth period. Assuming sample ωn = Bn ∈ Ω occurs ln times, the total gain will be
wl
= Πn=1
N
(1 + Θ̂ · g(Bn ))ln . (2.2.3)
w0
Observing that when l → ∞, ln / l → P (Bn ), the average gain per period in the
long run is
N
Πn=1 (1 + Θ̂ · g(Bn ))P (Bn ) . (2.2.5)
We will call f (s) a log return function. We refer to the portfolio weight s on the
risky asset as leverage. The leverage level that maximizing the log return function
f (s) is called the optimal leverage.
Theorem 2.2.6 (Compute the Optimal Leverage) Assume without loss of gener-
ality that g1 < g2 < . . . < gN . Then the optimal leverage s̄ is determined by the
unique solution of the (N − 1)th order polynomial equation
N
pn g n
0= N
Πn=1 (1 + sgn ) (2.2.7)
1 + sgn
n=1
on (− g1N , − g11 )
which is the optimal leverage.
Finally, observing that the polynomial Πn=1 N (1 + sg ) has no solution in the
n
interval (− g1N , − g11 ), which shows that s̄ must be the unique solution of the (N −1)th
polynomial equation
N
pn g n
0 = Πn=1
N
(1 + sgn )
1 + sgn
n=1
When the market has only two or three states explicit solutions are not hard to
derive. Those results are very useful for analyzing betting on games and, therefore,
presented below.
Proposition 2.2.7 (Two States) Consider a market with two distinct states repre-
sented by g1 < g2 corresponding to probabilities p1 and p2 , respectively. Then the
optimal leverage is
p1 g 1 + p 2 g 2
s̄ = − . (2.2.9)
g1 g2
Proof The log return function for such an investment system is f (s) = p1 ln(1 +
sg1 ) + p2 ln(1 + sp2 ). By Theorem 2.2.6, the optimal leverage s̄ is the solution of
equation
p1 g 1 p2 g 2
0 = (1 + sp1 )(1 + sp2 ) + .
1 + sp1 1 + sg2
Proposition 2.2.8 (Three States) Consider a market with three distinct states
represented by g1 < g2 < g3 corresponding to probabilities p1 , p2 , and p3 ,
respectively. Then the optimal leverage s̄ is given by
⎧
⎪
⎪ 0 if C = 0
⎪
⎪
⎪
⎨− p1 g1 +p3 g3 if g2 = 0
(p +p )g1 g3
s̄ = −B+1√B32 −4AC (2.2.10)
⎪
⎪ if C < 0, g2 = 0
⎪
⎪ √2A
⎪
⎩ −B− B 2 −4AC
2A if C > 0, g2 = 0.
Remark 2.2.9 (The Kelly Criterion and the Shannon Information Rate) In Propo-
sition 2.2.7 if −g1 = g2 = 1 are symmetric and standardized then at the optimal
leverage
s̄ = p2 − p1
f (s̄) = p1 ln p1 + p2 ln p2 + ln 2.
2.2 Utility Functions 53
This is Shannon’s information rate for a communication channel with noise [49].
Note that when g2 = −1 and g1 = 1 our portfolio is equivalent to a game with
symmetric payoffs. This says that Shannon’s information rate can be explained as
the best possible outcome of using communication channel with noise when the
signal is used for a game with symmetric payoffs.
Let us apply Proposition 2.2.7 to a simplified Blackjack game.
Example 2.2.10 (Money Management in Blackjack) In play a certain version of the
Blackjack we know with counting cards a skilled player has a winning probability
of 51% over the house. We simplify the problem by assuming the win and loss are
always equal to the bet and apply Proposition 2.2.8 to determine the best betting
size s as a percentage of all the bankroll of the player. In this case g2 = 1 (wining
100% of the bet), g1 = −1 (losing 100% of the bet), p2 = 51% and p1 = 49%.
Thus, the optimal leverage indicates that the best betting size is
p1 g 1 + p2 g 2
s̄ = − = 2%.
g1 g2
The game of Blackjack has changed a lot and the player’s advantage has mostly
slipped away due to the use of multiple deck of cards and frequent shuffling.
However, even if the assumption in Example 2.2.10 were correct, the optimal betting
size s̄ is too aggressive as explained in the next example.
Example 2.2.11 Now consider playing a game with symmetric payoff t = −c = 1
with the wining probability of 90%. We can easily calculate that the best betting size
(optimal leverage) is s̄ = 80%. Putting 80% of your wealth on the line is clearly too
aggressive no matter how favorable the game is to you.
Despite the short comings of the growth portfolio theory, similar to the Markowitz
portfolio theory the idea can also be used to construct a criterion for evaluating
investment performance. The key is to realize by examining, e.g., Proposition 2.2.7
that the effectiveness of an investment strategy must be evaluated with appropriate
leverage level.
Example 2.2.12 We consider two simplified investment strategies labeled S1 (Strat-
egy 1) and S2 (Strategy 2), respectively. We assume that each strategy has two
possible returns with the corresponding probability specified below:
54 2 Financial Models in One Period Economy
For illustration let’s assume each strategy is used on ten periods with the same
initial endowment of $100 in two different leverage levels of 20% and 100%. The
first column in Table 2.1 represents the periods. The next two columns represent
the returns in each period for the two strategies S1 and S2, respectively. The last
four columns are the balances of strategies S1 and S2 at different periods when used
with the two different leverage levels 100% and 20%, respectively. The results show
that with a leverage level of 100% of the available capital for each strategy, System
2 is better than System 1, but with a leverage level of 20% System 1 becomes the
better one.
N
γ = max pn ln(1 + sgn ), (2.2.11)
s∈[−1/ max(gn ),−1/ min(gn )
n=1
0.03
0.02
0.01
0 s
0.5 1 1.5 2
System 1 System 2
Table 2.2 simultaneously in Figure 2.6 we can understand the reasons behind the
phenomenon observed in Example 2.2.12. Moreover, we see that neither strategy
was tested in Example 2.2.12 at the optimal leverage.
Using Theorem 2.2.7 we can calculate that, for Strategy 1, s̄ = 50%, γ = 0.035
and for Strategy 2, s̄ = 83%, γ = 0.02. Comparing the efficiency indices we can
see that Strategy 1 is the better one. Yet this fact is hard to unveil without the help
of the efficiency index.
We can see that increasing or reducing the share of cash in the portfolio clearly
swings the leverage level as measured by the magnitude of Θ, yet does nothing to
the gain (2.3.1). The following example shows that even if we fix the share of the
cash, such a phenomenon can still happen.
Example 2.3.1 (Infinitely Many Portfolio with Equivalent Gain) Consider a state
space Ω = {0, 1} and with a financial market with three risky assets whose prices
at times 0, 1 are given by S0 = (1, 1, 1, 1), S1 (0) = (1, 0.8, 0.9, 1), and S1 (1) =
(1, 1.1, 1.2, 1.1). We can easily verify that for portfolio Θ̄ = (1, 1, −2, 3), Θ̄ ·(S1 −
S0 )(i) = 0 for both i = 0 and i = 1. It follows that for any r ∈ R, all the portfolios
Θ + r Θ̄ have the same gain.
56 2 Financial Models in One Period Economy
Θ 1 · S0 = Θ 2 · S0 (2.3.2)
Θ 1 · (S1 − S0 ) = Θ 2 · (S1 − S0 ).
We will use S[Θ] to denote all the portfolios that are equivalent to Θ in market S.
Since all the portfolio in S[Θ] are equivalent we prefer those that have low
leverages as measured by · , the Euclidean norm on RM+1 . The following lemma
provides us with an optimally leveraged portfolio in each equivalent class.
Lemma 2.3.3 For any portfolio Θ in S, the optimization problem
Here · RV is the norm on RV (Ω, F, P ) introduced in Section 2.1 induced by the
inner product defined in (2.1.1).
Proof Note that problem (2.3.3) and the following problem (2.3.5) have the same
solution
Denote
⎡ ⎤
S1 (B1 ) − S0
⎢ S1 (B2 ) − S0 ⎥
A=⎢
⎣
⎥,
⎦
...
S1 (BN ) − S0
2.3 Fundamental Theorem of Asset Pricing 57
where {B1 , . . . , BN } are the set of atoms of the probability space (Ω, F, P ). Then
A is an N × (M + 1) matrix.
We observe that x ∈ S[Θ] amounts to requiring
Ax = Θ · (S1 − S0 ). (2.3.6)
we have (2.3.4).
If rank(A) < min(M + 1, N), then we can first remove the rows or columns in
A that are dependent on others and then apply the above special case to the reduced
matrix A.
Definition 2.3.4 (Portfolio Space) We call the quotient space of RM+1 with
respect to the portfolio equivalent relationship in market S the portfolio space on
S and denote it port[S]. For Θ ∈ port[S] we define its norm by
Θp = Θ.
Gain without risk is what every investor desires. Such opportunities arguably will
not last as when everyone tries to chase it. Based on this observation, in a financial
market a guiding principle is that such “free lunch” should not exist. The following
is a formal definition.
Definition 2.3.5 (Arbitrage) We say that a portfolio Θ is an arbitrage if it involves
no risk so that Θ · (S1 − S0 ) ≥ 0 and has opportunity to gain something: Θ · (S1 −
S0 ) = 0.
A rational investor with a utility function u satisfying conditions (u1)–(u3) will
try to maximize the expected utility of the final wealth among all portfolios in
port[S]. In other words, if w0 > 0 is the initial wealth of the investor, he wants
to solve the following portfolio utility maximization problem. Find:
Therefore, Θ ∗ is an arbitrage.
The fundamental theorem of asset pricing (FTAP) links no arbitrage with the
existence of certain type of measures defined below:
Definition 2.3.7 (Equivalent Martingale Measure) We say that Q is an equiv-
alent martingale measure (EMM) on economy (Ω, F, P ) for financial market S
provides that, for any atom Bi of F, Q(Bi ) = 0 if and only if P (Bi ) = 0, and
EQ [S1 ] = S0 .
Given an initial wealth w0 > 0, the set of all achievable wealth outcomes at the
end of the one period economy t = 1 using all possible portfolios is
W ∩ RV (Ω, F, P )+ \{0} = ∅.
2.3 Fundamental Theorem of Asset Pricing 59
Define f (y) = −E[u(y)] and g(y) = ιw0 +W (y). Then we can rewrite
problem (2.3.8) as
(corresponding to (1.2.2)) holds, Fenchel strong duality implies (2.3.9) and its
dual (2.3.10) has the same value.
60 2 Financial Models in One Period Economy
By Theorem 2.3.6, port[S] contains no arbitrage if and only if the optimal values
of problem (2.3.7) are finite and, therefore, the dual problems (2.3.9) and (2.3.10)
are all finite. Since W is a subspace, the optimal value of (2.3.10) is not −∞ implies
that its solution z ⊥ W . Moreover, E[(−u)∗ (−z)] > −∞ implies that z(Bi ) > 0
for all P (Bi ) = 0. Thus, Q = z/E[z] is an S-martingale measure equivalent to P .
That is, (i) implies (ii).
On the other hand, the existence of an equivalent S-martingale measure
implies that the constraint qualification condition for (2.3.10) holds. In fact,
problem (2.3.10) can be viewed as minimizing the convex function z →
E[(−u)∗ (−z)] + w0 , z over the entire subspace W ⊥ (z > 0 is merely a
consequence of the domain of E[(−u)∗ (·)] being a subset of int[−RV (Ω, F, P )+ ]
and, therefore, is not a separate constraint). Thus, the constraint qualification
condition for (2.3.10) satisfies (see, e.g., [62, Theorem 2.7.1]). It follows that
problem (2.3.7) which is equivalent to (2.3.9) as the dual of (2.3.10) has a finite
value and attains its solution, which is to say (ii) implies (iii).
Finally, if (iii) is true, then there cannot be any arbitrage in port[S] because
adding an arbitrage to the optimal solution of (2.3.7) will improve it. Thus, (iii)
implies (i) and we have completed a cyclic proof of the equivalence of (i), (ii), and
(iii).
We have already known from the proof of the Theorem 2.3.8 that this problem has a
solution (x ∗ , Θ ∗ ). Moreover, since we know strong duality holds and the dual prob-
lem has a solution, which implies that problem (2.3.12) has a Lagrange multiplier.
Let λ be the Lagrange multiplier of problem (2.3.12). Then the Lagrangian is
E[(λ/E[λ])S1 ] = S0 .
2.3 Fundamental Theorem of Asset Pricing 61
Assume there is no arbitrage then Theorem 2.3.6 implies that the optimal value
of problem (2.3.13) is finite and is attained at (x ∗ , β ∗ , Θ ∗ ). As in the previous
section that we can check that the constraint qualification condition for prob-
lem (2.3.13) is satisfied and, therefore, problem (2.3.13) has a Lagrange multiplier
λ ∈ RV (Ω, F, P ) such that the Lagrangian
E[(λ/E[λ])S1 ] = S0 .
φ0 = E Q [φ1 ],
in other words, if there is no arbitrage then the price of the contingent claim must be
the expectation of its payoff under one of the martingale measures that are equivalent
to P .
We can see from above that martingale measures and, therefore, the resulting
prices of the contingent claim depend on the choice of utility functions. We now
give a simple example that explicitly calculates the martingale measures in terms of
a class of utility functions.
Example 2.3.11 Consider a market S contains only one risky asset. Assume that
the market has N states Ω = {ω1 , . . . , ωN } and state ωn happens with probability
pn . Assume for simplicity that S0 = 1 and denote xn := S1 (ωn ) − S0 . In this
case a trading strategy Θ is simply a constant θ indicating the share of S that the
trader holds. Given a utility function u satisfying properties (u1)–(u3) the utility
2.3 Fundamental Theorem of Asset Pricing 63
N
max E[u(1 + θ · (S1 − S0 ))] = pn u(1 + θ xn ). (2.3.14)
n=1
N
min − pn u(yn ) (2.3.15)
n=1
subject to yn − 1 − θ xn = 0, n = 1, . . . , N.
N
L((y, h), λ) = − pn [u(yn ) + λn (yn − 1 − θ xn )].
n=1
N
pn λn xn = 0, (2.3.16)
n=1
and
λn = u (yn ) = u (1 + θ xn ). (2.3.17)
Equation (2.3.16) clearly shows that a scaled λ gives us the martingale measure. To
solve for θ so as to derive the solution to the utility optimization problem (2.3.14)
we can substitute (2.3.17) into (2.3.16) to get the following equation for θ ,
N
pn u (1 + θ xn )xn = 0. (2.3.18)
n=1
Equation (2.3.17) clearly shows that the martingale measure depends on the choice
of utility function.
ln x + cx x>0
uc (x) =
−∞ x ≤ 0,
N
L((y, θ ), λ) = − pn [ln(yn ) + cyn + λn (yn − 1 − θ xn )].
n=1
Numerically solving (2.3.19) and (2.3.20) and scaling the Lagrange multipliers
yield (Table 2.3) that relates c to optimal portfolio θ̄ and risk neutral measure π :
We can see that fixing w0 when c increases so does θ̄ , which is a fact that is not
hard to verify to be true in general from Equation (2.3.20). Note that in our family
of utility functions depend on the parameter c, decreasing of c corresponding to
increasing of risk aversion. On the other hand, fixing a utility function (by fixing c)
decreasing of w0 corresponds to increasing of risk aversion (see Table 2.4). This
is consistent with an intuitive explanation of the change in the martingale measure:
increasing in the weight in the middle (π2 ) while decreasing the weight on both
extremes (π1 and π3 ).
2.3 Fundamental Theorem of Asset Pricing 65
fp
p = 0.298
p = 0.297
p = 0.296
θ
−0.2 −0.1 0 0.1 0.2
Fixing a utility ln(x) + 0.2x, pricing C using the equivalent martingale measure
from the previous example gives the results in Table 2.5:
Fixing u(x) = ln(x) + 0.2x, w0 = 3 from the table p = 0.297. This is the
private price of the agent corresponding to his/her risk aversion. The meaning of
this private price is that the agent should buy (long) when the market price is lower
than p = 0.297 and sell (short) when the market price is higher to improve his/her
utility. Figure 2.7 shows the expected utility
The utility optimization point of view also explains that trading will happen
between agents with different risk aversion determined by utility and initial
endowment. For example, assume the same utility u(x) = ln(x) + 0.2x for all
agents. If market price is 0.297, then agents with w0 = 1 will sell, agents with
w0 = 6 will buy while agent with w0 = 3 will take no action.
We have seen that in general the martingale measure is not unique and they are
related to the investor’s utility function. One exception is when the financial market
is complete as defined below:
Definition 2.3.15 (Complete Market) We say a financial market S is complete if
{Θ · S1 | Θ ∈ port[S]} = RV (Ω, F, P ),
or equivalently
{1B : B ∈ F} ⊂ {Θ · S1 | Θ ∈ port[S]}.
which implies that Q = λ/E[λ]P is the unique risk neutral measure. Moreover,
since x ∗ satisfies the constraint x ∗ − Θ ∗ · (S1 − S0 ) − w0 = 0 we also know
that λ, x ∗ − w0 = EQ [x ∗ − w0 ] = 0. Thus, x ∗ is also a solution to the constrained
minimization problem
2.3 Fundamental Theorem of Asset Pricing 67
On the other hand, since −u is strictly convex, the solution to (2.3.21) is unique and,
therefore, must be x ∗ . Thus, problem (2.3.12) and (2.3.21) have the same solution.
Remark 2.3.17
1. Problem (2.3.21) only provides a solution x ∗ . To get the optimal portfolio one
has to do additional work using the constraint.
2. The equivalence of the solutions of the two problem breaks down if martingale
measures are not unique and, therefore the above result only holds in a complete
market.
sup{E[u(x)] : x ∈ W }.
sup{E[u(x)] : x ∈ W ∩ RV (Ω, F, P )+ }.
W ∩ RV (Ω, F, P )+ = {0}.
In fact,
Since this is true for all states ω ∈ Ω, it tells us the equivalent martingale measure
is proportional to a vector in [0, k]N , N = number of states in Ω. This amounts
to constraint in the martingale measure. We also note that in this case nothing
is lost by picking the utility function u(t) = t − ι(−∞,0) (t) so that the utility
maximization problem becomes a linear programming problem. This way one can
use the more widely known linear programming duality instead of Fenchel duality.
This approach, however, loses the information relating to the agent’s risk aversion.
Definition 2.4.1 (Risk Measure) Let RV (Ω, F, P ) represent the payoff space.
We say a lower semicontinuous function ρ : RV (Ω, F, P ) → R ∪ {+∞} is a risk
measure if ρ is convex and decreasing, i.e., ρ(x) ≤ ρ(y) for any x ≥ y.
Convexity of risk measures reflects the belief that diversification reduces risk.
The decreasing property says that a dominant payoff is less risky. We will focus on
the following:
Definition 2.4.2 (Coherent Risk Measure) Let RV (Ω, F, P ) represent the pay-
off space. We say a lower semicontinuous function ρ : RV (Ω, F, P ) → R∪{+∞}
is a coherent risk measure if, for any x, y ∈ RV (Ω, F, P ), ρ has the following
properties:
(r1) (Positive homogeneity) ρ(rx) = rρ(x) for any r > 0,
(r2) (Subadditivity) ρ(x + y) ≤ ρ(x) + ρ(y),
(r3) = ρ(x) − c ∀x ∈ RV (Ω, F, P ) and c ∈ R.
(Translation property) ρ(x + c1)
(r4) (Monotonicity) ρ(x) ≤ ρ(y) for any x ≥ y,
Properties (r1) and (r2) imply that a coherent risk measure is convex. Property
(r4) says a coherent risk measure is decreasing. Thus, coherent risk measure is a
special type of risk measures. Property (r1) says that the risk measure is proportional
to scaling. With this property coherent risk measure is actually sublinear. The idea
of (r3) is that one may measure the risk of x by the minimum amount of additional
capital reserve to ensure that there is no risk of bankruptcy. This is very important
in practice. A coherent risk measure as defined above has a simple structure and
affords several equivalent characterizations which we will discuss below.
2.4 Risk Measures 69
Coherent risk measure is convex. Any l.s.c. convex function on a finite dimensional
Banach space has the dual representation
ρ ∗ = ιM ,
where
Thus,
ρ ∗ = ιM .
We note that the characterization of M in Proposition 2.4.3 depends on ρ. Thus
we cannot use it to describe ρ. Information leads to ρ independent restriction is
useful. The axioms (r3) and (r4) provide such information.
70 2 Financial Models in One Period Economy
such that
ρ ∗ = ιM .
M ⊂ −RV (Ω, F, P )+ ,
such that
ρ ∗ = ιM .
such that
ρ = σM ,
essence picking a particular “test” set of typical losses represented by the set M
to determine the level of cash reserve for a certain investment. There are infinitely
many possibilities in choosing the set M and thus determining particular coherent
risk measures. The larger the set M, the more conservative the risk measure
(requiring higher cash reserves). In fact, this is the original motivation for the
definition of the coherent risk measure. The Chicago Merchantile Exchange margin
system is an example of using this method with a finite set M. The idea is rather
similar to “stress” test. In implementation, it is clear that what is important is not
how many elements one includes in M but how “diversified” the elements in M are.
Definition 2.4.8 (Acceptance Cone) Let ρ be a risk measure satisfying (r1), (r2),
and (r3) in Definition 2.4.2 and define
What is interesting is that any cone has properties (a1)–(a3) must be the
acceptance set of some coherent risk measure. This leads to the following definition.
Definition 2.4.10 (Coherent Acceptance Cone) We say a set A ⊂ RV (Ω, F, P )
is a coherent acceptance cone provided that it has the following properties:
(a1) A is a closed convex cone,
(a2) 1 ∈ A,
(a3) RV (Ω, F, P )+ ⊂ A.
Theorem 2.4.11 (Coherent Risk and Acceptance Cone) Let A ⊂ RV (Ω, F, P )
be a coherent acceptance cone. Then there exists a coherent risk measure ρA such
that
72 2 Financial Models in One Period Economy
All the desired properties then follow naturally. We leave checking the details as an
exercise.
It is natural to ask the relationship between the acceptance cone and the
generating set of a coherent risk measure.
Theorem 2.4.12 (Acceptance Cone and the Generating Set) Let ρ be a coherent
risk measure with a generating set M, i.e. ρ = σM where σM is the support function
of M as defined in (1.1.1). Let Aρ be its acceptance cone. Then
Aρ = −(cone M)+ ,
where cone M is the cone generated by M, i.e. the smallest cone containing M.
Proof We only need to observe x ∈ −(cone M)+ if and only if x, m ≤ 0, ∀m ∈ M
iff ρ(x) = σM (x) ≤ 0, i.e. x ∈ Aρ .
RV (Ω, F , P )+
Aρ
M
2.4 Risk Measures 73
Coherent Preference
We know that any closed convex cone induces a continuous partial order. Denote ≤A
the linear partial order defined by a cone A, that is x ≤A y if and only if y − x ∈ A.
Proposition 2.4.13 Let A be a coherent acceptance cone and define partial order
≤A by x ≤A y if and only if y − x ∈ A. Then ≤A has the following properties:
(o1) (Positive homogeneous) 0 ≤A x implies 0 ≤A tx for any t > 0,
(o2) (Additive) x ≤A y and u ≤A v implies x + u ≤ y + v,
(o3) (Reflexive) x ≤A x,
(o4) (Monotone) 0 ≤ x for any x ∈ RV (Ω, F, P )+ .
Proof Exercise.
A = {x ∈ RV (Ω, F, P ) | 0 ≤ x}.
π, x ≥ 0.
We say π is normalized if π, 1 = 1.
Definition 2.4.18 (Consistent Price Operator) Consider a one period financial
market S on RV (Ω, F, P ). We say π ∈ RV (Ω, F, P )∗ \{0} is a consistent price
operator for S, provided that
π, S1 = π, S0 .
Viewing price operators as elements in the dual space is consistent with the one
price principle. The definition of price operators recognizes the relative value of any
payoff 0 ≤ x, or x ∈ A where A is the coherent acceptance cone generating the
partial order ≤. Normalized price is consistent with the value of cash implied in
the translation property of the coherent risk measure. Consistent price operator is,
in fact, looking at martingale measures from the perspective of a pricing system.
The next proposition explains the meaning of valuation bounds and follows directly
from the definition.
Proposition 2.4.19 (Bounds for Normalized Price) Let π be a normalized price
operator. Then, for any x ∈ RV (Ω, F, P ),
π(x) ≤ π, x ≤ π (x).
Proof Exercise.
While the concepts of valuation bounds and prices provide different perspectives
they are closely related to the coherent risk and its equivalent description in terms of
its coherent acceptance cone and coherent partial order as evidenced in the theorem
below.
Theorem 2.4.20 (Valuation Bounds and Coherent Risk Measure) Let ≤ be the
coherent partial order generated by the coherent risk measure ρ and let π and π be
the price bounds induced by the partial order ≤. Then, for any x ∈ RV (Ω, F, P ),
ρ(x) ≤ π (−x).
= ρ(x) − ρ(x) = 0 implies that
On the other hand, ρ(x + ρ(x)1)
ρ(x) ≥ π (−x).
x − r 1 ∈ A.
The above characterization for the existence of good deal is from the perspective
of payoffs. We now relate it to price and price bounds. Mathematically, it is a process
of scalarization. What we do here is to consider the potential price of a payoff z in
the market. First we discuss price bounds for a good deal.
Definition 2.4.23 (Good Deal Bounds) Let A be a coherent acceptance cone and
let z ∈ W the gain space of financial market S. We define the upper and lower good
deal bounds with respect to A by
π W (z) = inf {r : x + r 1 − z ∈ A}
r∈R,x∈W
and
76 2 Financial Models in One Period Economy
As the name suggests, good deal bounds reveal prices for good deals. The interval
[π W (z), π W (z)] is the interval of normalized admissible prices that is consistent
with the absence of a good deal. In fact, if z has a normalized admissible price
P > π W (z), then there exists x = Θ · (S1 − S0 ) ∈ W and 0 < r < P such that
x + r 1 − z ∈ A, then we can sell short z at price P and assemble portfolio Θ · S0 at
time t = 0. When t = 1 the value of the portfolio gives us y = x + P 1 − z. Since
y − (P − r)1 = x + r 1 − z ∈ A, it is a good deal.
The good deal bounds are actually coherent valuation bounds.
Proposition 2.4.24 (Good Deal Bounds as Valuation Bounds) The upper and
lower good deal bounds π W (z) and π W (z) defined in Definition 2.4.23 are actually
coherent valuation bounds.
Proof It is easy to check that π W (−z) = −π W (z). Moreover, rewrite −π W (z) as
−π W (z) = − sup {r : x − r 1 + z ∈ A}
r∈R,x∈W
= inf {r : −r 1 + z ∈ A − W }
−r∈R
= inf {r : z + r 1 ∈ A − W }.
r∈R
x, y ≤ a, y , ∀x ∈ W and a ∈ A.
2.4 Risk Measures 77
We discuss several useful risk measures below paying particular attention on how
many of the standard assumptions of coherent risk measure in Definition 2.4.2 they
satisfy.
Standard Deviation
Variance or equivalently standard deviation has been used as a risk measure since
Markowitz proposed the modern portfolio theory. It satisfies (r1) and (r2) but fails
(r3) and (r4). The standard deviation does not satisfy axiom (r4) which has long
been criticized as unreasonable. Some remedies have been suggested such as count
the deviation only on losses. It turns out that
ρs (x) = E[((x − E[x])− )2 ) − E[x]
is actually a coherent risk measure that is faithful to the idea of using downside
deviation as a measure for risk.
Both implementations suggested by the dual representation Theorem 2.4.6 and
the acceptance cone formulation in Theorem 2.4.11 are viable. For example, if one
uses the acceptance cone to implement, then each security is paired with a margin
requirement equals to its modified standard deviation if that can be estimated.
Drawdown
The maximum absolute drawdown, denoted dd(x) in a given period of time is often
used by traders. This risk measure also satisfies axioms (r1) and (r2) but fails (r3)
and (r4).
As in the case of standard deviation we can also subtract E(x) to make it satisfy
(r3). One way to adjust it so that it has property (r4) is to make the reference point
for maximum down move to the fixed beginning wealth. But this completely distorts
the intention of drawdown as a risk measure.
Both implementations suggested by the dual representation Theorem 2.4.6 and
the acceptance cone formulation in Theorem 2.4.11 are viable without axiom (r4).
The only difference is that the acceptance cone may not contain the entire cone
RV (Ω, F, P )+ . This is not unreasonable in practice.
78 2 Financial Models in One Period Economy
Value at Risk
The value at risk of a portfolio in a given period is a gauge for the risk of the
portfolio that is important for both portfolio managers and regulators. It is defined
on the random variable of loss, the negative of the payoff.
Definition 2.4.26 (Value at Risk) Let L be the random variable representing the
loss of a portfolio in a given period. The value at risk with confidence level α ∈
(0, 1), denoted by V aRα is defined as
The risk measure defined below can be viewed as a remedy for VaR does not have
the convexity.
Definition 2.4.28 (Conditional Value at Risk) Let L be the random variable that
represents the loss of a portfolio in a given period. The conditional value at risk
with confidence level α ∈ (0, 1), denoted by CV aRα is defined as
1 1
CV aRα (L) = V aRs (L)ds.
1−α α
We can see that CV aRα is the expected or average loss that has a probability
1 − α of happening.
Example 2.4.29 (CVaR of a Discrete Loss Distribution) Suppose again that the loss
L is discretely distributed as in Table 2.6. Then CV aR0.95 (L) = (50 · 0.03 + 600 ·
0.02)/0.05 = 270, V aR0.9 (L) = (40 · 0.05 + 50 · 0.03 + 600 · 0.02)/0.1 = 155, and
V aR0.8 (L) = (30 · 0.1 + 40 · 0.05 + 50 · 0.03 + 600 · 0.02)/0.2 = 92.5 (Table 2.7).
Table 2.7 Compares VaR and CVaR.
We can see that V aR has the effect of give unreasonable incentive to insurance
writers in general and Credit Default Swap (CDS) writers in particular.
It is not hard to see that both V aRα (L) and CV aRα (L) are increasing functions
of α and V aRα (L) is dominated by CV aRα (L).
The following representation reveals that the conditional value at risk is convex
with respect to L.
Theorem 2.4.30 (Representation as an Expectation)
1
CV aRα (L) = min r + E[(L − r)+ ] (2.4.3)
r∈R 1−α
1
= V aRα (L) + E[(L − V aRα (L))+ ].
1−α
80 2 Financial Models in One Period Economy
1
α
QL
FL
0
rα = V aRα (L)
In particular (see Figure 2.9 in which the shaded area represents E[(L − rα )+ ]),
let r = rα = V aRα (L) we have
1 1 ∞
E[(L − rα )+ ] = P (L ≥ t)dt (2.4.5)
1−α 1−α rα
1 1
= (V aRt (L) − rα )dt
1−α α
1 1
= V aRt (L)dt − rα
1−α α
= CV aRα (L) − rα .
This proves
1
CV aRα (L) = V aRα (L) + E[(L − V aRα (L))+ ].
1−α
1
α
QL
FL
0
rα = V aRα (L)
and we need only to show the easy fact that, for any r,
rα
(1 − α)D = (1 − α)(r − rα ) + P (L ≥ t)dt ≥ 0. (2.4.7)
r
The intuition is illustrated in Figure 2.10 in which the short vertical bars signify
r < rα and r > rα , respectively.
Proof We can write the conditional value at risk with confidence level α as the value
function of the following linear programming problem:
1
CV aRα (L) = inf r+ E[u] : u ≥ 0, u + r 1 ≥ L .
r∈R,u∈RV (Ω,F ,P ) 1−α
1
L((r, u), (s, v)) = r + 1, u + s, u + v, u + r 1 − L1 ,
1−α
where s, v ≤ 0. For linear programming problem as long as both primal and dual
problems are feasible strong duality holds. Thus, we have
1
= sup inf r(1 + v, 1 ) + 1 + s + v, u + v, −L
s≤0,v≤0 r,u 1 − α
1
= sup v, −L : −v, 1 = 1, 1+s+v ≥0
s≤0,v≤0 1−α
1
= sup v, −L : E[−v] = 1, 1 ≥ −v ≥ 0 .
1−α
Since the dual solution exists the sup is, in fact, a max.
Estimating CVaR
It follows that
m
1 +
CV aRα (Θ · R) ≈ min r + (Θ · R − r)
k
(2.4.9)
r∈R (1 − α)m
k=1
1
m !
0 ≤ vk ≤ , k = 1, . . . , m, vk = 1 .
(1 − α)m
k=1
3.1.1 An Example
wi = w0 + X1 + . . . + Xi . (3.1.1)
game with n = 3 to get some feeling. We use H to represent a head and T , tail. The
information we can get at each stage can be illustrated with the following binary
tree.
F0 F1 F2 F3
HHH
HH
HHT
H
HT H
HT
HT T
{Ω}
T HH
TH
T HT
T
TTH
TT
TTT
Ω = {H H H, H H T , H T H, H T T , T H H, T H T , T T H, T T T }.
{H H, H T , T H, T T } = {{H H H, H H T }, {H T H, H T T }, {T H H, T H T },
{T T H, T T T }}.
{Θ ∈ ts[S] | Θtm
=Θ0m , m=k + 1, . . . , M, t=1, . . . , T − 1}.
Θt−1 · St = Θt · St , t = 1, 2, . . . , T − 1.
t
Gt (Θ) := Θs−1 · (Ss − Ss−1 ) = Θt−1 · St − Θ0 · S0 .
s=1
In every practical trading there is always a limit in how much one can lose. This
leads to the concept of admissible trading strategies described below.
Definition 3.2.2 (Admissible Trading Strategy) Let a > 0 be a constant. We
say that a self-financing trading strategy Θ ∈ T is a-admissible if, for all t =
1, 2, . . . , T ,
We use A(a) to denote the (convex) set of all a-admissible trading strategies.
An arbitrage trading strategy is a-admissible for any a > 0. Thus, we have
Lemma 3.2.3 For a > 0, T contains no arbitrage if and only if A(a) contains no
arbitrage.
The next lemma shows that when T contains no arbitrage to show Θ is a-
admissible we need only to check condition (3.2.1) at t = T .
Lemma 3.2.4 If T contains no arbitrage, then Θ ∈ T is a-admissible if and only
if
Θt−1 · St = b < −a
M
Θ̄t · St = Θt0 − b + Θtn Stn (3.2.5)
n=1
= Θt · St − b = Θt−1 · St − b = 0 = Θ̄t−1 · St .
n=1
Θt · St+1 − b > −a − b > 0 for ω ∈ A
= .
0 for ω ∈ A.
We can also show that when there is no arbitrage the set of admissible trading
strategies A(a) is compact.
Lemma 3.2.5 For any a > 0, if A(a) contains no arbitrage, then it is bounded and
compact.
Proof We first show that A is bounded. For t = 1, 2, . . . , T , let us denote At =
{Θ ∈ A : Θs contains only cash position for s > t − 1}. We note that AT = A
and prove by induction on t. Again without loss of generality we assume the initial
endowment is always 0.
For t = 1, assume that there is no arbitrage but A1 is unbounded. By
Corollary 3.1.3 there exists a sequence of trading strategies Θ(m) ∈ A1 such that
Θ(m)0 · S1 is unbounded. Without loss of generality we may assume that, for all
m, Θ(m)0 · S1 > 1 and Θ(m)0 · S1 → +∞ then Θ(m)/Θ(m)0 · S1 ∈ A1
and is bounded by Corollary 3.1.3. Selecting a subsequence if necessary we may
assume that Θ(m)/Θ(m)0 · S1 converges to Θ ∗ ∈ A1 . Since Θ(m)0 · S1 ≥ −a,
taking limit we have
On the other hand, we also know from the above limiting process that Θ1∗ ·S1 = 1.
This means Θ ∗ is an arbitrage, a contradiction.
3.3 Fundamental Theorem of Asset Pricing 89
Now we turn to prove the FTAP in multiperiod market model and discuss related
applications.
sup{E[u(ΘT −1 · ST )] : Θ0 · S0 = w0 , Θ ∈ T }, (3.3.1)
where T the set of self-financial trading strategies. We show that a solution to the
dual of (3.3.1) when scaled gives us a martingale measure and, thus, linking the
fundamental theorem of asset pricing to utility maximization problem (3.3.1).
Theorem 3.3.1 Let S be a financial market. Then the following statements are
equivalent:
(i) There exists no arbitrage trading strategy in T ;
(ii) For every utility function u with properties (u1), (u2), and (u3), the finite
optimal value of the trading strategy utility optimization problem (3.3.1) is
attained.
(iii) There is an equivalent S-martingale measure proportional to an element of the
subdifferential of the utility function at the optimal portfolio.
Proof First observe that the utility optimization problem (3.3.1) can be written
equivalently as
where W = {GT (Θ) : Θ ∈ T } is the linear subspace of all achievable gains using
self-financing trading strategies.
Defining f (y) = −E[u(y)] and g(y) = ιw0 +W (y), we can rewrite prob-
lem (3.3.2) as
holds, (3.3.3) and its dual (3.3.4) have the same value.
When T contains no arbitrage, by property (u2) of the utility function,
E[u(ΘT −1 ·ST )] > −∞ implies ΘT −1 ·ST ≥ 0 or GT (Θ) ≥ −w0 . By Lemma 3.2.4,
we must have Θ ∈ A(w0 ). Thus, the utility maximization problem (3.3.1) is
equivalent to
By Lemma 3.2.5 problem (3.3.6) and, therefore, (3.3.2) has a finite solution. By
the strong duality, the dual problem (3.3.4) has a finite optimal value and attains
its solution. Condition (u2) forces the domain of E[(−u)∗ (·)] to be a subset
of int (−RV (Ω, F, P )+ ). Thus, we only need to consider z > 0 in the dual
problem (3.3.4). Moreover, we must have z, GT (Θ) = 0 in (3.3.4) since σW (z) <
∞ and W is a subspace of RV (Ω, F, P ). Hence we can write problem (3.3.4) as
Finally, if (iii) is true, then there cannot be any arbitrage in T because adding an
arbitrage to the optimal solution of (3.3.1) will improve it. Thus, (iii) implies (i) and
we have completed a cyclic proof of the equivalence of (i), (ii), and (iii).
The existence of the solution to the dual of (3.3.8) implies the existence of a
Lagrange multiplier λ ∈ RV (Ω, F, P ) such that the Lagrangian
attains minimum at solution (x ∗ , Θ ∗ ) to the problem (3.3.1). It follows that, for any
P (ω) = 0,
λ, GT (Θ) = 0, ∀Θ ∈ T . (3.3.10)
Using the same argument as in the previous subsection, we can show that there
exists a Lagrange multiplier λ ∈ RV (Ω, F, P ) such that, for any P (ω) = 0,
φ0 = EQ [φT ]. (3.3.13)
The above arguement can also be used to derive a no arbitrage price for φT at any t <
T in terms of a martingale measure. Formula (3.3.12) indicates that the martingale
measure used to pricing a contingent claim, in general, relies on the risk aversion of
an agent. Thus, agents with different risk aversions and, therefore, different utility
functions may reasonably price the same contingent differently.
{ΘT −1 · ST | Θ ∈ T } = RV (Ω, F, P ).
As we have seen in the one period case this is merely calculating the optimal end
wealth using the Lagrangian. Proof is similar to that of the one period case and is
omitted.
If the market price of an asset violates those specified by the fundamental theorem of
asset pricing, then in theory an arbitrage opportunity arises. We turn to the problem
of how to take advantage of such an arbitrage opportunity.
and
give us upper and lower bounds for the price of ψ. If the price of ψ fells outside
of these bounds, an arbitrage will become possible. We call them super- and sub-
hedging bounds, respectively. We focus on the super-hedging bound. The discussion
94 3 Finite Period Financial Models
about the sub-hedging bound can be reduced to that of a super hedging bound for
−ψ because
If the market price of ψ is above this super hedging bound how can we find an
arbitrage strategy? It turns out that the key is to view (3.4.1) as a linear programming
problem and consider its dual. As discussed before that for a linear programming
problem and its dual, the constraint qualification condition ensuring the strong
duality is, in fact, the feasibility condition. So the key is to correctly formulate
the dual problem of (3.4.1). We will use the Lagrange formulation. Let’s assume
{Θn }N
n=1 is a bases for the finite dimensional Banach space T of self-financing
trading strategies. Then we can rewrite (3.4.1) as
where M + signifies the set of all positive measures. We can see that (3.4.4) is a
linear programming problem. Moreover, the Lagrangian of (3.4.4) is
N
L(Q, λ) = E [ψ] +
Q
λn EQ [GT (Θn )] + λ0 (EQ [1] − 1), (3.4.5)
n=1
N
Θ= λn Θn
n=1
min t (3.4.9)
s.t. t − GT (Θ)(ω) ≥ ψ(ω), ω ∈ Ω
Θ ∈ T , t ∈ R.
Let Θ̄ and t¯ = ψ be the solution of (3.4.9). If the market price of the contingent
claim at t = 0 is
ψ0 > ψ,
then we can short one share of the contingent claim and follow the trading strategy
−Θ (or equivalently, short the trading strategy Θ). By time t = T , we have
t¯ − GT (Θ̄)(ω) ≥ ψ(ω), ∀ω ∈ Ω.
That is to say the gain from the trading and cash amount ψ safely covers the short
position in any possible economic state and the difference ψ0 − ψ becomes our
arbitrage profit.
96 3 Finite Period Financial Models
L(Q, (Θ, λ0 , b)) = EQ [ψ] + EQ [GT (Θ)] + λ0 (EQ [1] − 1) + b · (EQ [φ] − c),
where (Θ, λ0 , b) ∈ T × R × RK .
Similar to the previous section we can verify that, by the strong lagrange duality,
Since φ 1 (C11 ) is arbitrary this is to say that P1 is the marginal probability measure
of π on Ω1 . Similarly, P2 must be the marginal probability measure of π on Ω2 .
Clearly, product measure π that satisfies such marginal requirements is not unique.
We see that despite the completeness of the financial markets on Ω1 and Ω2 , in
pricing a contingent claim with payoff as a random variable on the product measure
space (Ω1 × Ω2 , F1 × F2 ), we face an incomplete market.
To find the upper bound for the price of ψ that is consistent with the no arbitrage
principle we face the optimization problem
where Π (P1 , P2 ) signifies the set of all probability measures on the product measure
space (Ω1 × Ω2 , F1 × F2 ) whose marginals on Ω1 and Ω2 are P1 and P2 ,
respectively.
Convex duality again plays an important role in dealing with problem (3.4.15).
We illustrate by an example.
Example 3.4.1 (Estimate Upper No Arbitrage Bound in Finite Sample Spaces)
98 3 Finite Period Financial Models
Then the problem of finding an upper bound for the contingent claim ψ(C 1 , C 2 )
can be formulated as
μi = 1, νj = 1.
i j
Again this shows that in principle one can implement the upper no arbitrage price
bound ψ̄ using the sum of two contingent claims φ 1 and φ 2 on sample spaces Ω1
and Ω2 , respectively.
Real financial markets have frictions. Trading a financial asset one faces two
different prices: ask and bid. Usually, the ask is strictly larger than the bid and one
can only buy at the ask price and sell at the bid price. This violation of the one price
principle complicates the modeling. The attainable gains from trading assets in such
a more realistic market model is not a subspace but rather, in general, a cone. This
leads to the name of conic finance.
Note that it is important to specify the trading time t for the bid and ask prices.
Trading [x]t at time s > t would use information that are not available for random
variables xk , k = t, . . . , s − 1 and is impossible. Trading [x]t at s ≤ t is legitimate
but the prices for different s are different. Paying at ([S m ]t ) at t one will get the cash
stream [S m ]t = (0, . . . , 0, St+1
m , . . . , S m ). Similarly, receiving b ([S m ]t ) one sells
T t
the cash stream [S ] or in other words get the cash stream −[S m ]t . The riskless
m t
100 3 Finite Period Financial Models
Symmetrically, selling the above cash stream at the bid price bt ([S m ]t ) yields the
zero cost cash stream S̃ mt := bt ([S m ]t )1t − [S m ]t , i.e.
⎧
⎪
⎪
⎨0 s<t
S̃s = bt ([S ] ) s = t
mt m t (3.5.3)
⎪
⎪
⎩−S m s > t.
s
We observe that S̃ it is different from −S it due to the spread between the ask and
bid prices. Similarly buying and selling bonds maturing at u at time t generate zero
cost cash streams 1ut := 1u − at (1u )1t and 1̃ut := bt (1u )1t − 1u , respectively, i.e.
⎧ ⎧
⎪
⎪ s = u, t ⎪
⎪ s = u, t
⎨0 ⎨0
1ut = −at (1u ) s=t and 1̃ut = bt (1u ) s=t (3.5.4)
s
⎪
⎪
s
⎪
⎪
⎩1 s = u, ⎩−1 s = u.
Assuming that one can buy or sell any fraction of the cash stream alluded to
above, suppose αti , α̃ti , βtu , β̃tu , i = 1, . . . , M, u = 1, . . . , T are nonnegative Ft
measurable random variables, then
T M T T
z= [αti S it + α̃ti S̃ it ] + [βtu 1ut + β̃tu 1̃ut ], (3.5.5)
t=0 i=1 t=0 u=1
is a cash stream that can be implemented by trading the available zero cost cash
streams.
Definition 3.5.2 (Trading Strategies) A cash streams z of the form in (3.5.5)
is called an implementable cash stream and we say αti , α̃ti , βtu , and β̃tu is a
trading strategythat implements z. We use A(C) to denote the collection of all
implementable cash streams.
3.5 Conic Finance 101
It is clear that A(C) is a closed cone. If all the bid and ask prices coincide, then
S it = −S̃ it and 1ut = −1̃ut . In this case we recover the one price economy model
as a special case and A(C) becomes a linear subspace of X .
Definition 3.5.3 (Super Implementation) We say a cash streams x ∈ X is super
implementable if there exists a cash stream z ∈ A(C) of the form in (3.5.5) such
that z ≥ x. In this case we say αti , α̃ti , βtu , and β̃tu is a trading strategy that super
implements x. We use A(C) to denote the collection of all super implementable cash
streams.
It is easy to see that A(C) is also a closed cone and A(C) ⊂ A(C).
Using the model described in the previous section, we can extend the fundamental
theorem of asset pricing to markets with a bid-ask spread. First we define arbitrage
in such a market.
Definition 3.5.4 (Arbitrage Trading Strategy) We say that a cash stream x ∈
A(C) is an arbitrage if x ≥ 0 and x = 0. If x ≤ z ∈ A(C) where z has the
representation in (3.5.5), then we say αti , α̃ti , βtu , and β̃tu is an arbitrage trading
strategy.We say that the conic financial market has no arbitrage if A(C) does not
contain any arbitrage.
Denote X + the cone in X with all the components are nonnegative, then there is
no arbitrage trading strategy in the financial market described in the previous section
if and only if
Proof Since one can always scale an arbitrage cash stream with any arbitrarily large
positive number, therefore p < +∞ implies that there is no arbitrage. Similar to
Lemma 3.2.5 we can show that in this case the finite optimal is attained.
On the other hand, if p = +∞, without loss of generality we assume that there
is a sequence zn ∈ A(C) such that
T
E[u(wt0 + ztn )] → +∞. (3.5.8)
t=0
Clearly zn → +∞. Then taking a subsequence if necessary we can assume that
zn /zn → z∗ ∈ A(C)\{0}. By property (u3) ztn ≥ −wt0 , t = 0, 1, . . . , T . Thus,
zt∗ ≥ 0 implies that z∗ is an arbitrage.
We turn to the dual characterization of the no arbitrage and its implication for
the price of financial assets. For this purpose, we will often need to consider the
conditional expectation with respect to Ft which we will denote Et . Similarly we
use notation
% T &
x, y t = Et xt yt .
t=0
π, x t ≤ 0. (3.5.9)
π, χA x ≤ 0 (3.5.10)
π, x t ≤ 0. (3.5.11)
To see the relationship of a consistent price operator and the bid and ask prices
of a cash stream we observe that 0 ≥ π, S mt t = π, [S m ]t − at ([S m ]t )1t t implies
that π, [S m ]t t ≤ at ([S m ]t ) π, 1t t = at ([S m ]t )πt . Similarly, 0 ≥ π, S̃ mt t
implies that π, [S m ]t t ≥ bt ([S m ]t )πt . That is to say
In a one price one period financial market, for t = 0, [S m ]0 = S1m and a0 ([S m ]0 ) =
b0 ([S m ]0 ) = S0m . Since (3.5.12) holds for all m = 1, . . . , M we have π, S1 =
π, S0 . Thus, we recover consistent price operator in Definition 2.4.18 as a special
case. Clearly, consistent price operator, in general, is not normalized in the sense
of Definition 2.4.17. We can see from (3.5.12) that, for any fixed t, dividing π by
π, 1t t = πt normalizes it for the purpose of deriving prices at time t. Clearly, it is
impossible to uniformly normalize a consistent price operator.
In Section 2.4 we have seen that consistent price operator is closely related to a
martingale measure. Next we derive a version of FTAP for a conic financial market
in which consistent price operators play the role of that of martingale measures in
FTAP for a one price financial market.
Theorem 3.5.8 (FTAP in Conic Financial Market) Let C be a conic financial
market as in Definition 3.5.1 and let u be a utility function that satisfies properties
(u1), (u2), and (u3). Then the following statements are equivalent:
(i) The conic financial market C has no arbitrage;
(ii) The utility optimization problem (3.5.7) is finite and attained.
(iii) There exists a C-consistent price operator which is an element of the subdiffer-
ential of the utility function at the optimal cash stream.
Proof The equivalence of (i) and (ii) follows from Theorem 3.5.5.
We show the equivalence of (ii) and (iii). Define, for x ∈ X ,
T
f (x) = E[(−u)(xt )], (3.5.13)
t=0
104 3 Finite Period Financial Models
Let x ∗ , π be solutions to the primal and dual problem (3.5.14) and (3.5.16),
respectively. Condition (u2) implies that dom(−u)∗ = (−∞, 0) so that πt > 0.
Moreover,
πt ∈ −∂(−u)(xt∗ ). (3.5.17)
Finally, if the market has no arbitrage trading strategy, then p < +∞ in (3.5.16)
which implies that σA(C ) (π ) < ∞ or π ∈ A(C)◦ . Thus, by Proposition 3.5.7, π
is a C-consistent price operator. Moreover, we can see from (3.5.17) that π is a
subgradient of the utility function at the optimal solution. Thus, (ii) implies (iii).
On the other hand, when (iii) is satisfied, there is a C-consistent price operator
π ∈ A(C)◦ \{0} satisfies (3.5.17). Thus, π must be a solution to the convex
optimization problem (3.5.16). That is to say p < +∞ so that (iii) implies (ii)
and, therefore, they are equivalent.
By Proposition 3.5.7, we see that to use consistent price operators for pricing we
must normalize them. However, (3.5.12) shows that, in general, the appropriate
normalizing factor for different t is different. For this reason a general discussion of
pricing and hedging in a conic financial market is technical. In this section we are
satisfied with a brief discussion of the one period model.
3.5 Conic Finance 105
determines a super hedging bound. Moreover, the solution to the dual linear
programming of (3.5.18) determines a super-hedging trading strategy. A sub-
hedging bound can be derived symmetrically.
We denote the finite sample space Ω = {ω1 , . . . , ωN }. We regard a random
variable r on Ω as a vector r = [r(ω1 ), . . . , r(ωN )] and use · to signify the dot
product between such vectors. Defining x = π1 P we can write (3.5.18) explicitly
as a linear programming problem
u0 = max c1 · x (3.5.19)
subject to [S ] · x ≤ a0 ([S ] ), − [S ] · x ≤ −b0 ([S ] ), m = 1, . . . , M,
m 1 m 1 m 1 m 1
2M+2
For (Λ, γ ) ∈ R+ consider the zero cost portfolio of cash follows:
M
(z0 (Λ, γ ), z1 (Λ, γ )) = γ01 110 + γ̃01 1̃10 + (λm
0S
m0
+ λ̃m m0
0 S̃ ). (3.5.24)
m=1
We see that
Fs ⊂ F t .
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 107
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2_4
108 4 Continuous Financial Models
are independent,
3. for 0 ≤ s ≤ t ≤ T , Bt −Bs has a Gaussian distribution with mean 0 and variance
t − s,
4. for ω in a set of probability one, the path Bt (ω) is continuous.
Definition 4.1.5 (Multi-Dimensional Brownian Motion) A vector stochastic
process {Bt : t ∈ [0, T ]} in Rn is called a standard Brownian motion if
Bt = (Bt1 , Bt2 , . . . , Btn ) where Bti , i = 1, 2, . . . , n are independent standard
one-dimensional Brownian motions. If Bt is a standard Brownian motion, then
x + Bt is called a Brownian motion starting from x.
4.1 Continuous Stochastic Processes 109
Remark 4.1.6 The existence of a stochastic process satisfying all the conditions laid
out in Definition 4.1.4 is not automatically guaranteed. By and large, there are two
ways to prove the existence:
• by construction pioneered by Wiener (see, e.g., [54]), or
• by Kolmogorov’s extension theorem (see, e.g., [42]).
We are satisfied with known the existence of Brownian motions for our applications.
If in a given probability space there is a Brownian motion, then one can
also define a Brownian motion in a different yet similar probability space. Thus,
Brownian motion is not uniquely defined. However, since every Brownian motion
has the same properties laid out in Definition 4.1.4, their effects are equivalent. We
usually pick a “convenient” version for the purpose of a concrete application.
For each Brownian motion Bt , defining the σ -algebra represents the information
contained in Bt up to time t by Ft we get a nature filtration associated with Bt . In
fact, we can take Ft to be the σ -algebra generated by the collection of preimages of
Borel sets under Bs , s < t. In the sequel whenever we discuss a Brownian motion
we always assume that it is accompanied by this filtration.
Somewhat more general than a Brownian motion is the martingale process.
Definition 4.1.7 (Martingale) Let Ft be a filtration for the probability space
(Ω, F, P ). We say Mt is a (P , Ft )-martingale if Mt is adapted to the filtration
Ft , for all t > 0, E[Mt ] < ∞ and for all s < t,
EP [Mt |Fs ] = Ms .
Similar to the discrete case a martingale can be think of representing the wealth
process in playing a fair game. A Brownian motion Bt is clearly a martingale and
it is also easy to check that Mt = Bt2 − t is also a martingale. So martingale is
not necessarily a Brownian motion. However, martingales are only slightly more
general than the Brownian motion as the following Levy’s theorem shows (which
we state without proof).
Theorem 4.1.8 (The Levy Characterization of Brownian Motion) Let X(t) =
(X1 (t), . . . , Xn (t)) be a continuous stochastic process on (Ω, F, Q). Then X(t) is
a Brownian motion with respect to Q if and only if
(i) X(t) is a martingale w.r.t. Q, and
(ii) Xi (t)Xj (t) − δij t is a martingale w.r.t. Q for all i, j = 1, . . . , n.
Here δij is the Kronecker delta defined by δij = 0 when i = j and δii = 1.
For n = 1 we have the characterization of one-dimensional Brownian motion.
Theorem 4.1.9 (The Levy Characterization of Brownian Motion) Let X(t) be
a scalar continuous stochastic process on (Ω, F, Q). Then X(t) is a Brownian
motion with respect to Q if and only if
(i) X(t) is a martingale w.r.t. Q, and
(ii) X2 (t) − t is a martingale w.r.t. Q.
110 4 Continuous Financial Models
This formula (4.1.1) looks like a usual chain rule except for the last term. A rigorous
proof is beyond the scope of this short book. Below are some heuristics that can help
in understanding the Itô formula. t
We know that f (Bt , t) − f (0, 0) = 0 df (Bt , t). Expand df (Bt , t) using the
Taylor’s expansion. Since terms of order o(dt) will vanish in the integration process
we need only do this to the second order. That gives us
1
df (Bt , t) = ft (Bt , t)dt + fx (Bt , t)dBt + fxx (Bt , t)(dBt )2
2
1
+ ftt (Bt , t)(dt)2 + ftx (Bt , t)dtdBt .
2
Since dt 2 , dtdBt are o(dt) the last two terms can be omitted and we have
1
df (Bt , t) = ft (Bt , t)dt + fx (Bt , t)dBt + fxx (Bt , t)(dBt )2 .
2
By the properties of the Brownian motion, we can replace dBt2 by dt giving us the
Itô formula (4.1.1).
Graphically we can illustrate by drawing the graph of fx around point Bt , then
df (Bt , t) is the area under the graph of fx (see Figure 4.1). We can see that
fx (Bt , t)dBt represents the approximation of the area using Euler’s method while
2 fxx (Bt , t)(dBt ) ∼ 2 fxx dt corrects the “triangle” part to get to an approximation
1 2 1
fx dBt
fx (·, t)
B
O Bt Bt + dBt
dt dBt
dt 0 0
dBt 0 dt
Example 4.1.11 Below is a nice application illustrating the power of the Itô
formula. Define βk (t) = E[Btk ]. Itô formula gives us
1 t
βk (t) = k(k − 1) βk−2 (s)ds.
2 0
We can use this to easily get E[Bt3 ] = 0 and E[Bt4 ] = 3t 2 . Those are mostly used in
financial applications. By induction, in general E[Bt2k+1 ] = 0 and
(2k)!t k
E[Bt2k ] = .
2k k!
Itô Processes
and
t
P |μ(s, ω)|ds < ∞ for all t ≥ 0 = 1.
0
In shorthand we write
Here μ is a drift and σ indicates magnitude of the variation of the random part. It is
often useful to write stochastic process in this form if we can. A Brownian motion
is an example of an Itô process where μ = 0 and σ = 1. The Itô formula can be
generalized to Itô process with dXt replacing dBt .
Theorem 4.1.12 (The General Itô Formula) Let f (t, x) ∈ C 2 and let Xt be an
Itô process. Then
1
df (Xt , t) = ft (Xt , t)dt + fx (Xt , t)dXt + fxx (Xt , t)(dXt )2 .
2
Example 4.1.15 Here is an example of using the general Itô formula. Let Xt =
μt + σ Bt . Then dXt = μdt + σ dBt . Using the box algebra we have
1
df (Xt , t) = ft dt + fx dXt + fxx (dXt )2
2
1
= ft dt + μfx dt + σfx dBt + σ 2 fxx dt
2
Example 4.1.16 Letting f (t, x) = tx we have
t t
tBt = Bs ds + sdBs
0 0
4.1 Continuous Stochastic Processes 113
or
t t
sdBs = tBt − Bs ds.
0 0
The integral form in the following is the general integration by parts formula
t t t
Xs dYs = Xt Yt − X0 Y0 − Ys dXs − dXs dYs .
0 0 0
Remark 4.1.18 The term dXt dYt is called the quadratic covariation of Xt and Yt
and is often denoted d X, Y t .
114 4 Continuous Financial Models
Martingale Representation
The Itô formula is a crucial tool in proving the following important martingale
representation theorem. This representation theorem further highlights the close
relationship between martingales and Brownian motions. As an application oriented
class we will omit the proof and directly present the result.
Theorem 4.1.19 (Martingale Representation) Let Bt be an n-dimensional Brow-
nian motion generating filtration Ftn . Suppose that Mt is an (P , Ftn )-martingale
and that E[Mt2 ] < +∞ for all t ≥ 0. Then there exists a unique stochastic process
v ∈ V n such that
t
Mt = E[M0 ] + vdBs .
0
Let f (x, t) ∈ C 2,1 and let Xt be an Itô process. Then using the quadratic covariation
in Remark 4.1.18 we can write the general Itô formula in Theorem 4.1.12 as
1
df (Xt , t) = ft (Xt , t)dt + fx (Xt , t)dXt + d fx (X, t), X t . (4.1.4)
2
Now assume that f is convex in x for all t. We use f ∗ (y, t) to signify the conjugate
of f with respect to variable x. Define Yt = fx (Xt , t). We see that Xt , Yt satisfies
the Fenchel equality
Combining (4.1.4), (4.1.6), and (4.1.8) we derive the following Dual Itô formula
1
df (Xt , t) = ft (Xt , t)dt + Yt dXt + d Y, X t (4.1.9)
2
1
df ∗ (Yt , t) = ft∗ (Yt , t)dt + Xt dYt + d X, Y t .
2
4.1 Continuous Stochastic Processes 115
In financial applications, prices of stocks and other assets are often described by a
Itô process of the form
where μ models a drift reflecting the large trend of the asset price and σ describes
the volatility of the random fluctuation of the price process. In analyzing the price
process, the important part is the impact of σ . The Girsanov theorem allows us to
“absorb” the drift μ by using a change of the probability measure. This is very
similar to the equivalent martingale measure that absorbs the excess gains for the
risky assets in the discrete model.
Theorem 4.1.20 (Removal of Drift via Girsanov’s Theorem) Let St be an Itô
process of the form
is a probability
t measure on FT and
2. B̂(t) = 0 u(s, ω)ds + B(t) is a standard Brownian motion w.r.t. Q and
3.
1
dXt = udBt + u2 dt.
2
By direct calculation we have
To show that B̂t is a standard Brownian motion, we turn to check the conditions
in the Levy characterization of Theorem 4.1.8. We check only Theorem 4.1.8 (i)
since (ii) is similar. Using the product rule we can verify that Mt B̂t is a martingale
with respect to P . Now for s < t, and A ∈ Fs we have
t
1 t
We note that, by (4.1.10), Mt = exp − 0 u(s, ω)dBs − 2 0 u2 (s, ω)ds is
always a local martingale. Novikov’s condition
' 1 T 2 (
E e 2 0 ut dt < ∞
that represents the price process of a certain financial asset. Here Bt is a Brownian
motion in a probability measure space (Ω, F, P ) with filtration Ft . Assume for
simplicity that the risk free interest rate is 0 and that μ, σ are bounded and σ ≥ c >
0 for some constant c. Suppose that we want to price a European style contingent
claim on St with the payoff f (ST ) at the maturity T . We can proceed as follows.
First using the Girsanov theorem we can write
Next we explicitly calculate the price function for call options under the Bachelier
and Black–Scholes models.
Bachelier Formula
Bachelier modeled the price of a stock in his 1900 pioneering paper [3] by
where μ and σ are constant. This model was thought unrealistic because stock price
cannot become negative. However, now we can see it as a good approximation for
pair trading or forward for currency swap contracts. Consider the price of a call
option with a strike K maturing at T . Then formula (4.2.1) reduces to
dSt = σ dWt
1 ∞ √ y2
= √ (x − K + T − tσy)+ e− 2 dy
2π −∞
1 ∞ √ y2
= √ (x − K + T − tσy)e− 2 dy
2π K−x
√
σ T −t
√
x−K
√
1 σ T −t z2
= √ (x − K − T − tσ z)e− 2 dz (z = −y)
2π −∞
where
1 t z2
N(t) = √ e− 2 dz.
2π −∞
Black–Scholes Formula
Black and Scholes modeled the price of a stock as a geometric Brownian motion
where μ and σ are constant. Consider the price of a call option with a strike K
maturing at T . Again formula (4.2.1) reduces to
dSt = σ St dWt
⎡ ⎤
+
−σ 2 (T − t) √
C(x, t) = EQ ⎣ x exp − + T − tσ W1 − K ⎦ (4.2.7)
2
∞ +
1 −σ 2 (T − t) √ y2
= √ x exp − + T − tσy − K e− 2 dy
2π −∞ 2
1 ∞ −σ 2 (T − t) √ y2
+ T − tσy − K e− 2 dy
= √ ln K
x +
σ 2 (T −t) x exp −
2π √ 2 2
σ T −t
where
x
± σ (T2 −t)
2
ln
d± = K
√ .
σ T −t
4.2.2 Convexity
Convexity and generalized convexity play important roles in dealing with option
pricing and hedging. Both Bachelier and Black–Scholes formulae involve interest-
ing convexity with respect to their various parameters. √
We start with the Bachelier formula and use I = T − tσ and forward price
X = x − K to simplify notation. We will also use their ratio moneyness m = X/I .
Using these new variables then we can write the Bachelier formula (4.2.3) as
X X
B(X, I ) = EQ [(X + I W1 )+ ] = XN + IN . (4.2.9)
I I
We see that the sublinear property of the Bachelier formula brings us much
convenience in calculating BX and BI .
120 4 Continuous Financial Models
B = σM and B ∗ = ιM .
We observe that the variable x appears in the expressions of C(x, t) in three separate
places. Yet curiously the calculation result of the partial derivative with respect to x
contains only the partial derivative with respect to the linear term of x. This is rather
similar to the simple formula for BX in (4.2.11). In the next section we will show
the reason is related to the convexity of C in x and Fenchel-Legendra transform of
C in x is related to the delta hedging. It is nature to ask whether C is also convex
with respect to σ . It turns out the answer is negative. Yet if we compensate C by a
multiple of an at money call it becomes convex.
We start by calculating the partial derivative of C with respect to σ :
∂d+ ∂d−
Cσ = xN (d+ ) − KN (d− ) . (4.2.14)
∂σ ∂σ
Observing that
)
xK (ln(x/K))2 τσ2
xN (d+ ) = KN (d− ) = exp − − (4.2.15)
2π 2τ σ 2 8
and
√
d+ − d− = σ τ (4.2.16)
It follows that
)
xKτ (ln(x/K))2 τσ2 (ln(x/K))2 τσ
Cσ σ = exp − − − . (4.2.18)
2π 2τ σ 2 8 τσ3 4
4.2 Bachelier and Black–Scholes Formulae 121
Defining
√ √
√ τσ τσ
f (σ ) := C − xK N −N −
2 2
(note inside the hard bracket is the percentage premium of an at the money call
option) we have
√
√ τσ τσ
f (σ ) = Cσ σ +
xKτ N (4.2.19)
4 2
√
√ τσ (ln(x/K))2 (ln(x/K))2
= xKτ N exp −
2 2τ σ 2 τσ3
√
√ τσ τσ (ln(x/K))2
+ xKτ N 1 − exp − ≥ 0.
4 2 2τ σ 2
We note that
√ √
τσ τσ
N −N −
2 2
is the price of an at the money√ call. Thus, the Black–Scholes call price C
compensated by a multiple (− x/K) of an at the money call as a function of σ
is convex. We can also phrase this in terms of generalized convexity. Note that f
is convex and, therefore, can be supported from below by an affine function. Thus,
the Black–Scholes call price C as a function of σ can be supported from below by
a function of the form
√ √
√ τσ τσ
xK N −N − + yσ − b.
2 2
Define
√ √
√ τσ τσ
c(σ, y) = xK N −N − + yσ
2 2
Then the Black–Scholes call price C as a function of σ is Φc(1) -convex using the
notation in Section 1.5.
4.2.3 Duality
We turn to explore the reason why the derivative of the Black–Scholes call formula
C has a simple derivative with respect to x. To understand this phenomenon we need
122 4 Continuous Financial Models
They want to choose Nt in such a way that the resulting portfolio (4.2.20) has
riskless gains, that is
It follows that
∂C
Nt = (4.2.23)
∂x
and C must satisfies the Black–Scholes partial differential equation
∂C 1 ∂ 2C
+ = 0, (4.2.24)
∂t 2 ∂x 2
with terminal condition
The Black–Scholes partial differential equation (4.2.24) with the terminal condi-
tion (4.2.25) provides an alternative derivation of the Black–Scholes formula (4.2.8)
via the Feynmann–Kac formula.
Relationships (4.2.20) and (4.2.23) reveals that when portfolio (4.2.20) has
riskless gains its value equals to the Fenchel-Legendra transform of the no arbitrage
option price. Since Merton has shown that the Black–Scholes option price C(St , t)
is convex in St , we have the following duality:
and
where the conjugate operation is with respect to the first variable. These relation-
ships reveal that for each fixed t the option value is a convex function of the stock
price and the cash borrowed C ∗ (Nt , t) is a convex function of the share of the stock
in the hedging portfolio. The same relationship also holds for the Bachelier formula.
Thus, the simple form of the partial derivative of C in (4.2.13) is a consequence
of the Fenchel-Young equality in Proposition 1.3.1. This duality argument also
explains the simplicity of BX but as mentioned before BX can be derived more
directly using the sublinear property of the Bachelier formula B.
The duality relationship in delta hedging observed in the previous section for the
Bachelier and Black–Scholes formulae also holds in more general setting.
4.3.2 Duality
and
Here the conjugate v ∗ is the cash borrowed process when we maintaining a self-
financing hedging portfolio. Relationship (4.3.8) corresponds to that the hedging
portfolio has riskless gain and relationship (4.3.9) shows that the hedging portfolio
St Nt − v ∗ (Nt , t) is self-financing.
4.3 Duality and Delta Hedging 125
n n
Nt
Nt
s s
O St O St
vx−1 (·, t)
vx (·, t)
s
O St
∂v ∗ σ 2 x 2 ṽxx
2
∗
− + vnn = 0. (4.3.11)
∂t 2
126 4 Continuous Financial Models
∂v ∗ σ 2 x 2 ṽxx
2
∗
+ vnn = 0. (4.3.12)
∂τ 2
Since Equations (4.3.12) and (4.3.5) have the same form this suggests that in reverse
time the cash borrowed process v ∗ should be a martingale just like v is a martingale
in time t.
Let us fix the notation first. We use τ to denote the reversed time. For a stochastic
process Pt , t ∈ [0, T ] we define its time reversal by P̂τ = Pt provided that t + τ =
T . Let us denote Δ an infinitesimal increment of time. Setting τ + t + Δ = T ,
we have
N̂τ = vx (Ŝτ , τ ).
The time reversal for the differential of a product stochastic processes needs to be
dealt with caution. For example, we can write (4.3.1) as
St+Δ − St = σ St (Wt+Δ − Wt ).
Letting t + τ + Δ = T we have
∂vx 1 ∂ 2 vx ∂vx
d N̂τ = dτ + 2
(d Ŝτ )2 + d Ŝτ (4.3.15)
∂t 2 ∂x ∂x
∂vx ∂vx 2 1 ∂ 2 vx 2 2 ∂vx
= + σ Ŝτ + σ Ŝτ dτ + σ Sτ d Ŵτ .
∂t ∂x 2 ∂x 2 ∂x
∂vx ∂vx 2 1 ∂ 2 vx 2 2
+ σ x+ σ x = 0.
∂t ∂x 2 ∂x 2
It follows that
∂vx
d N̂τ = σ Sτ d Ŵτ (4.3.16)
∂x
is a martingale.
Finally we consider the time reversal of the hedging portfolio (cash borrowed)
process Ht = v ∗ (Nt , t). Using the dual Itô formula (4.1.9) we have
1
dv = vt dt + Nt dSt + d S, N t (4.3.17)
2
1
dHt = dv ∗ = vt∗ dt + St dNt + d S, N t .
2
Combining (4.3.17) with the riskless gain condition dv = Nt dSt and vt + vt∗ = 0
from (4.1.6) we have
Letting t + τ + Δ = T we have
or
∂vx 2
d Ĥτ = Ŝτ d N̂τ = σ Sτ d Ŵτ . (4.3.19)
∂x
Financial innovations in the past several decades have led to the creation of many
new types of financial derivatives. They become increasingly liquid and, thus, can
also be used as hedging devices. What happens when we use a contingent claim
instead the underlying to construct a hedging portfolio for the purpose of pricing and
hedging a target contingent claim? It turns out that a duality also emerges between
the value of the target contingent claim and the cash borrowed process in terms of
generalized duality which naturally corresponds to a generalized convexity concept
(see, e.g., Section 1.5). Moreover, similar to the classical option pricing theory, the
no arbitrage value of the contingent claim derived this way preserves the generalized
convexity of the terminal payoff.
where Wt is a standard Brownian motion. We assume again that the risk free rate
is 0. Consider a target contingent claim on St of European style with maturity at
T > 0 and a terminal payoff f (ST ) at t = T . Suppose that a different contingent
claim, we call it hedging claim, on St is traded on the market with price p(St , t) at
all time t ∈ [0, T ]. For uniqueness in what follows we always assume that p and v
2
are smooth functions bounded by αeβx for some α, β > 0. Our main result is:
Theorem 4.4.1 (Consistency of Generalized Convexity) Define ct (x, y) =
p(x, t)y and assume that f is ΦcT (1) -convex. Then
2
(i) Partial differential equation vt + σ2 vxx = 0, v(x, T ) = f (x), uniquely
determines an arbitrage free price for the target claim;
(ii) for any t ∈ [0, T ], v(·, t) is Φct (1) -convex; and
(iii) Nt determined by
makes the portfolio of the hedging instrument and the riskless asset
p(St , t)Nt − v ct (1) (Nt , t) riskless.
4.4 Generalized Duality and Hedging with Contingent Claims 129
is the cash borrowed resulting from this portfolio. Self-financing implies that
σ2
pt + pxx = 0.
2
Thus, v must also satisfy the Black–Scholes PDE
σ2
vt + vxx = 0. (4.4.6)
2
with terminal condition
We show that v ct (1)ct (2) satisfies the same Black–Scholes PDE as v does. Observe
that x → p(x, T ) is strictly monotone, which implies that x → p(x, t) is invertible,
i.e., x = x(p, t). We can define
Then we have
Thus, we need only to show that ṽ and ṽ ∗∗ satisfy the same Black–Scholes PDE.
We do so through the PDE for the cash borrowed ṽ ∗ . Changing variables we have
∂v ∂ ṽ ∂p
= + ṽp
∂t ∂t ∂t
vx = ṽp px
vxx = ṽp pxx + ṽpp px2 .
∂v σ2
+ vxx = 0
∂t 2
and using
∂p σ 2
+ pxx = 0
∂t 2
we have
∂ ṽ σ 2 px2
+ ṽpp = 0. (4.4.8)
∂t 2
Thus, using Fenchel equality
ṽ(Pt , t) + ṽ ∗ (Nt , t) = Pt Nt
4.4 Generalized Duality and Hedging with Contingent Claims 131
we have
∂ ṽ ∂ ṽ ∗
n = ṽp , p = ṽn∗ , =−
∂t ∂t
and
∗
ṽpp ṽnn = 1.
∂ ṽ ∗ σ 2 px2 ṽpp
2
∗
− + ṽnn = 0. (4.4.9)
∂t 2
To derive the PDE for ṽ ∗∗ we start from Pt and Nt satisfying the Fenchel equality
ṽ ∗∗ (Pt , t) + ṽ ∗ (Nt , t) = Pt Nt .
Then we have
∂ ṽ ∗∗ ∂ ṽ ∗
n = ṽp∗∗ , p = ṽn∗ , =−
∂t ∂t
and
∗∗ ∗
ṽpp ṽnn = 1.
∂ ṽ ∗∗ σ 2 px2 ∗∗
+ ṽ = 0.
∂t 2 pp
We see that ṽ and ṽ ∗∗ satisfy the same Black–Scholes differential equation. Since
v(x, t) = ṽ(p, t) and ṽ ∗∗ (p, t) = v ct (1)ct (2) (x, t) for x = x(p, t) we conclude that
v(x, t) and v ct (1)ct (2) (x, t) also satisfy the same Black–Scholes differential equation.
Finally, since v(·, T ) is ΦcT (1) -convex we have v(x, T ) = v cT (1)cT (2) (x, T ). That
is, v and v cT (1)cT (2) satisfy the same terminal condition. Thus, they must be the same
for all t, i.e. v(x, t) = v ct (1)ct (2) (x, t) so that v(·, t) is Φct (1) -convex.
Remark 4.4.2 Function ct (x, y) = p(x, t)y is known when we know the price of
claim p that we use to hedge.
Fixing t and defining ṽ(p, t) = v(x(p, t), t), we can represent the portfolio
p(St , t)Nt − v(St , t) graphically in Figures 4.4 and 4.5
132 4 Continuous Financial Models
n n
Nt
Nt
s s
O St O St
ṽp (·, t)
s
O St
We see that these graphs are almost exact replications of the graphic repre-
sentation of the hedging portfolio St Nt − v(St , t). The only difference is that the
sn-plane is weighted by px (·, t). This implies the following generalized Fenchel
duality relationship.
and
While in principle the PDE with terminal condition (4.4.6) and (4.4.7) determines
an arbitrage free and Φct (1) -convexity preserving contingent claim pricing function
v, to determine the hedging process one must know the dynamics of Nt and Ht =
v(·, t)ct (1) (Nt ).
Defining n(x, t) := vx (x, t)/px (x, t), Equation (4.4.5) implies that the hedging
process is
In general Ht is not a martingale. However, in some special case it could be. For
example, if p(x, t) = x, i.e. the hedging is done with the price process St itself,
then px = 1, pxx = 0 and Equation (4.4.17) is simplified to
Exchange traded funds (ETFs) are securities that can be traded in a financial market
like a stock. These financial products are created to provide investors the flexibility
to invest in a specifical sector as real estate, technology etc. . . or in a broad index
such as the SP500. Some of them also enable investors to leverage. For example,
one can buy ETFs that double and triple the daily percentage movement of, say, the
popular SP500 index and many other indices. There are also short ETFs that mimic
the effect of selling borrowed share of corresponding ETFs. Buying an ETF itself is
referred to as long. They provides convenient tools for hedging. We discuss in this
section the general p-multiple ETF, which mimics the p times of the percentage
movement of the underlying, as a hedging tool. We will need the following special
case of Theorem 2.2.3.
Proposition 4.4.3 The function x q , x ≥ 0 is Φ[x p y](1) -convex if either q > 0 and
p < q or q < 0 and q < p. Similarly, the function −x q , x ≥ 0 is Φ[x p y](1) -convex
if either p > q > 0 or p < q < 0.
Proof We prove only for the case x q . The discussion for −x q is similar. Let u(x) =
x q , x ≥ 0. It is easy to calculate that
xu (x)
R(x) = − = 1 − q. (4.4.19)
u (x)
When q > 0 and p < q, u is an increasing function and R(x) = 1 − q < 1 − p
and when q < 0 and p > q, u is a decreasing function and R(x) = 1 − q > 1 − p.
Now the conclusion of the proposition directly follows that of Theorem 2.2.3.
dv(St , t) dSt
=q .
v(St , t) St
Theorem 4.4.4 (Hedging with Multiple of ETF) Let St be the price of an asset
satisfying the diffusion equation (4.4.20). Suppose that either q > 0 and p < q
or q < 0 and q < p. Then a q-multiple long ETF of St , t ∈ [0, T ] can always
be dynamically hedged with an arbitrage free self-financing portfolio involving a
p-multiple ETF of St . Moreover, for any t ∈ [0, T ], the arbitrage free price of the
q-multiple ETF is Φ[x p y](1) -convex.
Proof By Theorem 4.4.1 we need only to check that v(x, T ) = x q is Φ[x p y](1) -
convex. This follows directly from Proposition 4.4.3.
q −p
Ht = v(St , t).
p
Note that the cash borrowed process is always a martingale. In particular, for q = 4
and p = 2, we see that the no arbitrage price of the quadruple long ETF at any given
time t ∈ [0, T ] is Φ[x 2 y](1) -convex and such a process can be hedged by a double
ETF.
Remark 4.4.5 It is worthy to observe that when q ∈ (0, 1) and p < q the Φ[x p y](1) -
convex functions are, in fact, concave. We can see that Φ[x p y](1) -convex functions
represent a wide spectrum of convex and concave functions with different strengths.
A few graphic illustrations are included in Figures 4.6, 4.7, 4.8, and 4.9.
The above discussion can be applied to q-multiple short ETF of St . We
summarize the result in the following Theorem.
Theorem 4.4.6 Let St be the price of an asset satisfying the diffusion equa-
tion (4.4.20). Suppose that either p > q > 0 or p < q < 0 and q < p. Then
x
136 4 Continuous Financial Models
x
4.4 Generalized Duality and Hedging with Contingent Claims 137
dat = σ at dWt .
where σ is a constant. Assume that the risk free rate is r and that there is no dividend.
Let’s first view the stock price S(at ) as a perpetual claim on at . Then S(at ) satisfies
the ordinary differential equation
σ 2x2
Sxx + rxSx − rS = 0.
2
So that
q
S(at ) = bat − cat ,
xu (x)
− ≤ 1 − q.
u (x)
Thus, for K sufficiently large u(x, T ) is a Φ[x q y](1) -convex function. It follows
from Theorem 4.4.1 that u(·, t) is also Φ[x q y](1) -convex.
Example 4.4.8 (Normal Kernel) Consider the scaled normal kernel
n(x) = e−kx
2 /2
, x ≥ 0, k > 0.
138 4 Continuous Financial Models
We can verify that −xn (x)/n (x) = kx 2 − 1 ≥ −1 but there is no upper bound.
Thus, the decreasing function e−kx /2 , x ≥ 0 is Φ[x p y](1) -convex for any p ≥ 2.
2
Due to the symmetry of both e−kx /2 and |x|p y − b with respect to the vertical axis
2
we conclude that this property also holds when x < 0. So that e−x /2 is Φ[|x|p y](1) -
2
When there are multiple hedging claims available in the market, it is usually the
case that for a given target contingent claim there are many different ways to hedge.
Choosing an appropriate hedging device that fits better in generalized convexity
often can help reducing the volatility of the hedging process.
Example 4.4.9 (Hedging q-Multiple Long ETF Using p-Multiple) Suppose that St
is a diffusion process
Let v be the value of the q-multiple long ETF of St . Suppose either q > 0, p < q
or q < 0, p > q. Then the process for the hedging shares has been explicitly
calculated as
q q−p [ q(q−1) − p(p−1) ]σ 2 (T −t)
Nt = S e 2 2
p t
q −p
Ht = v(St , t).
p
Note that the closer the p to q, the smoother the cash borrowed process Ht which is
a proxy for the value of the hedging portfolio.
Example 4.4.10 (Normal Kernel) Now consider St following a Bachelier model
St = σ Wt and let v(St , t), t ∈ [0, T ] be the no arbitrage price of a contingent
claim with payoff f (x) = e−x /2 at T .
2
Let us assume that the volatility σt2 is unknown. We further assume that the market
implies a constant volatility σh2 which is, say, known to be too high by a certain
trader. Can he take advantage of the situation? Carr and Madan have shown in [11]
that the answer is yes if there is a contingent claim whose no arbitrage price v(St , t)
is convex in St .
In this example we show that generalized convexity can help us to derive a similar
volatility trade when v(St , t) has a certain generalized convexity properties. Let
p(St , t) be the no arbitrage price of a hedging claim with p(·, t) strictly monotone.
Let ct (x, y) = p(x, t)y. We assume that v(·, T ) is ΦcT (1) -convex but not necessarily
convex in St such as in Examples 4.4.7 and 4.4.8.
Denote again
The left hand is the trading portfolio and the right hand is the P&L. Since the trader
follows the constant volatility σh implied by the market in trading
where vpp > 0. We see that the trader can take advantage of the over estimation on
volatility by the market by dynamically trading the portfolio
T
v(ST , T ) − v(St , t) − ṽp dPt .
t
Comments
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 141
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2
142 Comments
Growth optimal portfolio theory [32] and Kelly’es criterion [27, 34, 35, 55–
57] as a money management tool in investment general and in games in particular
are discussed as an illustration of such utility optimization problems. In particular,
following [27, 49, 63] we highlight that optimizing the expected log utility for a
portfolio of cash and a given investment strategy on historical performance data
amounts to measure the useful information implied by the investment strategy and
can be used as a measure to compare different investment strategies. In practice
the growth optimal portfolio and its special case the Kelly criterion are often too
risky as illustrated in Example 2.2.11. Various fractional Kelly money management
schemes, often ad hoc, were proposed to limiting the risk. Recently Vince and Zhu
[60] and Lopez de Prado, Vince and Zhu [33] provided theoretical justification for
such more conservative betting strategies. They use more realistic finite investment
horizon and select betting size based on risk adjusted returns. The analysis involves,
however, nonconvex functions.
Fundamental theorem of asset pricing (FTAP) relates no arbitrage to the existence
of a martingale measure that can be used to price assets in a financial market. Cox,
Ross, and Rubinstein observed such a principle in their classical work related to
option pricing in complete markets [12, 13]. General FTAPs were discussed in
[15, 21, 22, 29] with progressing generality, usually with a proof based on separation
arguments. Dybvig and Ross [17] observed that in an incomplete market the
martingale measures are related to the risk aversion of market agent. In Section 2.3
we approach the FTAP from the perspective of convex duality. We show that in an
incomplete market, a martingale measure is, in fact, a scaling of the dual solution
to a portfolio utility maximization problem. We also illustrate with example that
this relationship helps us to understand that in an incomplete market, a martingale
measure provides a reference price for a certain agent to improve their utility
rather than arbitrage. In a finite dimensional space, the linear programming duality
approach in Section 2.3.4 (see e.g. [28]) is equivalent to the Krep-Yan cone
separation theorem which is used by Harrison and Kreps [21], Harrison and Pliska
[22], Delbaen and Schachermayer [15], and many others in their proofs of FTAP in
different settings.
Section 2.4 deals with risk measures, a concept that plays important roles for
both financial institutions and regulatory agencies. Diversification reduces risk
which implies the convexity of risk measures. We focus on coherent risk measures
proposed by Artzner, Delbaen, Eber, and Heath in [2]. Coherent risk measures are
sublinear, a particular type of convex function. Duality is involved in providing a
dual characterization of a coherent risk measure as the conjugate of an indicator
function of a cone, called acceptance cone. Interestingly, the generating set for the
acceptance cone is closely related to the practice of stress tests. Convex duality
also provides several equivalent description of the coherent risk measures in terms
of linear preference and value bonds. Moreover, the same argument is at the core
of the discussion of good deal in financial markets as explained in Jaschke and
Küchler [24]. Beside providing a framework to understand risk measures and their
relationship with other important financial concepts, convex duality methods also
Comments 143
help to amend widely used nonconvex risk measure value at risk [25] to the convex
conditional value at risk proposed by Rockafellar and Uryasev in [46, 47].
Chapter 3 Sections 3.1–3.3 demonstrate that many of the results in the previous
chapter also persist in the more general setting of a multiperiod economy. We use
the general model laid out in S. Roman’s textbook [48].
Section 3.4 discusses super hedging (and symmetrically subhedging) bounds
in incomplete markets. This is a classical topic in financial mathematics (see
[22, 23, 26]). We emphasize that the super hedging bound of a given contingent
claim is a linear programming problem. Linear programming duality allows us to
view the super hedging bound in two different perspectives. On one hand it is the
supremum of all the prices derived through martingale measure and on the other
hand it can be represented as the cost of the smallest super hedging portfolio. When
the sample space is finite, the super hedging portfolio in the second representation
can be derived by solving a linear programming problem. The linear programming
duality can also be used to analyze narrowing the gap between the super and sub-
hedging bounds by adding contingent claims with known prices. When discussing
contingent claims related to currency spread, incomplete markets may arise from
complete markets. Considering supper hedging bounds in this kind of problems, in
general, leads to a Kantorovich mass transportation problem [59]. We illustrate the
solution process with an example on a finite sample space using linear programming
duality.
Section 3.5 discusses a model for financial markets with bid and ask spread. The
main difference with a simplified one price financial market is that the attainable
payoff set due to trading is, in general, a convex cone rather than a subspace.
This leads to the title conic finance as coined by Madan in [36, 37]. Besides a
concise representation of the basic conic finance model, we also discuss new refined
fundamental theorem of asset pricing as well as super and sub-hedging price bounds.
These results are taken from [58] emphasizing the role of convex duality.
Chapter 4 Section 4.1 summarizes facts on continuous models that we need later.
To be concise we are satisfied with a heuristic description of most of the material.
Readers interested in further details may consult [5, 42, 52–54]. The dual Itô formula
is a first taste of the role of duality in continuous model. It develops the generalized
Itô formula using quadratic covariance in [19].
Section 4.2 discusses convexity and generalized convexity emerged in Bachelier
[3] and Black–Scholes [6, 40] formulae. The importance of these convexity proper-
ties is highlighted in applying them in the computation of Greeks and in illustrating
the delta hedging is, in fact, the Fenchel-Legendra transform of the pricing formula.
This is the observation in Carr [10] for more general settings and discussed in greater
detail in Section 4.3.
It turns out that if one hedges using a contingent claim rather than the underlying
itself, similar duality still persists in the sense of generalized duality that we discuss
in Section 4.4. The general principles are summarized in Sections 4.4.1 and 4.4.2. A
number of examples are included to illustrate their applications in financial practice.
How to hedge with the popular multiple ETFs of indices is discussed in detail in
144 Comments
Section 4.4.3. What are also discussed in this section are examples of generalized
convexity of Leland’s model of stock price as contingent claims of company’s assets
[31] and the general convexity of the normal kernel. The common theme here is
that they all follow from characterizations of the generalized convexity using the
relative risk aversion coefficient and the absolute risk aversion coefficient. Hedging
with derivatives can help to reduce the risk and to expand the range of volatility
trading which is proposed in [11]. These are discussed in Sections 4.4.4 and 4.4.5,
respectively. Much of the materials regarding these duality and generalized duality
relationships appear here for the first time. We believe that this is an area that is
worthy of further attention.
In addition, survey papers [14, 43, 61, 64] have also been valuable references.
References
1. Arrow, K.J.: Aspects of the Theory of Risk Bearing. The Theory of Risk Aversion. Yrjo
Jahnssonin Saatio, Helsinki (1965). Reprinted In: Essays in the Theory of Risk Bearing, pp.
90–109. Markham, Chicago (1971)
2. Artzner, P., Delbaen, F., Eber, J.-M., Heath, D.: Coherent measures of risk. Math. Financ. 9,
203–228 (1999)
3. Bachelier, L.: Théorie de la spéculation. Ann. Sci. Éc. Norm. Supér. 3(17), 21–86 (1900)
4. Bernoulli, D.: Exposition of a new theory on the measurement of risk. Econometrica 22, 23–36
(1954/1738)
5. Bjork, T.: Arbitrage Theory in Continuous Time. Oxford University Press, New York (2009)
6. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81,
637–645 (1973)
7. Borwein, J.M., Lewis, A.S.: Convex Analysis and Nonlinear Optimization. Springer, New York
(2000). Second edition (2005)
8. Borwein, J.M., Zhu, Q.J.: Techniques of Variational Analysis. Springer, New York (2005)
9. Borwein, J.M., Zhu, Q.J.: A variational approach to Lagrange multipliers. J. Optim. Theory
Appl. 171, 727–756 (2016). https://doi.org/10.1007/s10957-015-0756-2
10. Carr, P.: Option as Optimization: A Dual Approach to Derivatives Pricing. Quant USA, New
York (2014)
11. Carr, P., Madan, D.: Toward a theory of volatility trading. In: Jarrow, R. (ed.) Volatility
Estimation Techniques for Pricing Derivatives, pp. 417–427. Risk Books, London (1998)
12. Cox, J., Ross, S.: The valuation of options for alternative stochastic processes. J. Financ. Econ.
3, 144–166 (1976)
13. Cox, J., Ross, S., Rubinstein, M.: Option pricing: a simplified approach. J. Financ. Econ. 7,
229–263 (1979)
14. Dahl, K.R.: Convex duality and mathematical finance. Thesis for M.Sci., University of Oslo
(2012)
15. Delbaen, F., Schachermayer, W.: A general version of the fundamental theorem of asset pricing.
Math. Ann. 300, 463–520 (1994)
16. Doleski, S., Kurcyusz, S.: On Φ− convexity in extremal problems. SIAM J. Control Optim.
16, 277–300 (1978)
17. Dybvig, P., Ross, S.A.: Arbitrage, state prices and portfolio theory. In: Handbook of the
Economics of Finance. North-Holland, Amsterdam (2003)
18. Fenchel, W.: Convex Cones, Sets and Functions. Lecture Notes. Princeton University, Prince-
ton (1951)
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 145
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2
146 References
19. Föllmer, H., Protter P., Shiryaev, A.N.: Quadratic covariation and an extension of Itô’s formula.
Bernoulli 1, 149–169 (1995)
20. Gale, D.: A geometric duality theorem with economic applications. Rev. Econ. Stud. 34, 19–24
(1967)
21. Harrison, J.M., Kreps, D.M.: Martingales and arbitrage in multiperiod securities markets. J.
Econ. Theory 20, 381–408 (1979)
22. Harrison, J.M., Pliska, S.: Martingales and stochastic integrals in the theory of continuous
trading. Stoch. Process. Appl. 11, 215–260 (1981)
23. Jacka, S.D.: A martingales representation result and an application to incomplete financial
markets. Math. Financ. 2, 239–250 (1992)
24. Jaschke, S., Küchler, U.: Coherent risk measures and good-deal bounds. Financ. Stochast. 5,
181–200 (2001)
25. Jorion, P.: Value at Risk. McGraw-Hill, New York (1997)
26. Kahalé, N.: Sparse calibrations of contingent claims. Math. Financ. 20, 105–115 (2010)
27. Kelly, J.L.: A new interpretation of information rate. Bell Syst. Tech. J. 35, 917–926 (1956)
28. King, A.J.: Duality and martingale: a stochastic programming perspective on contingent
claims. Math. Progam. Ser. B 91, 543–562 (2002)
29. Kramkov, D., Schachermayer, W.: The asymptotic elasticity of utility functions and optimal
investment in incomplete markets. Ann. Appl. Probab. 9, 904–950 (1999)
30. Kutateladze, S.S., Rubinov, A.M.: Minkowski duality and its applications. Russ. Math. Surv.
27, 137–192 (1972)
31. Leland, H.: Corporate debt value, bond covenants, and optimal capital structure. J. Financ.
49(4), 1213–1252 (1994)
32. Lintner, J.: The valuation of risk assets and the selection of risky investments in stock portfolios
and capital budgets. Rev. Econ. Stat. 47, 13–37 (1965)
33. Lopez de Prado, M., Vince, R., Zhu, Q.J.: Optimal Risk Budgeting Under a Finite Investment
Horizon. SSRN 2364092 (2013)
34. Maclean, L.C., Thorp, E.O., Ziemba, W.T.: Good and bad properties of the Kelly criterion.
In: Maclean, L.C., Thorp, E.O., Ziemba, W.T. (eds.) The Kelly Capital Growth Investment
Criterion, Theory and Practice, pp. 563–574. World Scientific, Singapore (2010)
35. Maclean, L.C., Thorp, E.O., Ziemba, W.T. (eds.): The Kelly Capital Growth Investment
Criterion, Theory and Practice. World Scientific Handbook in Financial Economics Series,
vol. 3. World Scientific, Singapore (2011)
36. Madan, D.: Asset pricing theory for two price economies. Ann. Financ. 11, 1–35 (2014)
37. Madan, D., Schoutens, W.: Applied Conic Finance. Cambridge University Press, Cambridge
(2016)
38. Markowitz, H.: Portfolio Selection. Cowles Monograph, vol. 16. Wiley, New York (1959)
39. Martinez-Legaz, J.E.: Generalized Convex Duality and Its Economic Applications. Pontificia
Universidade Catolica del Peru (2002)
40. Merton, R.: Theory of rational option pricing. Bell J. Econ. Manag. Sci. 4, 141–183 (1973)
41. Moreau, J.J.: Fonctionelles Convexes. Lecture Notes. College de France, Paris (1967)
42. Oksendal, B.: Stochastic Differential Equations, 6th edn. Springer, New York (2003)
43. Pennanen, T.: Convex duality in stochastic optimization and mathematical nance. Math. Oper.
Res. 36, 340–362 (2011)
44. Pratt, J.W.: Risk aversion in the small and in the large. Econometrica 32(1–2), 122–136 (1964)
45. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
46. Rockafellar, R.T., Uryasev, S.: Optimization of conditional value at risk. J. Risk 2, 21–41
(2000)
47. Rockafellar, R.T., Uryasev, S.: Conditional value-at-risk for general loss distributions. J. Bank.
Financ. 26, 1443–1471 (2002)
48. Roman, S.: Introduction to the Mathematics of Finance. Springer, New York (2004)
49. Shannon, C., Weaver, W.: The Mathematical Theory of Communication. University of Illinois
Press, Urbana (1949)
References 147
50. Sharpe, W.F.: Capital asset prices: a theory of market equilibrium under conditions of risk. J.
Finance 19, 425–442 (1964)
51. Sharpe, W.F.: Mutual fund performance. J. Bus. 39, 119–138 (1966)
52. Shreve, S.E.: Stochastical Calculus for Finance I. Springer, New York (2004)
53. Shreve, S.E.: Stochastical Calculus for Finance II Springer, New York (2004)
54. Steele, J.M.: Stochastic Calculus and Financial Applications. Springer, New York (2001)
55. Thorp, E.O.: Beat the Dealer. Random House, New York (1962)
56. Thorp, E.O.: Portfolio choice and the Kelly criterion. In: Proceedings of the Business and
Economic Statistics, pp. 215–224. American Statistical Association, Washington (1971)
57. Thorp, E.O., Kassouf, S.T.: Beat the Market. Random House, New York (1967)
58. Vazifedan, M., Zhu, Q.J.: No Arbitrage Principle in Conic Finance. Working Paper (2018)
59. Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics, vol. 58.
American Mathematical Society, Providence (2003)
60. Vince, R., Zhu, Q.J.: Optimal betting sizes for the game of blackjack. Risk J. Portf. Manag. 4,
53–75 (2015)
61. Xia, J., Yan, J.A.: Convex duality theory for optimal investment (2006). Preprint
62. Zǎlinescu, C.: On duality gaps in linear conic problems. School of Industrial and Systems Engi-
neering, Georgia Institute of Technology, Atlanta, GA (2010). Preprint. www.optimization-
online.org/DB_HTML/2010/09/2737.html
63. Zhu, Q.J.: Mathematical analysis of investment systems. J. Math. Anal. Appl. 326, 708–720
(2007)
64. Zhu, Q.J.: Convex analysis in mathematical finance. Nonlinear Anal. Theory Methods Appl.
75, 1719–1736 (2012)
Index
Symbols B
(Ω, F , P ), 35 Bachelier formula, 116, 117
CV aR, 79 convexity, 119
I , 13 beta, 44
K +, 3 biconjugate, 13, 14
RV (Ω, F , P ), 35 Black–Scholes formula, 116, 118
S, 36 as Fenchel-Legendra transform, 120
V aR, 78 convexity, 120
[A, B], 13 delta hedging, 124
Λ, 10 dual, 124
Θ, 36 generalized convexity, 121
χA , 102 time reversal, 126
Θ̂, 36 Blackjack, 53
ιC , 1, 13 Boltzmann–Shannon entropy, 20
·, · , 35 box algebra, 110
epi f , 1 Brownian motion, 108
int, 5
∂, 5
ρs , 77 C
σ -algebra, 35 capital asset pricing model, 39, 40, 43
σC , 2, 13 capital market line, 41, 42
dC , 4 capital market portfolio, 42
dd, 77 cash stream, 99
f ∗ , 12 implementable, 100
f −1 , 1 super implementable, 101
port[S], 57 chain rule, 8
ts[S], 85 coherent
E, 35 acceptence cone, 71
E[X], 2 partial order, 73
preference, 73
A price, 74
acceptance cone, 71 risk measure, 68
arbitrage, 54, 57 coherent partial order, 73
trading strategy, 101 coherent preference, 73
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 149
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2
150 Index
information structure, 85 P
interior of the domain, 5 partial order, 3, 73
Itô formula, 110 coherent, 73
basic form, 110 payoff, 61
dual, 114 polar cone, 3
graphic illustration, 110 polyhedral
multidimensional, 113 function, 6, 23
Itô process, 111 set, 6
portfolio, 36, 85
equivalent, 56
J growth optimal, 50
Jensen’s inequality, 2 Markowitz, 38
minimum risk, 40
space, 57
K price operator, 74
Kelly criterion, 52 consistent, 74, 102
normalized, 74, 105
L Pshenichnii–Rockafellar condition, 9
Lagrange multiplier, 4, 9, 15, 94
leverage, 51
linear programming, 6 R
log return function, 51 relative interior, 6
long, 134 return, 36
risk
aversion, 47
M coefficient(absolute), 48
market coefficient(relative), 48
complete, 66, 92 risk free asset, 36
incomplete, 66, 92 risk measure, 68
Markowitz coherent, 68
bullet, 39 conditional value at risk, 79
frontier, 38 drawdown, 77
portfolio, 35, 36 dual representation, 69, 70
martingale, 109 standard deviation, 77
representation, 114 value at risk, 78
martingale measure, 58 risky assets, 36
unique, 66, 92 rule
Fermat, 9
N
necessary optimality condition, 9 S
norm sandwich theorem, 7
portfolio, 85 Sharpe ratio, 45
trading strategy, 85 short, 134
normal cone, 8 span of the domain, 6
and subgradients, 8 stochastic processes, 107
to intersection, 8 subdifferential, 5
Novikov’s condition, 116 calculus, 8
chain rulel, 8
generalized, 30
O nonemptyness, 5
optimal leverage, 51, 52 sum rule, 8, 9
optimal value function, 4 subgradient, 5
order-reversing, 12 sum rule, 8
152 Index
T U
trading strategy, 85, 100, 101 utility optimizaation, 58
admissible, 86, 87
arbitrage, 86, 95, 101
leverage level, 86 V
norm, 86 valuation bounds, 73
self-financing, 86 coherent, 73
two fund separation theorem, 42
two fund theorem, 39