0% found this document useful (0 votes)

82 views162 pages

Convex Duality and Financial Mathematics - Compress

This document provides an introduction to convex duality and its applications in financial mathematics. It summarizes a book that uses a simple one-period financial market model to showcase how convex duality arises in important problems like Markowitz portfolio theory, capital asset pricing model, utility maximization, and coherent risk measures. It then expands the discussion to a more general multiperiod model and discusses additional topics like superhedging and conic finance. The goal is to provide graduate students and researchers an accessible introduction to the growing field of convex duality in financial problems.

Uploaded by

dodopdf31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views162 pages

Convex Duality and Financial Mathematics - Compress

Uploaded by

dodopdf31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 162

SPRINGER BRIEFS IN MATHEMATICS

Peter Carr · Qiji Jim Zhu

Convex Duality
and Financial
Mathematics

123
SpringerBriefs in Mathematics

Series Editors
Nicola Bellomo
Michele Benzi
Palle Jorgensen
Tatsien Li
Roderick Melnik
Otmar Scherzer
Benjamin Steinberg
Lothar Reichel
Yuri Tschinkel
George Yin
Ping Zhang

SpringerBriefs in Mathematics showcases expositions in all areas of mathematics

and applied mathematics. Manuscripts presenting new results or a single new result
in a classical field, new field, or an emerging topic, applications, or bridges between
new results and already published works, are encouraged. The series is intended for
mathematicians and applied mathematicians.

More information about this series at http://www.springer.com/series/10030

Peter Carr • Qiji Jim Zhu

Convex Duality and

Financial Mathematics

123
Peter Carr Qiji Jim Zhu
Department of Finance and Risk Engineering Department of Mathematics
Tandon School of Engineering Western Michigan University
New York University Kalamazoo, MI, USA
New York, NY, USA

ISSN 2191-8198 ISSN 2191-8201 (electronic)

SpringerBriefs in Mathematics
ISBN 978-3-319-92491-5 ISBN 978-3-319-92492-2 (eBook)
https://doi.org/10.1007/978-3-319-92492-2

Library of Congress Control Number: 2018946786

Mathematics Subject Classification: 26B25, 49N15, 52A41, 60J60, 90C25, 91B16, 91B25, 91B26,
91B30, 91G10, 91G20

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To Carol and Olivia
To Lilly and Charles.
And in memory of Jonathan Borwein
(1951–2016) with respect.
Preface

Convex duality plays an essential role in many important financial problems.

For example, it arises both in the minimization of convex risk measures and in
the maximization of concave utility functions. Together with generalized convex
duality, they also appear when an optimization is not immediately apparent, for
instance in implementing dynamic hedging of contingent claims. Recognizing the
role of convex duality in financial problems is crucial for several reasons. First,
considering the primal and dual problem together gives the financial modeler the
option to tackle the more accessible problem first. Usually, knowledge of the
solution of one helps in solving the other. Moreover, the solution to the dual problem
can usually be given a financial interpretation. As a result, the dual problem often
illuminates an alternative perspective, which is not easily achieved by examining the
primal problem in isolation. When flipping from the primal to the dual, a surprise
insight typically awaits, irrespective of past experience. Finally, as an added benefit,
the primal and the dual can often be paired together to provide better numerical
solutions than when either side is considered in isolation.
The goal of this book is to provide a concise introduction to this growing research
field. Our target audience is graduate students and researchers in related areas. We
begin in Chapter 1 with a quick introduction of convex duality and related tools.
We emphasize the relationship between convex duality and the Lagrange multiplier
rule for constrained optimization problems. We then give a quick overview of the
intrinsic duality relationship in several diverse financial problems.
In Chapter 2, we consider the simplest possible financial market model. In
particular, we consider a one-period economy with a finite number of possible states.
Using this simple financial market model, we showcase convex duality in a number
of important financial problems. We begin with the Markowitz portfolio theory,
which involves a particularly simple convex programming problem: optimizing a
quadratic function with linear constraints. Duality plays two important roles in
Markowitz portfolio theory. First, while the primal problem may involve hundreds
or even thousands of variables representing the risky assets potentially included in

vii
viii Preface

the portfolio, the dual problem has only two variables related to the two constraints
on the initial endowment and the expected return. In fact, the key observation of
Markowitz is that one can evaluate the performance of a portfolio in the dual space
using the variance-expected return pair. Second, the duality relationship between the
primal Markowitz portfolio problem and its dual helps us to understand that the set
of optimal portfolios is an affine set, which leads to the important two-fund theorem.
The core methodology of optimizing a quadratic function with linear constraints was
also used in the capital asset pricing model, which leads to the widely used Sharpe
ratio. Duality also plays a crucial role in this problem.
Next, we consider portfolio optimization from the perspective of maximizing
expected utility. There has been a very long history of using utility functions in
economics. In financial problems, utility functions are increasing concave functions
of wealth. The concavity of the utility function captures the risk aversion of an
investor. Arrow and Pratt introduced widely used measures of the level of risk
aversion. It turns out that there is a precise way of using generalized convexity to
characterize Pratt–Arrow risk aversion. This application illustrates the relevance of
generalized convexity in dealing with financial problems. It is even more interesting
to consider the dual of the expected utility maximization problem. It turns out
that in the absence of arbitrage, solutions to the dual problem are in essence
the equivalent martingale measures (also called risk-neutral probabilities), which
are widely used in pricing financial derivatives. Considering the expected utility
maximization problem along with its dual leads us to rediscover the fundamental
theorem of asset pricing. An added benefit of this alternative approach is that
martingale measures can be related to the risk aversion of agents in the market.
The last application that we cover in Chapter 2 concerns the dual representation
of coherent risk measures. Coherent risk measures are motivated by the common
regulatory practice of assigning each position in a risky asset with the appropriate
amount of cash reserves. Hence, they are widely used to analyze risks. Mathemat-
ically, a coherent risk measure is characterized by a sublinear function: a convex
function with positive homogeneity. It is well known that the dual of a sublinear
function is an indicator function. Thus, using dual representation, a coherent risk
measure is just the support function of a closed convex set. Financially, we can view
the generating set of a coherent risk measure as the probabilities assigned to risky
scenarios in a stress test. Duality also generates numerical methods for calculating
some important coherent risk measures such as the conditional value at risk.
We expand our discussion to a more general multiperiod financial market model
in Chapter 3. This more general setting allows us to model dynamic trading. The
added complexity in dealing with a multiperiod model mainly involves capturing
the increase in information using an information structure. After laying out the
multiperiod financial market model, we show that the fundamental theorem of asset
pricing also arises in a multiperiod financial market model. After that we also dis-
cuss two new topics: super (sub) hedging and conic finance. In general, the absence
of arbitrage leads to multiple (usually infinitely many) pricing martingale measures
Preface ix

in an incomplete market. Thus, the no arbitrage principle usually determines a price

range for a contingent claim with upper and lower bounds, which are given by the
supremum and the infimum of the expectation of the payoff under the martingale
measures, respectively. If a market price falls outside of these bounds, then an
arbitrage opportunity occurs. It turns out that the dual solution to the optimization
problem of finding the upper or lower no arbitrage bounds provides a trading
strategy that one can use to take advantage of such an arbitrage opportunity. Conic
finance is used to describe financial markets for which the absolute value of the
price depends on whether one is buying or selling. In other words, conic finance
describes realistic financial markets with a strictly positive bid-ask spread. In such
a model, the cash flows that can be achieved from implementing acceptable trading
strategies form a convex cone. This observation provides the rationale for the name
conic finance. Despite the added complication of dealing with a conic constraint,
we show that most of the duality relationships that are observed under zero bid–ask
spread still prevail when the spread is positive.
We then move to continuous-time financial models in Chapter 4. The most
noteworthy duality relationship developed in this chapter is the observation that the
classical Black-Scholes formula for pricing a contingent claim with a convex payoff
is, in fact, a Fenchel-Legendre transform. We show that the function describing
cash borrowings while delta hedging a short position in a contingent claim is just
the Fenchel conjugate of the contingent claim pricing function. The flip side is that
the contingent claim pricing function can itself be viewed as a Fenchel conjugate
of the function describing these cash borrowings. This provides a new perspective
on the convex function linking the price of the contingent claim to the underlying
spot price. With the availability of many tradable contingent claims such as those
embedded in ETFs, the ability to dynamically hedge a contingent claim with other
contingent claims is increasingly becoming a financial reality. Interestingly, when
using contingent claims as hedging instruments, one discovers a similar duality
relationship between the contingent claim pricing function and the cash borrowings
function in terms of generalized convexity. Many useful applications are also
discussed in this chapter. We examine the convexity and generalized convexity of
the Bachelier and Black-Scholes option pricing formulae with respect to volatility
as well. Generalizations of these properties might be useful in dealing with financial
products related to volatility and be a potentially fruitful future research direction.
The material in this book grew out of slides used to teach a joint doctoral
seminar at New York University’s Courant Institute in the fall of 2015. Part of the
materials has also been used previously for graduate topic courses on optimization
and modeling at Western Michigan University. We thank our colleagues at both
NYU and WMU for providing us with supportive research environments. Professor
Robert Kohn helped to arrange us becoming neighbors, which facilitated our
collaboration in no small part. Conversations with Professors Marco Avellaneda,
Jonathan Goodman, and Fang-Hua Lin have been most helpful. We are also indebted
to the participants of these courses for many stimulating discussions. In particular,
x Preface

we thank Monty Essid, Tom Li, Matthew Foreman, Sanjay Karanth, Jay Treiman,
Mehdi Vazifadan, and Guolin Yu whose detailed comments on various parts of our
lecture notes have been incorporated into the text.

New York, NY, USA Peter Carr

Kalamazoo, MI, USA Qiji Jim Zhu
April, 2017
Contents

1 Convex Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Convex Sets and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Convex Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Subdifferential and Lagrange Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Nonemptiness of Subdifferential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 Role in Convex Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Fenchel Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 The Fenchel Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 The Fenchel–Young Inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3 Graphic Illustration and Generalizations . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Convex Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.1 Rockafellar Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.2 Fenchel Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.3 Lagrange Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4.4 Generalized Fenchel–Young Inequality . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Generalized Convexity, Conjugacy and Duality . . . . . . . . . . . . . . . . . . . . . . . 28
2 Financial Models in One Period Economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.1 Markowitz Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.1.2 Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.1.3 Sharpe Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.1 Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.2 Measuring Risk Aversion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.3 Growth Optimal Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.2.4 Efficiency Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

xi
xii Contents

2.3 Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.3.1 Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . . . . 57
2.3.2 Pricing Contingent Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.3.3 Complete Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.3.4 Use Linear Programming Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.4 Risk Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.4.1 Coherent Risk Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.4.2 Equivalent Characterization of Coherent Risk Measures . . . . . 69
2.4.3 Good Deal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.4.4 Several Commonly Used Risk Measures . . . . . . . . . . . . . . . . . . . . . . 77
3 Finite Period Financial Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.1.1 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.1.2 A General Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.2 Arbitrage and Admissible Trading Strategies. . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.3 Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.3.1 Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . . . . 89
3.3.2 Relationship Between Dual of Portfolio Utility
Maximization, Lagrange Multiplier and Martingale
Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.3.3 Pricing Contingent Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.3.4 Complete Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.4 Hedging and Super Hedging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.1 Super- and Sub-hedging Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.2 Towards a Complete Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.4.3 Incomplete Market Arise from Complete Markets . . . . . . . . . . . . 97
3.5 Conic Finance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.5.1 Modeling Financial Markets with an Ask-Bid Spread . . . . . . . . 99
3.5.2 Characterization of No Arbitrage by Utility Optimization . . . 101
3.5.3 Dual Characterization of No Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . 102
3.5.4 Pricing and Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4 Continuous Financial Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.1 Continuous Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.1.1 Brownian Motion and Martingale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.1.2 The Itô Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.1.3 Girsanov Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.2 Bachelier and Black–Scholes Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.2.1 Pricing Contingent Claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.2.2 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.2.3 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.3 Duality and Delta Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.3.1 Delta Hedging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Contents xiii

4.3.2 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

4.3.3 Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4 Generalized Duality and Hedging with Contingent Claims . . . . . . . . . . . 128
4.4.1 Preservation of Generalized Convexity in the Value
Function of a Contingent Claim. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.4.2 Determining the Hedging Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.4.3 Hedging with p-Multiple ETF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.4.4 Reducing the Volatility of the Hedging Process . . . . . . . . . . . . . . . 138
4.4.5 The Volatility Trade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Chapter 1
Convex Duality

Abstract We present a concise description of the convex duality theory in this

chapter. The goal is to lay a foundation for later application in various financial
problems rather than to be comprehensive. We emphasize the role of the subdiffer-
ential of the value function of a convex programming problem. It is both the set of
Lagrange multiplier and the set of solutions to the dual problem. These relationships
provide much convenience in financial applications. We also discuss generalized
convexity, conjugacy, and duality.

1.1 Convex Sets and Functions

1.1.1 Definitions

Definition 1.1.1 (Convex Sets and Functions) Let X be a Banach space. We say
that a subset C of X is a convex set if, for any x, y ∈ C and any λ ∈ [0, 1],
λx + (1 − λ)y ∈ C. We say an extended-valued function f : X → R ∪ {+∞} is a
convex function if its domain, dom f := {x ∈ X | f (x) < ∞}, is convex and for
any x, y ∈ dom f and any λ ∈ [0, 1], one has

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y).

We call f : X → [−∞, +∞) a concave function if −f is convex.

In some sense convex functions are the simplest functions next to linear
functions. Convex sets and functions are intrinsically related. For example, it is
easy to verify that C is a convex set if and only if ιC (x) := 0 if x ∈ C and
ιC (x) := +∞ otherwise, its indicator function, is a convex function. On the other
hand, if f is a convex function, then the epigraph of f , epi f := {(x, r) | f (x) ≤ r}
and f −1 ((−∞, a]) := {x | f (x) ∈ (−∞, a]}, a ∈ R are convex sets. In fact,
we can check that the convexity of epi f characterizes that of f . This geometric

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 1

P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2_1
2 1 Convex Duality

characterization is very useful in many situations. For instance, it is easy to see that
the intersection of a class of convex sets is convex. Now let fα be a class of convex
functions we can see that

epi sup fα = ∩α epi fα

and, thus, supα fα is convex. In particular, the support function of a set C ⊂ X

defined on the dual space X∗ by

σC (x ∗ ) = σ (C; x ∗ ) := sup{ x, x ∗ | x ∈ C} (1.1.1)

is always convex. Note that allowing the extended value +∞ in the definition of
convex function is important in establishing those relations.
An important property of convex functions related to applications in economics
and finance is the Jensen inequality.
Proposition 1.1.2 (Jensen’s Inequality) Let f be a convex function. Then, for any
random variable X on a finite probability space,

f (E[X]) ≤ E[f (X)],

where E[X] stands for the expectation of X.

When X has only finite states this result directly follows from the definition. The
general result can be proven by approximation.
A special kind of convex set-convex cone is very useful.
Definition 1.1.3 Let X be a finite dimensional Banach space. We say K ⊂ X is a
convex cone if for any x, y ∈ K and any α, β ≥ 0, αx + βy ∈ K. Moreover, we say
K is pointed if K ∩ (−K) = {0}.
A pointed convex cone K induces a partial order ≤K by defining x ≤K y if and
only if y−x ∈ K. We can easily check that ≤K is reflexive (x ≤K x), antisymmetric
(x ≤K y and y ≤K x implies x = y), and transitive (x ≤K y and y ≤K z implies
that x ≤K z). The definition of convexity can easily be extended to mappings whose
image space has such a partial order.
Definition 1.1.4 (Convex Mappings) Let X and Y be two Banach spaces. Assume
that Y has a partial order ≤K generated by the pointed convex cone K ⊂ Y . We say
that a mapping f : X → Y is K-convex provided that, for any x, y ∈ dom f and
any λ ∈ [0, 1], one has

f (λx + (1 − λ)y) ≤K λf (x) + (1 − λ)f (y).

1.1 Convex Sets and Functions 3

1.1.2 Convex Programming

We will often encounter various forms of the general convex programming problems
below in financial applications in subsequent chapters. Let X, Y , and Z be finite
dimensional Banach spaces. Assume that Y has a partial order ≤K generated by the
pointed convex cone K. We will use X∗ , Y ∗ , and Z ∗ to denote the dual spaces of
X, Y , and Z, respectively, and denote the polar cone of K by

K + := {y ∗ ∈ Y ∗ : y ∗ , y ≥ 0 for all y ∈ K}.

Consider the following class of constrained optimization problems

P (y, z) Minimize f (x) (1.1.2)

Subject to g(x) ≤K y,
h(x) = z,
x ∈ C,

where C is a closed set, f : X → R is lower semicontinuous, g : X → Y is lower

semicontinuous with respect to ≤K , and h : X → Z is continuous. We will use
v(y, z) to represent the optimal value function

v(y, z) := inf{f (x) : g(x) ≤K y, h(x) = z, x ∈ C},

which may take values ±∞ (in infeasible or unbounded below cases), and S(y, z)
the (possibly empty) solution set of problem P (y, z).
A concrete example is

Minimize f (x) (1.1.3)

Subject to gm (x) ≤ ym , m = 1, 2, . . . , M,
hl (x) = zl , l = 1, 2, . . . , L
x ∈ C ⊂ RN ,

where C is a closed subset, f, gm : RN → R are lower semicontinuous, and hl :

RN → R are continuous. Defining vector valued function g = (g1 , g2 , . . . , gM ) and
h = (h1 , h2 , . . . , hL ), problem (1.1.3) becomes problem (1.1.2) with ≤K =≤RM ,
+
where RM + := {x ∈ R
M | x ≥ 0} is the positive orthant in RM . Beside Euclidean

spaces, for applications in this book we will often need to consider the Banach space
of random variables.
It turns out that the optimal value function of a convex programming problem is
convex.
4 1 Convex Duality

Proposition 1.1.5 (Convexity of Optimal Value Function) Suppose that in the

constrained optimization problem (1.1.2), function f is convex, mapping g is ≤K
convex, and mapping h is affine, and set C is convex. Then the optimal value function
v is convex.
Proof Consider (y i , zi ), i = 1, 2 in the domain of v and an arbitrary ε > 0. We can
find xεi feasible to the constraint of problem P (y i , zi ) such that

f (xεi ) < v(y i , zi ) + ε, i = 1, 2. (1.1.4)

Now for any λ ∈ [0, 1], we have

f (λxε1 + (1 − λ)xε2 ) ≤ λf (xε1 ) + (1 − λ)f (xε2 ) (1.1.5)

< λv(y 1 , z1 ) + (1 − λ)v(y 2 , z2 ) + ε.

It is easy to check that λxε1 + (1 − λ)xε2 is feasible for problem P (λ(y 1 , z1 ) + (1 −

λ)(y 2 , z2 )). Thus, v(λ(y 1 , z1 ) + (1 − λ)(y 2 , z2 )) ≤ f (λxε1 + (1 − λ)xε2 ). Combining
with inequality (1.1.5) and letting ε → 0 we arrive at

v(λ(y 1 , z1 ) + (1 − λ)(y 2 , z2 )) ≤ λv(y 1 , z1 ) + (1 − λ)v(y 2 , z2 ),

that is to say v is convex.

This is a very potent result that can help us to recognize the convexity of many
other functions. For example, let C be a convex set then, dC , the distance function
to C defined by dC (z) := inf[z − c : c ∈ C] is a convex function because we can
rewrite it as the optimal value of the following special case of problem (1.1.2)

dC (z) = inf[x : x + c = z, c ∈ C].

While the value function of a convex programming problem is always convex, it

is not necessarily smooth even if all the data involved are smooth. The following is
an example
√
− y y≥0
v(y) = inf[x : x ≤ y] =
2
+∞ y < 0.

1.2 Subdifferential and Lagrange Multiplier

Many naturally arising nonsmooth convex functions lead to the definition of

subdifferential as a replacement for the nonexisting derivative.
1.2 Subdifferential and Lagrange Multiplier 5

1.2.1 Definition

Definition 1.2.1 (Subdifferential) Let X be a finite dimensional Banach space.

The subdifferential of a lower semicontinuous function φ : X → R ∪ {+∞} at
x ∈ dom φ is defined by

∂φ(x) = {x ∗ ∈ X∗ : φ(y) − φ(x) ≥ x ∗ , y − x ∀y ∈ X}.

We define the domain of the subdifferential of φ by

dom ∂φ = {x ∈ X | ∂φ(x) = ∅}.

An element of ∂φ(x) is called a subgradient of φ at x.

Definition 1.2.2 (Normal Cone) For a closed convex set C ⊂ X, we define the
normal cone of C at x̄ ∈ C by N(C; x̄) = ∂ιC (x̄).
Sometimes we will also use the notation NC (x̄) = N (C; x̄). A useful characteriza-
tion of the normal cone is x ∗ ∈ N(C; x) if and only if, for all y ∈ C, x ∗ , y−x ≤ 0.
It is easy to verify that if f has a continuous derivative at x then ∂f (x) = {f (x)}.
At a nondifferentiable point a convex function’s subdifferential is usually a set. Here
are a few examples.
Example 1.2.3 We can easily verify that
• ∂| · |(0) = [−1, 1].
• ∂(·)+ (0) = [0, +∞).
• ∂(·)− (0) = (−∞, 0].
In general, if · is the euclidean norm on RN , ∂ · (0) = B1 (0), where B1 (0) is
the closed unit ball of RN .

1.2.2 Nonemptiness of Subdifferential

A natural and important question is that when we can ensure the subdifferential
is nonempty. The following Fenchel-Rockafellar theorem provides a basic form of
sufficient conditions.
Theorem 1.2.4 (Fenchel-Rockafellar Theorem on Nonemptiness of Subdiffer-
ential) Let f : X → R ∪ {+∞} be a convex function. Suppose that x̄ ∈
int(dom f ), the interior of dom f . Then the subdifferential ∂f (x̄) is nonempty.
Proof We observe that (x̄, f (x̄)) is a boundary point of the closed set epi f which
has a nonempty interior. Thus by the Hahn–Banach extension theorem there exists
6 1 Convex Duality

a supporting hyperplane of epi f at (x̄, f (x̄)) whose normal vector is (0, 0) =

(x ∗ , r) ∈ X∗ × R. Now, for any x ∈ dom f and u ≥ f (x), we have

r(u − f (x̄)) + x ∗ , x − x̄ ≥ 0. (1.2.1)

Since u ≥ f (x) is arbitrary, r ≥ 0. Moreover, if r = 0, then x̄ ∈ int dom f would

also imply x ∗ = 0, which yield a contradiction. Thus, r > 0. Letting u = f (x)
in (1.2.1) we see that −x ∗ /r ∈ ∂f (x̄).

Remark 1.2.5 (Constraint Qualification: Relative Interior) The Fenchel-Rockafellar

Theorem is a fundamental result that we will use often in the sequel. Condition
x̄ ∈ int(dom f ) is a sufficient condition that can be improved. Notice that we
don’t need to worry about points at which f = ∞. Thus, we need only check the
condition of Theorem 1.2.4 on span(dom f ), the span of dom f . Thus, condition
x̄ ∈ int(dom f ) can be revised to x ∈ ri(dom f ) and f is lower semicontinuous,
where ri signifies the relative interior, i.e. interior points on span(dom f ).
Remark 1.2.6 (Constraint Qualification: Polyhedral Problem) Recall that a set is
polyhedral if it is the intersection of finitely many closed half-spaces. A function
is polyhedral if its epigraph is a polyhedral set. For a polyhedral function its
subdifferential is nonempty in any point of its domain (see, e.g., [7]). This sufficient
condition is very useful in dealing with linear programming problems.
The conclusion ∂f (x̄) = ∅ can be stated alternatively as there exists a linear
functional x ∗ such that f − x ∗ attains its minimum at x̄. This is a very useful
perspective on the use of variational arguments—deriving results by observing a
certain auxiliary function attains a minimum or maximum.

1.2.3 Calculus

For more complicated convex functions we need the help of a convenient calculus
for calculating or estimating its subdifferential. It turns out that the key for
developing such a calculus is to combine a decoupling mechanism with the existence
of subgradient. We summarize this idea in the following lemma.
Lemma 1.2.7 (Decoupling Lemma) Let X and Y be Banach spaces. Let the
functions f : X → R and g : Y → R be convex and let A : X → Y be a linear
transform. Suppose that f , g, and A satisfy the condition

0 ∈ ri[dom g − A dom f ]. (1.2.2)

Then there is a y ∗ ∈ Y ∗ such that for any x ∈ X and y ∈ Y ,

p ≤ [f (x) − y ∗ , Ax ] + [g(y) + y ∗ , y ], (1.2.3)

1.2 Subdifferential and Lagrange Multiplier 7

where p = infx∈X {f (x) + g(Ax)}.

Proof Define an optimal value function v : Y → [−∞, +∞] by

v(u) = inf {f (x) + g(Ax + u)}

x∈X
= inf {f (x) + g(y) : y − Ax = u.} (1.2.4)
x∈X

Proposition 1.1.5 implies that v is convex. Moreover, it is easy to check that

dom v = dom g − A dom f so that by Theorem 1.2.4 and Remark 1.2.5 the
constraint qualification condition (1.2.2) ensures that ∂v(0) = ∅. Let −y ∗ ∈ ∂v(0).
By definition we have

v(0)=p ≤ v(y−Ax)+ y ∗ , y−Ax ≤ f (x)+g(y)+ y ∗ , y − Ax . (1.2.5)

We apply the decoupling lemma of Lemma 1.2.7 to establish a sandwich

theorem.
Theorem 1.2.8 (Sandwich Theorem) Let f : X → R ∪ {+∞} and g : Y →
R ∪ {+∞} be convex functions and let A : X → Y be a linear map. Suppose that
f ≥ −g ◦ A and f , g, and A satisfy condition (1.2.2). Then there is an affine
function α : X → R of the form α(x) = A∗ y ∗ , x + r satisfying f ≥ α ≥ −g ◦ A.
Moreover, for any x̄ satisfying f (x̄) = −g ◦ A(x̄), we have −y ∗ ∈ ∂g(Ax̄).
Proof By Lemma 1.2.7 there exists y ∗ ∈ Y ∗ such that for any x ∈ X and y ∈ Y ,

0 ≤ p ≤ [f (x) − y ∗ , Ax ] + [g(y) + y ∗ , y ]. (1.2.6)

For any z ∈ X setting y = Az in (1.2.6) we have

f (x) − A∗ y ∗ , x ≥ −g(Az) − A∗ y ∗ , z . (1.2.7)

Thus,

a := inf [f (x) − A∗ y ∗ , x ] ≥ b := sup[−g(Az) − A∗ y ∗ , z ].

x∈X z∈X

Picking any r ∈ [a, b], α(x) := A∗ y ∗ , x + r is an affine function that separates f

and −g ◦ A. Finally, when f (x̄) = −g ◦ A(x̄), it follows from (1.2.6) that −y ∗ ∈
∂g(Ax̄).
8 1 Convex Duality

We now use the tools established above to deduce calculus rules for the convex
functions. We start with a sum rule playing a role similar to the sum rule for
derivatives in calculus.
Theorem 1.2.9 (Convex Subdifferential Sum Rule) Let f : X → R ∪ {+∞}
and g : Y → R ∪ {+∞} be convex functions and let A : X → Y be a linear map.
Then at any point x in X, we have the sum rule

∂(f + g ◦ A)(x) ⊃ ∂f (x) + A∗ ∂g(Ax), (1.2.8)

with equality if condition (1.2.2) holds.

Proof Inclusion (1.2.8) is easy and left to the reader as an exercise. We prove the
reverse inclusion under condition (1.2.2). Suppose x ∗ ∈ ∂(f + g ◦ A)(x̄). Since
shifting by a constant does not change the subdifferential of a convex function, we
may assume without loss of generality that

x → f (x) + g(Ax) − x ∗ , x

attains its minimum 0 at x = x̄. By the sandwich theorem there exists an affine
function α(x) := A∗ y ∗ , x + r with −y ∗ ∈ ∂g(Ax̄) such that

f (x) − x ∗ , x ≥ α(x) ≥ −g(Ax).

Clearly equality is attained at x = x̄. It is now an easy matter to check that x ∗ +

A∗ y ∗ ∈ ∂f (x̄).

Note that when A is the identity mapping and both f and g are differentiable
Theorem 1.2.9 recovers sum rules in calculus. The geometrical interpretation of this
is that one can find a hyperplane in X × R that separates the epigraph of f and
hypograph of −g, i.e. {(x, r) : −g(x) ≥ r}.
By applying the subdifferential sum rule to the indicator functions of two convex
sets we have parallel results for the normal cones to the intersection of convex sets.
Theorem 1.2.10 (Normals to an Intersection) Let C1 and C2 be two convex
subsets of X and let x ∈ C1 ∩ C2 . Suppose that C1 ∩ int C2 = ∅. Then

N(C1 ∩ C2 ; x) = N(C1 ; x) + N (C2 ; x).

Proof Applying the subdifferential sum rule to the indicator functions of C1

and C2 .

The condition (1.2.2) is often referred to as a constraint qualification. Without it

the equality in the convex subdifferential sum rule may not hold.
1.2 Subdifferential and Lagrange Multiplier 9

1.2.4 Role in Convex Programming

Subdifferential plays important roles in convex programming. First for uncon-

strained convex minimization problem we have Fermat’s rule:
Proposition 1.2.11 (Fermat’s Rule) Let X be a Banach space and let f : X →
R ∪ {+∞} be a proper convex function. Then the point x̄ ∈ X is a (global)
minimizer of f if and only if the condition 0 ∈ ∂f (x̄) holds.
Proof We only need to observe that x̄ ∈ X is a minimizer of f if and only if

f (x) − f (x̄) ≥ 0 = 0, x − x̄ ,

which by definition is equivalent to 0 ∈ ∂f (x̄).

Alternatively put, minimizers of f correspond exactly to “zeroes” of ∂f .

Consider the constrained convex optimization problem of

CP minimize f (x) (1.2.9)

subject to x ∈ C ⊂ X,

where C is a closed convex subset of X and f : X → R ∪ {+∞} is a convex lower

semicontinuous function. Combining the Fermat’s rule with the subdifferential sum
rule we derive a characterization for solutions to CP.
Theorem 1.2.12 (Pshenichnii–Rockafellar Conditions) Let C be a closed convex
subset of RN and let f : X → R ∪ {+∞} be a convex function. Suppose that 0 ∈
ri[dom f − C] and f is bounded from below on C. Then x̄ is a solution of CP if and
only if it satisfies

0 ∈ ∂f (x̄) + N(C; x̄).

Proof Apply the convex subdifferential sum rule of Theorem 1.2.9 to f + ιC at x̄.

Finally we turn to the relationship between subdifferential of optimal value

functions in convex programming and Lagrange multipliers. We shall see from
the two versions of Lagrange multiplier rules given below, the subdifferential of
the optimal value function completely characterizes the set of Lagrange multipliers
(denoted λ in these theorems).
Theorem 1.2.13 (Lagrange Multiplier Without Existence of Optimal Solution)
Let v(y, z) be the optimal value function of the constrained optimization problem
P (y, z). Then −λ ∈ ∂v(0, 0) if and only if
10 1 Convex Duality

(i) (nonnegativity) λ ∈ K + × Z ∗ ; and

(ii) (unconstrained optimum) for any x ∈ C,

f (x) + λ, (g(x), h(x)) ≥ v(0, 0).

Proof The “only if” part. Suppose that −λ ∈ ∂v(0, 0). It is easy to see that v(y, 0)
is non-increasing with respect to the partial order ≤K . Thus, for any y ∈ K,

0 ≥ v(y, 0) − v(0, 0) ≥ −λ, (y, 0)

so that λ ∈ K + × Z ∗ verifying (i). Conclusion (ii) follows from the fact that for all
x ∈ C,

f (x) + λ, (g(x), h(x)) ≥ v(g(x), h(x)) + λ, (g(x), h(x)) ≥ v(0, 0).

(1.2.10)

The “if” part. Suppose λ satisfies conditions (i) and (ii). Then we have, for any
x ∈ C, g(x) ≤K y and h(x) = z,

f (x) + λ, (y, z) ≥ f (x) + λ, (g(x), h(x)) ≥ v(0, 0). (1.2.11)

Taking the infimum of the leftmost term under the constraints x ∈ C, g(x) ≤K y
and h(x) = z, we arrive at

v(y, z) + λ, (y, z) ≥ v(0, 0). (1.2.12)

Therefore, −λ ∈ ∂v(0, 0).

If we denote by Λ(y, z) the multipliers satisfying (i) and (ii) of Theorem 1.2.13,
then we may write the useful set equality

Λ(0, 0) = −∂v(0, 0).

The next corollary is now immediate.

Corollary 1.2.14 (Lagrange Multiplier Without Existence of Optimal Solution)
Let v(y, z) be the optimal value function of the constrained optimization problem
P (y, z). Then −λ ∈ ∂v(0, 0) if and only if
(i) (nonnegativity) λ ∈ K + × Z ∗ ;
(ii) (unconstrained optimum) for any x ∈ C, satisfying g(x) ≤K y and h(x) = z,

f (x) + λ, (y, z) ≥ v(0, 0).

1.2 Subdifferential and Lagrange Multiplier 11

When an optimal solution for the problem P (0, 0) exists, we can also derive a
so-called complementary slackness condition.
Theorem 1.2.15 (Lagrange Multiplier when Optimal Solution Exists) Let
v(y, z) be the optimal value function of the constrained optimization problem
P (y, z). Then the pair (x̄, λ) satisfies −λ ∈ ∂v(0, 0) and x̄ ∈ S(0, 0) if and only
if
(i) (nonnegativity) λ ∈ K + × Z ∗ ;
(ii) (unconstrained optimum) the function

x → f (x) + λ, (g(x), h(x))

attains its minimum over C at x̄;

(iii) (complementary slackness) λ, (g(x̄), h(x̄)) = 0.
Proof The “only if” part. Suppose that x̄ ∈ S(0, 0) and −λ ∈ ∂v(0, 0). As in the
proof of Theorem 1.2.13 we can show that λ ∈ K + × Z ∗ . By the definition of the
subdifferential and the fact that v(g(x̄), h(x̄)) = v(0, 0), we have

0 = v(g(x̄), h(x̄)) − v(0, 0) ≥ −λ, (g(x̄), h(x̄)) ≥ 0,

so that the complementary slackness condition λ, (g(x̄), h(x̄)) = 0 holds.

Observing that v(0, 0) = f (x̄) + λ, (g(x̄), h(x̄)) , the strengthened uncon-
strained optimal condition follows directly from that of Theorem 1.2.13.
The “if” part. Let λ and x̄ satisfy conditions (i), (ii), and (iii). Then, for any x ∈ C
satisfying g(x) ≤K 0 and h(x) = 0,

f (x) ≥ f (x) + λ, (g(x), h(x)) (1.2.13)

≥ f (x̄) + λ, (g(x̄), h(x̄)) = f (x̄).

That is to say x̄ ∈ S(0, 0).

Moreover, for any g(x) ≤K y, h(x) = z, f (x) + λ, (y, z) ≥ f (x) +
λ, (g(x), h(x)) . Since v(0, 0) = f (x̄), by (1.2.13) we have

f (x) + λ, (y, z) ≥ f (x̄) = v(0, 0). (1.2.14)

Taking the infimum on the left-hand side of (1.2.14) yields

v(y, z) + λ, (y, z) ≥ v(0, 0),

which is to say, −λ ∈ ∂v(0, 0).

We can deduce from Theorems 1.2.13 and 1.2.15 that ∂v(0, 0) completely
characterizes the set of Lagrange multipliers.
12 1 Convex Duality

1.3 Fenchel Conjugate

Obtaining Lagrange multipliers by using the convex subdifferential is closely related

to convex duality theory based on the concept of conjugate functions introduced by
Fenchel.

1.3.1 The Fenchel Conjugate

The Fenchel conjugate of a function (not necessarily convex) f : X → [−∞, +∞]

is the function f ∗ : X∗ → [−∞, +∞] defined by

f ∗ (x ∗ ) := sup { x ∗ , x − f (x)}. (1.3.1)

x∈X

The operation f → f ∗ is also called a Fenchel–Legendre transform. The function

f ∗ is convex and if the domain of f is nonempty then f ∗ never takes the value
−∞. Clearly the conjugate operation is order-reversing : for functions f, g : X →
[−∞, +∞], the inequality f ≥ g implies f ∗ ≤ g ∗ .

1.3.2 The Fenchel–Young Inequality

This is an elementary but important result that relates conjugate operation with the
subgradient.
Proposition 1.3.1 (Fenchel–Young Inequality) Let f : X → R ∪ {+∞} be a
convex function. Suppose that x ∗ ∈ X∗ and x ∈ dom f . Then

f (x) + f ∗ (x ∗ ) ≥ x ∗ , x . (1.3.2)

Equality holds if and only if x ∗ ∈ ∂f (x).

Proof The inequality (1.3.2) follows directly from the definition. We have the
equality

f (x) + f ∗ (x ∗ ) = x ∗ , x ,

if and only if, for any y ∈ X,

f (x) + x ∗ , y − f (y) ≤ x ∗ , x .
1.3 Fenchel Conjugate 13

That is

f (y) − f (x) ≥ x ∗ , y − x ,

or x ∗ ∈ ∂f (x).

Remark 1.3.2 When f is differentiable, taking derivative with respect to x in the

Fenchel equality we have x ∗ = f (x). Then the Fenchel–Legendre transform has
the following explicit form as a function of x

f ∗ (f (x)) = x, f (x) − f (x).

In Chapter 4 we will see that when f is the price of a contingent claim as a

function of a forward price x, the Fenchel–Legendre transform is related to the
delta hedging. Its derivative is also relevant when we deal with dynamical hedging.
We can directly verify the following representation of the derivative of the Fenchel–
Legendre transform

Dx f ∗ (f (x)) = Dx x, f (x) − f (x) = [Dx , f (x)I ]x,

where Dx is the differential operator with respect to x, I is the identity operator, and
[A, B] = AB − BA represents the commutator of operator A and B. Symmetrically
we also have

Dx ∗ f ((f ∗ ) (x ∗ )) = [Dx ∗ , (f ∗ ) (x ∗ )I ]x ∗ .

We can consider the conjugate of f ∗ called the biconjugate of f and denoted

f ∗∗ . This is a function on X∗∗ . When X is a reflexive Banach space, i.e. X = X∗∗ it
follows from the Fenchel–Young inequality (1.3.2) that f ∗∗ ≤ f . The function f ∗∗
is the largest among all the convex function dominated by f and is called the convex
hull of f . Many important convex functions f on X = RN equal to their biconjugate
f ∗∗ . Such functions thus occur as natural pairs, f = g ∗ and f ∗ = g, where both
f and g are lsc convex functions. Table 1.1 shows some elegant examples on R.
Checking the calculation in Table 1.1 is a good exercise to get familiar with concept
of conjugate functions.
Note that the first four functions in Table 1.1 are special cases of indicator
functions on R. A more general result is:
Example 1.3.3 (Conjugate of Indicator Function) Let C be a closed convex set in
the reflexive Banach space X. Then ι∗C = σC and σC∗ = ιC . The first four lines of
Table 1.1 describe four different indicate functions and their conjugate functions.

Example 1.3.4 (Conjugate of Transform) Next let us assume that h is a lower

semicontinuous function. Then the effect of some simple transform on conjugate
is summarized in Table 1.2.
14 1 Convex Duality

Table 1.1 Conjugate pairs of convex functions f and g on R

f (x) = g ∗ (x) dom f g(y) = f ∗ (y) dom g
0 R 0 {0}
0 R+ −R+
0 [−1, 1] |y| R
0 [0,1] y+ R
|x|p /p, p > 1 R |y|q /q ( p1 + 1
q = 1) R
|x|p /p, p>1 R+ |y + |q /q ( p1 + 1
q = 1) R
−x p /p, 0 < p < 1 R+ −(−y)q /q ( p1 + 1
q = 1) − int R+
− log x int R+ −1
− log(−y) − int R+
y log y − y (y > 0)
ex R R+
0 (y = 0)

Table 1.2 Transformed f (x) f ∗ (y)

conjugates
h(ax) (a =
0) h∗ (y/a)
h(x + b) h∗ (y) − by
ah(x) (a > 0) ah∗ (y/a)

Combining Fenchel–Young inequality and the sandwich theorem we can show

that f ∗∗ = f for convex lower semicontinuous (lsc) function f .
Theorem 1.3.5 (Biconjugate) Let X be a finite dimensional Banach space. Then
f ∗∗ ≤ f in dom f and equality holds at point x ∈ int dom f .

Proof It is easy to check f ∗∗ ≤ f and we leave it as an exercise. For any x̄ ∈

int dom f , ∂f (x̄) = ∅. Let x ∗ ∈ ∂f (x̄). By the Fenchel–Young inequality we have

f (x̄) = x ∗ , x̄ − f ∗ (x ∗ ) ≤ sup[ y ∗ , x̄ − f ∗ (y ∗ )] = f ∗∗ (x̄) ≤ f (x̄).

y∗

1.3.3 Graphic Illustration and Generalizations

x
For increasing function φ, φ(0) = 0, f (x) = 0 φ(s)ds is convex and
x∗
f ∗ (x ∗ ) = 0 φ −1 (t)dt. Graphs Figure 1.1 illustrate the Fenchel–Young inequality
graphically. The additional areas enclosed by the graph of φ −1 , s = x and t = x ∗
1.4 Convex Duality Theory 15

t t

x∗
x∗

φ−1 φ−1
φ φ

s s
O x O x

Fig. 1.1 Fenchel–Young inequality

Fig. 1.2 Fenchel–Young t

equality
x∗

φ−1
φ

s
O x

or that of φ, s = x and t = x ∗ beyond the area of the rectangle [0, x] × [0, x ∗ ]

generate the additional area that leads to a strict inequality. We also see that equality
holds when x ∗ = φ(x) = f (x) and x = φ −1 (x ∗ ) = (f ∗ ) (x ∗ ) in Figure 1.2.

1.4 Convex Duality Theory

Using the Fenchel–Young inequality for each constrained optimization problem we

can write its companion dual problem. There are several different but equivalent
perspectives.
16 1 Convex Duality

1.4.1 Rockafellar Duality

We start with the Rockafellar formulation of bi-conjugate. It is very general and—as

we shall see—other perspectives can easily be written as its special cases.
Consider a two-variable function F (x, y) on X × Y where X, Y are Banach
spaces. Treating y as a parameter, consider the parameterized optimization problem

v(y) = inf F (x, y). (1.4.1)

Our associated primal optimization problem1 is

p = v(0) = inf F (x, 0) (1.4.2)

x∈X

and the dual problem is

d = v ∗∗ (0) = sup −F ∗ (0, −y ∗ ). (1.4.3)

y ∗ ∈Y ∗

Since v dominates v ∗∗ as the Fenchel–Young inequality establishes, we have

v(0) = p ≥ d = v ∗∗ (0).

This is called weak duality and the non-negative number p − d = v(0) − v ∗∗ (0) is
called the duality gap—which we aspire to be small or zero.
Let F (x, (y, z)) := f (x) + ιepi(g) (x, y) + ιgraph(h) (x, z). Then problem P (y, z)
in (1.1.2) becomes problem (1.4.1) with parameters (y, z). On the other hand, we
can rewrite (1.4.1) as

v(y) = inf{F (x, u) : u = y}

which is problem P (0, y) with x = (x, u), C = X × Y , f (x, u) = F (x, u),

h(x, u) = u and g(x, u) = 0. So where we start is a matter of taste and
predisposition.
Theorem 1.4.1 (Duality and Lagrange Multipliers) The followings are
equivalent:
(i) the primal problem has a Lagrange multiplier λ.
(ii) there is no duality gap, i.e. d = p is finite and the dual problem has solution −λ.

1 Theuse of the term “primal” is much more recent than the term “dual” and was suggested by
George Dantzig’s father Tobias when linear programming was being developed in the 1940s.
1.4 Convex Duality Theory 17

Proof If the primal problem has a Lagrange multiplier λ, then −λ ∈ ∂v(0). By the
Fenchel–Young equality

v(0) + v ∗ (−λ) = −λ, 0 = 0.

Direct calculation yields

v ∗ (−λ) = sup{ −λ, y − v(y)}

= sup{ −λ, y − F (x, y)} = F ∗ (0, −λ).

y,x

Since

− F ∗ (0, −λ) ≤ v ∗∗ (0) ≤ v(0) = −v ∗ (−λ) = −F ∗ (0, −λ), (1.4.4)

λ is a solution to the dual problem and p = v(0) = v ∗∗ (0) = d.

On the other hand, if v ∗∗ (0) = v(0) and λ is a solution to the dual problem, then
all the quantities in (1.4.4) are equal. In particular,

v(0) + v ∗ (−λ) = 0.

This implies that −λ ∈ ∂v(0), so that λ is a Lagrange multiplier of the primal

problem.

Example 1.4.2 (Finite Duality Gap) Consider

v(y) = inf {|x2 − 1| : x12 + x22 − x1 ≤ y}.

We can easily calculate

⎧
⎪
⎪
⎨0 y>0
v(y) = 1 y=0
⎪
⎪
⎩+∞ y < 0,

and v ∗∗ (0) = 0, i.e. there is a finite duality gap v(0) − v ∗∗ (0) = 1.

In this example neither the primal nor the dual problem has a Lagrange multiplier
yet both have solutions. Hence, even in two dimensions, existence of a Lagrange
multiplier is only a sufficient condition for the dual to attain a solution and is far
from necessary.
18 1 Convex Duality

1.4.2 Fenchel Duality

Let us specify F (x, y) := f (x) + g(Ax + y), where A : X → Y is a linear operator.

We then get the Fenchel formulation of duality. Now the primal problem is

p = v(0) = inf[f (x) + g(Ax)]. (1.4.5)

To derive the dual problem we calculate

F ∗ (0, −y ∗ ) = sup[ −y ∗ , y − f (x) − g(Ax + y)].

x,y

Letting u = Ax + y we have

F ∗ (0, −y ∗ ) = sup[ −y ∗ , u − Ax − f (x) − g(u)]

x,u

= sup[ y ∗ , Ax − f (x)] + sup[ −y ∗ , u − g(u)]

x u
∗ ∗ ∗ ∗ ∗
= f (A y ) + g (−y ).

Thus, the dual problem is

d = v ∗∗ (0) = sup[−f ∗ (A∗ y ∗ ) − g ∗ (−y ∗ )]. (1.4.6)

y∗

If both f and g are convex functions, then so is

v(y) = inf[f (x) + g(Ax + y)]

as shown in the proof of Lemma 1.2.7. Moreover, dom v = dom g − A dom f .

Thus, a sufficient condition for the existence of Lagrange multipliers for the primal
problem, i.e., ∂v(0) = ∅, is (1.2.2).
Figure 1.3 illustrates the Fenchel duality theorem for f (x) := x 2 /2 + 1 and
g(x) = (x − 1)2 /2 + 1/2. The upper function is f and the lower one is −g. The
minimum gap occurs at 1/2 and, which is 7/4.
Condition (1.2.2) is often referred to as a constraint qualification or a transver-
sality condition. Enforcing such constraint qualification conditions we can write
Theorem 1.4.1 in the following form:
Theorem 1.4.3 (Strong Duality) If the lower semicontinuous convex functions f ,
g and the linear operator A satisfy the constraint qualification conditions (1.2.2),
then there is a zero duality gap between the primal and dual problems, (1.4.5)
and (1.4.6), and the dual problem has a solution.
1.4 Convex Duality Theory 19

Fig. 1.3 The Fenchel duality

sandwich

A really illustrative example is the application to entropy optimization.

Example 1.4.4 (Entropy Optimization Problem) Entropy maximization refers to

minimize f (x) (1.4.7)

subject to Ax = b ∈ R , N

with the lower semicontinuous convex function f defined on a Banach space of

signals, emulating the negative of an entropy and A emulating a finite number
of continuous linear constraints representing conditions on some given moments.
A wide variety of applications can be covered by this model due to its physical
relevance.
Applying Theorem 1.4.3 with g = ι{b} we have if b ∈ ri(A dom f ) then

inf {f (x) | Ax = b} = max { φ, b − f ∗ (A∗ φ)} = (f ∗ ◦ A∗ )∗ (b). (1.4.8)

x∈X φ∈RN

When N < dim X (often infinite) the dual problem is typically much easier to solve
than the primal.
Example 1.4.5 (Boltzmann–Shannon Entropy in Euclidean Space) Let

N
f (x) := p(xn ), (1.4.9)
n=1

where
⎧
⎪
⎨t ln t − t
⎪ if t > 0,
p(t) := 0 if t = 0,
⎪
⎪
⎩+∞ if t < 0.
20 1 Convex Duality

The functions p and f defined above are (negatives of) Boltzmann–Shannon

entropy functions on R and RN , respectively. For c ∈ RN , b ∈ RM and linear
mapping A : RN → RM consider the entropy optimization problem

minimize {f (x) + c, x : Ax = b}. (1.4.10)

Example 1.4.4 can help us conveniently derive an explicit formula for solutions
of (1.4.10) in terms of the solution to its dual problem.
First we note that the sublevel sets of the objective function are compact,
thus ensuring the existence of solutions to problem (1.4.10). We can also see by
direct calculation that the directional derivative of the cost function is −∞ on any
boundary point x of dom f = RN + , the domain of the cost function, in the direction
of z − x. Thus, any solution of (1.4.10) must be in the interior of RN + . Since the cost
function is strictly convex on int (RN + ), then the solution is unique.
Let us denote this unique solution of (1.4.10) by x̄. Then the duality result in
Example 1.4.4 implies that

f (x̄) + c, x̄ = inf {f (x) + c, x : Ax = b}

x∈R N

= max { φ, b − (f + c)∗ (A φ)}.

φ∈R M

Now let φ̄ be a solution to the dual problem, i.e., a Lagrange multiplier for the
constrained minimization problem (1.4.10). We have

f (x̄) + c, x̄ + (f + c)∗ (A φ̄) = φ̄, b = φ̄, Ax̄ = A φ̄, x̄ .

It follows from the Fenchel–Young equality that A φ̄ ∈ ∂(f + c)(x̄). Since x̄ ∈

int (R+N ) where f is differentiable, we have A φ̄ = f (x̄)+c. Explicit computation

shows that x̄ = (x̄1 , . . . , x̄N ) is determined by

x̄n = exp(A φ̄ − c)n , n = 1, . . . , N. (1.4.11)

Indeed, we can use the existence of the dual solution to prove that the primal
problem has the given solution without direct appeal to compactness—we deduce
the existence of the primal from the duality theory.
Remark 1.4.6 In view of Remark 1.2.6, when both f and g are polyhedral functions
the constraint qualification condition (1.2.2) simplifies to

dom g ∩ A dom f = ∅. (1.4.12)

1.4 Convex Duality Theory 21

This is very useful in dealing with polyhedral cone programming and, in particular,
linear programming problems. One can also similarly handle a subset of polyhedral
constraints, see [7, 8].

1.4.3 Lagrange Duality

For problem (1.1.2) define the Lagrangian

L(λ, x; (y, z)) = f (x) + λ, (g(x) − y, h(x) − z) .

Then

f (x) if g(x) ≤K y, h(x) = z
sup L(λ, x; (y, z)) = .
λ∈K + ×Z ∗ +∞ otherwise.

Then problem (1.1.2) can be written as

p = v(0) = inf sup L(λ, x; 0). (1.4.13)

x∈C λ∈K + ×Z ∗

We can calculate

v ∗ (−λ) = sup[ −λ, (y, z) − v(y, z)]

y,z

= sup[ −λ, (y, z) − inf {f (x) : g(x) ≤K y, h(x) = z}]

y,z x∈C

= sup { −λ, (y, z) − f (x) : g(x) ≤K y, h(x) = z}.

x∈C,y,z

Letting ξ = y − g(x) ∈ K we can rewrite the expression above as

v ∗ (−λ) = sup [ −λ, (g(x), h(x)) − f (x) + −λ, (ξ, 0) ]

x∈C,ξ ∈K

=− inf [L(x, λ, 0) + λ, (ξ, 0) ]

x∈C,ξ ∈K

− infx L(x, λ, 0) if λ ∈ K + × Z ∗
=
+∞ otherwise.

Thus, the dual problem is

d = v ∗∗ (0) = sup −v ∗ (−λ) = sup inf L(λ, x; 0). (1.4.14)

λ λ∈K + ×Z ∗ x∈C
22 1 Convex Duality

We can see that the weak duality inequality v(0) ≥ v ∗∗ (0) is simply the familiar
fact that

inf sup ≥ sup inf .

Example 1.4.7 (Classical Linear Programming Duality) Consider a linear pro-

gramming problem

max c, x (1.4.15)
subject to Ax ≤ b, x ≥ 0

where x ∈ R N , b ∈ R M , A is an M × N matrix and ≤=≤RM . Then by the

+
Lagrange duality, the dual problem is

min b, λ (1.4.16)
∗
subject to A λ ≥ c, λ ≥ 0.

In fact, we need to deal with the minimizing problem

min[ −c, x : Ax ≤ b, x ≥ 0] = − max[ c, x : Ax ≤ b, x ≥ 0]

We write the Lagrangian

L(λ, x) = −c, x + λ, Ax − b

Then the primal problem is

inf sup L(λ, x).

x≥0 λ≥0

The dual problem is

sup inf L(λ, x).

λ≥0 x≥0

We can see that

− λ, b if A∗ λ ≥ c
inf L(λ, x) = inf −c + A∗ λ, x − λ, b =
x≥0 x≥0 +∞ otherwise.

So we have
1.4 Convex Duality Theory 23

Table 1.3 Transformed conjugates

Primal constraint Dual variable Primal variable Dual constraint
Ax ≤ b λ≥0 x≥0 A∗ λ ≥ c
Ax = b λ free x free A∗ λ = c
Ax ≥ b λ≤0 x≤0 A∗ λ ≤ c

max[ c, x : Ax ≤ b, x ≥ 0] = − max[− λ, b : A∗ λ ≥ c]
λ≥0

= min[ λ, b : A∗ λ ≥ c, λ ≥ 0].

Clearly all the functions involved here are polyhedral. Applying the constraint
qualification condition for polyhedral functions we can conclude that if either
the primal problem or the dual problem is feasible then there is no duality gap.
Moreover, when the common optimal value is finite then both problems have
optimal solutions.
The hard work in Example 1.4.7 was hidden in establishing that the constraint
qualification (1.4.12) is sufficient, but unlike many applied developments we have
rigorously recaptured linear programming duality within our framework.
Note that the primal Lagrange multiplier λ is the dual solution and vice versa.
Table 1.3 can help us formulating the dual problem.

1.4.4 Generalized Fenchel–Young Inequality

Reexamining the graphic representation of the Fenchel–Young inequality we also

realize that the underlying inequality relationship remains valid when the area is
weighted by a positive “density” function K(s, t). Thus, we have
Theorem 1.4.8 (Weighted Fenchel–Young Inequality) Let K(x, y) be a contin-
uous positive function and let φ be a continuous increasing function with φ(0) = 0.
Then
x x∗ x φ(s) x∗ φ −1 (t)
K(s, t)dtds ≤ K(s, t)dtds + K(s, t)dsdt
0 0 0 0 0 0

and equality holds when x ∗ = φ(x) and x = φ −1 (x ∗ ).

Proof If φ(x) ≥ x ∗ , we have

x φ(s) x∗ φ −1 (t)
K(s, t)dtds + K(s, t)dsdt (1.4.17)
0 0 0 0
24 1 Convex Duality

x x∗ x φ(s)
≥ K(s, t)dsdt + K(s, t)dtds
0 0 φ −1 (x ∗ ) x ∗
x x∗
≥ K(s, t)dtds.
0 0

Otherwise, φ(x) < x ∗ and we have

x φ(s) x∗ φ −1 (t)
K(s, t)dtds + K(s, t)dsdt (1.4.18)
0 0 0 0
x x∗ φ −1 (x ∗ ) x∗
≥ K(s, t)dsdt + K(s, t)dtds
0 0 x φ(s)
x x∗
≥ K(s, t)dtds.
0 0

Clearly equality holds if and only if φ(x) = x ∗ .

The condition φ(0) = 0 merely conveniently locates the lower left corner of the
graph to the coordinate origin and is clearly not essential. In general we can always
shift this corner to any point (a, φ(a)). More substantively, the requirement that φ
being a continuous increasing function can be relaxed to nondecreasing as long as
φ −1 is replaced appropriately by
−1
φinf (t) = inf{s, φ(s) ≥ t}.

Now we can state a more general Fenchel–Young inequality whose proof is an easy
exercise.
Theorem 1.4.9 (Weighted Fenchel–Young Inequality) Let K(x, y) be a
bounded essentially positive measurable function and let φ be a nondecreasing
function. Then
−1
x x∗ x φ(s) x∗ φinf (t)
K(s, t)dsdt ≤ K(s, t)dtds + K(s, t)dsdt
a φ(a) a φ(a) φ(a) a

−1 ∗ −1 ∗
with equality attained when x ∗ ∈ [φ(x−), φ(x+)], x ∈ [φinf (x −), φinf (x +)].
The above idea can be further pushed in two different directions in the next two
sections.
1.4 Convex Duality Theory 25

φ2 φ2

φ2 (b2 )
φ2 (b2 )

(φ1 (t), φ2 (t)) (φ1 (t), φ2 (t))

φ1 φ1
(φ1 (a), φ2 (a)) φ1 (b1 ) (φ1 (a), φ2 (a)) φ1 (b1 )

Fig. 1.4 Fenchel–Young inequality

Fig. 1.5 Fenchel–Young φ2

equality
φ2 (b1 )

(φ1 (t), φ2 (t))

φ1
(φ1 (a), φ2 (a)) φ1 (b1 )

Multidimensional Fenchel–Young Inequality

It is easier to understand and to formulate n-dimensional Fenchel Young inequality

starting by re-examining the graphs presented above with a parameterization
(φ1 , φ2 ) of the graph of φ in Figures 1.4 and 1.5.
Let K(s1 , s2 ) be a nonnegative function and let φ1 , φ2 be increasing functions.
To avoid technical complication we assume that φ1 , φ2 are invertible. Then we can
rewrite the Fenchel–Young inequality as

φ2 (b2 ) φ1 (b1 )
K(s1 , s2 )ds1 ds2 (1.4.19)
φ2 (a) φ1 (a)

φ1 (b1 ) φ2 (φ1−1 (s1 )) φ2 (b2 ) φ1 (φ2−1 (s2 ))

≤ K(s1 , s2 )ds2 ds1 + K(s1 , s2 )ds1 ds2
φ1 (a) φ2 (a) φ2 (a) φ1 (a)

with equality attained when b1 = b2 .

26 1 Convex Duality

This form of the Fenchel–Young inequality can easily be generalized to N -

dimension with an induction argument. We will use the following vector notation:
s N = (s1 , . . . , sN ), 1N = (1, 1, . . . , 1) and

snN = (s1 , . . . , sn−1 , sn+1 , . . . , sN ).

When φ N = (φ1 , . . . , φN ) is a vector valued function we define

φ N (s N ) = (φ1 (s1 ), . . . , φN (sN )).

Similarly,

φ N (bN ) φN (b1 ) φN (bN )

K(s N )ds N = ... K(s1 , . . . , sN )dsN . . . ds1 .
φ N (a N ) φ1 (a1 ) φN (aN )

Now we can state and prove the multidimensional Fenchel–Young inequality.

Theorem 1.4.10 (Multidimensional Generalized Fenchel–Young Inequality)
Let K : RN → R be a nonnegative function and let φ N be a vector function with
all the components increasing and invertible. We have

φ N (bN ) N φn (bn ) φnN (φn−1 (sn )·1N−1 )

N
K(s )ds N
≤ K(s N )dsnN dsn
φ N (a·1N ) n=1 φn (a) φnN (a·1N−1 )

(1.4.20)

with equality attained when b1 = b2 = . . . = bN .

Proof We prove by induction. The case N = 2 has already been established. We
focus on the induction step. By separating the integration with respect to dsN +1 , we
can write the left-hand side of the inequality as

φ N (bN+1 )
LH S = K(s N +1 )ds N +1
φ N+1 (a·1N+1 )

φN+1 (bN+1 ) φ N (bN )

= K(s N +1 )ds N dsN +1
φN+1 (a) φ N (a·1N )

Applying the induction hypothesis to the inner layer of the integration we have

φN+1 (bN+1 ) N φn (bn ) φnN (φn−1 (sn )·1N−1 )

LH S ≤ K(s N +1 )dsnN dsn dsN +1
φN+1 (a) n=1 φn (a) φnN (a·1N−1 )
1.4 Convex Duality Theory 27

N φN+1 (bN+1 ) φn (bn ) φnN (φn−1 (sn )·1N−1 )

= K(s N +1 )dsnN dsn dsN +1 .
n=1 φN+1 (a) φn (a) φnN (a·1N−1 )

The last equality groups the two out layers of the integration together. Now applying
the Fenchel–Young inequality with N = 2 to get

N φn (bn ) φN+1 (φn−1 (sn )) φnN (φn−1 (sn )·1N−1 )

LH S ≤ K(s N +1 )dsnN dsN +1 dsn
n=1 φn (a) φN+1 (a) φnN (a·1N−1 )

−1
φN+1 (bN+1 ) N φn (φN+1 (sN+1 )) φnN (φn−1 (sn )·1N−1 )
+ K(s N +1 )dsnN dsn dsN +1
φN+1 (a) n=1 φn (a) φnN (a·1N−1 )

Combining the inner layers of the integration in the first sum and applying the
equality part of the induction hypothesis for the second sum we arrive at

N φn (bn ) φnN+1 (φn−1 (sn )·1N )

LH S ≤ K(s N +1 )dsnN +1 dsn
n=1 φn (a) φnN+1 (a·1N )

−1
φN+1 (bN+1 ) φ N (φN+1 (sN+1 )·1N )
+ K(s N +1 )ds N dsN +1
φN+1 (a) φnN (a·1N )

N +1 φn (bn ) φnN+1 (φn−1 (sn )·1N )

= K(s N +1 )dsnN +1 dsn = RH S.
n=1 φn (a) φnN+1 (a·1N )

A three-dimensional graphical illustration of the multidimensional Fenchel–

Young inequality is presented in Figure 1.6. In this figure we illustrate the simple
case where K(s1 , s2 , s3 ) = 1 so that the left-hand side of the inequality (1.4.20)
is the volume of a rectangular region. We set (φ1 (t), φ2 (t), φ3 (t)) = (t, t 2 , t),
(a1 , a2 , a3 ) = (0, 0, 0), and (b1 , b2 , b3 ) = (0.9, 1, 0.8). The light lines are the
edges of the rectangular region and the dark lines outline the boundaries of the three
regions corresponding to the three integrals on the right-hand side of Fenchel–Young
inequality (1.4.20).
Remark 1.4.11 We also have the following alternative form of estimations by
changing the way of integration. Let K(s1 , s2 ) be a nonnegative function and let
φ1 , φ2 be nondecreasing functions.

φ2 (b2 ) φ1 (b1 )
K(s1 , s2 )ds1 ds2
φ2 (a) φ1 (a)
28 1 Convex Duality

Fig. 1.6 Three-dimensional

Fenchel–Young inequality
1

0.8

0.6

0.4

0.2
1
0.8
0.6
0 0.4
0 0.2
0.2
0.4
0.6
0.8
1

φ1 (b2 ) φ2 (b2 ) φ2 (b1 ) φ1 (b1 )

≤ K(s1 , s2 )ds2 ds1 + K(s1 , s2 )ds1 ds2
φ1 (a) φ2 (φ1−1 (s1 )) φ2 (a) φ1 (φ2−1 (s2 ))

with equality attained when b1 = b2 .

1.5 Generalized Convexity, Conjugacy and Duality

Note that the graphic illustrations in Section 1.4.3 only work when x, x ∗ ∈ R.
When, in general, (x, x ∗ ) ∈ X × X∗ we can imitate the general definition of the
Fenchel conjugate. In such a generalization a nonlinear function c(x, x ∗ ) replaces
x x∗
the role of x ∗ , x just as in Theorem 1.4.8 0 0 K(s, t)dsdt replacing the
product x ∗ x. In fact, x ∗ does not even have to be in X∗ . This is a more significant
generalization. To implement this idea, one needs to first revise the concept of
convexity.
Definition 1.5.1 (Generalized Convexity) Let Φ be a set of extended real valued
functions. We say f is Φ-convex if

f (x) = sup{φ(x) : φ ∈ Φ, f ≥ φ}.

It is easy to verify that Φ-convex functions are closed under supremum. Thus, every
function has a largest Φ-convex minorant called its Φ-convex hull. Moreover, if f is
Φ-convex then it is coincide with its Φ-convex hull. By setting Φ to be the class of
affine functions we get the usual convexity with in the class of lower semicontinuous
functions.
Similar to Fenchel conjugate we define:
1.5 Generalized Convexity, Conjugacy and Duality 29

Definition 1.5.2 (Generalized Fenchel Conjugate) Let c be a function on X × Y .

We define

f c(1) (y) = sup[c(x, y) − f (x)] and g c(2) (x) = sup[c(x, y) − g(y)].

x y

They are generalizations of Fenchel conjugate. When the function c is not

symmetric with respect to its two variables, the c(1) and c(2) conjugate are different.
It is easy to see that the generalized Fenchel conjugate also has the order reversing
property. Define Φc(1) = {c(·, y) − b : y ∈ Y, b ∈ R} and Φc(2) = {c(x, ·) − b : x ∈
X, b ∈ R}. Then f c(1) is Φc(2) -convex and g c(2) is Φc(1) -convex.
Next we discuss some basic properties of generalized Fenchel conjugate.
Theorem 1.5.3 (Fenchel Inequality and Duality) Let f : X → R ∪ {+∞} and
g : Y → R ∪ {+∞}. Then
(i) (Fenchel inequality) f c(1) (y) ≥ c(x, y) − f (x), g c(2) (x) ≥ c(x, y) − g(y),
(ii) (Convex hull) The Φc(1)(c(2)) -convex hull of f (g) is f c(1)c(2) (g c(2)c(1) ),
(iii) (Duality) f c(1) = f c(1)c(2)c(1) , g c(2) = g c(2)c(1)c(2) .
Proof (i) follows directly from the definitions.
To prove (ii) we observe that by (i) f (x) ≥ c(x, y) − f c(1) (y). Taking sup over
y we get f ≥ f c(1)c(2) . On the other hand, if for some y, b, f (x) ≥ c(x, y) − b for
all x, then b ≥ c(x, y) − f (x). Taking sup over x we have b ≥ f c(1) (y). Thus,

f (x) ≥ f c(1)c(2) (x) ≥ c(x, y) − f c(1) (y) ≥ c(x, y) − b

establishing f c(1)c(2) as the largest Φc(1) -convex function dominated by f . The

proof that g c(2)c(1) is the Φc(2) -convex hull of g is similar.
(iii) follows from (ii) since f c(1) is Φc(2) -convex and g c(2) is Φc(1) -convex.

Remark 1.5.4 We see from the discussion about generalized Fenchel conjugate that
what is essential in dealing with conjugate operation is the closedness with respect
to the sup operation. For simple convexity the key link is that a convex function is
the sup of all the affine functions it dominates. It is a fact based on the fundamental
convex separation theorem.
The generalized convexity can characterize many class of functions. The follow-
ings are a few examples that showcase the potent of this concept.
Example 1.5.5 Let ·, · be the dual pairing between X and X∗ . Define c(x, x ∗ ) =
ln x, x ∗ , with ln t = −∞ for t ≤ 0. Then a function f : X → R ∪ {+∞} is
Φc(1) -convex if and only if ef (with the convention e−∞ = 0) is sublinear.
Example 1.5.6 Let X = Y = [0, +∞] and define c(x, y) = xy, with the
convention a(+∞) = +∞. Then a function f : X → R ∪ {+∞} is Φc(1) -convex
if and only if it is convex and nondecreasing.
30 1 Convex Duality

Example 1.5.7 Let X be a Hilbert space and Y = R+ × X. Define c(x, (ρ, y)) =
−ρx − y2 . Then f : X → R ∪ {+∞} is Φc(1) -convex if and only if it is lower
semicontinuous and has a finite minorant φ ∈ Φc(1) .
The concept of subdifferential and its relationship with Fenchel conjugate can
also be generalized.
Definition 1.5.8 (Generalized Subdifferential) Let c be a function on X × Y .
We say y0 (x0 ) is a c(1)(c(2))-subdifferential of f (g) at x0 (y0 ) if

f (x) − c(x, y0 )(g(y) − c(x0 , y))

attains minimum at x0 (y0 ).

Notation y0 ∈ ∂c(1) f (x0 )(x0 ∈ ∂c(2) g(y0 )).
Theorem 1.5.9 (Generalized Fenchel–Young Equality)
(i) (Fenchel equality) y0 ∈ ∂c(1) f (x0 ) iff f (x0 ) + f c(1) (y0 ) = c(x0 , y0 ).
(ii) (Symmetry) y0 ∈ ∂c(1) f c(1)c(2) (x0 ) iff x0 ∈ ∂c(2) f c(1) (y0 ).
(iii) (Φ convexity) ∂c(1) f (x0 ) = ∅ implies that f is Φc(1) convex at x0 .
On the other hand, f is Φc(1) convex at x0 implies that ∂c(1) f (x0 ) =
∂c(1) f c(1)c(2) (x0 ).
Proof The argument for proving Fenchel equality applies to (i) with y0 , x0
replaced by c(x0 , y0 ). The rest follows from this generalized Fenchel equality.
Details are left as an exercise.

Similar to the usual subdifferential we have

Theorem 1.5.10 (Cyclical Monotonicity) Subdifferential ∂c(1) f is c(1)- cyclically
monotone that is for any m pairs of points yi ∈ ∂c(1) f (xi ) we have

(c(x1 , y0 ) − c(x0 , y0 )) + (c(x2 , y1 ) − c(x1 , y1 )) +

. . . + (c(x0 , ym ) − c(xm , ym )) ≤ 0.

Proof Adding the following inequalities:

f (x1 ) − f (x0 ) ≥ c(x1 , y0 ) − c(x0 , y0 )

f (x2 ) − f (x1 ) ≥ c(x2 , y1 ) − c(x1 , y1 )
... ...
f (x0 ) − f (xm ) ≥ c(x0 , ym ) − c(xm , ym ).

and noticing all the terms on the left-hand side are cancelled.

Next we look at an axiomatic approach to the c-conjugate.

1.5 Generalized Convexity, Conjugacy and Duality 31

Theorem 1.5.11 (Characterization of c-Conjugate) Define an operator Δ that

maps an extended valued function f on X to an extended valued function Δf on
Y . Then Δ is a c-conjugate if and only if
(i) (Duality) Δ infα fα = supα Δfα
(ii) (Shift reversing) Δ(f + d) = Δ(f ) − d, ∀d ∈ R
where

c(x, y) = Δ(ι{x} )(y).

Proof The “if ” part: The two properties can be derived from direct computation.
For property (i)

(inf fα )c(1) (y) = sup[c(x, y) − inf fα (x)]

x α

= sup sup[c(x, y) − fα (x)]

x α

= sup sup[c(x, y) − fα (x)] = sup fαc(1) (y).

α x α

For property (ii)

(f + d)c(1) (y) = sup[c(x, y) − (f (x) − d)]

= sup[c(x, y) − f (x)] + d = f c(1) (y) + d.

The “only if” part: The key is the representation

f (·) = inf[ι{x} (·) + f (x)].

Applying the Δ operator to the above representation we have

(Δf )(y) = Δ inf[ι{x} + f (x)] (y)
x

= sup Δ[ι{x} + f (x)] (y)
x

= sup[Δ(ι{x} )(y) − f (x)] = f c(1) (y)

where

c(x, y) = Δ(ι{x} )(y).

32 1 Convex Duality

Rockafellar Duality

Consider the bi-conjugate setting again. The primal problem is

p = v(0) = inf F (x, 0) (1.5.1)

x∈X

as one of the family v(y) = infx F (x, y) on the perturbation space Y . Let Z be
the “dual parameter space” and let c(y, z) be a coupling function. Define the dual
problem as

d = v c(1)c(2) (0) = sup{c(0, z) − v c(1) (z)}. (1.5.2)

z∈Z

This definition is the same as the Rockafellar duality. However, since now c(0, z) is
not necessarily 0 the problem is more involved.
Theorem 1.5.12 (Dual Solution Set) If d = v c(1)c(2) (0) < ∞, then the optimal
solution set to the dual problem is ∂c v c(1)c(2) (0).
Proof It follows directly from definition and is left as an exercise.

Also similar to the Rockafellar duality we have

Theorem 1.5.13 (Weak and Strong Duality) We always have the weak duality
d = v c(1)c(2) (0) ≤ v(0) = p. Equality holds if and only if v is Φc(1) -convex at 0.
In this case if d = p is finite then the optimal solution set to the dual problem is
∂c(1) v(0).
Proof As before the weak duality follows easily from the Fenchel–Young inequal-
ity. To prove strong duality notice that v is Φc(1) -convex at 0 implies that
∂c(1) v(0) = ∅. Then we can check each element of ∂c(1) v(0) is a solution to the
dual problem.

Lagrange Duality

Define Lagrangian for the primal problem as

L(x, z) = c(0, z) − Fxc (z)

where Fx (y) := F (x, y). Then we have the Lagrange form of the primal: If Fx (y)
is Φc -convex for all x ∈ X at y = 0, then

sup L(x, z) = sup{c(0, z) − Fxc(1) (z)} = Fxc(1)c(2) (0) = Fx (0) = F (x, 0).
z z
1.5 Generalized Convexity, Conjugacy and Duality 33

Thus, the primal problem becomes

inf sup L(x, z).

x z

Next we consider the Lagrange form of the dual. If c < +∞, we have

inf L(x, z) = inf{c(0, z) − Fxc(1) (z)}

x x
= inf{c(0, z) − sup(c(y, z) − Fx (y))}
x y

= c(0, z) − sup{c(y, z) − inf F (x, y)}

y x

= c(0, z) − sup{c(y, z) − v(y)} = c(0, z) − v c(1) (z).

Therefore, the dual problem becomes

sup inf L(x, z).

z x

We see that the primal and dual value equal if and only if

inf sup L(x, z) = sup inf L(x, z).

x z z x
Chapter 2
Financial Models in One Period Economy

Abstract This chapter focuses on financial models in a one period economy with
a finite sample space. Mathematically, these models involve only finite dimensional
spaces yet they still illustrate the main patterns.
In modeling the behavior of agents in a financial market, we usually use
concave utility functions and convex risk measure to characterize their attitude
towards risk. These agents are subject to various constraints ranging from the
availability of capital, contractual obligation to clients to mandates from regulators.
Thus, the theory regarding constrained (convex) optimization discussed in the
previous chapter is most relevant. The Lagrange multipliers in such financial models
often carry a special financial meaning and are worthy of attention. Moreover, as
illustrated in the previous chapter, they also provide the key link between the primal
and the dual problems.

2.1 Portfolio

Portfolio theory considers the one period financial model in which transaction can
only take place at either the beginning of the period or the end of the period
represented by t = 0 or 1, respectively. We use probability space (Ω, F, P ) to
represent an economy where the σ -algebra F is generated by finitely many atoms
F = σ ({B1 , . . . , BN }). We use RV (Ω, F, P ) to denote the Hilbert space of all
F-measurable random variables endowed with the inner product

N
x, y = EP [xy] = x(ω)y(ω)P (ω) = x(Bi )y(Bi )P (Bi ), (2.1.1)
ω∈Ω i=1

where x(Bi ) and y(Bi ) signify the common value of F-measurable random
variables x and y on atom Bi , respectively. We use · RV to denote the norm on
RV (Ω, F, P ) induced by the inner product in (2.1.1). Elements in RV (Ω, F, P )
represent the price or payoff of assets. In a one period economy we may think

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 35

P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2_2
36 2 Financial Models in One Period Economy

the sample space simply consists of the atoms of F. Denoting ωi = Bi , then

Ω = {ω1 , . . . , ωN }, P (ωi ) = P (Bi ) and F contains all subsets of Ω.
A financial market is modeled by random vectors St = (St0 , St1 , . . . , StM ), t =
0, 1 on Ω in which St0 represent the price of a risk free asset and for simplicity
is assumed to be cash here so that St0 = 1 for t = 0, 1, and Ŝt = (St1 , . . . , StM )
represents the prices of risky assets at time t. For each asset i > 0, we also assume
that its price S0i is a constant and S1i is an F-measurable random variable.
Definition 2.1.1 (Portfolio) A portfolio is a vector Θ = (θ 0 , θ 1 , . . . , θ M ) ∈ RM+1
whose ith component θ i signifies the share of the ith asset (with price at t
represented by Sti ) in the portfolio. The value of a portfolio Θ at time t is Θ · St ,
where notation “·” signifies the dot product in RM+1 .
The question is what is the best portfolio. Since different agents have different
preferences there is no unique answer to this question.

2.1.1 Markowitz Portfolio

Markowitz portfolio theory considers only risky assets and is based on the idea that
for a fixed expected return one should choose portfolios with minimum variation,
which serves as a measure for the risk. In general, a portfolio with a higher expected
return also accompanied with a higher variation (risk). The tradeoff is left to the
individual agent.
Use Ŝ = (S 1 , . . . , S M ) to denote the price process of the risky assets and Θ̂ =
(θ1 , . . . , θM ) to denote the portfolio. For a given expected payoff r0 and an initial
wealth w0 we can formulate the Markowitz portfolio problem as

minimize Var(Θ̂ · Ŝ1 ) = σ 2 (Θ̂ · Ŝ1 )

subject to E[Θ̂ · Ŝ1 ] = r0 (2.1.2)
Θ̂ · Ŝ0 = w0 ,

where Var is the variation and σ signifies the standard deviation. Regarding Ŝ
as a row vector of random variables and Θ̂ as a row vector, denoting E[Ŝ1 ] =
[E[Ŝ11 ], . . . , E[Ŝ1M ]],

E[Ŝ1 ] r0
A= , and b = , (2.1.3)
Ŝ0 w0

we can rewrite (2.1.2) as an entropy maximization problem

1
minimize f (x) := x Σx
2
subject to Ax = b. (2.1.4)
2.1 Portfolio 37

Here x = Θ̂ and

Σ = E[(Ŝ1 − E(Ŝ1 )) (Ŝ1 − E(Ŝ1 ))] (2.1.5)

j j
= (E[(S1i − E(S1i ))(S1 − E(S1 ))])i,j =1,...,M .

The coefficient 1/2 is added to the risk function to make the computation easier.
Clearly, Σ is a symmetric positive semidefinite matrix. We will assume that it is in
fact positive definite. Then the Fenchel conjugate f ∗ of f (see (1.3.1)) is

1 −1
f ∗ (y) = y Σ y. (2.1.6)
2
The constraint qualification condition for strong duality here is b ∈ rangeA
which is to say (r0 , w0 ) is feasible for the constraint. Assuming that this constraint
qualification condition is satisfied, it follows from Theorem 1.4.3 on the strong
duality that the value of problem (2.1.4) equals to that of its dual:

1
maximize b y − y AΣ −1 A y
2
1
= b (AΣ −1 A )−1 b. (2.1.7)
2
Here the optimal solution to the dual is

ȳ = (AΣ −1 A )−1 b. (2.1.8)

It follows that

σ 2 = b (AΣ −1 A )−1 b. (2.1.9)

Let x̄ and ȳ be the solutions of (2.1.4) and (2.1.7), respectively. By the strong
duality in Theorem 1.4.3 we have f (x̄) = b ȳ − f ∗ (A ȳ). Since b ȳ = ȳ, Ax̄
it follows that

f (x̄) + f ∗ (A ȳ) − ȳ, Ax̄ = 0. (2.1.10)

The equality (2.1.10) via the Fenchel–Young equality in Proposition 1.3.1 tells us

x̄ = (f ∗ ) (A ȳ) = Σ −1 A ȳ.

Thus the optimal portfolio is

x̄ = Σ −1 A (AΣ −1 A )−1 b. (2.1.11)

Define α = E[Ŝ1 ]Σ −1 E[Ŝ1 ] , β = E[Ŝ1 ]Σ −1 Ŝ0 and γ = Ŝ0 Σ −1 Ŝ0 . We have

38 2 Financial Models in One Period Economy

Theorem 2.1.2 (Markowitz Portfolio Theorem) For given initial wealth w0 and
expected payoff r0 , the Markowitz portfolio Θ and the minimum risk in terms of
standard deviation σ are determined by

γ r02 − 2βr0 w0 + αw02
σ (r0 , w0 ) = (2.1.12)
αγ − β 2

and

E[Ŝ1 ](γ r0 − βw0 ) + Ŝ0 (αw0 − βr0 ) −1

Θ(r0 , w0 ) = Σ (2.1.13)
αγ − β 2

Proof Using (2.1.3) and the definition of α, β, and γ we have

−1 αβ
AΣ A = .
βγ

Thus, (2.1.9) becomes

γ −β r0
σ = [r0 , w0 ]
2
/(αγ − β 2 ) (2.1.14)
−β α w0

which verifies (2.1.12). Similarly (2.1.11) leads to (2.1.13).

Note that both σ (r0 , w0 ) and Θ(r0 , w0 ) are positive homogeneous functions we
have
Corollary 2.1.3 Use μ to denote the expected return on unit initial wealth and let
σ = σ (μ, 1) and Θ = Θ(μ, 1). Then

γ μ2 − 2βμ + α
σ = (2.1.15)
αγ − β 2

and

E[Ŝ1 ](γ μ − β) + Ŝ0 (α − βμ) −1

Θ= Σ (2.1.16)
αγ − β 2

Moreover, σ (μw0 , w0 ) = w0 σ and Θ(μw0 , w0 ) = w0 Θ.

We now turn to a graphical interpretation of the Markowitz portfolio theory. Note
that (2.1.15) also determines μ as a function of σ . Draw this function on the σ μ-
plan we get the following curve called a Markowitz bullet because of its shape. It is
also often referred to as the Markowitz frontier (Figure 2.1).
2.1 Portfolio 39

Fig. 2.1 Markowitz bullet μ

Every point inside the Markowitz bullet represents a portfolio that can be moved
horizontally to the left to a point on the boundary of the bullet. This point on the
boundary represents a portfolio with the same expected return but less risk. For
every point on the lower half of the boundary of the Markowitz bullet, one can find
a corresponding point on the upper half of the boundary with the same variation and
a higher expected return. Thus, preferred portfolios are represented by points on the
upper boundary of the Markowitz bullet. We note that the upper boundary of the
Markowitz bullet has an asymptote whose slope can be determined by

μ αγ − β 2
lim = . (2.1.17)
σ →∞ σ γ

By taking the limit of the tangent line of points on the boundary of the Markowitz
bullet one can show that the μ-intercept of this asymptote is at β/γ . This number
will play an important role in our discussion of the capital asset pricing model. In
fact, the asymptote for the upper boundary of the Markowitz bullet passes through
this point.
Although the Markowitz bullet is nonlinear, the Markowitz portfolio (2.1.15) is
an affine function of the return. This leads to
Theorem 2.1.4 (Two Fund Theorem) Given two distinct portfolios on the
Markowitz bullet (2.1.15), then any portfolio on the Markowitz bullet can be
represented as their linear combination.
Proof This follows directly from the affine structure of the Markowitz optimal
portfolio (2.1.16). In fact, suppose that

E[Ŝ1 ](γ μi − β) + Ŝ0 (α − βμi ) −1

Θi = Σ , i = 1, 2
αγ − β 2
40 2 Financial Models in One Period Economy

are two distinct Markowitz portfolios so that μ1 = μ2 . Then any Markowitz efficient
portfolio described in (2.1.16) can be explicitly represented as

μ − μ2 μ − μ1
Θ= Θ1 + Θ2 .
μ1 − μ2 μ2 − μ1

Remark 2.1.5 In pointing out that all portfolios on the Markowitz frontier are
generated by just two such portfolios, the two fund theorem has great practical
significance. One can often use two broad based indices to approximate the two
basic generating portfolios for the Markowitz frontier. This can be viewed as a
theoretical foundation for the passive investment strategy of buy and hold broad
based indices.
If our sole goal is to minimize the risk, then our problem becomes

1
minimize f (x) := x Σx
2
subject to Ŝ0 x = w0 . (2.1.18)

Using a similar argument one can show

Theorem 2.1.6 (Minimum Risk Portfolio) The minimum risk portfolio is

Θmin = γ −1 w0 Ŝ0 Σ −1

and its standard deviation is

σmin = γ −1/2 w0 .

2.1.2 Capital Asset Pricing Model

Capital asset pricing model (CAPM) is an equilibrium model for determining the
price of risky assets. It is based on the Markowitz mean variance analysis that also
includes riskless bond. The mathematical model is

minimize Var(Θ · S1 )
subject to E[Θ · S1 ] = μ (2.1.19)
Θ · S0 = 1.

Here we standardized the initial wealth to 1 and μ is the expected return.

2.1 Portfolio 41

It turns out that the efficient portfolios determined by (2.1.19) all lie on a straight
line in the σ μ-plane. This line is called the capital market line. Then the model
prices a risky asset according to the principle that adding it to the market does not
change the capital market line.
We derive the capital market line using convex duality first. Recall that S1 =
(S10 , Ŝ1 ). Since Var(S10 ) = 0 one can show that

Var(Θ · S1 ) = Var(Θ̂ · Ŝ1 ). (2.1.20)

Relation (2.1.20) suggests a strategy of solving problem (2.1.19) in two steps. First,
for a portfolio with θ = θ0 ≥ 0, denote R = S10 /S00 , the return on the risk free asset,
we solve problem

minimize Var(Θ̂ · Ŝ1 )

subject to E[Θ̂ · S1 ] = μ − θ R (2.1.21)
Θ̂ · Ŝ0 = 1 − θ.

Then, we minimize the minimum variation of (2.1.21) as a function of θ .

By Theorem 2.1.2 the minimum variation corresponding to problem (2.1.21) as
a function of θ is determined by

f (θ ) = [σ (μ − θ R, 1 − θ )]2
γ (μ − θ R)2 − 2β(μ − θ R)(1 − θ ) + α(1 − θ )2
= (2.1.22)
αγ − β 2

Clearly, the solution of problem (2.1.19) corresponds to the minimum of function

f , if it exists. Since f is a quadratic function of θ , the minimum attains at

α − β(μ + R) + γ μR
θ̄ = , (2.1.23)
α − 2βR + γ R 2

the solution to the equation f (θ ) = 0. Denote Δ := α − 2βR + γ R 2 > 0. It is

easy to see that the share invested in the risky assets is

μ−R
1 − θ̄ = (β − γ R) (2.1.24)
Δ
We observe that only μ > R makes sense because by including risky assets we
always expect to get a higher return than the risk free assets. Note that the risky
assets are involved in the minimum variance portfolio only when 1 − θ̄ > 0. This
implies

R < β/γ (2.1.25)

42 2 Financial Models in One Period Economy

by (2.1.24). Let us focus on the case when R satisfies (2.1.25). We can calculate
μ−R
μ − θ̄ R = (α − βR) . (2.1.26)
Δ
By the positive homogeneous property of σ we have
μ−R
σ = σ (μ − θ̄ R, 1 − θ̄ ) = σ (α − βR, β − γ R) . (2.1.27)
Δ
√
It is easy to verify that σ (α − βR, β − γ R) = Δ. Thus, all the optimal portfolios
lie on the line
√
μ=R+ Δσ. (2.1.28)

This line on the σ μ-plane is usually referred to as the capital market line. This
linear structure of the optimal portfolios suggests that we can derive all the optimal
portfolios as the linear combinations of two distinct portfolios. Taking the risk free
bond and a portfolio of pure risky assets we have the following
Theorem 2.1.7 (Two Fund Separation Theorem) All the optimal portfolios on
the capital market line can be represented as the linear combination of the riskless
bond and the capital market portfolio

E[S1 ] − RS0 −1 E[S1 ] − RS0

ΘM = Σ = Σ −1 , (2.1.29)
β − γR (E[Ŝ1 ] − R Ŝ0 )Σ −1 Ŝ0

whose corresponding coordinates in the σ μ-plane is

√
Δ α − βR
(σM , μM ) = , . (2.1.30)
β − γR β − γR

Proof Clearly the riskless bond is on the capital market line and can be represented
in the σ μ-plane as (0, R). We now seek a portfolio on the capital market line that
contains only risky asset. We denote its coordinates by (σM , μM ). Note such a
portfolio corresponding to θ̄ = 0. It follows from (2.1.24) that

Δ α − βR
μM = R + = . (2.1.31)
β − γR β − γR

Thus, we can find risky part of the capital market portfolio by solving

minimize Var(Θ̂ · Ŝ1 )

α − βR
subject to E[Θ̂ · S1 ] = (2.1.32)
β − γR
Θ̂ · Ŝ0 = 1.
2.1 Portfolio 43

By Theorem 2.1.2, we derive the optimal portfolio of (2.1.32) to be

E[Ŝ1 ] − R Ŝ0 −1
Θ̂M = Σ . (2.1.33)
β − γR

Noting that the weight on the riskless bond is 0 for the capital market portfolio we
arrive at the representation in (2.1.29): ΘM = (0, Θ̂M ).
Finally, comparing (2.1.28) and (2.1.31), we derive
√
Δ
σM = . (2.1.34)
β − γR

Clearly, the point (σM , μM ) lies on the boundary of the Markowitz bullet.
Moreover, since the capital market line represents optimal portfolio, the Markowitz
frontier must lie below it. Thus, the capital market line must tangent to the
Markowitz frontier at (σM , μM ) (see Figure 2.2). As a result, if R ≥ β/γ , there
is no capital market line (see Figure 2.3), which confirms what has been derived
analytically in (2.1.25).
Using the fact that both (0, R) and (σM , μM ) belong to the capital market line
we can rewrite the capital market line as

μM − R
μ= σ + R. (2.1.35)
σM

The theorem below tells us how to use this capital market line to price a risky asset
in terms of its expected return.
Theorem 2.1.8 (Capital Asset Pricing Model) Suppose that we know a financial
market S with a riskless bond returning R. Let a i be a fair priced risky asset with

Fig. 2.2 Capital market line μ

and Markowitz bullet

(σM , μM )

(0, R)

σ
44 2 Financial Models in One Period Economy

Fig. 2.3 No capital market μ

line

expected percentage return μi . Then

μi = R + βi (μM − R). (2.1.36)

Here βi = σiM /σM2 is called the beta of a i , where σ

iM = cov(a , Θ̂M · Ŝ1 ) is the
i
i
covariance of a and the market portfolio.
Proof Consider a portfolio relies on the parameter α that consists of the risky asset
a i and the capital market portfolio:

p(α) = αa i + (1 − α)Θ̂M · Ŝ. (2.1.37)

Denote the expected return and the standard variation of p(α) by μα and σα ,
respectively, we have

μα = αμi + (1 − α)μM , (2.1.38)

and

σα2 = α 2 σi2 + 2α(1 − α)σiM + (1 − α)2 μ2M , (2.1.39)

where μi and σi are the expected return and standard deviation of asset a i ,
respectively. The parametric curve (σα , μα ) must lie below the capital market line
because the latter consists of optimal portfolios. On the other hand, it is clear that
when α = 0 this curve coincides with the capital market line. Thus, the capital
market line is a tangent line of the parametric curve (σα , μα ) at α = 0. It follows that

μM − R dμα σM (μi − μM )
= = . (2.1.40)
σM dσα α=0 σiM − σM 2
2.1 Portfolio 45

Solving for μi we derive

μi = R + βi (μM − R). (2.1.41)

2.1.3 Sharpe Ratio

Think a little bit more we will realize that to construct the capital market portfolio,
theoretically, we need to use every available risky asset available to us. Given
the huge number of available equities, constructing the capital market portfolio is
practically impossible even if we have accurate probability distribution information
on all the available risky assets (which is another impossible task). Thus, we have
to deal with suboptimal situation. What happens if we mix risk free asset with an
arbitrary portfolio of risky assets (not necessarily the capital market portfolio)? Let
Θ̂ = (θ1 , . . . , θM ) be such a portfolio corresponding to risky assets (a 1 , . . . , a M )
with price random vector Ŝ = (S 1 , . . . , S M ). Again
we standardize the portfolio so
that Θ̂ · Ŝ0 = 1. Denote μ∗ = E[Θ̂ · Ŝ1 ] and σ ∗ = Var(Θ̂ · Ŝ1 ). Then any mix of
this portfolio with a risk free asset having return R will produce a portfolio whose
expected return μ and standard deviation σ lie on the line

μ∗ − R
μ= σ + R. (2.1.42)
σ∗
∗
Portfolios of risky assets with larger μ σ−R ∗ have the potential of generating higher
return for a fixed level of risk (see Figure 2.4). Sharpe proposes the formula to
compare risky portfolios such as those maintained by mutual funds using this idea.
As an illustration, suppose that R1 , . . . , RN are the monthly returns of a mutual fund
a in the past N months and the monthly return of the risk free asset is R. Define a
random variable X with finite values {Rn − R | n = 1, . . . , N } and prob(X =
Rn − R) = 1/N. Then the Sharpe ratio of a is defined as

E[X]
s(a) = √ . (2.1.43)
V ar(X)

μ∗ −R
We can see that the Sharpe ratio is, in fact, a statistical estimate of σ∗ .
46 2 Financial Models in One Period Economy

Fig. 2.4 Sharpe ratio μ

2.2 Utility Functions

In financial problems maximizing utilities and minimizing risks are constant themes.
In the Markowitz portfolio theory, one uses expected return to measure performance
and the variance to measure the risk. They are among the simplest of such measures.
Since utility functions are concave and risk measures are convex, convex analysis is
a natural tool in dealing with financial modeling.

2.2.1 Utility Functions

In 1713 Nicolas Bernoulli posted the following problem later known as the St.
Petersburg Wager paradox:
“Peter tosses a coin and continues to do so until it should land “heads” when it
comes to the ground. He agrees to give Paul one ducat1 if he gets “heads” on the
very first throw, two ducats if he gets it on the second, four if on the third, eight if
the on the fourth, and so on, so that with each additional throw the number he must
pay is doubled. Suppose we seek to determine the value of Paul’s expectation.”
Assuming a fair coin we can easily calculate the expectation to be
∞
2n−1 · P (getting the f irst head on the nth throw)
n=1
∞ ∞
1 1
= 2n−1 = = ∞.
2n 2
n=1 n=1

1 Currency unit.
2.2 Utility Functions 47

The paradox lies in according to this computation the value of the rights of playing
such a game would be infinity. In other words, one would be willing to pay any cost
to play it, which is obviously absurd.
Daniel Bernoulli, Nicolas cousin, suggested a solution in 1738 which became
highly influential later. Observing that an extra 100 ducat maybe considered a small
fortune to a poor it may mean little to a rich, Daniel Bernoulli argued that people
intuitively value money not according to its face value but its relative usefulness.
Mathematically, he introduced utility function to capture this. For the St. Petersburg
Wager problem, Bernoulli suggested to use u(x) = ln(x) as the utility function.
Bernoulli chose the ln as a utility function because of two of the properties of this
function. First the ln function is increasing signaling the more the better. Second the
derivative of the ln function is 1/x which is decreasing. This matches the intuition
that the more you have the less you care about additional money. Abstractly, let
us denote a utility function by u(x). For convenience let us assume u is twice
differentiable. Then we can characterize the above two properties as u (x) ≥ 0
and u (x) ≤ 0. Alternatively, without assuming differentiability of u we can also
coding the intuition above mathematically by requiring a utility function to be an
increasing concave function. We say a function f : R → R is concave if and
only if −f is convex. If −f is concave, we say f is convex. Usually we assume
rational agents maximizing their expected utility when making decisions. Thus,
convex optimization becomes important in analyzing financial problems.
There are many increasing concave functions. A few are listed below.
• Power utility: (x 1−γ − 1)/(1 − γ ), γ > 0.
• Log utility: ln(x).
• Exponential utility: −e−αx , α > 0.
In dealing with a particular application problem the choice of the utility function
is often based on economic or tractability considerations. Different agents can have
different utility functions that reflect their own attitude towards rewards and risks of
various degree.
For our mathematical model, it is important to know what kind of general
conditions we should impose on a utility function. We consider a general extended
valued upper semicontinuous utility function u. The following is a collection of
additional conditions that are often used in financial models to accommodate
different levels of tolerance to risk:
(u1) (Risk aversion) u is strictly concave,
(u2) (Profit seeking) u is strictly increasing and limt→+∞ u(t) = +∞,
(u3) (Bankruptcy forbidden) For any t < 0, u(t) = −∞.
48 2 Financial Models in One Period Economy

2.2.2 Measuring Risk Aversion

Comparing tendency of risk aversion by directly examining the utility functions is

difficult. The following tools are useful.
Definition 2.2.1 (Arrow-Pratt Absolute Risk Aversion Coefficient (ARA)) The
coefficient of absolute risk aversion is defined as

u (x)
A(x) = − .
u (x)

Constant absolute risk aversion (CARA) refers to A(x) = α is a constant, e.g.

u(x) = 1 − e−αx . Hyperbolic absolute risk aversion (HARA) refers to A(x) =
1/(ax + b) is a hyperbolic function, e.g.

(x − x0 )1−γ
u(x) =
1−γ

where γ = 1/a, x0 = −b/a.

Definition 2.2.2 (Relative Risk Aversion Coefficient (RRA)) The coefficient of
relative risk aversion is defined as

xu (x)
R(x) = − .
u (x)

When ARA decreases the investor will increase risky investment in absolute
amount. Similarly, when RRA decreases the investor will increase risky investment
in percentage.
The property that a utility function has bounded ARA and RRA can be
characterized by generalized convexity. We showcase the proof for RRA.
Theorem 2.2.3 (Characterization of Bounded Relative Risk Aversion) Let u :
R+ → R be an increasing (decreasing) function with continuous second order
derivative. Then, for any p ∈ R, u has a coefficient of relative risk aversion R(x) ≤
(≥)1 − p if and only if u is Φ(x p y)(1) -convex.
Proof We focus on the case that u is increasing and the case of decreasing is similar.
The “If” part. Assume u is Φ(x p y)(1) -convex. Then, for any x > 0 we can find
y(x), b(x) such that

u(z) ≥ y(x)zp − b(x), ∀z > 0

with equality holds at z = x. Let

z → f (z) := u(z) − y(x)zp + b(x).

2.2 Utility Functions 49

We have f (x) = 0, f (x) ≥ 0, which give us

xu (x)
R(x) = − ≤ 1 − p.
u (x)

The “Only if” part. Write the R(x) ≤ 1 − p condition as

u (s) p−1

≥ .
u (s) s

Then solving for u on [x, z]. Details are left as an exercise.

Similarly, we have
Theorem 2.2.4 (Characterization of Bounded Absolute Risk Aversion) Let u :
R+ → R be an increasing (decreasing) function with continuous second order
derivative. Then, for any p ∈ R, u has a coefficient of absolute risk aversion
A(x) ≤ (≥)p if and only if u is Φ(e−px y)(1) -convex.
Remark 2.2.5 It is not hard to see that the above two theorems are also valid
for functions with piecewise continuous second order derivatives.
√ As a concrete
example Figure 2.5 illustrates that the function f (x) = x is Φ(x −1/2 y)(1) -convex.
√
We can see there how the top curve x is represented as an envelop of a class of
functions of the form x −1/2 y − b for different parameters (y, b).

Fig. 2.5 Generalized y

convexity

x
50 2 Financial Models in One Period Economy

2.2.3 Growth Optimal Portfolio Theory

Now consider investing for the long run (multiple period) and trying to maximize
the compounded return assuming that the financial market behaves the same on
each period as described in Section 2.1. The compounded return is much easier to
handle in percentage. We standardize the financial market by assuming S0 = 1 :=
(1, 1, . . . , 1) so that g = Ŝ1 − Ŝ0 represents the vector of percentage return of the
risky assets in the market. We also assume the risk free rate is 0 so that S10 = 1.
Similarly, we focus on portfolios that represent a percentage allocation of initial
endowments into the financial market, i.e., we require Θ · S0 = Θ · 1 = 1. When the
initial endowment is w0 the portfolio will be implemented as w0 Θ. The advantage
of focusing on the percentage portfolio is that when dealing with investment related
to multiple periods that repeats an identical one period market model, the percentage
portfolio on each period is the same. The growth portfolio theory seeks the portfolio
that maximizes the average compounded return in the above setting. This can be
phased as the utility maximization problem

maximizing E[ln(Θ · S1 )] (2.2.1)

subject to Θ · S0 = 1, S0 = 1,

or equivalently

maximizing E[ln(1 + Θ̂ · g)]. (2.2.2)

In fact consider investing the initial endowment w0 for l periods and rebalancing
with a fixed (percentage) portfolio Θ in each period. Using wk to denote the balance
at kth period. Assuming sample ωn = Bn ∈ Ω occurs ln times, the total gain will be
wl
= Πn=1
N
(1 + Θ̂ · g(Bn ))ln . (2.2.3)
w0

We can see that the average gain per period is

N
Πn=1 (1 + Θ̂ · g(Bn ))ln / l . (2.2.4)

Observing that when l → ∞, ln / l → P (Bn ), the average gain per period in the
long run is
N
Πn=1 (1 + Θ̂ · g(Bn ))P (Bn ) . (2.2.5)

Thus, pursuing the long-term compounded return or “growth” is to maximiz-

ing (2.2.5) or equivalently (2.2.2) among all percentage portfolio Θ. The maximiz-
ing portfolio for (2.2.2) is called the growth optimal portfolio.
2.2 Utility Functions 51

A growth optimal portfolio has the theoretical advantage of maximum rate of

growth of one’s wealth. However, in practice it often suffers the drawback of being
too risky. To understand this risk let us look at a simple financial market with only
one risk asset. In this case s = Θ̂ is just one real number. For the simplicity of
the notation we denote gn = g(Bn ) and pn = P (Bn ). Then the growth portfolio
optimization problem becomes
N
maximizing f (s) = pn ln(1 + sgn ). (2.2.6)
n=1

We will call f (s) a log return function. We refer to the portfolio weight s on the
risky asset as leverage. The leverage level that maximizing the log return function
f (s) is called the optimal leverage.
Theorem 2.2.6 (Compute the Optimal Leverage) Assume without loss of gener-
ality that g1 < g2 < . . . < gN . Then the optimal leverage s̄ is determined by the
unique solution of the (N − 1)th order polynomial equation
N
pn g n
0= N
Πn=1 (1 + sgn ) (2.2.7)
1 + sgn
n=1

on the interval (− g1N , − g11 ).

Proof Since the log return function,
N
f (s) = pn ln(1 + sgn ),
n=1

is a strictly concave function on (− g1N , − g11 ), its derivative is strictly

decreasing. Moreover, it is easy to see that lims→(−1/gN )+ f (s) = ∞ and
lims→(−1/g1 )− f (s) = −∞. Thus, there is a unique solution s̄ to the equation
N
pn g n
0 = f (s) = (2.2.8)
1 + sgn
n=1

on (− g1N , − g11 )
which is the optimal leverage.
Finally, observing that the polynomial Πn=1 N (1 + sg ) has no solution in the
n
interval (− g1N , − g11 ), which shows that s̄ must be the unique solution of the (N −1)th
polynomial equation

N
pn g n
0 = Πn=1
N
(1 + sgn )
1 + sgn
n=1

on the inverval (− g1N , − g11 ).

52 2 Financial Models in One Period Economy

When the market has only two or three states explicit solutions are not hard to
derive. Those results are very useful for analyzing betting on games and, therefore,
presented below.
Proposition 2.2.7 (Two States) Consider a market with two distinct states repre-
sented by g1 < g2 corresponding to probabilities p1 and p2 , respectively. Then the
optimal leverage is

p1 g 1 + p 2 g 2
s̄ = − . (2.2.9)
g1 g2

Proof The log return function for such an investment system is f (s) = p1 ln(1 +
sg1 ) + p2 ln(1 + sp2 ). By Theorem 2.2.6, the optimal leverage s̄ is the solution of
equation

p1 g 1 p2 g 2
0 = (1 + sp1 )(1 + sp2 ) + .
1 + sp1 1 + sg2

Solving this equation produces Equation (2.2.9).

Proposition 2.2.8 (Three States) Consider a market with three distinct states
represented by g1 < g2 < g3 corresponding to probabilities p1 , p2 , and p3 ,
respectively. Then the optimal leverage s̄ is given by
⎧
⎪
⎪ 0 if C = 0
⎪
⎪
⎪
⎨− p1 g1 +p3 g3 if g2 = 0
(p +p )g1 g3
s̄ = −B+1√B32 −4AC (2.2.10)
⎪
⎪ if C < 0, g2 = 0
⎪
⎪ √2A
⎪
⎩ −B− B 2 −4AC
2A if C > 0, g2 = 0.

Here A = g2 g2 g3 , B = g − 2[p3 g3 + p1 g1 + p2 (g1 + g3 )] + (p1 + p3 )g1 g3 and

C = p1 g1 + p2 g2 + p3 g3 .
Proof The proof is similar to that of Proposition 2.2.7 and is left as an exercise.

Remark 2.2.9 (The Kelly Criterion and the Shannon Information Rate) In Propo-
sition 2.2.7 if −g1 = g2 = 1 are symmetric and standardized then at the optimal
leverage

s̄ = p2 − p1

the value of the log return function is

f (s̄) = p1 ln p1 + p2 ln p2 + ln 2.
2.2 Utility Functions 53

This is Shannon’s information rate for a communication channel with noise [49].
Note that when g2 = −1 and g1 = 1 our portfolio is equivalent to a game with
symmetric payoffs. This says that Shannon’s information rate can be explained as
the best possible outcome of using communication channel with noise when the
signal is used for a game with symmetric payoffs.
Let us apply Proposition 2.2.7 to a simplified Blackjack game.
Example 2.2.10 (Money Management in Blackjack) In play a certain version of the
Blackjack we know with counting cards a skilled player has a winning probability
of 51% over the house. We simplify the problem by assuming the win and loss are
always equal to the bet and apply Proposition 2.2.8 to determine the best betting
size s as a percentage of all the bankroll of the player. In this case g2 = 1 (wining
100% of the bet), g1 = −1 (losing 100% of the bet), p2 = 51% and p1 = 49%.
Thus, the optimal leverage indicates that the best betting size is

p1 g 1 + p2 g 2
s̄ = − = 2%.
g1 g2

This is actually recommended by Ed Thorp an expert in the Blackjack game.

The game of Blackjack has changed a lot and the player’s advantage has mostly
slipped away due to the use of multiple deck of cards and frequent shuffling.
However, even if the assumption in Example 2.2.10 were correct, the optimal betting
size s̄ is too aggressive as explained in the next example.
Example 2.2.11 Now consider playing a game with symmetric payoff t = −c = 1
with the wining probability of 90%. We can easily calculate that the best betting size
(optimal leverage) is s̄ = 80%. Putting 80% of your wealth on the line is clearly too
aggressive no matter how favorable the game is to you.

2.2.4 Efficiency Index

Despite the short comings of the growth portfolio theory, similar to the Markowitz
portfolio theory the idea can also be used to construct a criterion for evaluating
investment performance. The key is to realize by examining, e.g., Proposition 2.2.7
that the effectiveness of an investment strategy must be evaluated with appropriate
leverage level.
Example 2.2.12 We consider two simplified investment strategies labeled S1 (Strat-
egy 1) and S2 (Strategy 2), respectively. We assume that each strategy has two
possible returns with the corresponding probability specified below:
54 2 Financial Models in One Period Economy

Table 2.1 Effects of investment systems under different investment sizes

Period S1 return S2 return 100%S1 100%S2 20%S1 20% S2
1 45% 30% 145.00 130.00 109.00 106.00
2 −60% 30% 58.00 169.00 95.92 112.36
3 45% −20% 84.10 135.20 104.55 107.86
4 45% 30% 121.95 175.76 113.96 114.34
5 −60% −20% 48.78 140.61 100.28 109.76
6 −60% −20% 19.51 112.49 88.25 105.37
7 45% 30% 28.29 146.23 96.19 111.69
8 45% −20% 41.02 116.98 104.85 107.23
9 45% 30% 59.48 152.08 114.29 113.66
10 45% −20% 86.25 121.66 124.58 109.12

For illustration let’s assume each strategy is used on ten periods with the same
initial endowment of $100 in two different leverage levels of 20% and 100%. The
first column in Table 2.1 represents the periods. The next two columns represent
the returns in each period for the two strategies S1 and S2, respectively. The last
four columns are the balances of strategies S1 and S2 at different periods when used
with the two different leverage levels 100% and 20%, respectively. The results show
that with a leverage level of 100% of the available capital for each strategy, System
2 is better than System 1, but with a leverage level of 20% System 1 becomes the
better one.

How to place them on a leveled playing ground? One way to do it is to compare

them under their optimal leverages, respectively. This leads to the following
definition.
Definition 2.2.13 (Efficiency Index) Suppose an investment strategy is character-
ized by its returns g ∈ RV (Ω, F, P ). we define its efficiency index γ as

N
γ = max pn ln(1 + sgn ), (2.2.11)
s∈[−1/ max(gn ),−1/ min(gn )
n=1

where gn = g(Bn ) and pn = P (Bn ).

If gn ≥ 0, n = 1, . . . , N or gn ≤ 0, n = 1, . . . , N, then we can derive positive
return without any risk signaled by γ = +∞. This situation will be called an
arbitrage opportunity (see Definition 2.3.5 in the next subsection). Otherwise the
efficiency index γ is the log return of the portfolio of cash and the given investment
strategy at the optimal leverage level. In view of Remark 2.2.9 the efficiency index
gauges the useful information contained in an investment strategy.
Example 2.2.14 Let us re-examine Example 2.2.12 using the efficiency index.
Drawing the log return functions of investment strategies S1 and S2 according to
2.3 Fundamental Theorem of Asset Pricing 55

Table 2.2 Outcomes and g1 p1 g2 p2

probabilities for the two
strategies S1 45% 0.7 −60% 0.3
S2 30% 0.5 −20% 0.5

Fig. 2.6 Log return functions γ

0.03
0.02
0.01
0 s
0.5 1 1.5 2

System 1 System 2

Table 2.2 simultaneously in Figure 2.6 we can understand the reasons behind the
phenomenon observed in Example 2.2.12. Moreover, we see that neither strategy
was tested in Example 2.2.12 at the optimal leverage.
Using Theorem 2.2.7 we can calculate that, for Strategy 1, s̄ = 50%, γ = 0.035
and for Strategy 2, s̄ = 83%, γ = 0.02. Comparing the efficiency indices we can
see that Strategy 1 is the better one. Yet this fact is hard to unveil without the help
of the efficiency index.

2.3 Fundamental Theorem of Asset Pricing

We turn to consider optimizing a general utility of the payoff of a portfolio Θ ∈

RM+1 . We wish to endow a norm on the space of portfolios that can reflect the
size of a portfolio. Intuitively, the magnitude of Θ as a vector in RM+1 in a sense
indicates the level of capital commitment or leverage level of a portfolio. However,
one needs to be careful here. Holding a portfolio, an investor’s goal is to derive a
risk adjusted gain represented by the random variable

Θ · (S1 − S0 ) ∈ RV (Ω, F, P ). (2.3.1)

We can see that increasing or reducing the share of cash in the portfolio clearly
swings the leverage level as measured by the magnitude of Θ, yet does nothing to
the gain (2.3.1). The following example shows that even if we fix the share of the
cash, such a phenomenon can still happen.
Example 2.3.1 (Infinitely Many Portfolio with Equivalent Gain) Consider a state
space Ω = {0, 1} and with a financial market with three risky assets whose prices
at times 0, 1 are given by S0 = (1, 1, 1, 1), S1 (0) = (1, 0.8, 0.9, 1), and S1 (1) =
(1, 1.1, 1.2, 1.1). We can easily verify that for portfolio Θ̄ = (1, 1, −2, 3), Θ̄ ·(S1 −
S0 )(i) = 0 for both i = 0 and i = 1. It follows that for any r ∈ R, all the portfolios
Θ + r Θ̄ have the same gain.
56 2 Financial Models in One Period Economy

Notice that as |r| → ∞, the magnitude of Θ + r Θ̄ ∈ RM+1 also goes to infinity.

This example demonstrates that the magnitude of a portfolio in RM+1 is not an
appropriate measure for the leverage level of the portfolio. Moreover, it clearly does
not make sense in practice to use a portfolio of the form Θ + r Θ̄ with large |r|. This
is because doing so will greatly increase the risk (as the price of assets in a financial
market is not deterministic) without benefit to the gain. These considerations lead
to the following definitions:
Definition 2.3.2 (Equivalent Portfolios) We say two portfolios Θ 1 and Θ 2 are
equivalent in market S if they have the same initial value and the gain, that is to
say,

Θ 1 · S0 = Θ 2 · S0 (2.3.2)

and, as random variables,

Θ 1 · (S1 − S0 ) = Θ 2 · (S1 − S0 ).

We will use S[Θ] to denote all the portfolios that are equivalent to Θ in market S.
Since all the portfolio in S[Θ] are equivalent we prefer those that have low
leverages as measured by · , the Euclidean norm on RM+1 . The following lemma
provides us with an optimally leveraged portfolio in each equivalent class.
Lemma 2.3.3 For any portfolio Θ in S, the optimization problem

min{x : x ∈ S[Θ]}. (2.3.3)

has a unique solution, denoted Θ. Moreover, there exists a constant K = K(S)

depending only on S such that, for any portfolio Θ,

Θ ≤ KΘ · (S1 − S0 )RV . (2.3.4)

Here · RV is the norm on RV (Ω, F, P ) introduced in Section 2.1 induced by the
inner product defined in (2.1.1).
Proof Note that problem (2.3.3) and the following problem (2.3.5) have the same
solution

min{x2 : x ∈ S[Θ]}. (2.3.5)

Denote
⎡ ⎤
S1 (B1 ) − S0
⎢ S1 (B2 ) − S0 ⎥
A=⎢
⎣
⎥,
⎦
...
S1 (BN ) − S0
2.3 Fundamental Theorem of Asset Pricing 57

where {B1 , . . . , BN } are the set of atoms of the probability space (Ω, F, P ). Then
A is an N × (M + 1) matrix.
We observe that x ∈ S[Θ] amounts to requiring

Ax = Θ · (S1 − S0 ). (2.3.6)

We first consider the special case when rank(A) = min(M + 1, N) If rank(A) =

M + 1, the constraint uniquely determines Θ = x = (A A)−1 A Θ · (S1 − S0 ).
Otherwise, rank(A) = N and the quadratic function x2 attains a minimum on the
affine set characterized by the linear constraint. It is easy to calculate this solution
to be Θ = x = A (AA )−1 Θ · (S1 − S0 ). In both cases Θ is unique. Moreover,
defining

K = K(S) = max(A (A A)−1 , (AA )−1 A ),

we have (2.3.4).
If rank(A) < min(M + 1, N), then we can first remove the rows or columns in
A that are dependent on others and then apply the above special case to the reduced
matrix A.

Definition 2.3.4 (Portfolio Space) We call the quotient space of RM+1 with
respect to the portfolio equivalent relationship in market S the portfolio space on
S and denote it port[S]. For Θ ∈ port[S] we define its norm by

Θp = Θ.

The portfolio space (port[S], · p ) is a finite dimensional Banach space.

2.3.1 Fundamental Theorem of Asset Pricing

Gain without risk is what every investor desires. Such opportunities arguably will
not last as when everyone tries to chase it. Based on this observation, in a financial
market a guiding principle is that such “free lunch” should not exist. The following
is a formal definition.
Definition 2.3.5 (Arbitrage) We say that a portfolio Θ is an arbitrage if it involves
no risk so that Θ · (S1 − S0 ) ≥ 0 and has opportunity to gain something: Θ · (S1 −
S0 ) = 0.
A rational investor with a utility function u satisfying conditions (u1)–(u3) will
try to maximize the expected utility of the final wealth among all portfolios in
port[S]. In other words, if w0 > 0 is the initial wealth of the investor, he wants
to solve the following portfolio utility maximization problem. Find:

sup{E[u(w0 + Θ · (S1 − S0 )] : Θ ∈ port[S]}. (2.3.7)

58 2 Financial Models in One Period Economy

It turns out that an arbitrage opportunity is exactly characterized by the optimal

value for problem (2.3.7) to be +∞.
Theorem 2.3.6 (Characterizing Arbitrage with Utility Optimization) The port-
folio space port[S] contains an arbitrage if and only if the optimal value of the
utility optimization problem is +∞
Proof The “only if” part is easy: if Θ ∈ port[S] is an arbitrage, then so is rΘ for
any r > 0. Then it is easy to see that E[u(w0 +rΘ ·(S1 −S0 )] → +∞ as r → +∞.
To prove the “if part” assume the optimal value for problem (2.3.7) is +∞. Then
there exists a sequence Θ n ∈ port[S] such that E(u(w0 + Θ n · (S1 − S0 )) → +∞
as n → +∞. Necessarily, tn = Θ n · (S1 − S0 )RV → +∞ as n goes to ∞. By
Lemma 2.3.3 there exists a constant K = K(S) such that Θ n /tn ≤ K. Without
loss of generality we may assume that Θ n /tn converges to some Θ ∗ ∈ port[S].
Note that, for any n, Θ n · (S1 − S0 ) ≥ −w0 by property (u3) of the utility function.
Thus, Θ ∗ · (S1 − S0 ) ≥ 0. Also,

Θ ∗ · (S1 − S0 ) ≥ lim inf Θ n · (S1 − S0 )/tn = 1.

n→∞

Therefore, Θ ∗ is an arbitrage.

The fundamental theorem of asset pricing (FTAP) links no arbitrage with the
existence of certain type of measures defined below:
Definition 2.3.7 (Equivalent Martingale Measure) We say that Q is an equiv-
alent martingale measure (EMM) on economy (Ω, F, P ) for financial market S
provides that, for any atom Bi of F, Q(Bi ) = 0 if and only if P (Bi ) = 0, and

EQ [S1 ] = S0 .

Given an initial wealth w0 > 0, the set of all achievable wealth outcomes at the
end of the one period economy t = 1 using all possible portfolios is

w0 + {Θ · (S1 − S0 ) : Θ ∈ port[S]} ⊂ RV (Ω, F, P ).

We denote the set of gains

W := {Θ · (S1 − S0 ) : Θ ∈ port[S]} ⊂ RV (Ω, F, P ).

In fact, W is a subspace of RV (Ω, F, P ). It is not hard to see that if Θ is an arbitrage

portfolio then Θ · (S1 − S0 ) ∈ RV (Ω, F, P )+ \{0}, where RV (Ω, F, P )+ is the
cone of nonnegative random variables. Thus, no arbitrage can be described as

W ∩ RV (Ω, F, P )+ \{0} = ∅.
2.3 Fundamental Theorem of Asset Pricing 59

Traditional proof of the FTAP relies on applying an appropriate version of the

cone separation theorem to ensure that there is a hyperplane separating W and
RV (Ω, F, P )+ . Then, a scaling of the normal vector of such a separation hyper-
plane gives us an equivalent martingale measure. This geometric picture is often
interpreted as the no arbitrage price being independent of investors preferences.
However, we will give a proof of the FTAP below based on portfolio utility
optimization (2.3.7). We show that the equivalent martingale measure can be viewed
as a scaling of the solution to the dual problem or equivalently the Lagrange
multiplier related to such a utility optimization problem. As a result, a pricing
martingale measure does depend on the utility function of the investor in general.
Theorem 2.3.8 (Refined Fundamental Theorem of Asset Pricing) Let S be a
financial market, let u be a utility function that satisfies properties (u1), (u2), and
(u3) and let w0 ≥ 0 be a given initial endowment. Then the following statements
are equivalent:
(i) port[S] contains no arbitrage.
(ii) The optimal value of the portfolio utility optimization problem (2.3.7) is finite
and attained.
(iii) There is an equivalent S-martingale measure proportional to a subgradient of
−u at the optimal solution of (2.3.7).
Proof First observe that the utility optimization problem (2.3.7) can be written
equivalently as

max E[u(y)] (2.3.8)

subject to y ∈ w0 + W.

Define f (y) = −E[u(y)] and g(y) = ιw0 +W (y). Then we can rewrite
problem (2.3.8) as

− min{f (y) + g(y)} (2.3.9)

The dual problem of (2.3.9) is

− max{−f ∗ (−z) − g ∗ (z)} (2.3.10)

∗
= min{E[(−u) (−z)] + w0 , z + σW (z)}

Since we can check that the constraint qualification condition

w0 ∈ ri[dom g − dom f ] = ri[w0 + W − RV (Ω, F, P )+ \{0}] (2.3.11)

(corresponding to (1.2.2)) holds, Fenchel strong duality implies (2.3.9) and its
dual (2.3.10) has the same value.
60 2 Financial Models in One Period Economy

By Theorem 2.3.6, port[S] contains no arbitrage if and only if the optimal values
of problem (2.3.7) are finite and, therefore, the dual problems (2.3.9) and (2.3.10)
are all finite. Since W is a subspace, the optimal value of (2.3.10) is not −∞ implies
that its solution z ⊥ W . Moreover, E[(−u)∗ (−z)] > −∞ implies that z(Bi ) > 0
for all P (Bi ) = 0. Thus, Q = z/E[z] is an S-martingale measure equivalent to P .
That is, (i) implies (ii).
On the other hand, the existence of an equivalent S-martingale measure
implies that the constraint qualification condition for (2.3.10) holds. In fact,
problem (2.3.10) can be viewed as minimizing the convex function z →
E[(−u)∗ (−z)] + w0 , z over the entire subspace W ⊥ (z > 0 is merely a
consequence of the domain of E[(−u)∗ (·)] being a subset of int[−RV (Ω, F, P )+ ]
and, therefore, is not a separate constraint). Thus, the constraint qualification
condition for (2.3.10) satisfies (see, e.g., [62, Theorem 2.7.1]). It follows that
problem (2.3.7) which is equivalent to (2.3.9) as the dual of (2.3.10) has a finite
value and attains its solution, which is to say (ii) implies (iii).
Finally, if (iii) is true, then there cannot be any arbitrage in port[S] because
adding an arbitrage to the optimal solution of (2.3.7) will improve it. Thus, (iii)
implies (i) and we have completed a cyclic proof of the equivalence of (i), (ii), and
(iii).

An equivalent martingale measure can also be viewed as a scaling of a Lagrange

multiplier for the portfolio utility optimization problem (2.3.7) due to the relation-
ship between Lagrange multipliers and dual solutions. To see this let us rewrite
problem (2.3.7) as a constrained minimization problem

minimize E[(−u)(x)] (2.3.12)

subject to x − Θ · (S1 − S0 ) − w0 = 0.

We have already known from the proof of the Theorem 2.3.8 that this problem has a
solution (x ∗ , Θ ∗ ). Moreover, since we know strong duality holds and the dual prob-
lem has a solution, which implies that problem (2.3.12) has a Lagrange multiplier.
Let λ be the Lagrange multiplier of problem (2.3.12). Then the Lagrangian is

L((x, Θ), λ) = E[(−u)(x)] + λ, x − Θ · (S1 − S0 ) − w0

= E[(−u)(x)] + λ, x − w0 − λ, Θ · (S1 − S0 )
= E[(−u)(x) + λ(x − w0 )] − λ, Θ · (S1 − S0 ) .

It attains minimum at (x ∗ , Θ ∗ ). Thus, we have λ, S1 − S0 = 0 and −λ(Bi ) ∈

∂(−u)(x ∗ (Bi )), i = 1, 2, . . . , N for P (Bi ) > 0. Since −u is strictly decreasing
we have λ(Bi ) > 0 whenever P (Bi ) > 0. Moreover, dividing λ, S1 − S0 =
E[λ(S1 − S0 )] = 0 by E[λ] and noticing that S0 is a constant vector we get

E[(λ/E[λ])S1 ] = S0 .
2.3 Fundamental Theorem of Asset Pricing 61

This is to say that Q = (λ/E[λ])P is a martingale measure equivalent to P . We can

see that this martingale measure is indeed a scaling of the Lagrange multiplier.
Condition (u3) can be removed from Theorem 2.3.8 to derive a generalization of
the version of FTAP in [17].
Theorem 2.3.9 (Refined Fundamental Theorem of Asset Pricing) Let S be a
market. Then the following are equivalent:
(i) There exists no arbitrage trading strategy in port[S];
(ii) There is an equivalent S-martingale measure.
(iii) There exists a utility function u with properties (u1) and (u2), such that the
finite optimal value of the trading strategy utility optimization problem (2.3.7)
is attained.
Proof Implication (i) → (ii) → (iii) follows from Theorem 2.3.8. If the finite
optimal value of the trading strategy utility optimization problem (2.3.7) is attained,
then there can be no arbitrage because superposition of an arbitrage to the optimal
solution will improve it. Thus (iii) also implies (i) completing a cyclic proof.

Remark 2.3.10 Although the fundamental result of no arbitrage is equivalent to

existence of an equivalent martingale measure is well known, as pointed out in
[64] the proof of Theorem 2.3.8 using a class of utility functions says more:
when the martingale measure is not unique, the dual problem actually points to
one particular martingale measure. Thus, in principle, every choice of martingale
measure (corresponding to a particular price of the contingent claim) can be viewed
as a particular portfolio optimization problem with a corresponding concave utility
function.
The useful perspective we can get from this exercise is that pricing contingent
claims either by a replicating portfolio or by using a martingale measure can be
viewed as a special case of portfolio optimization with respect to a certain utility
function. There are many possibilities in selecting the utility functions. Thus, the
pricing of contingent claims does rely on the trader’s preference. There can exist
many different reasonable prices as a result of the differences in trader’s risk-reward
preferences.

2.3.2 Pricing Contingent Claims

A contingent claim is a random variable φ1 ∈ RV (Ω, F, P ) as a payoff at t = 1.

To find a fair price φ0 for this contingent claim we form a portfolio holding one
such contingent claim along with a portfolio of other assets in the market scaled to
the initial wealth of the investor and then (as in the previous section) consider the
portfolio optimization problem of maximizing the utility of the final wealth:
62 2 Financial Models in One Period Economy

maximizing E[u(β(φ1 + Θ · S1 ))]

subject to β(φ0 + Θ · S0 ) = w0 .

Equivalently we can write this portfolio optimization problem as

minimizing E[(−u)(x)] (2.3.13)

subject to x − β(φ1 − φ0 + Θ · (S1 − S0 )) − w0 = 0.

Assume there is no arbitrage then Theorem 2.3.6 implies that the optimal value
of problem (2.3.13) is finite and is attained at (x ∗ , β ∗ , Θ ∗ ). As in the previous
section that we can check that the constraint qualification condition for prob-
lem (2.3.13) is satisfied and, therefore, problem (2.3.13) has a Lagrange multiplier
λ ∈ RV (Ω, F, P ) such that the Lagrangian

L((x, β, Θ), λ) = E[(−u)(x)] + λ, x − β[φ1 − φ0 + Θ · (S1 − S0 )] − w0

= E[(−u)(x)] + λ, x − w0 − λ, β[φ1 − φ0 + Θ · (S1 − S0 )]
= E[(−u)(x) + λ(x − w0 )] − λ, β[φ1 − φ0 + Θ · (S1 − S0 )] ,

attains mininum at (x ∗ , β ∗ , Θ ∗ ). Thus, we have −λ(Bi ) ∈ ∂(−u)(x ∗ (Bi )), i =

1, 2, . . . , N for P (Bi ) > 0. Since −u is strictly decreasing we have λ(Bi ) > 0
whenever P (Bi ) > 0. Moreover, λ, S1 − S0 = 0, which is E[λ(S1 − S0 )] = 0.
Dividing by E[λ] and noticing that S0 is a constant vector we get

E[(λ/E[λ])S1 ] = S0 .

This is to say that Q = (λ/E[λ])P is a P -equivalent martingale measure. Finally,

λ, φ1 − φ0 = 0. That is

φ0 = E Q [φ1 ],

in other words, if there is no arbitrage then the price of the contingent claim must be
the expectation of its payoff under one of the martingale measures that are equivalent
to P .
We can see from above that martingale measures and, therefore, the resulting
prices of the contingent claim depend on the choice of utility functions. We now
give a simple example that explicitly calculates the martingale measures in terms of
a class of utility functions.
Example 2.3.11 Consider a market S contains only one risky asset. Assume that
the market has N states Ω = {ω1 , . . . , ωN } and state ωn happens with probability
pn . Assume for simplicity that S0 = 1 and denote xn := S1 (ωn ) − S0 . In this
case a trading strategy Θ is simply a constant θ indicating the share of S that the
trader holds. Given a utility function u satisfying properties (u1)–(u3) the utility
2.3 Fundamental Theorem of Asset Pricing 63

maximization problem (2.3.7) takes the following concrete form:

N
max E[u(1 + θ · (S1 − S0 ))] = pn u(1 + θ xn ). (2.3.14)
n=1

Rewrite (2.3.14) as a constrained minimization problem

N
min − pn u(yn ) (2.3.15)
n=1
subject to yn − 1 − θ xn = 0, n = 1, . . . , N.

Let’s write the Lagrangian

N
L((y, h), λ) = − pn [u(yn ) + λn (yn − 1 − θ xn )].
n=1

Setting ∇y,θ L = 0 we derive, at the optimal solution,

N
pn λn xn = 0, (2.3.16)
n=1

and

λn = u (yn ) = u (1 + θ xn ). (2.3.17)

Equation (2.3.16) clearly shows that a scaled λ gives us the martingale measure. To
solve for θ so as to derive the solution to the utility optimization problem (2.3.14)
we can substitute (2.3.17) into (2.3.16) to get the following equation for θ ,

N
pn u (1 + θ xn )xn = 0. (2.3.18)
n=1

Equation (2.3.17) clearly shows that the martingale measure depends on the choice
of utility function.

We continue this example by considering a concrete family of utility functions.

Example 2.3.12 (Risk Aversion) Let us consider a class of utility function that
depend on parameter c > 0,
64 2 Financial Models in One Period Economy

Table 2.3 Martingale c θ̄ π1 π2 π3

measures when w0 = 1
0.0 0.868 0.178 0.232 0.589
0.2 1.023 0.183 0.226 0.591
0.4 1.154 0.185 0.222 0.593
0.6 1.258 0.189 0.219 0.593

Table 2.4 Martingale w0 θ̄ π1 π2 π3

measures when c = 0.2
1 1.024 0.183 0.226 0.591
3 3.777 0.188 0.218 0.594
6 8.830 0.192 0.212 0.596

ln x + cx x>0
uc (x) =
−∞ x ≤ 0,

and set N = 3, p1 = p2 = p3 = 1/3 and x1 = 1, x2 = 0.5 and x3 = −0.5.

In this case the Lagrangian is

N
L((y, θ ), λ) = − pn [ln(yn ) + cyn + λn (yn − 1 − θ xn )].
n=1

At the optimal solution (ȳ, θ̄ ), Equation (2.3.17) determines the Lagrange

multiplier as

1 1 1
λ = (λ1 , λ2 , λ3 ) = + c, + c, + c . (2.3.19)
1 + θ̄ 1 + 0.5θ̄ 1 − 0.5θ̄

The optimal portfolio θ̄ can be determined by (2.3.18) that is

1 1 1
+c + + c 0.5 − + c 0.5 = 0. (2.3.20)
1 + θ̄ 1 + 0.5θ̄ 1 − 0.5θ̄

Numerically solving (2.3.19) and (2.3.20) and scaling the Lagrange multipliers
yield (Table 2.3) that relates c to optimal portfolio θ̄ and risk neutral measure π :
We can see that fixing w0 when c increases so does θ̄ , which is a fact that is not
hard to verify to be true in general from Equation (2.3.20). Note that in our family
of utility functions depend on the parameter c, decreasing of c corresponding to
increasing of risk aversion. On the other hand, fixing a utility function (by fixing c)
decreasing of w0 corresponds to increasing of risk aversion (see Table 2.4). This
is consistent with an intuitive explanation of the change in the martingale measure:
increasing in the weight in the middle (π2 ) while decreasing the weight on both
extremes (π1 and π3 ).
2.3 Fundamental Theorem of Asset Pricing 65

Table 2.5 Prices of a call w0 Price π1 π2 π3

option when c = 0.2
1 0.296 0.183 0.226 0.591
3 0.297 0.188 0.218 0.594
6 0.298 0.192 0.212 0.596

p = 0.298

p = 0.297

p = 0.296
θ
−0.2 −0.1 0 0.1 0.2

Fig. 2.7 Utility on quantity of option for different prices

Example 2.3.13 (Pricing Contingent Claims) We now turn to pricing contingent

claims. We consider the same financial market as in Example 2.3.12 defined by
S0 = (1, 1) and

S1 (ω1 ) = (1, 2), S1 (ω2 ) = (1, 1.5), S1 (ω3 ) = (1, 0.5)

the payoff of a call option with strike 1 is

C(ω1 ) = 1, C(ω2 ) = 0.5, C(ω3 ) = 0.

Fixing a utility ln(x) + 0.2x, pricing C using the equivalent martingale measure
from the previous example gives the results in Table 2.5:
Fixing u(x) = ln(x) + 0.2x, w0 = 3 from the table p = 0.297. This is the
private price of the agent corresponding to his/her risk aversion. The meaning of
this private price is that the agent should buy (long) when the market price is lower
than p = 0.297 and sell (short) when the market price is higher to improve his/her
utility. Figure 2.7 shows the expected utility

fp (θ ) := E[u(3 + h̄(S1 − S0 ) + θ (C − p))]

for different values of the option price around p = 0.297.

Remark 2.3.14 We can see that when market price differs from the agent’s private
price an opportunity of improving utility arises. However, this does not mean
opportunity for arbitrage. In fact, from the graph we can see that buying (or shorting)
too much will actually reduce the utility. Market price equals the agent’s private
price means no opportunity of improving utility. In this case the agent should take
no position.
66 2 Financial Models in One Period Economy

The utility optimization point of view also explains that trading will happen
between agents with different risk aversion determined by utility and initial
endowment. For example, assume the same utility u(x) = ln(x) + 0.2x for all
agents. If market price is 0.297, then agents with w0 = 1 will sell, agents with
w0 = 6 will buy while agent with w0 = 3 will take no action.

2.3.3 Complete Market

We have seen that in general the martingale measure is not unique and they are
related to the investor’s utility function. One exception is when the financial market
is complete as defined below:
Definition 2.3.15 (Complete Market) We say a financial market S is complete if

{Θ · S1 | Θ ∈ port[S]} = RV (Ω, F, P ),

or equivalently

{1B : B ∈ F} ⊂ {Θ · S1 | Θ ∈ port[S]}.

If S is not complete, then S is said to be incomplete.

The following characterizes the completeness of a financial market.
Proposition 2.3.16 (Unique Martingale Measure) Let S be a complete financial
market. Then there is only one unique equivalent martingale measure.
Proof Since W = {Θ · S1 | Θ ∈ port[S], Θ · S0 = 0}, dimW =dim {Θ · S1 | Θ ∈
port[S]}−1. Thus, for a complete market dim W ⊥ =1. Hence, in a complete market
equivalent martingale measure is unique.

If we focus only on complete markets, then utility functions are irrelevant to

asset pricing. But, of course, most markets are incomplete. In a complete market the
search for optimal portfolio can also be simplified.
Suppose that (x ∗ , Θ ∗ ) is the solution to the constrained minimization prob-
lem (2.3.12) then it is also the solution to the problem of minimizing the Lagrangian

L((x, Θ), λ) = E[(−u)(x) + λ(x − w0 )] − λ, Θ · (S1 − S0 ) .

which implies that Q = λ/E[λ]P is the unique risk neutral measure. Moreover,
since x ∗ satisfies the constraint x ∗ − Θ ∗ · (S1 − S0 ) − w0 = 0 we also know
that λ, x ∗ − w0 = EQ [x ∗ − w0 ] = 0. Thus, x ∗ is also a solution to the constrained
minimization problem
2.3 Fundamental Theorem of Asset Pricing 67

minimize E[(−u)(x)] (2.3.21)

subject to E [x] = w0 .
Q

On the other hand, since −u is strictly convex, the solution to (2.3.21) is unique and,
therefore, must be x ∗ . Thus, problem (2.3.12) and (2.3.21) have the same solution.
Remark 2.3.17
1. Problem (2.3.21) only provides a solution x ∗ . To get the optimal portfolio one
has to do additional work using the constraint.
2. The equivalence of the solutions of the two problem breaks down if martingale
measures are not unique and, therefore the above result only holds in a complete
market.

2.3.4 Use Linear Programming Duality

If we set w0 = 0, then the utility optimization problem becomes

sup{E[u(x)] : x ∈ W }.

Importantly, property (u2) of the utility function forces x ∈ RV (Ω, F, P )+ so that

the problem is, in fact,

sup{E[u(x)] : x ∈ W ∩ RV (Ω, F, P )+ }.

Note that no arbitrage is equivalent to

W ∩ RV (Ω, F, P )+ = {0}.

Thus, for the purpose of characterizing no arbitrage, the problem is trivial.

What do we get from our theory then? We still see that no arbitrage implies
the existence of an equivalent martingale measure. Moreover, we still have the
martingale measure is proportional to a subdifferential of the negative of the utility
function at the optimal portfolio. This is where we can derive more from our
approach. In this trivial problem the only solution is 0 for all economic states ω ∈ Ω.
Since u(t) = −∞, t < 0, the subdifferential of −u at 0 is determined by the right
directional derivative:
u(t) − u(0)
k := lim > 0.
t↓0 t

In fact,

− ∂(−u)(0) = [0, k]. (2.3.22)

68 2 Financial Models in One Period Economy

Since this is true for all states ω ∈ Ω, it tells us the equivalent martingale measure
is proportional to a vector in [0, k]N , N = number of states in Ω. This amounts
to constraint in the martingale measure. We also note that in this case nothing
is lost by picking the utility function u(t) = t − ι(−∞,0) (t) so that the utility
maximization problem becomes a linear programming problem. This way one can
use the more widely known linear programming duality instead of Fenchel duality.
This approach, however, loses the information relating to the agent’s risk aversion.

2.4 Risk Measures

We have discussed variance–standard deviation and drawdown as risk measures.

There are many other risk measures. To be systematic, in this section, we take an
axiomatic approach: list desired properties of risk measures. We focus on coherent
risk measures which are sublinear. Since sublinear function is a special type of
convex function, many tools in convex analysis and duality theory are applicable.

2.4.1 Coherent Risk Measure

Definition 2.4.1 (Risk Measure) Let RV (Ω, F, P ) represent the payoff space.
We say a lower semicontinuous function ρ : RV (Ω, F, P ) → R ∪ {+∞} is a risk
measure if ρ is convex and decreasing, i.e., ρ(x) ≤ ρ(y) for any x ≥ y.
Convexity of risk measures reflects the belief that diversification reduces risk.
The decreasing property says that a dominant payoff is less risky. We will focus on
the following:
Definition 2.4.2 (Coherent Risk Measure) Let RV (Ω, F, P ) represent the pay-
off space. We say a lower semicontinuous function ρ : RV (Ω, F, P ) → R∪{+∞}
is a coherent risk measure if, for any x, y ∈ RV (Ω, F, P ), ρ has the following
properties:
(r1) (Positive homogeneity) ρ(rx) = rρ(x) for any r > 0,
(r2) (Subadditivity) ρ(x + y) ≤ ρ(x) + ρ(y),
(r3) = ρ(x) − c ∀x ∈ RV (Ω, F, P ) and c ∈ R.
(Translation property) ρ(x + c1)
(r4) (Monotonicity) ρ(x) ≤ ρ(y) for any x ≥ y,
Properties (r1) and (r2) imply that a coherent risk measure is convex. Property
(r4) says a coherent risk measure is decreasing. Thus, coherent risk measure is a
special type of risk measures. Property (r1) says that the risk measure is proportional
to scaling. With this property coherent risk measure is actually sublinear. The idea
of (r3) is that one may measure the risk of x by the minimum amount of additional
capital reserve to ensure that there is no risk of bankruptcy. This is very important
in practice. A coherent risk measure as defined above has a simple structure and
affords several equivalent characterizations which we will discuss below.
2.4 Risk Measures 69

2.4.2 Equivalent Characterization of Coherent Risk Measures

Dual Representation

Coherent risk measure is convex. Any l.s.c. convex function on a finite dimensional
Banach space has the dual representation

ρ(x) = sup [ x, y − ρ ∗ (y)], (2.4.1)

y∈RV (Ω,F ,P )

where x, y = E[xy] and ρ ∗ is the Fenchel conjugate of ρ defined in (1.3.1). What

is interesting here is that ρ ∗ for any risk measure ρ satisfying (r1) and (r2) must
be an indicator function. Properties (r3) and (r4) further restrict the support of this
indicator function.
Proposition 2.4.3 (Conjugate of a Sublinear Risk Measure) Let ρ be a risk
measure satisfying axioms (r1) and (r2) in Definition 2.4.2. Then

ρ ∗ = ιM ,

where

M = {y : x, y ≤ ρ(x), ∀x ∈ RV (Ω, F, P )}.

Proof Clearly, for any y ∈ RV (Ω, F, P ), we have

ρ ∗ (y) = sup [ x, y − ρ(x)] ≥ 0, y − ρ(0) = 0.

x∈RV (Ω,F ,P )

For any y ∈ M, ρ ∗ (y) cannot exceed 0 so that it must be equal to 0.

On the other hand, for any y ∈ M, there exists x ∈ RV (Ω, F, P ) such that
x, y − ρ(x) ≥ 0. Since the function x → x, y − ρ(x) is positive homogeneous,
we must have

ρ ∗ (y) ≥ sup[ rx, y − ρ(rx)] = sup r[ x, y − ρ(x)] = +∞.

r>0 r>0

Thus,

ρ ∗ = ιM .

We note that the characterization of M in Proposition 2.4.3 depends on ρ. Thus
we cannot use it to describe ρ. Information leads to ρ independent restriction is
useful. The axioms (r3) and (r4) provide such information.
70 2 Financial Models in One Period Economy

Proposition 2.4.4 (Effect of the Translation Property) Let ρ be a risk measure

satisfying (r1), (r2), and (r3) in Definition 2.4.2. Then there exists a closed convex
subset

M ⊂ {y ∈ RV (Ω, F, P ) : E[−y] = 1},

such that

ρ ∗ = ιM .

Proof By Proposition 2.4.3 M = {y : x, y ≤ ρ(x), ∀x ∈ RV (Ω, F, P ). If ρ

also satisfies (r3), choose x = 1 and x = −1,
respectively we have E[y] ≤ −1 and
E[−y] ≤ 1, respectively. Thus, E[−y] = 1 as was to be shown.
Proposition 2.4.5 (Effect of Monotonicity) Let ρ be a risk measure satisfying
(r1), (r2), and (r4) in Definition 2.4.2. Then there exists a closed convex subset

M ⊂ −RV (Ω, F, P )+ ,

such that

ρ ∗ = ιM .

Proof By Proposition 2.4.3 M = {y : x, y ≤ ρ(x), ∀x ∈ RV (Ω, F, P ). If ρ also

satisfies (r4), then for any y ∈ M and x ∈ RV (Ω, F, P )+ we have x, y ≤ 0 so
that y ∈ −RV (Ω, F, P )+ .

By Example 1.3.3 the Fenchel conjugate of an indicator function is a support

function we derived the following characterization of a coherent risk measure.
Theorem 2.4.6 (Dual Characterization of Coherent Risk Measure) Let ρ be a
risk measure. Then ρ is a coherent risk measure if and only if there exists a closed
convex subset

M ⊂ {y ∈ −RV (Ω, F, P )+ : E[−y] = 1},

such that

ρ = σM ,

where σM is the support function of M as defined in (1.1.1).

Remark 2.4.7 Coherent risk measure is directly related to cash reserve. It is a way
to gauge how much cash reserve one needs to have for investing in a certain risky
asset. The set {y ∈ −RV (Ω, F, P )+ : E[−y] = 1} represents standardized
losses because E[y] = −1. Theorem 2.4.6 tells us a coherent risk measure is in
2.4 Risk Measures 71

essence picking a particular “test” set of typical losses represented by the set M
to determine the level of cash reserve for a certain investment. There are infinitely
many possibilities in choosing the set M and thus determining particular coherent
risk measures. The larger the set M, the more conservative the risk measure
(requiring higher cash reserves). In fact, this is the original motivation for the
definition of the coherent risk measure. The Chicago Merchantile Exchange margin
system is an example of using this method with a finite set M. The idea is rather
similar to “stress” test. In implementation, it is clear that what is important is not
how many elements one includes in M but how “diversified” the elements in M are.

Coherent Acceptance Cone

Definition 2.4.8 (Acceptance Cone) Let ρ be a risk measure satisfying (r1), (r2),
and (r3) in Definition 2.4.2 and define

Aρ := {x ∈ RV (Ω, F, P ) | ρ(x) ≤ 0}. (2.4.2)

Then Aρ is a cone and we call it the acceptance cone induced by ρ.

Acceptance cone induced by a coherent risk measure has special properties and
such a cone actually characterizes the related coherent risk measure. We layout the
details below.
Proposition 2.4.9 Let ρ be a coherent risk measure. Then the related acceptance
cone Aρ has the following properties:
(a1) Aρ is a closed convex cone,
(a2) 1 ∈ Aρ ,
(a3) RV (Ω, F, P )+ ⊂ Aρ .
Proof We merely note that (a1) is a consequence of (r1) and (r2), (a2) follows from
the transitive property (r3) and (a3) is the result of monotone property (r4). Details
are left as an exercise.

What is interesting is that any cone has properties (a1)–(a3) must be the
acceptance set of some coherent risk measure. This leads to the following definition.
Definition 2.4.10 (Coherent Acceptance Cone) We say a set A ⊂ RV (Ω, F, P )
is a coherent acceptance cone provided that it has the following properties:
(a1) A is a closed convex cone,
(a2) 1 ∈ A,
(a3) RV (Ω, F, P )+ ⊂ A.
Theorem 2.4.11 (Coherent Risk and Acceptance Cone) Let A ⊂ RV (Ω, F, P )
be a coherent acceptance cone. Then there exists a coherent risk measure ρA such
that
72 2 Financial Models in One Period Economy

A = {x ∈ RV (Ω, F, P ) | ρA (x) ≤ 0}.

Proof The way to construct ρA is

ρA (x) = inf{t ∈ R | x + t 1 ∈ A}.

All the desired properties then follow naturally. We leave checking the details as an
exercise.

It is natural to ask the relationship between the acceptance cone and the
generating set of a coherent risk measure.
Theorem 2.4.12 (Acceptance Cone and the Generating Set) Let ρ be a coherent
risk measure with a generating set M, i.e. ρ = σM where σM is the support function
of M as defined in (1.1.1). Let Aρ be its acceptance cone. Then

Aρ = −(cone M)+ ,

where cone M is the cone generated by M, i.e. the smallest cone containing M.
Proof We only need to observe x ∈ −(cone M)+ if and only if x, m ≤ 0, ∀m ∈ M
iff ρ(x) = σM (x) ≤ 0, i.e. x ∈ Aρ .

Figure 2.8 provides a graphic illustration of the relationship between M and Aρ .

The coherent acceptance cone provides a dual representation of a coherent risk
measure. It provides a different implementation of margin rules that are essentially
the SEC methods adopted by National Association of Security Dealers (NASD).
The way they implement is to consider a portfolio as consisting of a list of
component securities and for each of these securities there is a corresponding margin

Fig. 2.8 Generating set M

and acceptance set Aρ

RV (Ω, F , P )+

Aρ

M
2.4 Risk Measures 73

requirement. In the language of coherent acceptance cone, this amounts to specify a

set of generating elements of the cone.

Coherent Preference

We know that any closed convex cone induces a continuous partial order. Denote ≤A
the linear partial order defined by a cone A, that is x ≤A y if and only if y − x ∈ A.

Proposition 2.4.13 Let A be a coherent acceptance cone and define partial order
≤A by x ≤A y if and only if y − x ∈ A. Then ≤A has the following properties:
(o1) (Positive homogeneous) 0 ≤A x implies 0 ≤A tx for any t > 0,
(o2) (Additive) x ≤A y and u ≤A v implies x + u ≤ y + v,
(o3) (Reflexive) x ≤A x,
(o4) (Monotone) 0 ≤ x for any x ∈ RV (Ω, F, P )+ .
Proof Exercise.

Properties (o1)–(o4) also characterize partial order generated by a coherent

acceptance set.
Definition 2.4.14 (Coherent Partial Order) We say ≤ is a coherent partial
order provided that it has the following properties:
(o1) (Positive homogeneous) 0 ≤ x implies 0 ≤ tx for any t > 0,
(o2) (Additive) x ≤ y and u ≤ v implies x + u ≤ y + v,
(o3) (Reflexive) x ≤ x,
(o4) (Monotone) 0 ≤ x for any x ∈ RV (Ω, F, P )+ .
Theorem 2.4.15 (Coherent Partial Order and Acceptance Cone) Let ≤ be a
coherent partial order. Then there exists a coherent acceptance cone A such that
x ≤ y if and only if y − x ∈ A.
Proof The coherent acceptance cone can be identified as

A = {x ∈ RV (Ω, F, P ) | 0 ≤ x}.

Verifying the properties of A is not hard and is left as an exercise.

Valuation Bounds and Price System

Definition 2.4.16 (Valuation Bounds) Let ≤ be a coherent partial order in

Definition 2.4.14. We define the related coherent valuation bounds, for x ∈
RV (Ω, F, P ) by
74 2 Financial Models in One Period Economy

and π (x) = sup{r : r 1 ≤ x}.

π (x) = inf{r : x ≤ r 1}

Definition 2.4.17 (Price Operator) Let ≤ be a coherent partial order in Defini-

tion 2.4.14. We say π ∈ RV (Ω, F, P )∗ = RV (Ω, F, P ), π = 0 is a price
operator if, for all 0 ≤ x,

π, x ≥ 0.

We say π is normalized if π, 1 = 1.
Definition 2.4.18 (Consistent Price Operator) Consider a one period financial
market S on RV (Ω, F, P ). We say π ∈ RV (Ω, F, P )∗ \{0} is a consistent price
operator for S, provided that

π, S1 = π, S0 .

Viewing price operators as elements in the dual space is consistent with the one
price principle. The definition of price operators recognizes the relative value of any
payoff 0 ≤ x, or x ∈ A where A is the coherent acceptance cone generating the
partial order ≤. Normalized price is consistent with the value of cash implied in
the translation property of the coherent risk measure. Consistent price operator is,
in fact, looking at martingale measures from the perspective of a pricing system.
The next proposition explains the meaning of valuation bounds and follows directly
from the definition.
Proposition 2.4.19 (Bounds for Normalized Price) Let π be a normalized price
operator. Then, for any x ∈ RV (Ω, F, P ),

π(x) ≤ π, x ≤ π (x).

Proof Exercise.

While the concepts of valuation bounds and prices provide different perspectives
they are closely related to the coherent risk and its equivalent description in terms of
its coherent acceptance cone and coherent partial order as evidenced in the theorem
below.
Theorem 2.4.20 (Valuation Bounds and Coherent Risk Measure) Let ≤ be the
coherent partial order generated by the coherent risk measure ρ and let π and π be
the price bounds induced by the partial order ≤. Then, for any x ∈ RV (Ω, F, P ),

ρ(x) = π(−x) = −π (x).

Proof Consider r ∈ R with −x ≤ r 1. We have 0 ≤ x + r 1 so that ρ(x) − r =

≤ 0 or ρ(x) ≤ r. Taking infimum over all such r we have
ρ(x + r 1)
2.4 Risk Measures 75

ρ(x) ≤ π (−x).
= ρ(x) − ρ(x) = 0 implies that
On the other hand, ρ(x + ρ(x)1)

ρ(x) ≥ π (−x).

The equality π (−x) = −π (x) follows directly from definition.

2.4.3 Good Deal

The concept defined below is a relaxation of arbitrage.

Definition 2.4.21 (Good Deal) Consider a one period financial market S on
RV (Ω, F, P ). Let port[S] be the portfolio space and let W = {Θ · (S1 − S0 ) : Θ ∈
port[S]} be the gain space. For a coherent acceptance cone A we say that x ∈ W is
a good deal with respect to A if there exists r > 0 such that

x − r 1 ∈ A.

In particular, a good deal with respect to A = RV (Ω, F, P )+ is an arbitrage.

We have the following characterization of the existence (or absence) of a good deal.
Proposition 2.4.22 (Existence of Good Deals) Portfolio on S contains a good deal
with respect to A if and only if 1 ∈ W − A. Equivalently, port[S] contains no good
deal with respect to A if and only if 1 ∈ W − A.
Proof If 1 ∈ W − A we can find x ∈ W and a ∈ A such that x − 1 = a ∈ A. In
other words, x is a good deal. On the other hand, if x is a good deal, then x − r 1 = a
for some r > 0 and a ∈ A. Now 1 = x/r − a/r ∈ W − A as was to be shown.

The above characterization for the existence of good deal is from the perspective
of payoffs. We now relate it to price and price bounds. Mathematically, it is a process
of scalarization. What we do here is to consider the potential price of a payoff z in
the market. First we discuss price bounds for a good deal.
Definition 2.4.23 (Good Deal Bounds) Let A be a coherent acceptance cone and
let z ∈ W the gain space of financial market S. We define the upper and lower good
deal bounds with respect to A by

π W (z) = inf {r : x + r 1 − z ∈ A}
r∈R,x∈W

and
76 2 Financial Models in One Period Economy

π W (z) = sup {r : x − r 1 + z ∈ A}.

r∈R,x∈W

As the name suggests, good deal bounds reveal prices for good deals. The interval
[π W (z), π W (z)] is the interval of normalized admissible prices that is consistent
with the absence of a good deal. In fact, if z has a normalized admissible price
P > π W (z), then there exists x = Θ · (S1 − S0 ) ∈ W and 0 < r < P such that
x + r 1 − z ∈ A, then we can sell short z at price P and assemble portfolio Θ · S0 at
time t = 0. When t = 1 the value of the portfolio gives us y = x + P 1 − z. Since
y − (P − r)1 = x + r 1 − z ∈ A, it is a good deal.
The good deal bounds are actually coherent valuation bounds.
Proposition 2.4.24 (Good Deal Bounds as Valuation Bounds) The upper and
lower good deal bounds π W (z) and π W (z) defined in Definition 2.4.23 are actually
coherent valuation bounds.
Proof It is easy to check that π W (−z) = −π W (z). Moreover, rewrite −π W (z) as

−π W (z) = − sup {r : x − r 1 + z ∈ A}
r∈R,x∈W

= inf {r : −r 1 + z ∈ A − W }
−r∈R

= inf {r : z + r 1 ∈ A − W }.
r∈R

Since A − W is a cone containing RV (Ω, F, P )+ , we can see that −π W (z) =

ρA−W (z) is the coherent risk measure corresponding to the coherent acceptance
cone A − W .

Actually, one can show that ρA−W (z) = infx∈W ρA (x + z) (Exercise).

Note that the fundamental theorem of asset pricing is essentially based on the
separation of W and RV (Ω, F, P )+ . The same argument can be applied to yield a
similar result regarding good deal.
Theorem 2.4.25 (Fundamental Theorem of Asset Pricing for Good Deal) Let
A be a coherent acceptance cone and let W = {Θ · (S1 − S0 ) : Θ ∈ port[S]} be
the gain space of financial market S. Then port[S] contains no good deal iff there
exists an admissible consistent normalized price operator (see Definition 2.4.17).
Proof The portfolio space port[S] contains no good deal if and only if W not
intersect with the interior of A if and only if there exists y ∈ RV (Ω, F, P )∗ =
RV (Ω, F, P ) such that

x, y ≤ a, y , ∀x ∈ W and a ∈ A.
2.4 Risk Measures 77

Since 0 ∈ W , we have, for all a ∈ A, a, y ≥ 0. Thus, y is an admissible price.

Since 0 ∈ A, we have, for all x ∈ W , x, y ≤ 0. Since W is a subspace x, y =
y is an admissible consistent
0 for all x ∈ W . This is equivalent to π = y/ 1,
normalized price operator.

2.4.4 Several Commonly Used Risk Measures

We discuss several useful risk measures below paying particular attention on how
many of the standard assumptions of coherent risk measure in Definition 2.4.2 they
satisfy.

Standard Deviation

Variance or equivalently standard deviation has been used as a risk measure since
Markowitz proposed the modern portfolio theory. It satisfies (r1) and (r2) but fails
(r3) and (r4). The standard deviation does not satisfy axiom (r4) which has long
been criticized as unreasonable. Some remedies have been suggested such as count
the deviation only on losses. It turns out that

ρs (x) = E[((x − E[x])− )2 ) − E[x]

is actually a coherent risk measure that is faithful to the idea of using downside
deviation as a measure for risk.
Both implementations suggested by the dual representation Theorem 2.4.6 and
the acceptance cone formulation in Theorem 2.4.11 are viable. For example, if one
uses the acceptance cone to implement, then each security is paired with a margin
requirement equals to its modified standard deviation if that can be estimated.

Drawdown

The maximum absolute drawdown, denoted dd(x) in a given period of time is often
used by traders. This risk measure also satisfies axioms (r1) and (r2) but fails (r3)
and (r4).
As in the case of standard deviation we can also subtract E(x) to make it satisfy
(r3). One way to adjust it so that it has property (r4) is to make the reference point
for maximum down move to the fixed beginning wealth. But this completely distorts
the intention of drawdown as a risk measure.
Both implementations suggested by the dual representation Theorem 2.4.6 and
the acceptance cone formulation in Theorem 2.4.11 are viable without axiom (r4).
The only difference is that the acceptance cone may not contain the entire cone
RV (Ω, F, P )+ . This is not unreasonable in practice.
78 2 Financial Models in One Period Economy

Table 2.6 A discrete loss L Prob

distribution
600 0.02
50 0.03
40 0.05
30 0.10
20 0.10
10 0.05
0 0.65

Value at Risk

The value at risk of a portfolio in a given period is a gauge for the risk of the
portfolio that is important for both portfolio managers and regulators. It is defined
on the random variable of loss, the negative of the payoff.
Definition 2.4.26 (Value at Risk) Let L be the random variable representing the
loss of a portfolio in a given period. The value at risk with confidence level α ∈
(0, 1), denoted by V aRα is defined as

V aRα (L) = inf{l ∈ R | P (L > l) ≤ 1 − α}.

In other words, V aRα is a minimum level of loss which has a probability of

happening 1 − α. The following is an illustration.
Example 2.4.27 (VaR of a Discrete Loss Distribution) Suppose that the loss L is
discretely distributed as in Table 2.6.
Then V aR0.95 (L) = 50, V aR0.9 (L) = 40, and V aR0.8 (L) = 30.
Let FL (l) := P (L ≤ l) be the cumulative distribution function of L. Then

V aRα (L) = inf{l ∈ R | FL (l) ≥ α}.

We define the quantile function of L by

QL (p) = inf{l ∈ R | p ≤ FL (l)}.

When FL is an invertible function, QL = FL−1 .

Value at risk satisfies axioms (r1) and (r4). Similar to the maximum drawdown
one can adjust the cash position and define a revised version that also meets the
requirement of (r3). However, missing (r2) is a big drawback for VaR as a risk
measure and the remedy is complicated.
2.4 Risk Measures 79

Table 2.7 Comparing VaR L Prob α VaR CVaR

and CVaR
600 0.02
50 0.03 0.95 50 270
40 0.05 0.9 40 155
30 0.10 0.8 30 92.5
20 0.10
10 0.05
0 0.65

Conditional Value at Risk

The risk measure defined below can be viewed as a remedy for VaR does not have
the convexity.
Definition 2.4.28 (Conditional Value at Risk) Let L be the random variable that
represents the loss of a portfolio in a given period. The conditional value at risk
with confidence level α ∈ (0, 1), denoted by CV aRα is defined as

1 1
CV aRα (L) = V aRs (L)ds.
1−α α

We can see that CV aRα is the expected or average loss that has a probability
1 − α of happening.
Example 2.4.29 (CVaR of a Discrete Loss Distribution) Suppose again that the loss
L is discretely distributed as in Table 2.6. Then CV aR0.95 (L) = (50 · 0.03 + 600 ·
0.02)/0.05 = 270, V aR0.9 (L) = (40 · 0.05 + 50 · 0.03 + 600 · 0.02)/0.1 = 155, and
V aR0.8 (L) = (30 · 0.1 + 40 · 0.05 + 50 · 0.03 + 600 · 0.02)/0.2 = 92.5 (Table 2.7).
Table 2.7 Compares VaR and CVaR.
We can see that V aR has the effect of give unreasonable incentive to insurance
writers in general and Credit Default Swap (CDS) writers in particular.
It is not hard to see that both V aRα (L) and CV aRα (L) are increasing functions
of α and V aRα (L) is dominated by CV aRα (L).
The following representation reveals that the conditional value at risk is convex
with respect to L.
Theorem 2.4.30 (Representation as an Expectation)

1
CV aRα (L) = min r + E[(L − r)+ ] (2.4.3)
r∈R 1−α
1
= V aRα (L) + E[(L − V aRα (L))+ ].
1−α
80 2 Financial Models in One Period Economy

1
α
QL
FL

0
rα = V aRα (L)

Fig. 2.9 Represent CVaR

Proof Note that for any r,

1 1
E[(L − r)+ ] = (L(ω) − r)+ P (dω) (2.4.4)
1−α 1−α Ω
1 ∞
= 1[t,∞) (L(ω))dtP (dω)
1−α Ω r
1 ∞
= 1[t,∞) (L(ω))P (dω)dt
1−α r Ω
1 ∞
= P (L ≥ t)dt.
1−α r

In particular (see Figure 2.9 in which the shaded area represents E[(L − rα )+ ]),
let r = rα = V aRα (L) we have
1 1 ∞
E[(L − rα )+ ] = P (L ≥ t)dt (2.4.5)
1−α 1−α rα

1 1
= (V aRt (L) − rα )dt
1−α α

1 1
= V aRt (L)dt − rα
1−α α
= CV aRα (L) − rα .

This proves

1
CV aRα (L) = V aRα (L) + E[(L − V aRα (L))+ ].
1−α

To show that the min with respect to r is attained at r = rα we define

1 + 1 +
D= r+ E[(L − r) ] − rα + E[(L − rα ) ] (2.4.6)
1−α 1−α
1 rα
= r − rα + P (L ≥ t)dt,
1−α r
2.4 Risk Measures 81

1
α
QL
FL

0
rα = V aRα (L)

Fig. 2.10 Inequality (2.4.7)

and we need only to show the easy fact that, for any r,
rα
(1 − α)D = (1 − α)(r − rα ) + P (L ≥ t)dt ≥ 0. (2.4.7)
r

The intuition is illustrated in Figure 2.10 in which the short vertical bars signify
r < rα and r > rα , respectively.

The representation (2.4.3) can actually be written as a linear programming which

yields the following dual representation.
Theorem 2.4.31 (Dual Representation)

1
CV aRα (L) = max v, −L : E[−v] = 1, 0 ≤ −v ≤ 1 . (2.4.8)
1−α

Proof We can write the conditional value at risk with confidence level α as the value
function of the following linear programming problem:

1
CV aRα (L) = inf r+ E[u] : u ≥ 0, u + r 1 ≥ L .
r∈R,u∈RV (Ω,F ,P ) 1−α

The Lagrangian of this linear programming problem is

1
L((r, u), (s, v)) = r + 1, u + s, u + v, u + r 1 − L1 ,
1−α

where s, v ≤ 0. For linear programming problem as long as both primal and dual
problems are feasible strong duality holds. Thus, we have

CV aRα (L) = inf sup L((r, u), (s, v))

r,u s≤0,v≤0

= sup inf L((r, u), (s, v))

s≤0,v≤0 r,u
82 2 Financial Models in One Period Economy

1
= sup inf r(1 + v, 1 ) + 1 + s + v, u + v, −L
s≤0,v≤0 r,u 1 − α

1
= sup v, −L : −v, 1 = 1, 1+s+v ≥0
s≤0,v≤0 1−α

1
= sup v, −L : E[−v] = 1, 1 ≥ −v ≥ 0 .
1−α

Since the dual solution exists the sup is, in fact, a max.

As a corollary we see that CV aR is essentially a coherent risk measure.

Corollary 2.4.32 Define ρ(x) = CV aRα (−x). Then ρ is a coherent risk measure.

Estimating CVaR

The dual representation in Theorem 2.4.31 provides a method of estimating the

conditional value at risk. Consider a portfolio Θ. Its corresponding gain is Θ · R
where R = S1 − S0 is the vector of gains of the assets in the financial market. The
loss is then represented by −Θ · r. Now suppose R 1 , . . . , R m is a sample of the gain
vector of size m, then we can estimate the expectation of the return of the portfolio
Θ by
m
1
E[Θ · R] ≈ Θ · Rk .
m
k=1

It follows that
m

1 +
CV aRα (Θ · R) ≈ min r + (Θ · R − r)
k
(2.4.9)
r∈R (1 − α)m
k=1

Thus, by discretizing the dual representation we can estimate

m
CV aRα (Θ · R) ≈ max −vk Θ · R k (2.4.10)
k=1

1
m !
0 ≤ vk ≤ , k = 1, . . . , m, vk = 1 .
(1 − α)m
k=1

We can view vk as an alternative probability measure on the sample space

{R 1 , R 2 , . . . , R m }.
Chapter 3
Finite Period Financial Models

Abstract We now expand our discussion to a multi-period economy with finite

status. This setting models trading in the real world quite well, where we always
only deal with finite number of transactions and finite number of possible scenarios.
On the technical side, both payoffs and trading strategies are still belonging to
finite dimensional vector spaces. The first three sections show that the key results
in one period economy also hold in the more general setting of a multi-period
economy. Section 3.4 discusses super and sub-hedging from the perspective of
duality. Section 3.5 discusses how to model the more practical financial markets
with bid and ask spreads.

3.1 The Model

3.1.1 An Example

Consider the game of bet on flipping a fair coin.

• Head: the house will double your bet.
• Tail: you lose your bet to the house.
Play the game i times and always bet 1 unit. Denote the outcome of the ith game
by Xi . Then Xi is a random variable and P (Xi = 1) = P (Xi = −1) = 1/2. If we
start with an initial endowment of w0 , then our total wealth after the ith game is

wi = w0 + X1 + . . . + Xi . (3.1.1)

Now (wi )ni=1 is an example of a discrete stochastic process.

We turn to consider the available information at each stage. Suppose we know
X1 , . . . , Xi . Does this help us to play the (i + 1)th game? In this case we have no
reason to believe so. How do we clearly describe this conclusion? Let us look at the

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 83

P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2_3
84 3 Finite Period Financial Models

game with n = 3 to get some feeling. We use H to represent a head and T , tail. The
information we can get at each stage can be illustrated with the following binary
tree.

F0 F1 F2 F3
HHH
HH
HHT
H
HT H
HT
HT T
{Ω}
T HH
TH
T HT
T
TTH
TT
TTT

In this example all the information are represented by F3 = 2Ω ,where

Ω = {H H H, H H T , H T H, H T T , T H H, T H T , T T H, T T T }.

Similarly, after 2 tosses F2 = 2{H H,H T ,T H,T T } , where

{H H, H T , T H, T T } = {{H H H, H H T }, {H T H, H T T }, {T H H, T H T },
{T T H, T T T }}.

F2 has less information than F3 . Similarly, F1 = 2{H,T } , where

{H, T } = {{H H H, H H T , H T H, H T T }, {T H H, T H T , T T H, T T T }}.

At the beginning F0 = {∅, {Ω}}.

A random variable such as wi relies only on information up to time i. Then, for
any a, (wi < a) ∈ Fi . In other words, wi is Fi -measurable. We say a stochastic
process X = (Xi ) is F-adapted if, for each i, Xi is Fi -measurable. The random
process (wi ) in the coin toss example is F-adapted.

3.1.2 A General Model

We continue using probability space (Ω, F, P ) to represent an economy where

the sample space Ω is finite. Transactions now can happen in a finite set of
times {0, 1, . . . , T } instead of only {0, 1}. Involving transactions at multiple stages
3.1 The Model 85

requires us to be more elaborative about the information available at each of the

stags. An information structure is a finite chain of σ -algebras of Ω: F = {{∅, Ω} =
F0 ⊂ F1 ⊂ . . . ⊂ FT = F}. It represents the gradually revealing information as
illustrated in the previous subsection. Since Ω is finite, each Ft is generated by a
finite number of atoms Bt = {Btn , n = 1, . . . , Nt }. We model a financial market
with an (M + 1)-dimensional F-adapted stochastic process S = (S0 , S1 , . . . , ST )
where St = (St0 , St1 , . . . , StM ) represents the prices of M + 1 assets at time t and is
Ft -measurable. Again we assume the risk free rate is 0 so that St0 = 1.
Definition 3.1.1 (Portfolio) We say Θt is a portfolio on the time interval [t, t + 1)
if Θt is an Ft -measurable vector in RM+1 . Two portfolios Θt1 and Θt2 are equivalent
on market S if their restriction on all the atoms Btn , n = 1, 2, . . . , Nt of Ft are
equivalent in the sense of Definition 2.3.2. We define the norm of a portfolio Θt by
"
# Nt
#
Θt p = $ Θt |Btn 2p
n=1

where Θt |Btn p is the portfolio norm as in Definition 2.3.4.

We note that the portfolio space and portfolio norm in Definition 2.3.4 is a special
case of Definition 3.1.1.
Definition 3.1.2 (Trading Strategy) A trading strategy Θ = (Θ0 , Θ1 , . . . , ΘT −1 )
is an F-adapted process of (M + 1)-dimensional random vectors where each Θt
is a portfolio on [t, t + 1). The space of all trading strategies is called the trading
strategy space on market S and is denoted by ts[S]. We define the norm of a trading
strategy Θ ∈ ts[S] by
"
#T −1
#
Θts = $ Θt 2p .
t=0

Then (ts[S], · ts ) is a finite dimensional Banach space.

In a real world of investing, the investors often face scenarios in which not all the
trading strategies in ts[S] are available. For example,
• If short selling is not allowed, then the set of admissible trading strategies is
defined by

ts[S]+ = {Θ ∈ ts[S] | Θt ≥ 0, t = 0, 1, . . . , T − 1}.

• If for a particular investor only a subset of the assets {S 0 , S 1 , . . . , S k } is available,

then the set of admissible trading strategies becomes

ts[{S 0 , S 1 , . . . , S k }] = {Θ ∈ ts[S] | Θtm = 0, m = k + 1, . . . , M,

t = 0, 1, . . . , T − 1}.
86 3 Finite Period Financial Models

• Suppose a subset of the assets S k+1 , . . . , S M can only be traded at t = 0 and

t = T . Then the set of admissible trading strategies is defined by

{Θ ∈ ts[S] | Θtm
=Θ0m , m=k + 1, . . . , M, t=1, . . . , T − 1}.

By choosing different subset of ts[S] we can conveniently handle different

scenarios of the finite period financial model over economy (Ω, F, P ). We can
view various questions related to these scenarios as to find suitable admissible
trading strategies to obtain preferred risk adjusted gains. However, the preference
will depend on the agent who is usually risk avert. By and large, there are two ways
of modeling the risk aversion: using concave utility functions and using convex risk
or loss functions. As a result, problems related to these financial models will be
handled in the framework of maximizing expected utility functions or minimizing
convex risk functions. Thus, tools in convex analysis again play essential roles.
We say a trading strategy is self-financing if

Θt−1 · St = Θt · St , t = 1, 2, . . . , T − 1.

We use T to denote all self-financing trading strategies on market S. Clearly T is a

subspace of ts[S]. The gain of a self-financing trading strategy Θ up to time t is the
cumulative gains of portfolios Θs , s = 0, 1, . . . , t − 1:

t
Gt (Θ) := Θs−1 · (Ss − Ss−1 ) = Θt−1 · St − Θ0 · S0 .
s=1

We can verify that Gt (Θ) ∈ RV (Ω, F, P ) for all t = 1, 2, . . . , T .

The norm of a trading strategy is a good proxy for its leverage level which is very
important for many purposes. As a corollary of Lemma 2.3.3 we have
Corollary 3.1.3 There exists a constant K = K(S) that depends only on market S
such that for any self-financing trading strategy Θ ∈ T ,

Θts ≤ K max{Gt (Θ)RV , t = 1, 2, . . . , T }.

3.2 Arbitrage and Admissible Trading Strategies

We extend the definition of arbitrage in Definition 2.3.5 to trading strategies.

Definition 3.2.1 (Arbitrage Trading Strategy) We say that a self-financing trad-
ing strategy Θ on market S is an arbitrage if Gt (Θ) ≥ 0, t = 1, . . . , T
and GT (Θ) = 0.
3.2 Arbitrage and Admissible Trading Strategies 87

In every practical trading there is always a limit in how much one can lose. This
leads to the concept of admissible trading strategies described below.
Definition 3.2.2 (Admissible Trading Strategy) Let a > 0 be a constant. We
say that a self-financing trading strategy Θ ∈ T is a-admissible if, for all t =
1, 2, . . . , T ,

Gt (Θ) ≥ −a. (3.2.1)

We use A(a) to denote the (convex) set of all a-admissible trading strategies.
An arbitrage trading strategy is a-admissible for any a > 0. Thus, we have
Lemma 3.2.3 For a > 0, T contains no arbitrage if and only if A(a) contains no
arbitrage.
The next lemma shows that when T contains no arbitrage to show Θ is a-
admissible we need only to check condition (3.2.1) at t = T .
Lemma 3.2.4 If T contains no arbitrage, then Θ ∈ T is a-admissible if and only
if

GT (Θ) ≥ −a. (3.2.2)

Proof The “only if” part is obvious.

To prove the “if” part observe first that without loss of generality we may assume
that the initial endowment Θ0 · S0 = 0 so that Gt (Θ) = Θt−1 · St , t = 1, 2, . . . , T .
Now assume that (3.2.2) holds and Θ is not a-admissible. Then there exist t ≤ T
and A ∈ Ft such that on A,

Θt−1 · St = b < −a

and Θs−1 · Ss ≥ −a on A for all s ≥ t.

Define a trading strategy Θ̄ as follows: for all s ≤ t − 1, Θ̄s = 0. For ω ∈ A,
Θ̄t (ω) = 0 and for ω ∈ A,

Θt0 (ω) − b f or n = 0
Θ̄tn (ω) = (3.2.3)
Θtn (ω) f or n = 1, 2, . . . , M.

For s > t define

Θ̄t · St+1 f or n = 0
Θ̄sn = (3.2.4)
0 f or n = 1, 2, . . . , M.
88 3 Finite Period Financial Models

We can see that Θ̄ is F-adapted. Moreover, for ω ∈ A,

M
Θ̄t · St = Θt0 − b + Θtn Stn (3.2.5)
n=1

= Θt · St − b = Θt−1 · St − b = 0 = Θ̄t−1 · St .

For ω ∈ A, Θ̄t · St = 0 = Θ̄t−1 · St by definition. For s > t, Θ̄s−1 · Ss = Θ̄t · St+1

are pure cash and, therefore, Θ̄ is a self-financing trading strategy.
Finally, for all s > t,

Θ̄s−1 · Ss = Θ̄t · St+1 (3.2.6)

M
= Θt0 − b + Θtn St+1
n

n=1

Θt · St+1 − b > −a − b > 0 for ω ∈ A
= .
0 for ω ∈ A.

This implies that Θ̄ is an arbitrage, which leads to a contradiction.

We can also show that when there is no arbitrage the set of admissible trading
strategies A(a) is compact.
Lemma 3.2.5 For any a > 0, if A(a) contains no arbitrage, then it is bounded and
compact.
Proof We first show that A is bounded. For t = 1, 2, . . . , T , let us denote At =
{Θ ∈ A : Θs contains only cash position for s > t − 1}. We note that AT = A
and prove by induction on t. Again without loss of generality we assume the initial
endowment is always 0.
For t = 1, assume that there is no arbitrage but A1 is unbounded. By
Corollary 3.1.3 there exists a sequence of trading strategies Θ(m) ∈ A1 such that
Θ(m)0 · S1 is unbounded. Without loss of generality we may assume that, for all
m, Θ(m)0 · S1 > 1 and Θ(m)0 · S1 → +∞ then Θ(m)/Θ(m)0 · S1 ∈ A1
and is bounded by Corollary 3.1.3. Selecting a subsequence if necessary we may
assume that Θ(m)/Θ(m)0 · S1 converges to Θ ∗ ∈ A1 . Since Θ(m)0 · S1 ≥ −a,
taking limit we have

lim Θ(m)1 · S1 /Θ(m)1 · S1 ) = Θ1∗ · S1 ≥ 0.

m→∞

On the other hand, we also know from the above limiting process that Θ1∗ ·S1 = 1.
This means Θ ∗ is an arbitrage, a contradiction.
3.3 Fundamental Theorem of Asset Pricing 89

Now under the induction hypothesis of As , s = 1, 2, . . . , t − 1 are all bounded,

we show that At is bounded. Assume that the contrary holds. Then there exists a
sequence of trading strategies Θ(m) ∈ At such that Θ(m)t−1 · St is unbounded.
Since all As , s = 1, 2, . . . , t − 1 are bounded, the portfolio Θt−1 (m) must
be unbounded. Then the same argument as in the case of t = 1 will yield a
contradiction. This completes the induction proof and, therefore, A is bounded.
Since Θt · St is continuous in Θt , A defined by constraint (3.2.1) is also closed
and, therefore, it is compact.

3.3 Fundamental Theorem of Asset Pricing

Now we turn to prove the FTAP in multiperiod market model and discuss related
applications.

3.3.1 Fundamental Theorem of Asset Pricing

As in the case of T = 1, we prove the FTAP by considering a pair of dual convex

programming problems in which the primal is maximizing utility among admissible
trading strategies:

sup{E[u(ΘT −1 · ST )] : Θ0 · S0 = w0 , Θ ∈ T }, (3.3.1)

where T the set of self-financial trading strategies. We show that a solution to the
dual of (3.3.1) when scaled gives us a martingale measure and, thus, linking the
fundamental theorem of asset pricing to utility maximization problem (3.3.1).
Theorem 3.3.1 Let S be a financial market. Then the following statements are
equivalent:
(i) There exists no arbitrage trading strategy in T ;
(ii) For every utility function u with properties (u1), (u2), and (u3), the finite
optimal value of the trading strategy utility optimization problem (3.3.1) is
attained.
(iii) There is an equivalent S-martingale measure proportional to an element of the
subdifferential of the utility function at the optimal portfolio.

Proof First observe that the utility optimization problem (3.3.1) can be written
equivalently as

max E[u(y)] (3.3.2)

subject to y ∈ w0 + W,
90 3 Finite Period Financial Models

where W = {GT (Θ) : Θ ∈ T } is the linear subspace of all achievable gains using
self-financing trading strategies.
Defining f (y) = −E[u(y)] and g(y) = ιw0 +W (y), we can rewrite prob-
lem (3.3.2) as

− min{f (y) + g(y)} (3.3.3)

The dual problem of (3.3.3) is

− max{−f ∗ (−z) − g ∗ (z)} (3.3.4)

∗
= min {E[(−u) (−z)] + z, w0 + σW (z)}

Since we can check that the constraint qualification condition

0 ∈ ri[dom(g) − dom(f )] = (w0 + W ) − RV (Ω, F, P )+ (3.3.5)

holds, (3.3.3) and its dual (3.3.4) have the same value.
When T contains no arbitrage, by property (u2) of the utility function,
E[u(ΘT −1 ·ST )] > −∞ implies ΘT −1 ·ST ≥ 0 or GT (Θ) ≥ −w0 . By Lemma 3.2.4,
we must have Θ ∈ A(w0 ). Thus, the utility maximization problem (3.3.1) is
equivalent to

sup{{E[u(ΘT −1 · ST )] : Θ0 · S0 = w0 , Θ ∈ A(w0 )}. (3.3.6)

By Lemma 3.2.5 problem (3.3.6) and, therefore, (3.3.2) has a finite solution. By
the strong duality, the dual problem (3.3.4) has a finite optimal value and attains
its solution. Condition (u2) forces the domain of E[(−u)∗ (·)] to be a subset
of int (−RV (Ω, F, P )+ ). Thus, we only need to consider z > 0 in the dual
problem (3.3.4). Moreover, we must have z, GT (Θ) = 0 in (3.3.4) since σW (z) <
∞ and W is a subspace of RV (Ω, F, P ). Hence we can write problem (3.3.4) as

min {E[(−u)∗ (−z)] + w0 , z | z > 0, z, GT (Θ) = 0, ∀Θ ∈ T }. (3.3.7)

Let z̄ be a solution to (3.3.7) it is easy to check that Q = (z̄/E[z̄)]P is an equivalent

S-martingale measure. Thus, (i) implies (ii).
On the other hand, the existence of an equivalent S-martingale measure implies
that the dual problem (3.3.4) has a finite value and, therefore is equivalent to prob-
lem (3.3.7) whose dual is the utility maximization problem (3.3.1). Problem (3.3.7)
can be viewed as minimizing the convex function z → E((−u)∗ (−z)) + w0 , z
over the entire subspace {z : z, GT (Θ) = 0, ∀Θ ∈ T (z > 0 is merely a
consequence of the domain of E[(−u)∗ (·)] being a subset of int[−RV (Ω, F, P )+ ]
and, therefore, is not a separate constraint). Thus, the constraint qualification
condition for (3.3.7) satisfies. It follows that problem (3.3.1) as the dual of (3.3.7)
has a finite value and attains its solution, which is to say that (ii) implies (iii).
3.3 Fundamental Theorem of Asset Pricing 91

Finally, if (iii) is true, then there cannot be any arbitrage in T because adding an
arbitrage to the optimal solution of (3.3.1) will improve it. Thus, (iii) implies (i) and
we have completed a cyclic proof of the equivalence of (i), (ii), and (iii).

3.3.2 Relationship Between Dual of Portfolio Utility

Maximization, Lagrange Multiplier and Martingale
Measure

Although no arbitrage is equivalent to the existence of an equivalent martingale

measure is well known, the proof of Theorem 3.3.1 using a class of utility functions
says more. It tells us that the risk neutral measure is, in fact, a scaling of the
solution to the dual of the portfolio utility maximization problem. Moreover, since
the dual solution corresponding to the Lagrange multipliers of the primal portfolio
utility maximization problem, we see that the equivalent martingale measure can
also be explained as the scaling of the Lagrange multiplier of the portfolio utility
maximization problem.
To see this relationship explicitly, let us write the utility optimization prob-
lem (3.3.1) as

inf{E[(−u)(x)] : x − GT (Θ) − w0 = 0, Θ ∈ T }. (3.3.8)

The existence of the solution to the dual of (3.3.8) implies the existence of a
Lagrange multiplier λ ∈ RV (Ω, F, P ) such that the Lagrangian

L((x, Θ), λ) = E[(−u)(x)] + λ, x − GT (Θ) − w0

= E[(−u)(x) + λ(x − w0 )] − λ, GT (Θ)

attains minimum at solution (x ∗ , Θ ∗ ) to the problem (3.3.1). It follows that, for any
P (ω) = 0,

λ(ω) ∈ −∂(−u)(x ∗ (ω)) ⊂ (0, +∞) (3.3.9)

and, since Θ → λ, GT (Θ) is linear,

λ, GT (Θ) = 0, ∀Θ ∈ T . (3.3.10)

It is easy to deduce from (3.3.10) that E[λ(St − St−1 ) | Ft−1 ] = 0. Thus, Q =

(λ/E[λ])P is a martingale probability measure for market S equivalent to P .
92 3 Finite Period Financial Models

3.3.3 Pricing Contingent Claims

A contingent claim is a random variable φT ∈ RV (Ω, F, P ) as a payoff at time

T . For simplicity, below we will only consider European style contingent claim for
which the payoff is set at t = T . We consider the problem of finding price φ0 for
this contingent claim at time t = 0 that does not provide any arbitrage opportunity.
Again we consider the portfolio utility optimization problem

minimize E[(−u)(x)] (3.3.11)

subject to x − β(φ(ST ) − φ0 + GT (Θ)) − w0 = 0,
Θ ∈T.

Using the same argument as in the previous subsection, we can show that there
exists a Lagrange multiplier λ ∈ RV (Ω, F, P ) such that, for any P (ω) = 0,

λ(ω) ∈ −∂(−u)(x ∗ (ω)) ⊂ (0, +∞) (3.3.12)

and Q = (λ/E[λ])P is a martingale probability measure for market S equivalent to

P . Moreover,

φ0 = EQ [φT ]. (3.3.13)

The above arguement can also be used to derive a no arbitrage price for φT at any t <
T in terms of a martingale measure. Formula (3.3.12) indicates that the martingale
measure used to pricing a contingent claim, in general, relies on the risk aversion of
an agent. Thus, agents with different risk aversions and, therefore, different utility
functions may reasonably price the same contingent differently.

3.3.4 Complete Market

Similar to Section 2.3.3 we introduce the following definition.

Definition 3.3.2 (Complete Market) We say a financial market S is complete if

{ΘT −1 · ST | Θ ∈ T } = RV (Ω, F, P ).

If S is not complete, then S is said to be incomplete.

Similar to the one period model, the completeness of a multiperiod financial
market is also characterized by the uniqueness of the martingale measure.
Proposition 3.3.3 (Unique Martingale Measure) Let S be a complete financial
market. Then there is only one unique equivalent martingale measure.
3.4 Hedging and Super Hedging 93

Proof Let W = {GT (Θ) | Θ ∈ T }. We can see that W = {ΘT −1 · ST | Θ ∈

T , Θ0 · S0 = 0} and, therefore, dim W =dim {Θ · S1 | Θ ∈ T } − 1. Thus, for a
complete market dim W ⊥ =1. Hence, in a complete market equivalent martingale
measure is unique.

The discussion in Section 2.3.3 can be extended to multi-period model.

Theorem 3.3.4 Suppose that equivalent martingale measure Q on market S is
unique and S has no arbitrage. Then portfolio optimization problem (3.3.8) is
equivalent to

minimize E[(−u)(x)] (3.3.14)

subject to E [x] = w0 .
Q

As we have seen in the one period case this is merely calculating the optimal end
wealth using the Lagrangian. Proof is similar to that of the one period case and is
omitted.

3.4 Hedging and Super Hedging

If the market price of an asset violates those specified by the fundamental theorem of
asset pricing, then in theory an arbitrage opportunity arises. We turn to the problem
of how to take advantage of such an arbitrage opportunity.

3.4.1 Super- and Sub-hedging Bounds

Consider an European style contingent claim whose payoff at T is ψ. By the

fundamental theorem of asset pricing, the price of ψ at t = 0 must belong to the
set {EQ [ψ] : Q ∈ M} to be arbitrage free. Here M is the set of all martingale
measures equivalent to P . It follows that

ψ = sup{EQ [ψ] : Q ∈ M} (3.4.1)

and

ψ = inf{EQ [ψ] : Q ∈ M} (3.4.2)

give us upper and lower bounds for the price of ψ. If the price of ψ fells outside
of these bounds, an arbitrage will become possible. We call them super- and sub-
hedging bounds, respectively. We focus on the super-hedging bound. The discussion
94 3 Finite Period Financial Models

about the sub-hedging bound can be reduced to that of a super hedging bound for
−ψ because

− ψ = sup{EQ [−ψ] : Q ∈ M}. (3.4.3)

If the market price of ψ is above this super hedging bound how can we find an
arbitrage strategy? It turns out that the key is to view (3.4.1) as a linear programming
problem and consider its dual. As discussed before that for a linear programming
problem and its dual, the constraint qualification condition ensuring the strong
duality is, in fact, the feasibility condition. So the key is to correctly formulate
the dual problem of (3.4.1). We will use the Lagrange formulation. Let’s assume
{Θn }N
n=1 is a bases for the finite dimensional Banach space T of self-financing
trading strategies. Then we can rewrite (3.4.1) as

ψ = sup {EQ [ψ] : EQ [GT (Θ)] = 0, EQ [1] = 1, Θ ∈ T } (3.4.4)

Q∈M +

= sup {EQ [ψ] : EQ [1] = 1, EQ [GT (Θn )] = 0, n = 1, . . . , N },

Q∈M +

where M + signifies the set of all positive measures. We can see that (3.4.4) is a
linear programming problem. Moreover, the Lagrangian of (3.4.4) is

N
L(Q, λ) = E [ψ] +
Q
λn EQ [GT (Θn )] + λ0 (EQ [1] − 1), (3.4.5)
n=1

where λ = (λ0 , λ1 , . . . , λN ) ∈ RN +1 is the Lagrange multiplier. Observe that

elements Θ ∈ T can be represented as

N
Θ= λn Θn
n=1

we can equivalently view (Θ, λ0 ) as a Lagrange multiplier of the linear program-

ming problem (3.4.4) and write the Lagrangian as,

L(Q, (Θ, λ0 )) = EQ [ψ] + EQ [GT (Θ)] + λ0 (EQ [1] − 1), (3.4.6)

where (Θ, λ0 ) ∈ T × R. It is easy to verify that

EQ [ψ] Q∈M
inf L(Q, (Θ, λ0 )) =
(Θ,λ0 ))∈T ×R −∞ otherwise.
3.4 Hedging and Super Hedging 95

Thus, we can write

ψ = sup inf L(Q, (Θ, λ0 )) (3.4.7)

Q∈M + (Θ,λ0 )∈T ×R

and by strong duality we have

ψ= inf sup L(Q, (Θ, λ0 )) (3.4.8)

(Θ,λ0 )∈T ×R Q∈M +

= inf sup {EQ [ψ + GT (Θ)], EQ [1] = 1}

Θ∈T Q∈M +

= inf sup {ψ(ω) + GT (Θ)(ω)}

Θ∈T ω∈Ω

The financial interpretation of the last expression in (3.4.8) is that a solution to

problem (3.4.8), if exists, is a trading strategy that results in a payoff that is always
bounded by the super-hedging bound. Thus, if the market price exceeds the super-
hedging bound, one has an arbitrage strategy.
The arbitrage trading strategy alluded to above can be found by solving the linear
programming problem

min t (3.4.9)
s.t. t − GT (Θ)(ω) ≥ ψ(ω), ω ∈ Ω
Θ ∈ T , t ∈ R.

Let Θ̄ and t¯ = ψ be the solution of (3.4.9). If the market price of the contingent
claim at t = 0 is

ψ0 > ψ,

then we can short one share of the contingent claim and follow the trading strategy
−Θ (or equivalently, short the trading strategy Θ). By time t = T , we have

t¯ − GT (Θ̄)(ω) ≥ ψ(ω), ∀ω ∈ Ω.

That is to say the gain from the trading and cash amount ψ safely covers the short
position in any possible economic state and the difference ψ0 − ψ becomes our
arbitrage profit.
96 3 Finite Period Financial Models

3.4.2 Towards a Complete Market

If we know the prices of some European contingent claims, say φ1 , . . . , φK at t =

0 to be c1 , . . . , cK , respectively, then to avoid arbitrage the estimate of the upper
bound for a contingent claim ψ is

sup{EQ [ψ] : Q ∈ M, EQ [φk ] = ck , k = 1, . . . , K}. (3.4.10)

Denote c = (c1 , . . . , cK ) and φ = (φ1 , . . . , φK ) we can write the Lagrangian of the

constrained optimization problem (3.4.10) as

L(Q, (Θ, λ0 , b)) = EQ [ψ] + EQ [GT (Θ)] + λ0 (EQ [1] − 1) + b · (EQ [φ] − c),

where (Θ, λ0 , b) ∈ T × R × RK .
Similar to the previous section we can verify that, by the strong lagrange duality,

ψ|φ = inf sup L(Q, (Θ, λ0 , b)) (3.4.11)

(Θ,λ0 ,b)∈T ×R×RK Q∈M +

= inf sup {EQ [ψ + GT (Θ) + b · (φ − c)], EQ [1] = 1}

(Θ,b)∈T ×RK Q∈M +

= inf sup {ψ(ω) + GT (Θ)(ω) + b · (φ(ST )(ω) − c)}.

(Θ,b)∈T ×RK ω∈Ω

The financial interpretation of the last expression in (3.4.11) is that a solution to

problem (3.4.11), if exists, is a trading strategy that results in a payoff that is always
bounded by the super-hedging bound. Thus, if the market price exceeds the super-
hedging bound, one has an arbitrage strategy, which can be calculated using a liner
programming problem similar to that of in (3.4.9).
Here with the additional tradable contingent claims φ1 , . . . , φK , the upper bound
for the no arbitrage price is lowered and correspondingly the lower bound will
be increased so that we get a more accurate estimate of the price. If we add
enough additional contingent claims as the tradable, the market eventually becomes
complete in the sense that the upper and lower bounds will coincide to give us a
unique price. In view of the proof of Proposition 3.3.3 the precise condition for the
uniqueness of the price is the subspace

W = {GT (Θ) + b · (φ − c) | (Θ, b) ∈ T × RK } (3.4.12)

of RV (Ω, F, P ) has a codimension 1 (the dimension of W is exactly 1 less than

that of RV (Ω, F, P )).
3.4 Hedging and Super Hedging 97

3.4.3 Incomplete Market Arise from Complete Markets

We turn to consider an incomplete market arises from complete markets. A

motivating example is a call option on a currency spread. For simplicity let us
consider a one period economy where transactions take place at t = 0 and t = 1.The
payoff of a call option on the spread of two different currencies C 1 , C 2 with a strike
K in terms of a third currency at t = 1 is then

(C11 − C12 − K)+ . (3.4.13)

Since C 1 and C 2 are different currencies, it is reasonable to model their value in

terms of the common currency at time t = 1 as random variables in two different
probability spaces (Ω1 , F 1 , P1 ) and (Ω2 , F 2 , P2 ), respectively. We assume that
both markets for C 1 and C 2 are complete. Moreover, we assume that Pi is the unique
martingale measure for C i , i = 1, 2. If we consider (3.4.13) to be a special form of
the more general contingent claim ψ = ψ(C11 , C12 ), then ψ is a random variable
on the product measure space (Ω1 × Ω2 , F1 × F2 ). Our problem now is to seek a
martingale measure π on (Ω1 × Ω2 , F1 × F2 ), which prices ψ so as to consistent
with the martingale measures P1 and P2 , respectively. Consider a contingent claim
φ 1 (C 1 ) that depends only on C 1 . We can view this payoff both as a random variable
on (Ω1 , F 1 , P1 ) and as a random variable on (Ω1 ×Ω2 , F1 ×F2 , π ). Thus requiring
π to be consistent with P1 is to require

φ 1 (C11 )dP1 = φ 1 (C11 )dπ. (3.4.14)

Ω1 Ω1 ×Ω2

Since φ 1 (C11 ) is arbitrary this is to say that P1 is the marginal probability measure
of π on Ω1 . Similarly, P2 must be the marginal probability measure of π on Ω2 .
Clearly, product measure π that satisfies such marginal requirements is not unique.
We see that despite the completeness of the financial markets on Ω1 and Ω2 , in
pricing a contingent claim with payoff as a random variable on the product measure
space (Ω1 × Ω2 , F1 × F2 ), we face an incomplete market.
To find the upper bound for the price of ψ that is consistent with the no arbitrage
principle we face the optimization problem

ψ̄ = sup Eπ [ψ], (3.4.15)

π ∈Π(P1 ,P2 )

where Π (P1 , P2 ) signifies the set of all probability measures on the product measure
space (Ω1 × Ω2 , F1 × F2 ) whose marginals on Ω1 and Ω2 are P1 and P2 ,
respectively.
Convex duality again plays an important role in dealing with problem (3.4.15).
We illustrate by an example.
Example 3.4.1 (Estimate Upper No Arbitrage Bound in Finite Sample Spaces)
98 3 Finite Period Financial Models

Suppose that both sample spaces Ω1 and Ω2 are finite. Denote Ω1 = {i : i =

1, . . . , L} and Ω2 = {j : j = 1, . . . , M}, respectively. For brevity of the notation
we denote

ψij = ψ(C 1 (i), C 2 (j )).

Then the problem of finding an upper bound for the contingent claim ψ(C 1 , C 2 )
can be formulated as

max ψij πij (3.4.16)

s.t. πij − μi = 0, πij − νj = 0

j i

C11 (i)μi = C01 , C12 (j )νj = C02

i j

μi = 1, νj = 1.
i j

The dual of the linear programming problem (3.4.16) is

min λ1 C01 + λ2 C02 + λ3 + λ4 (3.4.17)

s.t. ui + vj ≥ ψij
λ1 C11 (i) + λ3 − ui ≥ 0
λ2 C12 (j ) + λ4 − vj ≥ 0.

Defining φ 1 (C 1 ) = λ1 C 1 + λ3 and φ 2 (C 2 ) = λ2 C 2 + λ4 we can rewrite (3.4.17) as

min φ 1 (C01 ) + φ 2 (C02 ) (3.4.18)

s.t. φ 1 (C11 (i)) + φ 2 (C12 (j )) ≥ ψij .

Note that φ 1 and φ 2 linearly depend on C 1 and C 2 , respectively. Thus, prob-

lem (3.4.18) is a linear programming problem.
Remark 3.4.2 In general, problem (3.4.15) has to be dealt with in infinite dimen-
sional spaces. The dual problem is

ψ̄ = sup Eπ [ψ] = min EP1 [φ 1 ] + EP2 [φ 2 ] , (3.4.19)
π ∈Π(P1 ,P2 ) (φ1 ,φ2 )∈Gψ

where Gψ := {(φ 1 , φ 2 ) ∈ (Ω1 × Ω2 , F1 × F2 ) : φ 1 (ω1 ) + φ 2 (ω2 ) ≥

ψ(C11 (ω1 ), C12 (ω2 ))}.
3.5 Conic Finance 99

Again this shows that in principle one can implement the upper no arbitrage price
bound ψ̄ using the sum of two contingent claims φ 1 and φ 2 on sample spaces Ω1
and Ω2 , respectively.

3.5 Conic Finance

Real financial markets have frictions. Trading a financial asset one faces two
different prices: ask and bid. Usually, the ask is strictly larger than the bid and one
can only buy at the ask price and sell at the bid price. This violation of the one price
principle complicates the modeling. The attainable gains from trading assets in such
a more realistic market model is not a subspace but rather, in general, a cone. This
leads to the name of conic finance.

3.5.1 Modeling Financial Markets with an Ask-Bid Spread

Let F = {{∅, Ω} = F0 ⊂ F1 ⊂ . . . ⊂ FT = F} be an information structure on the

probability space (Ω, F, P ) with a finite sample space that represents the economic
states. Denote X the space of all F-adapted cash streams x = (x0 , x1 , . . . , xT )
endowed with the inner product
% T
&
x, y = E xt yt .
t=0

Then X is a finite dimensional Hilbert space. We say cash stream x dominates

that of y denoted x ≥ y if xt ≥ yt , t = 0, 1, . . . , T . At any time t one can
only trade the cash stream in x that come after t, which we will denote [x]t =
(0, . . . , 0, xt+1 , . . . , xT ).
Definition 3.5.1 (Conic Financial Market) A conic financial market C consists
of risky cash streams S m ∈ X , m = 1, 2, . . . , M and riskless bonds 1u , u =
0, 1, 2, . . . , T where 1uu = 1 and 1ut = 0 for t = u. At time t, to trade the rights to
the cash stream of [x]t ∈ C, there is a bid and ask price pair:

bt ([x]t ) ≤ at ([x]t ) (3.5.1)

Note that it is important to specify the trading time t for the bid and ask prices.
Trading [x]t at time s > t would use information that are not available for random
variables xk , k = t, . . . , s − 1 and is impossible. Trading [x]t at s ≤ t is legitimate
but the prices for different s are different. Paying at ([S m ]t ) at t one will get the cash
stream [S m ]t = (0, . . . , 0, St+1
m , . . . , S m ). Similarly, receiving b ([S m ]t ) one sells
T t
the cash stream [S ] or in other words get the cash stream −[S m ]t . The riskless
m t
100 3 Finite Period Financial Models

cash stream 1u , u = 1, 2, . . . , T can be regarded as bonds maturing at time t = u

and 10 is the unit cash at time t = 0. Thus, bt (1u ) and at (1u ) are the bid and ask
prices for a bond issued at t and matures at u, respectively.
A convenient way of thinking the trading of these income streams is to
incorporate the buying cost or selling revenue into the cash streams to yield zero
cost cash streams. For example, the action of buying cash stream [S m ]t at time
t with ask price at ([S m ]t ) is equivalent to acquiring the zero cost cash stream
S mt := [S m ]t − at ([S m ]t )1t , i.e.
⎧
⎪
⎪
⎨0 s<t
Ss = −at ([S ] ) s = t
mt m t (3.5.2)
⎪
⎪
⎩S m s > t.
s

Symmetrically, selling the above cash stream at the bid price bt ([S m ]t ) yields the
zero cost cash stream S̃ mt := bt ([S m ]t )1t − [S m ]t , i.e.
⎧
⎪
⎪
⎨0 s<t
S̃s = bt ([S ] ) s = t
mt m t (3.5.3)
⎪
⎪
⎩−S m s > t.
s

We observe that S̃ it is different from −S it due to the spread between the ask and
bid prices. Similarly buying and selling bonds maturing at u at time t generate zero
cost cash streams 1ut := 1u − at (1u )1t and 1̃ut := bt (1u )1t − 1u , respectively, i.e.
⎧ ⎧
⎪
⎪ s = u, t ⎪
⎪ s = u, t
⎨0 ⎨0
1ut = −at (1u ) s=t and 1̃ut = bt (1u ) s=t (3.5.4)
s
⎪
⎪
s
⎪
⎪
⎩1 s = u, ⎩−1 s = u.

Assuming that one can buy or sell any fraction of the cash stream alluded to
above, suppose αti , α̃ti , βtu , β̃tu , i = 1, . . . , M, u = 1, . . . , T are nonnegative Ft
measurable random variables, then

T M T T
z= [αti S it + α̃ti S̃ it ] + [βtu 1ut + β̃tu 1̃ut ], (3.5.5)
t=0 i=1 t=0 u=1

is a cash stream that can be implemented by trading the available zero cost cash
streams.
Definition 3.5.2 (Trading Strategies) A cash streams z of the form in (3.5.5)
is called an implementable cash stream and we say αti , α̃ti , βtu , and β̃tu is a
trading strategythat implements z. We use A(C) to denote the collection of all
implementable cash streams.
3.5 Conic Finance 101

It is clear that A(C) is a closed cone. If all the bid and ask prices coincide, then
S it = −S̃ it and 1ut = −1̃ut . In this case we recover the one price economy model
as a special case and A(C) becomes a linear subspace of X .
Definition 3.5.3 (Super Implementation) We say a cash streams x ∈ X is super
implementable if there exists a cash stream z ∈ A(C) of the form in (3.5.5) such
that z ≥ x. In this case we say αti , α̃ti , βtu , and β̃tu is a trading strategy that super
implements x. We use A(C) to denote the collection of all super implementable cash
streams.
It is easy to see that A(C) is also a closed cone and A(C) ⊂ A(C).

3.5.2 Characterization of No Arbitrage by Utility Optimization

Using the model described in the previous section, we can extend the fundamental
theorem of asset pricing to markets with a bid-ask spread. First we define arbitrage
in such a market.
Definition 3.5.4 (Arbitrage Trading Strategy) We say that a cash stream x ∈
A(C) is an arbitrage if x ≥ 0 and x = 0. If x ≤ z ∈ A(C) where z has the
representation in (3.5.5), then we say αti , α̃ti , βtu , and β̃tu is an arbitrage trading
strategy.We say that the conic financial market has no arbitrage if A(C) does not
contain any arbitrage.
Denote X + the cone in X with all the components are nonnegative, then there is
no arbitrage trading strategy in the financial market described in the previous section
if and only if

A(C) ∩ X + = {0}. (3.5.6)

Let u be a utility function satisfying the conditions (u1)–(u3). We consider the

optimal trading problem
T

p = max E[u(ct )] : c ∈ w 0 + A(C) , (3.5.7)
t=0

where w 0 ∈ X + is an initial endowment cash stream. We can characterize the no

arbitrage in terms of the optimal trading problem (3.5.7):
Theorem 3.5.5 (No Arbitrage and Utility Maximization) The conic financial
market C described in the previous section has no arbitrage if and only if the optimal
trading problem (3.5.7) has a finite optimal value p < ∞ which is attained.
102 3 Finite Period Financial Models

Proof Since one can always scale an arbitrage cash stream with any arbitrarily large
positive number, therefore p < +∞ implies that there is no arbitrage. Similar to
Lemma 3.2.5 we can show that in this case the finite optimal is attained.
On the other hand, if p = +∞, without loss of generality we assume that there
is a sequence zn ∈ A(C) such that

T
E[u(wt0 + ztn )] → +∞. (3.5.8)
t=0

Clearly zn → +∞. Then taking a subsequence if necessary we can assume that
zn /zn → z∗ ∈ A(C)\{0}. By property (u3) ztn ≥ −wt0 , t = 0, 1, . . . , T . Thus,
zt∗ ≥ 0 implies that z∗ is an arbitrage.

3.5.3 Dual Characterization of No Arbitrage

We turn to the dual characterization of the no arbitrage and its implication for
the price of financial assets. For this purpose, we will often need to consider the
conditional expectation with respect to Ft which we will denote Et . Similarly we
use notation
% T &
x, y t = Et xt yt .
t=0

Definition 3.5.6 (Consistent Price Operator) Let C be a conic financial market

described in Definition 3.5.1. We say an F-adapted stochastic process π ∈
X + \{0} is a C-consistent price operator if, for any t = 1, . . . , T and any x ∈
{S mt , S̃ mt , 1ut , 1̃ut : m = 1, . . . , M, u = t + 1, . . . , T },

π, x t ≤ 0. (3.5.9)

Geometrically, a consistent price operator is simply an element of A(C)◦ :=

{π ∈ X : π, c ≤ 0, ∀c ∈ A(C)}, the polar cone of the cone of implementable cash
flows.
Proposition 3.5.7 (Geometrical Characterization of Consistent Price Operator)
Let C be a conic financial market described in Definition 3.5.1. Then the set of all
consistent price operators is A(C)◦ \{0}.
Proof Let π be a C-consistent price operator. For any element y ∈ A(C), there
exists x ∈ A(C) that dominates y, i.e. x ≥ y. By Definition 3.5.6, π, x ≤ 0. Since
π ∈ X + , π, y ≤ π, x ≤ 0. Thus, π ∈ A(C)◦ \{0}.
To show the converse let π ∈ A(C)◦ \{0}. Define the characteristic function of a
set by χA (x) = 1 if x ∈ A and χA (x) = 0 otherwise. For any t = 1, . . . , T , since
3.5 Conic Finance 103

{S mt , S̃ mt , 1ut , 1̃ut : m = 1, . . . , M, u = t + 1, . . . , T } ⊂ A(C), for any A ∈ Ft ,

and any x ∈ {S mt , S̃ mt , 1ut , 1̃ut : m = 1, . . . , M, u = t + 1, . . . , T } we have

π, χA x ≤ 0 (3.5.10)

which implies that

π, x t ≤ 0. (3.5.11)

Thus, π is a C-consistent price operator.

To see the relationship of a consistent price operator and the bid and ask prices
of a cash stream we observe that 0 ≥ π, S mt t = π, [S m ]t − at ([S m ]t )1t t implies
that π, [S m ]t t ≤ at ([S m ]t ) π, 1t t = at ([S m ]t )πt . Similarly, 0 ≥ π, S̃ mt t
implies that π, [S m ]t t ≥ bt ([S m ]t )πt . That is to say

bt ([S m ]t )πt ≤ π, [S m ]t t ≤ at ([S m ]t )πt . (3.5.12)

In a one price one period financial market, for t = 0, [S m ]0 = S1m and a0 ([S m ]0 ) =
b0 ([S m ]0 ) = S0m . Since (3.5.12) holds for all m = 1, . . . , M we have π, S1 =
π, S0 . Thus, we recover consistent price operator in Definition 2.4.18 as a special
case. Clearly, consistent price operator, in general, is not normalized in the sense
of Definition 2.4.17. We can see from (3.5.12) that, for any fixed t, dividing π by
π, 1t t = πt normalizes it for the purpose of deriving prices at time t. Clearly, it is
impossible to uniformly normalize a consistent price operator.
In Section 2.4 we have seen that consistent price operator is closely related to a
martingale measure. Next we derive a version of FTAP for a conic financial market
in which consistent price operators play the role of that of martingale measures in
FTAP for a one price financial market.
Theorem 3.5.8 (FTAP in Conic Financial Market) Let C be a conic financial
market as in Definition 3.5.1 and let u be a utility function that satisfies properties
(u1), (u2), and (u3). Then the following statements are equivalent:
(i) The conic financial market C has no arbitrage;
(ii) The utility optimization problem (3.5.7) is finite and attained.
(iii) There exists a C-consistent price operator which is an element of the subdiffer-
ential of the utility function at the optimal cash stream.
Proof The equivalence of (i) and (ii) follows from Theorem 3.5.5.
We show the equivalence of (ii) and (iii). Define, for x ∈ X ,

T
f (x) = E[(−u)(xt )], (3.5.13)
t=0
104 3 Finite Period Financial Models

we can rewrite the optimal trading problem (3.5.7) as

p = − inf[f (x) + ιw0 +A(C ) (x)]. (3.5.14)

Note that the (CQ) condition

0 ∈ int[dom ιw0 +A(C ) − dom f ] = int[w 0 + A(C) − X + ] (3.5.15)

holds. Thus, strong duality implies that

p = − max{−σw0 +A(C ) (z) − f ∗ (−z)} (3.5.16)

z∈X
T

∗
= min E[(−u) (−zt ) + w , z ] + σA(C ) (z) .
0
z∈X
t=0

Let x ∗ , π be solutions to the primal and dual problem (3.5.14) and (3.5.16),
respectively. Condition (u2) implies that dom(−u)∗ = (−∞, 0) so that πt > 0.
Moreover,

πt ∈ −∂(−u)(xt∗ ). (3.5.17)

Finally, if the market has no arbitrage trading strategy, then p < +∞ in (3.5.16)
which implies that σA(C ) (π ) < ∞ or π ∈ A(C)◦ . Thus, by Proposition 3.5.7, π
is a C-consistent price operator. Moreover, we can see from (3.5.17) that π is a
subgradient of the utility function at the optimal solution. Thus, (ii) implies (iii).
On the other hand, when (iii) is satisfied, there is a C-consistent price operator
π ∈ A(C)◦ \{0} satisfies (3.5.17). Thus, π must be a solution to the convex
optimization problem (3.5.16). That is to say p < +∞ so that (iii) implies (ii)
and, therefore, they are equivalent.

3.5.4 Pricing and Hedging

By Proposition 3.5.7, we see that to use consistent price operators for pricing we
must normalize them. However, (3.5.12) shows that, in general, the appropriate
normalizing factor for different t is different. For this reason a general discussion of
pricing and hedging in a conic financial market is technical. In this section we are
satisfied with a brief discussion of the one period model.
3.5 Conic Finance 105

Definition 3.5.9 (Normalized Consistent Price Operator) Let C be a one period

conic financial market. We say π is a C-normalized consistent price operator if π is
a C-consistent price operator and π0 = π, 10 = 1.
The set of normalized consistent price operator plays a role similar to the set of
equivalent martingale measures in a one price economy. We will show that, for any
c = (0, c1 ) ∈ A(C), the linear programming problem
!
u0 = max π, c : π ∈ A(C)◦ , π, 10 = 1 (3.5.18)

determines a super hedging bound. Moreover, the solution to the dual linear
programming of (3.5.18) determines a super-hedging trading strategy. A sub-
hedging bound can be derived symmetrically.
We denote the finite sample space Ω = {ω1 , . . . , ωN }. We regard a random
variable r on Ω as a vector r = [r(ω1 ), . . . , r(ωN )] and use · to signify the dot
product between such vectors. Defining x = π1 P we can write (3.5.18) explicitly
as a linear programming problem

u0 = max c1 · x (3.5.19)
subject to [S ] · x ≤ a0 ([S ] ), − [S ] · x ≤ −b0 ([S ] ), m = 1, . . . , M,
m 1 m 1 m 1 m 1

1 · x ≤ a0 (11 ), − 1 · x ≤ −b0 (11 ).

We formulate the dual problem using the Lagrange format. Let

2M+2
(Λ, γ ) = (λ1 , . . . , λM , λ̃1 , . . . , λ̃M , γ 1 , γ̃ 1 ) ∈ R+ (3.5.20)

be the Lagrange multipliers of linear programming problem (3.5.19). We consider

the Lagrangian

L(x, (Λ, γ )) = c1 · x + γ 1 a0 (11 ) − 1 · x + γ̃ 1 1 · x − b0 (11 )
M
+ λm a0 ([S m ]1 ) − [S m ]1 · x
m=1
M
+ λ̃m [S m ]1 · x − b0 ([S m ]1 ) . (3.5.21)
m=1

We can see that

c1 · x π ∈ A(C)◦ , x = π1 P , π0 = 1
inf L(x, (Λ, γ )) = (3.5.22)
(Λ,γ )∈R2M+2
+ −∞ otherwise.
106 3 Finite Period Financial Models

Thus, by the strong linear programming duality

u0 = sup inf L(x, (Λ, γ )) (3.5.23)

2M+2
(x,π0 )∈RN+1
+
(Λ,γ )∈R+

= inf sup L(x, (Λ, γ )).

(Λ,γ )∈R2M+2
+ (x,π0 )∈RN+1
+

2M+2
For (Λ, γ ) ∈ R+ consider the zero cost portfolio of cash follows:

M
(z0 (Λ, γ ), z1 (Λ, γ )) = γ01 110 + γ̃01 1̃10 + (λm
0S
m0
+ λ̃m m0
0 S̃ ). (3.5.24)
m=1

We see that

L(x, (Λ, γ )) = (c1 − z1 (Λ, γ )) · x − z0 (Λ, γ ). (3.5.25)

Thus, the dual linear program is

u0 = min [−z0 (Λ, γ )] (3.5.26)

subject to z1 (Λ, γ ))(ω) ≥ c1 (ω), ω ∈ Ω.

The solution of (3.5.26) provides us a trading strategy (Λ, γ ) to create an arbitrage

should the bid price for c1 exceed u0 .
Chapter 4
Continuous Financial Models

Abstract We turn to discuss continuous financial models. These models in general

involve infinite dimensional spaces and are more complex. Our focus here is to
use relatively simple models to illustrate the convex duality between the price
of a contingent claim and the process of cash borrowed in delta hedging. This
reveals the root of the convexity in contingent claims. Interestingly, when hedging
with a contingent claim instead of the underlying, a similar duality in the sense
of generalized Fenchel conjugate holds. Correspondingly, this generalized duality
leads to the generalized convexity of the contingent claims with many interesting
applications. Much of the material presented in this chapter appear here for the first
time.

4.1 Continuous Stochastic Processes

A continuous stochastic process is a generalization of the discrete stochastic process

that we discussed before.
Definition 4.1.1 (Stochastic Process) Let (Ω, F, P ) be a probability space and
let [0, T ] be an interval. We call (Xt ), t ∈ [0, T ] a stochastic process if for every t,
Xt is a random variable on (Ω, F, P ).
In financial applications the parameter t is usually time but not always. For
example, it could be the so-called local time when the calendar time is fixed at a
point and the parameter t, in fact, reflects the change in the price space. Similar to
the discrete case we also need to deal with gradually revealing information.
Definition 4.1.2 (Filtration) Let (Ω, F, P ) be a probability space and let [0, T ]
be an interval. We say (Ft ), t ∈ [0, T ] is a filtration if for every t, Ft ⊂ F is a
σ -algebra and, for any s < t,

Fs ⊂ F t .

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 107
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2_4
108 4 Continuous Financial Models

As in the discrete case, Ft represents information available up to time t. The

definition implicitly assumes that information once become available will never be
forgotten.
Definition 4.1.3 (Adapted Stochastic Process) Let (Ft ), t ∈ [0, T ] be a filtration
on probability space (Ω, F, P ). We say a stochastic process (Xt ) is Ft -adapted
provided that, for every t, Xt is Ft measurable.
Intuitively, the value Xs of an adapted stochastic process becomes deterministic
when the current time t > s.

4.1.1 Brownian Motion and Martingale

Brownian motion is a special continuous stochastic process that plays a crucial

role in financial modeling. It is named after the Scottish botanist Robert Brown
who in 1828 observed such a motion from pollen suspended in liquid. Louis
Bachelier first used it to model the price of financial assets in his 1900 Ph. D. thesis
and derived the famous Bachelier formula for option pricing. The mathematical
property of Brownian motion was clearly elaborated by Robert Weiner who also
provided a proof of the existence of a Brownian motion by construction. Paul
Samuelson proposed the widely used geometric Brownian motion model for stock
price movements in 1965, which is more realistic when modeling assets with
nonnegative values. However, the geometric Brownian motion is continuous so that
it does not allow any price jump which does happen to a stock price process from
time to time. As the saying goes “All models are wrong. Some are wronger than
others.” What we need to keep in mind is that models are approximations of the
reality. They are not reality.
Definition 4.1.4 (One-Dimensional Brownian Motion) A stochastic process
{Bt : t ∈ [0, T )} is called a standard Brownian motion if
1. B0 = 0,
2. for 0 ≤ t1 < t2 < . . . < tk ≤ T , the random variables

Bt2 − Bt1 , Bt3 − Bt2 , . . . , Btk − Btk−1

are independent,
3. for 0 ≤ s ≤ t ≤ T , Bt −Bs has a Gaussian distribution with mean 0 and variance
t − s,
4. for ω in a set of probability one, the path Bt (ω) is continuous.
Definition 4.1.5 (Multi-Dimensional Brownian Motion) A vector stochastic
process {Bt : t ∈ [0, T ]} in Rn is called a standard Brownian motion if
Bt = (Bt1 , Bt2 , . . . , Btn ) where Bti , i = 1, 2, . . . , n are independent standard
one-dimensional Brownian motions. If Bt is a standard Brownian motion, then
x + Bt is called a Brownian motion starting from x.
4.1 Continuous Stochastic Processes 109

Remark 4.1.6 The existence of a stochastic process satisfying all the conditions laid
out in Definition 4.1.4 is not automatically guaranteed. By and large, there are two
ways to prove the existence:
• by construction pioneered by Wiener (see, e.g., [54]), or
• by Kolmogorov’s extension theorem (see, e.g., [42]).
We are satisfied with known the existence of Brownian motions for our applications.
If in a given probability space there is a Brownian motion, then one can
also define a Brownian motion in a different yet similar probability space. Thus,
Brownian motion is not uniquely defined. However, since every Brownian motion
has the same properties laid out in Definition 4.1.4, their effects are equivalent. We
usually pick a “convenient” version for the purpose of a concrete application.
For each Brownian motion Bt , defining the σ -algebra represents the information
contained in Bt up to time t by Ft we get a nature filtration associated with Bt . In
fact, we can take Ft to be the σ -algebra generated by the collection of preimages of
Borel sets under Bs , s < t. In the sequel whenever we discuss a Brownian motion
we always assume that it is accompanied by this filtration.
Somewhat more general than a Brownian motion is the martingale process.
Definition 4.1.7 (Martingale) Let Ft be a filtration for the probability space
(Ω, F, P ). We say Mt is a (P , Ft )-martingale if Mt is adapted to the filtration
Ft , for all t > 0, E[Mt ] < ∞ and for all s < t,

EP [Mt |Fs ] = Ms .

Similar to the discrete case a martingale can be think of representing the wealth
process in playing a fair game. A Brownian motion Bt is clearly a martingale and
it is also easy to check that Mt = Bt2 − t is also a martingale. So martingale is
not necessarily a Brownian motion. However, martingales are only slightly more
general than the Brownian motion as the following Levy’s theorem shows (which
we state without proof).
Theorem 4.1.8 (The Levy Characterization of Brownian Motion) Let X(t) =
(X1 (t), . . . , Xn (t)) be a continuous stochastic process on (Ω, F, Q). Then X(t) is
a Brownian motion with respect to Q if and only if
(i) X(t) is a martingale w.r.t. Q, and
(ii) Xi (t)Xj (t) − δij t is a martingale w.r.t. Q for all i, j = 1, . . . , n.
Here δij is the Kronecker delta defined by δij = 0 when i = j and δii = 1.
For n = 1 we have the characterization of one-dimensional Brownian motion.
Theorem 4.1.9 (The Levy Characterization of Brownian Motion) Let X(t) be
a scalar continuous stochastic process on (Ω, F, Q). Then X(t) is a Brownian
motion with respect to Q if and only if
(i) X(t) is a martingale w.r.t. Q, and
(ii) X2 (t) − t is a martingale w.r.t. Q.
110 4 Continuous Financial Models

4.1.2 The Itô Formula

The Itô formula is an important tool in analyzing continuous stochastic processes.

Theorem 4.1.10 (Basic Form of the Itô Formula) Let f (x, t) ∈ C 2,1 and let Bt
be a one-dimensional Brownian motion. Then
1
df (Bt , t) = ft (Bt , t)dt + fx (Bt , t)dBt + fxx (Bt , t)dt. (4.1.1)
2
The Itô formula presented in (4.1.1) is a shorthand for
t
f (Bt , t) = f (0, 0) + ft (Bs , s)ds (4.1.2)
0
t 1 t
+ fx (Bs , s)dBs + fxx (Bs , s)dt.
0 2 0

This formula (4.1.1) looks like a usual chain rule except for the last term. A rigorous
proof is beyond the scope of this short book. Below are some heuristics that can help
in understanding the Itô formula. t
We know that f (Bt , t) − f (0, 0) = 0 df (Bt , t). Expand df (Bt , t) using the
Taylor’s expansion. Since terms of order o(dt) will vanish in the integration process
we need only do this to the second order. That gives us

1
df (Bt , t) = ft (Bt , t)dt + fx (Bt , t)dBt + fxx (Bt , t)(dBt )2
2
1
+ ftt (Bt , t)(dt)2 + ftx (Bt , t)dtdBt .
2

Since dt 2 , dtdBt are o(dt) the last two terms can be omitted and we have

1
df (Bt , t) = ft (Bt , t)dt + fx (Bt , t)dBt + fxx (Bt , t)(dBt )2 .
2

By the properties of the Brownian motion, we can replace dBt2 by dt giving us the
Itô formula (4.1.1).
Graphically we can illustrate by drawing the graph of fx around point Bt , then
df (Bt , t) is the area under the graph of fx (see Figure 4.1). We can see that
fx (Bt , t)dBt represents the approximation of the area using Euler’s method while
2 fxx (Bt , t)(dBt ) ∼ 2 fxx dt corrects the “triangle” part to get to an approximation
1 2 1

using the trapezoid rule.

The heuristic argument leads us to the following simple rule in handling the
differential term arising in the Taylor expansion of a function of the Itô process
usually called box algebra.
4.1 Continuous Stochastic Processes 111

Fig. 4.1 Graphic illustration f

of the Itô formula
fx (Bt + dBt , t)
1
f (dBt )2
2 xx
fx (Bt , t)

fx dBt

fx (·, t)

B
O Bt Bt + dBt

dt dBt
dt 0 0
dBt 0 dt

Example 4.1.11 Below is a nice application illustrating the power of the Itô
formula. Define βk (t) = E[Btk ]. Itô formula gives us

1 t
βk (t) = k(k − 1) βk−2 (s)ds.
2 0

We can use this to easily get E[Bt3 ] = 0 and E[Bt4 ] = 3t 2 . Those are mostly used in
financial applications. By induction, in general E[Bt2k+1 ] = 0 and

(2k)!t k
E[Bt2k ] = .
2k k!

Itô Processes

Let Bt be a one-dimensional Brownian motion with respect to filtration Ft on

(Ω, F, P ). Then
t t
Xt = X0 + μ(s, ω)ds + σ (s, ω)dBs
0 0

is called a (1-dim) Itô processes if μ, σ are Ft adapted,

t
P σ (s, ω) ds < ∞ for all t ≥ 0 = 1
2
0
112 4 Continuous Financial Models

and
t
P |μ(s, ω)|ds < ∞ for all t ≥ 0 = 1.
0

In shorthand we write

dXt = μdt + σ dBt .

Here μ is a drift and σ indicates magnitude of the variation of the random part. It is
often useful to write stochastic process in this form if we can. A Brownian motion
is an example of an Itô process where μ = 0 and σ = 1. The Itô formula can be
generalized to Itô process with dXt replacing dBt .
Theorem 4.1.12 (The General Itô Formula) Let f (t, x) ∈ C 2 and let Xt be an
Itô process. Then

1
df (Xt , t) = ft (Xt , t)dt + fx (Xt , t)dXt + fxx (Xt , t)(dXt )2 .
2

Example 4.1.13 Applying the Itô formula to f (x) = x 2 we have

t 1 2
Bs dBs = (B − t).
0 2 t

Example 4.1.14 (Integration by Parts) The pattern in handling f (x) = x 2 holds in

more general setting. Let g(s) be a continuous function with bounded variation with
respect to s ∈ [0, t]. Applying the Itô formula to f (t, x) = g(t)x we have
t t
g(s)dBs = g(t)Bt − g (s)Bs ds.
0 0

Example 4.1.15 Here is an example of using the general Itô formula. Let Xt =
μt + σ Bt . Then dXt = μdt + σ dBt . Using the box algebra we have

1
df (Xt , t) = ft dt + fx dXt + fxx (dXt )2
2
1
= ft dt + μfx dt + σfx dBt + σ 2 fxx dt
2
Example 4.1.16 Letting f (t, x) = tx we have
t t
tBt = Bs ds + sdBs
0 0
4.1 Continuous Stochastic Processes 113

or
t t
sdBs = tBt − Bs ds.
0 0

The Multidimensional Itô Formula

Let Xt = (Xt1 , . . . , Xtn ) be an n-dimensional Itô process satisfying

dXt = μdt + σ dBt ,

where μ is an n-dimensional vector, σ an n × m matrix, and Bt an n-dimensional

Brownian motion. We require the components of μ and σ satisfy similar conditions
in the definition of the one-dimensional Itô process. Let g(t, x) : [0, ∞)×R n → R p
has continuous second order partial derivatives. Then, for Yt = g(t, Xt ),
n n
∂gk ∂gk 1 ∂ 2 gk j
dYtk = dt + dXti + dXti dXt . (4.1.3)
dt ∂xi 2 ∂xi ∂xj
i=1 i,j =1

The following multi-dimensional box algebra is a convenient tool in simplifying

the multi-dimensional Itô formula

dt dBt1 dBt2 . . . dBtn

dt 0 0 0 ... 0
dBt1 0 dt 0 . . . 0
dBt2 0 0 dt . . . 0
... ... ... ... ... ...
dBtn 0 0 0 . . . dt

Example 4.1.17 (Integration by Parts) Let Xt , Yt be Itô processes in R. Applying

the Itô formula to f (Xt , Yt ) = Xt Yt we have

d(Xt Yt ) = Xt dYt + Yt dXt + dXt dYt .

The integral form in the following is the general integration by parts formula
t t t
Xs dYs = Xt Yt − X0 Y0 − Ys dXs − dXs dYs .
0 0 0

Remark 4.1.18 The term dXt dYt is called the quadratic covariation of Xt and Yt
and is often denoted d X, Y t .
114 4 Continuous Financial Models

Martingale Representation

The Itô formula is a crucial tool in proving the following important martingale
representation theorem. This representation theorem further highlights the close
relationship between martingales and Brownian motions. As an application oriented
class we will omit the proof and directly present the result.
Theorem 4.1.19 (Martingale Representation) Let Bt be an n-dimensional Brow-
nian motion generating filtration Ftn . Suppose that Mt is an (P , Ftn )-martingale
and that E[Mt2 ] < +∞ for all t ≥ 0. Then there exists a unique stochastic process
v ∈ V n such that
t
Mt = E[M0 ] + vdBs .
0

Dual Itô Formula

Let f (x, t) ∈ C 2,1 and let Xt be an Itô process. Then using the quadratic covariation
in Remark 4.1.18 we can write the general Itô formula in Theorem 4.1.12 as
1
df (Xt , t) = ft (Xt , t)dt + fx (Xt , t)dXt + d fx (X, t), X t . (4.1.4)
2
Now assume that f is convex in x for all t. We use f ∗ (y, t) to signify the conjugate
of f with respect to variable x. Define Yt = fx (Xt , t). We see that Xt , Yt satisfies
the Fenchel equality

f (Xt , t) + f ∗ (Yt , t) = Xt Yt . (4.1.5)

Since Xt Yt is not explicitly depends on t, we have

ft (Xt , t) + ft∗ (Yt , t) = 0, (4.1.6)

Yt = fx (Xt , t) and Xt = fy∗ (Yt , t), (4.1.7)

and using Example 4.1.17

df (Xt , t) + df ∗ (Yt , t) = Xt dYt + Yt dXt + d X, Y t . (4.1.8)

Combining (4.1.4), (4.1.6), and (4.1.8) we derive the following Dual Itô formula

1
df (Xt , t) = ft (Xt , t)dt + Yt dXt + d Y, X t (4.1.9)
2
1
df ∗ (Yt , t) = ft∗ (Yt , t)dt + Xt dYt + d X, Y t .
2
4.1 Continuous Stochastic Processes 115

4.1.3 Girsanov Theorem

In financial applications, prices of stocks and other assets are often described by a
Itô process of the form

dSt = μdt + σ dBt

where μ models a drift reflecting the large trend of the asset price and σ describes
the volatility of the random fluctuation of the price process. In analyzing the price
process, the important part is the impact of σ . The Girsanov theorem allows us to
“absorb” the drift μ by using a change of the probability measure. This is very
similar to the equivalent martingale measure that absorbs the excess gains for the
risky assets in the discrete model.
Theorem 4.1.20 (Removal of Drift via Girsanov’s Theorem) Let St be an Itô
process of the form

dSt = μ(t, ω)dt + σ (t, ω)dBt , t ∈ [0, T ], S0 = 0,

where Bt is a standard (P , Ft )-Brownian motion and μ, σ are bounded and

σ > c > 0 for some constant c. Assume that, for u = μ/σ , Mt =
t t
exp − 0 u(s, ω)dBs − 12 0 u2 (s, ω)ds , t ∈ [0, T ], is a (P , Ft )-martingale.
Then
1.

dQ(ω) = MT (ω)dP (ω).

is a probability
t measure on FT and
2. B̂(t) = 0 u(s, ω)ds + B(t) is a standard Brownian motion w.r.t. Q and
3.

dSt = σ (t, ω)d B̂(t).

t
1 t
Proof (Sketch) Let Xt = 0 u(s, ω)dBs + 2 0 u2 (s, ω)ds we have

1
dXt = udBt + u2 dt.
2
By direct calculation we have

dMt = −uexp(−Xt )dBt . (4.1.10)

Since by assumption Mt is a martingale and M0 = 1,

Q(Ω) = EQ [1] = EP [MT ] = 1.

116 4 Continuous Financial Models

Thus, Q is a probability measure on FT . We note that dQ = Mt dP on Ft . In fact,

for any bounded Ft -measurable function f ,

f dQ = f MT dP = E[f MT ] = E[E[f MT |Ft ]]

Ω Ω

= E[f EMT |Ft ] = E[f Mt ] = f Mt dP .

To show that B̂t is a standard Brownian motion, we turn to check the conditions
in the Levy characterization of Theorem 4.1.8. We check only Theorem 4.1.8 (i)
since (ii) is similar. Using the product rule we can verify that Mt B̂t is a martingale
with respect to P . Now for s < t, and A ∈ Fs we have

EQ [B̂t |Fs ]dQ

= B̂t dQ = B̂t Mt dP = EP [1A Mt B̂t ]

A A

= EP [EP [1A Mt B̂t |Fs ]] = EP [1A Ms B̂s ]

= B̂s Ms dP = B̂s dQ.

A A

Since A ∈ Fs is arbitrary, E Q [B̂t |Fs ] = B̂s .

t
1 t
We note that, by (4.1.10), Mt = exp − 0 u(s, ω)dBs − 2 0 u2 (s, ω)ds is
always a local martingale. Novikov’s condition
' 1 T 2 (
E e 2 0 ut dt < ∞

is a sufficient condition ensuring Mt to be a martingale. Measure Q is called the

martingale measure for process St .

4.2 Bachelier and Black–Scholes Formulae

4.2.1 Pricing Contingent Claims

Let St be an Itô process

dSt = μ(St , t)dt + σ (St , t)dBt

4.2 Bachelier and Black–Scholes Formulae 117

that represents the price process of a certain financial asset. Here Bt is a Brownian
motion in a probability measure space (Ω, F, P ) with filtration Ft . Assume for
simplicity that the risk free interest rate is 0 and that μ, σ are bounded and σ ≥ c >
0 for some constant c. Suppose that we want to price a European style contingent
claim on St with the payoff f (ST ) at the maturity T . We can proceed as follows.
First using the Girsanov theorem we can write

dSt = σ (St , t)dWt

where Wt is a Brownian motion in (Ω, F, Q) with filtration Ft where Q is a

martingale measure for St equivalent to P . Similar to the discrete version of the
fundamental theorem of asset pricing, we can write down the no arbitrage price
function for the contingent claim at any time t ∈ [0, T ] and price x as

v(x, t) = EQ [f (ST )|St = x]. (4.2.1)

Next we explicitly calculate the price function for call options under the Bachelier
and Black–Scholes models.

Bachelier Formula

Bachelier modeled the price of a stock in his 1900 pioneering paper [3] by

dSt = μdt + σ dBt

where μ and σ are constant. This model was thought unrealistic because stock price
cannot become negative. However, now we can see it as a good approximation for
pair trading or forward for currency swap contracts. Consider the price of a call
option with a strike K maturing at T . Then formula (4.2.1) reduces to

B(x, t) = EQ [(ST − K)+ |St = x], (4.2.2)

where Q is an equivalent martingale measure with respect to the price process St .

Since under Q the dynamics of the price process is

dSt = σ dWt

where Wt is a Q Brownian motion, we have

√
ST = x + T − tσ W1 ,

where W1 ∼ N (0, 1). Thus,

√
B(x, t) = EQ [(x + T − tσ W1 − K)+ ] (4.2.3)
118 4 Continuous Financial Models

1 ∞ √ y2
= √ (x − K + T − tσy)+ e− 2 dy
2π −∞
1 ∞ √ y2
= √ (x − K + T − tσy)e− 2 dy
2π K−x
√
σ T −t

√
x−K
√
1 σ T −t z2
= √ (x − K − T − tσ z)e− 2 dz (z = −y)
2π −∞

We can write (4.2.3) concisely as

x−K √ x−K
B(x, t) = (x − K)N √ + σ T − tN √ , (4.2.4)
σ T −t σ T −t

where

1 t z2
N(t) = √ e− 2 dz.
2π −∞

Black–Scholes Formula

Black and Scholes modeled the price of a stock as a geometric Brownian motion

dSt = μSt dt + σ St dBt

where μ and σ are constant. Consider the price of a call option with a strike K
maturing at T . Again formula (4.2.1) reduces to

C(x, t) = EQ [(ST − K)+ |St = x], (4.2.5)

where Q is an equivalent martingale measure with respect to the price process St .

Now under Q the dynamics of the price process is

dSt = σ St dWt

where Wt is a Q Brownian motion. We have

−σ 2 (T − t) √
ST = x exp − + T − tσ W1 , (4.2.6)
2

where W1 ∼ N (0, 1). Thus,

4.2 Bachelier and Black–Scholes Formulae 119

⎡ ⎤
+
−σ 2 (T − t) √
C(x, t) = EQ ⎣ x exp − + T − tσ W1 − K ⎦ (4.2.7)
2

∞ +
1 −σ 2 (T − t) √ y2
= √ x exp − + T − tσy − K e− 2 dy
2π −∞ 2

1 ∞ −σ 2 (T − t) √ y2
+ T − tσy − K e− 2 dy

= √ ln K
x +
σ 2 (T −t) x exp −
2π √ 2 2
σ T −t

which can be represented as

C(x, t) = xN(d+ ) − KN (d− ), (4.2.8)

where
x
± σ (T2 −t)
2
ln
d± = K
√ .
σ T −t

4.2.2 Convexity

Convexity and generalized convexity play important roles in dealing with option
pricing and hedging. Both Bachelier and Black–Scholes formulae involve interest-
ing convexity with respect to their various parameters. √
We start with the Bachelier formula and use I = T − tσ and forward price
X = x − K to simplify notation. We will also use their ratio moneyness m = X/I .
Using these new variables then we can write the Bachelier formula (4.2.3) as

X X
B(X, I ) = EQ [(X + I W1 )+ ] = XN + IN . (4.2.9)
I I

Since for any fixed w, (X + I w)+ is a sublinear function of (X, I ), so is B(X, I ).

Thus, we have representation

B(X, I ) = XBX + I BI . (4.2.10)

Comparing with (4.2.9) we see that

X X
BX = N and BI = N . (4.2.11)
I I

We see that the sublinear property of the Bachelier formula brings us much
convenience in calculating BX and BI .
120 4 Continuous Financial Models

The sublinearity of B also means that its conjugate is an indicator function of

some convex set M and we have the representation

B = σM and B ∗ = ιM .

By the definition of conjugate function we can calculate that

M = {(X∗ , I ∗ ) : I ∗ + mX∗ ≤ mN(m) + N (m)} (4.2.12)

∗ ∗
= {(N (m), I ) : I ≤ N (m), m ∈ R}.

We now turn to the Black–Scholes formula. First direct calculation verifies

∂C(x, t)
= N(d+ ). (4.2.13)
∂x

We observe that the variable x appears in the expressions of C(x, t) in three separate
places. Yet curiously the calculation result of the partial derivative with respect to x
contains only the partial derivative with respect to the linear term of x. This is rather
similar to the simple formula for BX in (4.2.11). In the next section we will show
the reason is related to the convexity of C in x and Fenchel-Legendra transform of
C in x is related to the delta hedging. It is nature to ask whether C is also convex
with respect to σ . It turns out the answer is negative. Yet if we compensate C by a
multiple of an at money call it becomes convex.
We start by calculating the partial derivative of C with respect to σ :

∂d+ ∂d−
Cσ = xN (d+ ) − KN (d− ) . (4.2.14)
∂σ ∂σ
Observing that
)
xK (ln(x/K))2 τσ2
xN (d+ ) = KN (d− ) = exp − − (4.2.15)
2π 2τ σ 2 8

and
√
d+ − d− = σ τ (4.2.16)

we can simplify the expression of Cσ to

)
xKτ (ln(x/K))2 τσ2
Cσ = exp − − . (4.2.17)
2π 2τ σ 2 8

It follows that
)
xKτ (ln(x/K))2 τσ2 (ln(x/K))2 τσ
Cσ σ = exp − − − . (4.2.18)
2π 2τ σ 2 8 τσ3 4
4.2 Bachelier and Black–Scholes Formulae 121

Defining
√ √
√ τσ τσ
f (σ ) := C − xK N −N −
2 2

(note inside the hard bracket is the percentage premium of an at the money call
option) we have
√
√ τσ τσ
f (σ ) = Cσ σ +
xKτ N (4.2.19)
4 2
√
√ τσ (ln(x/K))2 (ln(x/K))2
= xKτ N exp −
2 2τ σ 2 τσ3
√
√ τσ τσ (ln(x/K))2
+ xKτ N 1 − exp − ≥ 0.
4 2 2τ σ 2

We note that
√ √
τσ τσ
N −N −
2 2

is the price of an at the money√ call. Thus, the Black–Scholes call price C
compensated by a multiple (− x/K) of an at the money call as a function of σ
is convex. We can also phrase this in terms of generalized convexity. Note that f
is convex and, therefore, can be supported from below by an affine function. Thus,
the Black–Scholes call price C as a function of σ can be supported from below by
a function of the form
√ √
√ τσ τσ
xK N −N − + yσ − b.
2 2

Define
√ √
√ τσ τσ
c(σ, y) = xK N −N − + yσ
2 2

Then the Black–Scholes call price C as a function of σ is Φc(1) -convex using the
notation in Section 1.5.

4.2.3 Duality

We turn to explore the reason why the derivative of the Black–Scholes call formula
C has a simple derivative with respect to x. To understand this phenomenon we need
122 4 Continuous Financial Models

to go back to the original derivation of the Black–Scholes formula in [6]. Black

and Scholes derive formula (4.2.8) by considering a portfolio of Nt shares of the
underlying to hedge a short position of one share of the European call option:

St Nt − C(St , t). (4.2.20)

They want to choose Nt in such a way that the resulting portfolio (4.2.20) has
riskless gains, that is

Nt dSt − dC(St , t) = 0. (4.2.21)

Using the Itô formula we have

∂C ∂C 1 ∂ 2C
Nt dSt = dSt + + dt. (4.2.22)
∂x ∂t 2 ∂x 2

It follows that
∂C
Nt = (4.2.23)
∂x
and C must satisfies the Black–Scholes partial differential equation

∂C 1 ∂ 2C
+ = 0, (4.2.24)
∂t 2 ∂x 2
with terminal condition

C(x, T ) = (x − K)+ . (4.2.25)

The Black–Scholes partial differential equation (4.2.24) with the terminal condi-
tion (4.2.25) provides an alternative derivation of the Black–Scholes formula (4.2.8)
via the Feynmann–Kac formula.
Relationships (4.2.20) and (4.2.23) reveals that when portfolio (4.2.20) has
riskless gains its value equals to the Fenchel-Legendra transform of the no arbitrage
option price. Since Merton has shown that the Black–Scholes option price C(St , t)
is convex in St , we have the following duality:

C ∗ (Nt , t) = sup[Nt St − C(St , t)], (4.2.26)

and

C(St , t) = sup[Nt St − C ∗ (St , t)], (4.2.27)

St
4.3 Duality and Delta Hedging 123

where the conjugate operation is with respect to the first variable. These relation-
ships reveal that for each fixed t the option value is a convex function of the stock
price and the cash borrowed C ∗ (Nt , t) is a convex function of the share of the stock
in the hedging portfolio. The same relationship also holds for the Bachelier formula.
Thus, the simple form of the partial derivative of C in (4.2.13) is a consequence
of the Fenchel-Young equality in Proposition 1.3.1. This duality argument also
explains the simplicity of BX but as mentioned before BX can be derived more
directly using the sublinear property of the Bachelier formula B.

4.3 Duality and Delta Hedging

The duality relationship in delta hedging observed in the previous section for the
Bachelier and Black–Scholes formulae also holds in more general setting.

4.3.1 Delta Hedging

We consider a diffusion process St satisfying

dSt = σ St dWt , (4.3.1)

where Wt is a standard Brownian motion under measure Q (so that Q is a martingale

measure for St ). We assume that the risk free rate is 0. Consider a contingent claim
on St of European style with maturity at T > 0 and a terminal payoff f (ST ) at
t = T . Denoting the price of the European contingent claim at time t by v(St , t).
We use a portfolio of Nt shares of the underlying St to hedge a short position of one
share of the European call option:

St Nt − v(St , t). (4.3.2)

The gain of this portfolio is

Nt dSt − dv(St , t). (4.3.3)

Applying the Itô formula we can rewrite (4.3.3) as

σ 2x2
Nt dSt − vt + vxx dt + vx σ dWt
2

To ensure a riskless gain we need

Nt = vx (St , t). (4.3.4)

124 4 Continuous Financial Models

Then the gain in portfolio reduces to

σ 2x2
vt + vxx dt.
2
Now no arbitrage requires this quantity to be 0. Thus, v must satisfy the Black–
Scholes PDE
σ 2x2
vt + vxx = 0. (4.3.5)
2
with terminal condition

v(x, T ) = f (x), (4.3.6)

where f is the payoff of the target at T .

4.3.2 Duality

Using (4.2.1) we know that

v(x, t) = EQ [f (ST )|St = x] (4.3.7)

σ2 √
= E f (x exp − (T − t) + T − tσ W1 ,
Q
2

where W1 ∼ N (0, 1) under measure Q. Thus we see that v is convex in x provided

that f is convex.
Fixing t, vx (·, t) is a monotone increasing function. Thus, we can represent the
pricing portfolio St Nt − v(St , t) graphically in Figures 4.2 and 4.3.
We see from those graphs the similarity with Fenchel duality. Indeed whenever
the terminal payoff f of the European contingent claim is convex we have the
following duality relationship:

v ∗ (Nt , t) = sup[St Nt − v(St , t)] (4.3.8)

and

v(St , t) = sup[St Nt − v ∗ (Nt , t)]. (4.3.9)

Here the conjugate v ∗ is the cash borrowed process when we maintaining a self-
financing hedging portfolio. Relationship (4.3.8) corresponds to that the hedging
portfolio has riskless gain and relationship (4.3.9) shows that the hedging portfolio
St Nt − v ∗ (Nt , t) is self-financing.
4.3 Duality and Delta Hedging 125

n n

Nt
Nt

vx−1 (·, t) vx−1 (·, t)

vx (·, t) vx (·, t)

s s
O St O St

Fig. 4.2 Hedging portfolio

Fig. 4.3 Equality holds when n

Nt = vx (St , t), St =
vx−1 (Nt , t) Nt

vx−1 (·, t)
vx (·, t)

s
O St

To implement this hedging, Nt must satisfy the Fenchel equality

v(St , t) + v ∗ (Nt , t) = St Nt . (4.3.10)

Then Nt = vx (St , t) is a function of St and St = vn∗ (Nt , t) is a function of Nt .

Moreover,
∂v ∂v ∗
=−
∂t ∂t
and
∗
vxx vnn = 1.

Substituting the above into (4.3.5) we derive

∂v ∗ σ 2 x 2 ṽxx
2
∗
− + vnn = 0. (4.3.11)
∂t 2
126 4 Continuous Financial Models

4.3.3 Time Reversal

In particular, if we reverse the time by setting τ = T − t then Equation (4.3.11)

becomes

∂v ∗ σ 2 x 2 ṽxx
2
∗
+ vnn = 0. (4.3.12)
∂τ 2
Since Equations (4.3.12) and (4.3.5) have the same form this suggests that in reverse
time the cash borrowed process v ∗ should be a martingale just like v is a martingale
in time t.
Let us fix the notation first. We use τ to denote the reversed time. For a stochastic
process Pt , t ∈ [0, T ] we define its time reversal by P̂τ = Pt provided that t + τ =
T . Let us denote Δ an infinitesimal increment of time. Setting τ + t + Δ = T ,
we have

dPt = Pt+Δ − Pt = P̂τ − P̂τ +Δ = −d P̂τ .

We note that if Wt is a Brownian motion under measure Q then so is Ŵτ under

the same measure. The time reversal of a function of a stochastic process is defined
below using Nt = vx (St , t) as an example

N̂τ = vx (Ŝτ , τ ).

The time reversal for the differential of a product stochastic processes needs to be
dealt with caution. For example, we can write (4.3.1) as

St+Δ − St = σ St (Wt+Δ − Wt ).

Letting t + τ + Δ = T we have

d Ŝτ = Ŝτ +Δ − Ŝτ = −(St+Δ − St ) = −dSt (4.3.13)

= −σ St (Wt+Δ − Wt ) = −σ Ŝτ +Δ (Ŵτ − Ŵτ +Δ )
= σ (Ŝτ + d Ŝτ )d Ŵτ .

Iterating (4.3.13) and eliminating zero terms we have

d Ŝτ = σ 2 Ŝτ dτ + σ Ŝτ d Ŵτ . (4.3.14)

We see that although St is a martingale its time reversal Ŝτ is not.

4.3 Duality and Delta Hedging 127

Now we turn to N̂τ . Using Itô’s formula we have

∂vx 1 ∂ 2 vx ∂vx
d N̂τ = dτ + 2
(d Ŝτ )2 + d Ŝτ (4.3.15)
∂t 2 ∂x ∂x

∂vx ∂vx 2 1 ∂ 2 vx 2 2 ∂vx
= + σ Ŝτ + σ Ŝτ dτ + σ Sτ d Ŵτ .
∂t ∂x 2 ∂x 2 ∂x

Differentiating (4.3.5) with respect to x we have

∂vx ∂vx 2 1 ∂ 2 vx 2 2
+ σ x+ σ x = 0.
∂t ∂x 2 ∂x 2
It follows that
∂vx
d N̂τ = σ Sτ d Ŵτ (4.3.16)
∂x
is a martingale.
Finally we consider the time reversal of the hedging portfolio (cash borrowed)
process Ht = v ∗ (Nt , t). Using the dual Itô formula (4.1.9) we have

1
dv = vt dt + Nt dSt + d S, N t (4.3.17)
2
1
dHt = dv ∗ = vt∗ dt + St dNt + d S, N t .
2

Combining (4.3.17) with the riskless gain condition dv = Nt dSt and vt + vt∗ = 0
from (4.1.6) we have

dHt = Ht+Δ − Ht = St dNt + d S, N t (4.3.18)

= (St + dSt )dNt = St+Δ (Nt+Δ − Nt ).

Letting t + τ + Δ = T we have

Ĥτ − Ĥτ +Δ = Ŝτ (N̂τ − N̂τ +Δ )

or
∂vx 2
d Ĥτ = Ŝτ d N̂τ = σ Sτ d Ŵτ . (4.3.19)
∂x

Thus, Ĥτ is also a martingale.

128 4 Continuous Financial Models

4.4 Generalized Duality and Hedging with Contingent

Claims

Financial innovations in the past several decades have led to the creation of many
new types of financial derivatives. They become increasingly liquid and, thus, can
also be used as hedging devices. What happens when we use a contingent claim
instead the underlying to construct a hedging portfolio for the purpose of pricing and
hedging a target contingent claim? It turns out that a duality also emerges between
the value of the target contingent claim and the cash borrowed process in terms of
generalized duality which naturally corresponds to a generalized convexity concept
(see, e.g., Section 1.5). Moreover, similar to the classical option pricing theory, the
no arbitrage value of the contingent claim derived this way preserves the generalized
convexity of the terminal payoff.

4.4.1 Preservation of Generalized Convexity in the Value

Function of a Contingent Claim
Consistency of Generalized Convexity

Let St be a diffusion process

dSt = μ(St , t)dt + σ (St , t)dWt , (4.4.1)

where Wt is a standard Brownian motion. We assume again that the risk free rate
is 0. Consider a target contingent claim on St of European style with maturity at
T > 0 and a terminal payoff f (ST ) at t = T . Suppose that a different contingent
claim, we call it hedging claim, on St is traded on the market with price p(St , t) at
all time t ∈ [0, T ]. For uniqueness in what follows we always assume that p and v
2
are smooth functions bounded by αeβx for some α, β > 0. Our main result is:
Theorem 4.4.1 (Consistency of Generalized Convexity) Define ct (x, y) =
p(x, t)y and assume that f is ΦcT (1) -convex. Then
2
(i) Partial differential equation vt + σ2 vxx = 0, v(x, T ) = f (x), uniquely
determines an arbitrage free price for the target claim;
(ii) for any t ∈ [0, T ], v(·, t) is Φct (1) -convex; and
(iii) Nt determined by

v(Nt , t)ct (1) + v(St , t) = p(St , t)Nt ,

makes the portfolio of the hedging instrument and the riskless asset
p(St , t)Nt − v ct (1) (Nt , t) riskless.
4.4 Generalized Duality and Hedging with Contingent Claims 129

Proof We price v by forming a potentially self-financing portfolio of statically

shorting one share of the target contingent claim with Nt units of the hedging claim.
Then

p(St , t)Nt − v(St , t). (4.4.2)

is the cash borrowed resulting from this portfolio. Self-financing implies that

Nt dp(St , t) = dv(St , t). (4.4.3)

Applying the Itô formula we get

σ2
Nt pxx dt + px σ dWt
pt + μpx + (4.4.4)
2

σ2
− vt + μvx + vxx dt + vx σ dWt
2

To ensure riskless gains we need Nt to satisfy the equation

vx (St , t) = Nt px (St , t). (4.4.5)

Then the gain in portfolio reduces to

σ2 σ2
Nt pt + pxx dt − vt + vxx dt.
2 2

Now no arbitrage requires this quantity to be 0. Thus

σ2 σ2
Nt pt + pxx dt = vt + vxx dt.
2 2

Since p is arbitrage free,

σ2
pt + pxx = 0.
2
Thus, v must also satisfy the Black–Scholes PDE

σ2
vt + vxx = 0. (4.4.6)
2
with terminal condition

v(x, T ) = f (x), (4.4.7)

where f is the payoff of the target at T .

130 4 Continuous Financial Models

We show that v ct (1)ct (2) satisfies the same Black–Scholes PDE as v does. Observe
that x → p(x, T ) is strictly monotone, which implies that x → p(x, t) is invertible,
i.e., x = x(p, t). We can define

ṽ(p, t) = v(x(p, t), t) + ιrange(p(·,t)) (p).

Then we have

ṽ ∗ (Nt , t) = sup[Pt Nt − ṽ(Pt , t)]

= sup[p(St , t)Nt − v(St , t)] = v ct (1) (Nt , t).

Similarly, for any Pt = p(St , t),

ṽ ∗∗ (Pt , t) = sup[Pt Nt − ṽ ∗ (Nt , t)]

= sup[p(St , t)Nt − v ct (1) (Nt , t)] = v ct (1)ct (2) (St , t).

Thus, we need only to show that ṽ and ṽ ∗∗ satisfy the same Black–Scholes PDE.
We do so through the PDE for the cash borrowed ṽ ∗ . Changing variables we have

∂v ∂ ṽ ∂p
= + ṽp
∂t ∂t ∂t
vx = ṽp px
vxx = ṽp pxx + ṽpp px2 .

Substituting them into

∂v σ2
+ vxx = 0
∂t 2
and using

∂p σ 2
+ pxx = 0
∂t 2
we have

∂ ṽ σ 2 px2
+ ṽpp = 0. (4.4.8)
∂t 2
Thus, using Fenchel equality

ṽ(Pt , t) + ṽ ∗ (Nt , t) = Pt Nt
4.4 Generalized Duality and Hedging with Contingent Claims 131

we have

∂ ṽ ∂ ṽ ∗
n = ṽp , p = ṽn∗ , =−
∂t ∂t
and
∗
ṽpp ṽnn = 1.

Substituting the above into (4.4.8) we derive

∂ ṽ ∗ σ 2 px2 ṽpp
2
∗
− + ṽnn = 0. (4.4.9)
∂t 2

To derive the PDE for ṽ ∗∗ we start from Pt and Nt satisfying the Fenchel equality

ṽ ∗∗ (Pt , t) + ṽ ∗ (Nt , t) = Pt Nt .

Then we have

∂ ṽ ∗∗ ∂ ṽ ∗
n = ṽp∗∗ , p = ṽn∗ , =−
∂t ∂t
and
∗∗ ∗
ṽpp ṽnn = 1.

∗ = 1 substituting the above relationship into (4.4.9) yields

Since ṽpp ṽnn

∂ ṽ ∗∗ σ 2 px2 ∗∗
+ ṽ = 0.
∂t 2 pp

We see that ṽ and ṽ ∗∗ satisfy the same Black–Scholes differential equation. Since
v(x, t) = ṽ(p, t) and ṽ ∗∗ (p, t) = v ct (1)ct (2) (x, t) for x = x(p, t) we conclude that
v(x, t) and v ct (1)ct (2) (x, t) also satisfy the same Black–Scholes differential equation.
Finally, since v(·, T ) is ΦcT (1) -convex we have v(x, T ) = v cT (1)cT (2) (x, T ). That
is, v and v cT (1)cT (2) satisfy the same terminal condition. Thus, they must be the same
for all t, i.e. v(x, t) = v ct (1)ct (2) (x, t) so that v(·, t) is Φct (1) -convex.

Remark 4.4.2 Function ct (x, y) = p(x, t)y is known when we know the price of
claim p that we use to hedge.
Fixing t and defining ṽ(p, t) = v(x(p, t), t), we can represent the portfolio
p(St , t)Nt − v(St , t) graphically in Figures 4.4 and 4.5
132 4 Continuous Financial Models

n n

Nt
Nt

ṽp−1 (·, t) px (·, t) ṽp−1 (·, t) px (·, t)

ṽp (·, t) ṽp (·, t)

s s
O St O St

Fig. 4.4 Hedging portfolio

Fig. 4.5 Equality holds when n

px (St , t)Nt = vx (St , t)
Nt

ṽp−1 (·, t) px (·, t)

ṽp (·, t)

s
O St

We see that these graphs are almost exact replications of the graphic repre-
sentation of the hedging portfolio St Nt − v(St , t). The only difference is that the
sn-plane is weighted by px (·, t). This implies the following generalized Fenchel
duality relationship.

v ct (1) (Nt , t) = sup[p(St , t)Nt − v(St , t)] (4.4.10)

and

v(St , t) = sup[p(St , t)Nt − v ct (1) (Nt , t)]. (4.4.11)

Relationship (4.4.10) can be interpreted as a cash borrowed process having the

property of riskless gains and Equation (4.4.11) shows that the hedging portfolio
p(St , t)Nt − v ct (1) (Nt , t) of the hedging claim and cash is self-financing. The key
of the formal proof of Theorem 4.4.1 is to verify that v(·, t) is Φct (1) -convex.
4.4 Generalized Duality and Hedging with Contingent Claims 133

4.4.2 Determining the Hedging Process

While in principle the PDE with terminal condition (4.4.6) and (4.4.7) determines
an arbitrage free and Φct (1) -convexity preserving contingent claim pricing function
v, to determine the hedging process one must know the dynamics of Nt and Ht =
v(·, t)ct (1) (Nt ).
Defining n(x, t) := vx (x, t)/px (x, t), Equation (4.4.5) implies that the hedging
process is

Nt = n(St , t). (4.4.12)

Differentiating (4.4.6) with respect to x we derive the PDE governing n:

σ2 nx σ
nt + nxx = − (px σ )x . (4.4.13)
2 px
We turn to the hedging process Nt . Using Itô’s formula we have

σ2
dNt = nt + μnx + nxx dt + nx σ dWt (4.4.14)
2
Using (4.4.13) we can simplify (4.4.14) to

(px σ )x
dNt = nx μ−σ dt + σ dWt (4.4.15)
px

We see that Nt is in general not a martingale unless μ − σ (ppx σx )x = 0.

Next we discuss the dynamic of the cash borrowed process Ht . We have seen that
no arbitrage forces v(·, t) = v(·, t)ct (1)ct (2) . Thus, by (4.4.10) and (4.4.11) we have

Ht (Nt ) + v(St , t) = p(St , t)Nt . (4.4.16)

Due to the self-financing condition (3.2.3) we have

dHt = pdNt + d p, N t (4.4.17)

1 (px σ )x
= nx pσ dWt + σ 2 px2 nx + pxx px2 n − pnx dt
2 px σ

In general Ht is not a martingale. However, in some special case it could be. For
example, if p(x, t) = x, i.e. the hedging is done with the price process St itself,
then px = 1, pxx = 0 and Equation (4.4.17) is simplified to

dHt = σ nx (St dWt + [σ − St σx ] dt) . (4.4.18)

Now when St follows a geometric Brownian motion where σ (x, t) = σ (t)x, we

have σ = xσx and Ht is a martingale.
134 4 Continuous Financial Models

4.4.3 Hedging with p-Multiple ETF

Exchange traded funds (ETFs) are securities that can be traded in a financial market
like a stock. These financial products are created to provide investors the flexibility
to invest in a specifical sector as real estate, technology etc. . . or in a broad index
such as the SP500. Some of them also enable investors to leverage. For example,
one can buy ETFs that double and triple the daily percentage movement of, say, the
popular SP500 index and many other indices. There are also short ETFs that mimic
the effect of selling borrowed share of corresponding ETFs. Buying an ETF itself is
referred to as long. They provides convenient tools for hedging. We discuss in this
section the general p-multiple ETF, which mimics the p times of the percentage
movement of the underlying, as a hedging tool. We will need the following special
case of Theorem 2.2.3.
Proposition 4.4.3 The function x q , x ≥ 0 is Φ[x p y](1) -convex if either q > 0 and
p < q or q < 0 and q < p. Similarly, the function −x q , x ≥ 0 is Φ[x p y](1) -convex
if either p > q > 0 or p < q < 0.

Proof We prove only for the case x q . The discussion for −x q is similar. Let u(x) =
x q , x ≥ 0. It is easy to calculate that
xu (x)
R(x) = − = 1 − q. (4.4.19)
u (x)
When q > 0 and p < q, u is an increasing function and R(x) = 1 − q < 1 − p
and when q < 0 and p > q, u is a decreasing function and R(x) = 1 − q > 1 − p.
Now the conclusion of the proposition directly follows that of Theorem 2.2.3.

Suppose St satisfies the diffusion process

dSt = σ St dBt . (4.4.20)

q
Consider an European style contingent claim with payoff at t = T . Denote
ST
the value of this contingent claim at time t by v(St , t). Solving (4.4.6) with terminal
q
condition v(ST , T ) = ST , we can determine that
q(q−1) 2
q σ (T −t)
v(St , t) = St e 2 .

It is easy to verify that

dv(St , t) dSt
=q .
v(St , t) St

Thus, v is a q-multiple of St . Similarly, a p-multiple of St has a no arbitrary price

p(p−1) 2
p σ (T −t)
Pt = St e 2 .
4.4 Generalized Duality and Hedging with Contingent Claims 135

Theorem 4.4.4 (Hedging with Multiple of ETF) Let St be the price of an asset
satisfying the diffusion equation (4.4.20). Suppose that either q > 0 and p < q
or q < 0 and q < p. Then a q-multiple long ETF of St , t ∈ [0, T ] can always
be dynamically hedged with an arbitrage free self-financing portfolio involving a
p-multiple ETF of St . Moreover, for any t ∈ [0, T ], the arbitrage free price of the
q-multiple ETF is Φ[x p y](1) -convex.
Proof By Theorem 4.4.1 we need only to check that v(x, T ) = x q is Φ[x p y](1) -
convex. This follows directly from Proposition 4.4.3.

In this case we can explicitly calculate that the hedging process is

q q−p [ q(q−1) − p(p−1) ]σ 2 (T −t)

Nt = S e 2 2
p t

and the cash borrowed process is

q −p
Ht = v(St , t).
p

Note that the cash borrowed process is always a martingale. In particular, for q = 4
and p = 2, we see that the no arbitrage price of the quadruple long ETF at any given
time t ∈ [0, T ] is Φ[x 2 y](1) -convex and such a process can be hedged by a double
ETF.
Remark 4.4.5 It is worthy to observe that when q ∈ (0, 1) and p < q the Φ[x p y](1) -
convex functions are, in fact, concave. We can see that Φ[x p y](1) -convex functions
represent a wide spectrum of convex and concave functions with different strengths.
A few graphic illustrations are included in Figures 4.6, 4.7, 4.8, and 4.9.
The above discussion can be applied to q-multiple short ETF of St . We
summarize the result in the following Theorem.
Theorem 4.4.6 Let St be the price of an asset satisfying the diffusion equa-
tion (4.4.20). Suppose that either p > q > 0 or p < q < 0 and q < p. Then

Fig. 4.6 Graphic illustration y

of q = 4 and p = 2

x
136 4 Continuous Financial Models

Fig. 4.7 Graphic illustration y

of q = 1/2 and p = 1/4

Fig. 4.8 Graphic illustration y

of q = 1/2 and p = −1/2

Fig. 4.9 Graphic illustration y

of q = −2 and p = −1/2

x
4.4 Generalized Duality and Hedging with Contingent Claims 137

a q-multiple short ETF of St , t ∈ [0, T ] can always be dynamically hedged with

an arbitrage free self-financing portfolio involving a p-multiple long ETF of St .
Moreover, for any t ∈ [0, T ], the arbitrage free price of the q-multiple short ETF is
Φ[x p y](1) -convex.
Proof The proof is the same as that of the proof of Theorem 4.4.4 except we need
to use the second part of Proposition 4.4.3.

Generalized convexity also shows up in other financial related functions. The

following are two simple examples.
Example 4.4.7 (Stock Price as a Contingent Claim of Company’s Asset) Leland
proposed the following perspective of stock price in [31]. Consider a company’s
activity has value at at t ∈ [0, +∞) with dynamics

dat = σ at dWt .

where σ is a constant. Assume that the risk free rate is r and that there is no dividend.
Let’s first view the stock price S(at ) as a perpetual claim on at . Then S(at ) satisfies
the ordinary differential equation

σ 2x2
Sxx + rxSx − rS = 0.
2
So that
q
S(at ) = bat − cat ,

where q = −r/σ 2 < 0, b, c > 0.

Now suppose that the company has outstanding bond maturing at T with a total
amount K. Then the stock price u becomes a contingent claim on at with terminal
payoff

u(aT , T ) = (baT − caT − K)+ .

It is easy to check that for x sufficiently large u is an increasing function and

xu (x)
− ≤ 1 − q.
u (x)

Thus, for K sufficiently large u(x, T ) is a Φ[x q y](1) -convex function. It follows
from Theorem 4.4.1 that u(·, t) is also Φ[x q y](1) -convex.
Example 4.4.8 (Normal Kernel) Consider the scaled normal kernel

n(x) = e−kx
2 /2
, x ≥ 0, k > 0.
138 4 Continuous Financial Models

We can verify that −xn (x)/n (x) = kx 2 − 1 ≥ −1 but there is no upper bound.
Thus, the decreasing function e−kx /2 , x ≥ 0 is Φ[x p y](1) -convex for any p ≥ 2.
2

Due to the symmetry of both e−kx /2 and |x|p y − b with respect to the vertical axis
2

we conclude that this property also holds when x < 0. So that e−x /2 is Φ[|x|p y](1) -
2

convex for any p ≥ 2.

We note that in both Example 4.4.7 and Example 4.4.8 the functions involved are
neither convex nor concave.

4.4.4 Reducing the Volatility of the Hedging Process

When there are multiple hedging claims available in the market, it is usually the
case that for a given target contingent claim there are many different ways to hedge.
Choosing an appropriate hedging device that fits better in generalized convexity
often can help reducing the volatility of the hedging process.
Example 4.4.9 (Hedging q-Multiple Long ETF Using p-Multiple) Suppose that St
is a diffusion process

dSt = σ St dWt , t ∈ [0, T ].

Let v be the value of the q-multiple long ETF of St . Suppose either q > 0, p < q
or q < 0, p > q. Then the process for the hedging shares has been explicitly
calculated as
q q−p [ q(q−1) − p(p−1) ]σ 2 (T −t)
Nt = S e 2 2
p t

and the cash borrowed process is

q −p
Ht = v(St , t).
p

Note that the closer the p to q, the smoother the cash borrowed process Ht which is
a proxy for the value of the hedging portfolio.
Example 4.4.10 (Normal Kernel) Now consider St following a Bachelier model
St = σ Wt and let v(St , t), t ∈ [0, T ] be the no arbitrage price of a contingent
claim with payoff f (x) = e−x /2 at T .
2

It is easy to directly calculate that

1 St2
v(St , t) = exp − .
σ 2 (T − t) + 1 2(σ 2 (T − t) + 1)
4.4 Generalized Duality and Hedging with Contingent Claims 139

In this case, we can dynamically replicate v using either St (v is not convex in St )

or its double long ETF Pt = St2 + σ 2 (T − t) with respect to which v is convex.
When hedging with St we can calculate that share of hedging NtS = vx =
−St v/(σ 2 (T − t) + 1). The cash borrowed process is
St2 + σ 2 (T − t) + 1
HtS = St NtS − v = − v.
σ 2 (T − t) + 1
When hedging with Pt we can similarly calculate that the share of hedging NtP =
vx /px = −v/2(σ 2 (T − t) + 1). The cash borrowed process becomes
St2 /2 + 3σ 2 (T − t)/2 + 1
HtP = St NtP − v = − v.
σ 2 (T − t) + 1
We can see that hedging with Pt results in a smoother cash borrowed process
because the random change related to the uncertain stock price is only half that
of hedging with St .

4.4.5 The Volatility Trade

Now consider St following a diffusion process

dSt = σt St dWt , t ∈ [0, T ].

Let us assume that the volatility σt2 is unknown. We further assume that the market
implies a constant volatility σh2 which is, say, known to be too high by a certain
trader. Can he take advantage of the situation? Carr and Madan have shown in [11]
that the answer is yes if there is a contingent claim whose no arbitrage price v(St , t)
is convex in St .
In this example we show that generalized convexity can help us to derive a similar
volatility trade when v(St , t) has a certain generalized convexity properties. Let
p(St , t) be the no arbitrage price of a hedging claim with p(·, t) strictly monotone.
Let ct (x, y) = p(x, t)y. We assume that v(·, T ) is ΦcT (1) -convex but not necessarily
convex in St such as in Examples 4.4.7 and 4.4.8.
Denote again

ṽ(p, t, σh ) = v(x(p, t), t, σh ).

We have already seen that ṽ(p, t, σh ) is convex in p. Here σh is added to emphasize

that the trader views that ṽ(p, t, σh ) follows the constant volatility σh implied by
the market in trading.
Itô’s formula tells us that
% &
T T ∂ ṽ ṽpp px2 Ss2 2
v(ST , T ) − v(St , t) − ṽp dPt = + σs ds.
t t ∂s 2
140 4 Continuous Financial Models

The left hand is the trading portfolio and the right hand is the P&L. Since the trader
follows the constant volatility σh implied by the market in trading

∂ ṽ ṽpp px2 St2 2

=− σh .
∂t 2
Thus,
T ṽpp px2 Ss2 2
P &L = (σs − σh2 )ds
t 2

where vpp > 0. We see that the trader can take advantage of the over estimation on
volatility by the market by dynamically trading the portfolio

T
v(ST , T ) − v(St , t) − ṽp dPt .
t
Comments

Chapter 1 Sections 1.1–1.4 give a concise summary of standard convex analysis

duality theory, which is pioneered by Fenchel [18], Moreau [41], and Rockafellar
[45]. Our exposition follows [9, 20] emphasizing the variational approach by
focusing on convex programming. We also highlight the role of subdifferential of
the optimal value function as the set of Lagrange multipliers and the set of dual
solutions.
Generalized convexity, conjugacy and related duality discussed in Section 1.5
can be traced back to Moreau. It gained more attention recently due to diverse
applications and also due to its role in mass transport theory [59]. Our main
references here are [16, 30, 39]. Their applications in hedging with contingent claims
are discussed in Section 4.4.
Chapter 2 Section 2.1 provides a unified treatment of the classical Markowitz
portfolio theory [38], CAPM model [50], and Sharpe ratio [51]. Following [64]
we emphasize that the underlying mathematical tools for all these applications are
minimizing a quadratic function with linear constraint, a simplest form of convex
programming. Convex duality is essential in revealing the structure of the solutions
with a practical financial meaning.
Section 2.2 deals with the portfolio problem from the perspective of utility
optimization. Utility function has a long history that goes back to the work of Daniel
Bernoulli [4] who in 1738 related to the St. Petersburg paradox proposed earlier
by his cousin Nicolas Bernoulli. The relevance to financial problem comes in as
optimizing the utility of a portfolio simultaneously accounts for investors pursuing
capital growth and risk aversion. The concavity of utility functions means convex
analysis is essential. Different agents have different degree of risk aversion. They
can be measured by using either absolute risk aversion coefficients or relative risk
aversion coefficients [1, 44]. Interestingly, utility functions with those risk aversion
coefficients bounded at a given level can be characterized by generalized convexity
discussed in Section 1.5. These new characterizations are included in Section 2.2.2.

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 141
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2
142 Comments

Growth optimal portfolio theory [32] and Kelly’es criterion [27, 34, 35, 55–
57] as a money management tool in investment general and in games in particular
are discussed as an illustration of such utility optimization problems. In particular,
following [27, 49, 63] we highlight that optimizing the expected log utility for a
portfolio of cash and a given investment strategy on historical performance data
amounts to measure the useful information implied by the investment strategy and
can be used as a measure to compare different investment strategies. In practice
the growth optimal portfolio and its special case the Kelly criterion are often too
risky as illustrated in Example 2.2.11. Various fractional Kelly money management
schemes, often ad hoc, were proposed to limiting the risk. Recently Vince and Zhu
[60] and Lopez de Prado, Vince and Zhu [33] provided theoretical justification for
such more conservative betting strategies. They use more realistic finite investment
horizon and select betting size based on risk adjusted returns. The analysis involves,
however, nonconvex functions.
Fundamental theorem of asset pricing (FTAP) relates no arbitrage to the existence
of a martingale measure that can be used to price assets in a financial market. Cox,
Ross, and Rubinstein observed such a principle in their classical work related to
option pricing in complete markets [12, 13]. General FTAPs were discussed in
[15, 21, 22, 29] with progressing generality, usually with a proof based on separation
arguments. Dybvig and Ross [17] observed that in an incomplete market the
martingale measures are related to the risk aversion of market agent. In Section 2.3
we approach the FTAP from the perspective of convex duality. We show that in an
incomplete market, a martingale measure is, in fact, a scaling of the dual solution
to a portfolio utility maximization problem. We also illustrate with example that
this relationship helps us to understand that in an incomplete market, a martingale
measure provides a reference price for a certain agent to improve their utility
rather than arbitrage. In a finite dimensional space, the linear programming duality
approach in Section 2.3.4 (see e.g. [28]) is equivalent to the Krep-Yan cone
separation theorem which is used by Harrison and Kreps [21], Harrison and Pliska
[22], Delbaen and Schachermayer [15], and many others in their proofs of FTAP in
different settings.
Section 2.4 deals with risk measures, a concept that plays important roles for
both financial institutions and regulatory agencies. Diversification reduces risk
which implies the convexity of risk measures. We focus on coherent risk measures
proposed by Artzner, Delbaen, Eber, and Heath in [2]. Coherent risk measures are
sublinear, a particular type of convex function. Duality is involved in providing a
dual characterization of a coherent risk measure as the conjugate of an indicator
function of a cone, called acceptance cone. Interestingly, the generating set for the
acceptance cone is closely related to the practice of stress tests. Convex duality
also provides several equivalent description of the coherent risk measures in terms
of linear preference and value bonds. Moreover, the same argument is at the core
of the discussion of good deal in financial markets as explained in Jaschke and
Küchler [24]. Beside providing a framework to understand risk measures and their
relationship with other important financial concepts, convex duality methods also
Comments 143

help to amend widely used nonconvex risk measure value at risk [25] to the convex
conditional value at risk proposed by Rockafellar and Uryasev in [46, 47].
Chapter 3 Sections 3.1–3.3 demonstrate that many of the results in the previous
chapter also persist in the more general setting of a multiperiod economy. We use
the general model laid out in S. Roman’s textbook [48].
Section 3.4 discusses super hedging (and symmetrically subhedging) bounds
in incomplete markets. This is a classical topic in financial mathematics (see
[22, 23, 26]). We emphasize that the super hedging bound of a given contingent
claim is a linear programming problem. Linear programming duality allows us to
view the super hedging bound in two different perspectives. On one hand it is the
supremum of all the prices derived through martingale measure and on the other
hand it can be represented as the cost of the smallest super hedging portfolio. When
the sample space is finite, the super hedging portfolio in the second representation
can be derived by solving a linear programming problem. The linear programming
duality can also be used to analyze narrowing the gap between the super and sub-
hedging bounds by adding contingent claims with known prices. When discussing
contingent claims related to currency spread, incomplete markets may arise from
complete markets. Considering supper hedging bounds in this kind of problems, in
general, leads to a Kantorovich mass transportation problem [59]. We illustrate the
solution process with an example on a finite sample space using linear programming
duality.
Section 3.5 discusses a model for financial markets with bid and ask spread. The
main difference with a simplified one price financial market is that the attainable
payoff set due to trading is, in general, a convex cone rather than a subspace.
This leads to the title conic finance as coined by Madan in [36, 37]. Besides a
concise representation of the basic conic finance model, we also discuss new refined
fundamental theorem of asset pricing as well as super and sub-hedging price bounds.
These results are taken from [58] emphasizing the role of convex duality.
Chapter 4 Section 4.1 summarizes facts on continuous models that we need later.
To be concise we are satisfied with a heuristic description of most of the material.
Readers interested in further details may consult [5, 42, 52–54]. The dual Itô formula
is a first taste of the role of duality in continuous model. It develops the generalized
Itô formula using quadratic covariance in [19].
Section 4.2 discusses convexity and generalized convexity emerged in Bachelier
[3] and Black–Scholes [6, 40] formulae. The importance of these convexity proper-
ties is highlighted in applying them in the computation of Greeks and in illustrating
the delta hedging is, in fact, the Fenchel-Legendra transform of the pricing formula.
This is the observation in Carr [10] for more general settings and discussed in greater
detail in Section 4.3.
It turns out that if one hedges using a contingent claim rather than the underlying
itself, similar duality still persists in the sense of generalized duality that we discuss
in Section 4.4. The general principles are summarized in Sections 4.4.1 and 4.4.2. A
number of examples are included to illustrate their applications in financial practice.
How to hedge with the popular multiple ETFs of indices is discussed in detail in
144 Comments

Section 4.4.3. What are also discussed in this section are examples of generalized
convexity of Leland’s model of stock price as contingent claims of company’s assets
[31] and the general convexity of the normal kernel. The common theme here is
that they all follow from characterizations of the generalized convexity using the
relative risk aversion coefficient and the absolute risk aversion coefficient. Hedging
with derivatives can help to reduce the risk and to expand the range of volatility
trading which is proposed in [11]. These are discussed in Sections 4.4.4 and 4.4.5,
respectively. Much of the materials regarding these duality and generalized duality
relationships appear here for the first time. We believe that this is an area that is
worthy of further attention.
In addition, survey papers [14, 43, 61, 64] have also been valuable references.
References

1. Arrow, K.J.: Aspects of the Theory of Risk Bearing. The Theory of Risk Aversion. Yrjo
Jahnssonin Saatio, Helsinki (1965). Reprinted In: Essays in the Theory of Risk Bearing, pp.
90–109. Markham, Chicago (1971)
2. Artzner, P., Delbaen, F., Eber, J.-M., Heath, D.: Coherent measures of risk. Math. Financ. 9,
203–228 (1999)
3. Bachelier, L.: Théorie de la spéculation. Ann. Sci. Éc. Norm. Supér. 3(17), 21–86 (1900)
4. Bernoulli, D.: Exposition of a new theory on the measurement of risk. Econometrica 22, 23–36
(1954/1738)
5. Bjork, T.: Arbitrage Theory in Continuous Time. Oxford University Press, New York (2009)
6. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81,
637–645 (1973)
7. Borwein, J.M., Lewis, A.S.: Convex Analysis and Nonlinear Optimization. Springer, New York
(2000). Second edition (2005)
8. Borwein, J.M., Zhu, Q.J.: Techniques of Variational Analysis. Springer, New York (2005)
9. Borwein, J.M., Zhu, Q.J.: A variational approach to Lagrange multipliers. J. Optim. Theory
Appl. 171, 727–756 (2016). https://doi.org/10.1007/s10957-015-0756-2
10. Carr, P.: Option as Optimization: A Dual Approach to Derivatives Pricing. Quant USA, New
York (2014)
11. Carr, P., Madan, D.: Toward a theory of volatility trading. In: Jarrow, R. (ed.) Volatility
Estimation Techniques for Pricing Derivatives, pp. 417–427. Risk Books, London (1998)
12. Cox, J., Ross, S.: The valuation of options for alternative stochastic processes. J. Financ. Econ.
3, 144–166 (1976)
13. Cox, J., Ross, S., Rubinstein, M.: Option pricing: a simplified approach. J. Financ. Econ. 7,
229–263 (1979)
14. Dahl, K.R.: Convex duality and mathematical finance. Thesis for M.Sci., University of Oslo
(2012)
15. Delbaen, F., Schachermayer, W.: A general version of the fundamental theorem of asset pricing.
Math. Ann. 300, 463–520 (1994)
16. Doleski, S., Kurcyusz, S.: On Φ− convexity in extremal problems. SIAM J. Control Optim.
16, 277–300 (1978)
17. Dybvig, P., Ross, S.A.: Arbitrage, state prices and portfolio theory. In: Handbook of the
Economics of Finance. North-Holland, Amsterdam (2003)
18. Fenchel, W.: Convex Cones, Sets and Functions. Lecture Notes. Princeton University, Prince-
ton (1951)

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 145
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2
146 References

19. Föllmer, H., Protter P., Shiryaev, A.N.: Quadratic covariation and an extension of Itô’s formula.
Bernoulli 1, 149–169 (1995)
20. Gale, D.: A geometric duality theorem with economic applications. Rev. Econ. Stud. 34, 19–24
(1967)
21. Harrison, J.M., Kreps, D.M.: Martingales and arbitrage in multiperiod securities markets. J.
Econ. Theory 20, 381–408 (1979)
22. Harrison, J.M., Pliska, S.: Martingales and stochastic integrals in the theory of continuous
trading. Stoch. Process. Appl. 11, 215–260 (1981)
23. Jacka, S.D.: A martingales representation result and an application to incomplete financial
markets. Math. Financ. 2, 239–250 (1992)
24. Jaschke, S., Küchler, U.: Coherent risk measures and good-deal bounds. Financ. Stochast. 5,
181–200 (2001)
25. Jorion, P.: Value at Risk. McGraw-Hill, New York (1997)
26. Kahalé, N.: Sparse calibrations of contingent claims. Math. Financ. 20, 105–115 (2010)
27. Kelly, J.L.: A new interpretation of information rate. Bell Syst. Tech. J. 35, 917–926 (1956)
28. King, A.J.: Duality and martingale: a stochastic programming perspective on contingent
claims. Math. Progam. Ser. B 91, 543–562 (2002)
29. Kramkov, D., Schachermayer, W.: The asymptotic elasticity of utility functions and optimal
investment in incomplete markets. Ann. Appl. Probab. 9, 904–950 (1999)
30. Kutateladze, S.S., Rubinov, A.M.: Minkowski duality and its applications. Russ. Math. Surv.
27, 137–192 (1972)
31. Leland, H.: Corporate debt value, bond covenants, and optimal capital structure. J. Financ.
49(4), 1213–1252 (1994)
32. Lintner, J.: The valuation of risk assets and the selection of risky investments in stock portfolios
and capital budgets. Rev. Econ. Stat. 47, 13–37 (1965)
33. Lopez de Prado, M., Vince, R., Zhu, Q.J.: Optimal Risk Budgeting Under a Finite Investment
Horizon. SSRN 2364092 (2013)
34. Maclean, L.C., Thorp, E.O., Ziemba, W.T.: Good and bad properties of the Kelly criterion.
In: Maclean, L.C., Thorp, E.O., Ziemba, W.T. (eds.) The Kelly Capital Growth Investment
Criterion, Theory and Practice, pp. 563–574. World Scientific, Singapore (2010)
35. Maclean, L.C., Thorp, E.O., Ziemba, W.T. (eds.): The Kelly Capital Growth Investment
Criterion, Theory and Practice. World Scientific Handbook in Financial Economics Series,
vol. 3. World Scientific, Singapore (2011)
36. Madan, D.: Asset pricing theory for two price economies. Ann. Financ. 11, 1–35 (2014)
37. Madan, D., Schoutens, W.: Applied Conic Finance. Cambridge University Press, Cambridge
(2016)
38. Markowitz, H.: Portfolio Selection. Cowles Monograph, vol. 16. Wiley, New York (1959)
39. Martinez-Legaz, J.E.: Generalized Convex Duality and Its Economic Applications. Pontificia
Universidade Catolica del Peru (2002)
40. Merton, R.: Theory of rational option pricing. Bell J. Econ. Manag. Sci. 4, 141–183 (1973)
41. Moreau, J.J.: Fonctionelles Convexes. Lecture Notes. College de France, Paris (1967)
42. Oksendal, B.: Stochastic Differential Equations, 6th edn. Springer, New York (2003)
43. Pennanen, T.: Convex duality in stochastic optimization and mathematical nance. Math. Oper.
Res. 36, 340–362 (2011)
44. Pratt, J.W.: Risk aversion in the small and in the large. Econometrica 32(1–2), 122–136 (1964)
45. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
46. Rockafellar, R.T., Uryasev, S.: Optimization of conditional value at risk. J. Risk 2, 21–41
(2000)
47. Rockafellar, R.T., Uryasev, S.: Conditional value-at-risk for general loss distributions. J. Bank.
Financ. 26, 1443–1471 (2002)
48. Roman, S.: Introduction to the Mathematics of Finance. Springer, New York (2004)
49. Shannon, C., Weaver, W.: The Mathematical Theory of Communication. University of Illinois
Press, Urbana (1949)
References 147

50. Sharpe, W.F.: Capital asset prices: a theory of market equilibrium under conditions of risk. J.
Finance 19, 425–442 (1964)
51. Sharpe, W.F.: Mutual fund performance. J. Bus. 39, 119–138 (1966)
52. Shreve, S.E.: Stochastical Calculus for Finance I. Springer, New York (2004)
53. Shreve, S.E.: Stochastical Calculus for Finance II Springer, New York (2004)
54. Steele, J.M.: Stochastic Calculus and Financial Applications. Springer, New York (2001)
55. Thorp, E.O.: Beat the Dealer. Random House, New York (1962)
56. Thorp, E.O.: Portfolio choice and the Kelly criterion. In: Proceedings of the Business and
Economic Statistics, pp. 215–224. American Statistical Association, Washington (1971)
57. Thorp, E.O., Kassouf, S.T.: Beat the Market. Random House, New York (1967)
58. Vazifedan, M., Zhu, Q.J.: No Arbitrage Principle in Conic Finance. Working Paper (2018)
59. Villani, C.: Topics in Optimal Transportation. Graduate Studies in Mathematics, vol. 58.
American Mathematical Society, Providence (2003)
60. Vince, R., Zhu, Q.J.: Optimal betting sizes for the game of blackjack. Risk J. Portf. Manag. 4,
53–75 (2015)
61. Xia, J., Yan, J.A.: Convex duality theory for optimal investment (2006). Preprint
62. Zǎlinescu, C.: On duality gaps in linear conic problems. School of Industrial and Systems Engi-
neering, Georgia Institute of Technology, Atlanta, GA (2010). Preprint. www.optimization-
online.org/DB_HTML/2010/09/2737.html
63. Zhu, Q.J.: Mathematical analysis of investment systems. J. Math. Anal. Appl. 326, 708–720
(2007)
64. Zhu, Q.J.: Convex analysis in mathematical finance. Nonlinear Anal. Theory Methods Appl.
75, 1719–1736 (2012)
Index

Symbols B
(Ω, F , P ), 35 Bachelier formula, 116, 117
CV aR, 79 convexity, 119
I , 13 beta, 44
K +, 3 biconjugate, 13, 14
RV (Ω, F , P ), 35 Black–Scholes formula, 116, 118
S, 36 as Fenchel-Legendra transform, 120
V aR, 78 convexity, 120
[A, B], 13 delta hedging, 124
Λ, 10 dual, 124
Θ, 36 generalized convexity, 121
χA , 102 time reversal, 126
Θ̂, 36 Blackjack, 53
ιC , 1, 13 Boltzmann–Shannon entropy, 20
·, · , 35 box algebra, 110
epi f , 1 Brownian motion, 108
int, 5
∂, 5
ρs , 77 C
σ -algebra, 35 capital asset pricing model, 39, 40, 43
σC , 2, 13 capital market line, 41, 42
dC , 4 capital market portfolio, 42
dd, 77 cash stream, 99
f ∗ , 12 implementable, 100
f −1 , 1 super implementable, 101
port[S], 57 chain rule, 8
ts[S], 85 coherent
E, 35 acceptence cone, 71
E[X], 2 partial order, 73
preference, 73
A price, 74
acceptance cone, 71 risk measure, 68
arbitrage, 54, 57 coherent partial order, 73
trading strategy, 101 coherent preference, 73

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 149
P. Carr, Q. J. Zhu, Convex Duality and Financial Mathematics,
SpringerBriefs in Mathematics, https://doi.org/10.1007/978-3-319-92492-2
150 Index

coherent valuation bounds, 73 F

commutator, 13 Fenchel
complete market, 66, 66, 92, 92, 96 biconjugate, 13
concave conjugate, 12
function, 1 examples, 13
mapping, 2 rules, 14
cone Fenchel–Young
acceptance, 71 equality, 12, 30
coherent acceptance, 71 inequality, 12, 13
conic finance, 99 generalized, 23
conic financial market, 99 multidimensional, 25
conjugation, 12 weighted, 24
consistent price operator, 74, 102 Fenchel-Legendre transform, 12, 13
constrained optimization, 3 derivative, 13
constraint qualification, 18, 37, 59, 104 Fenchel-Rockafellar Theorem, 5
contingent claim, 61, 92, 93 Fermat’s rule, 9
European style, 92, 93 filtration, 107
convex financial market, 36
cone, 2 function
function, 1 characteristic, 102
mapping, 2 epigraph, 1
normal cone, 5 indicator, 1
programming, 3, 9 log return, 51
set, 1 optimal value, 4, 7
subdifferential, 5 polyhedral, 6, 20, 23
subgradient, 5, 12 preimage, 1
cyclical monotonicity, 30 support, 2
utility, 46
fundamental theorem of asset pricing, 55, 59,
D 89
decoupling lemma, 6
delta hedging, 13, 123 G
domain generalized convexity, 128
of function, 1 p−multiple ETF, 134
of subdifferential, 5 normal kernel, 137
dual space, 3 reducing volatility, 138
duality volatility trade, 139
Fenchel, 18 Generalized duality, 128
generalized, 28 Girsanov theorem, 115
Lagrange, 21, 32 good deal, 75
Lagrange multipliers, 16 growth optimal portfolio, 50
linear program, 22 growth optimal portfolio theory, 50
linear programming, 67
Rockafellar, 16, 32
strong, 18, 37, 59 H
weak, 16 hedge, 93, 104
sub, 93
super, 93
E with p−multiple ETF, 134
efficiency index, 53, 54 with contingent claim, 128
entropy maximization, 19, 36
ETF, 134 I
exchange traded fund, 134 incomplete market, 66, 92, 97
expectation, 2 information rate, 52, 53
Index 151

information structure, 85 P
interior of the domain, 5 partial order, 3, 73
Itô formula, 110 coherent, 73
basic form, 110 payoff, 61
dual, 114 polar cone, 3
graphic illustration, 110 polyhedral
multidimensional, 113 function, 6, 23
Itô process, 111 set, 6
portfolio, 36, 85
equivalent, 56
J growth optimal, 50
Jensen’s inequality, 2 Markowitz, 38
minimum risk, 40
space, 57
K price operator, 74
Kelly criterion, 52 consistent, 74, 102
normalized, 74, 105
L Pshenichnii–Rockafellar condition, 9
Lagrange multiplier, 4, 9, 15, 94
leverage, 51
linear programming, 6 R
log return function, 51 relative interior, 6
long, 134 return, 36
risk
aversion, 47
M coefficient(absolute), 48
market coefficient(relative), 48
complete, 66, 92 risk free asset, 36
incomplete, 66, 92 risk measure, 68
Markowitz coherent, 68
bullet, 39 conditional value at risk, 79
frontier, 38 drawdown, 77
portfolio, 35, 36 dual representation, 69, 70
martingale, 109 standard deviation, 77
representation, 114 value at risk, 78
martingale measure, 58 risky assets, 36
unique, 66, 92 rule
Fermat, 9

N
necessary optimality condition, 9 S
norm sandwich theorem, 7
portfolio, 85 Sharpe ratio, 45
trading strategy, 85 short, 134
normal cone, 8 span of the domain, 6
and subgradients, 8 stochastic processes, 107
to intersection, 8 subdifferential, 5
Novikov’s condition, 116 calculus, 8
chain rulel, 8
generalized, 30
O nonemptyness, 5
optimal leverage, 51, 52 sum rule, 8, 9
optimal value function, 4 subgradient, 5
order-reversing, 12 sum rule, 8
152 Index

T U
trading strategy, 85, 100, 101 utility optimizaation, 58
admissible, 86, 87
arbitrage, 86, 95, 101
leverage level, 86 V
norm, 86 valuation bounds, 73
self-financing, 86 coherent, 73
two fund separation theorem, 42
two fund theorem, 39

Fin Eng
No ratings yet
Fin Eng
514 pages
Convex Duality and Financial Mathematics 1st Ed Peter Carr Qiji Jim Zhu Instant Download
No ratings yet
Convex Duality and Financial Mathematics 1st Ed Peter Carr Qiji Jim Zhu Instant Download
52 pages
Exvcellent J Robert Buchanan An Undergraduate Introduction To Financial
No ratings yet
Exvcellent J Robert Buchanan An Undergraduate Introduction To Financial
466 pages
Introduction To The Mathematics of Finance: 6. Mppmeqw
No ratings yet
Introduction To The Mathematics of Finance: 6. Mppmeqw
162 pages
Option Pricing With Machine Learning Daniel Bloch 1718934904
100% (1)
Option Pricing With Machine Learning Daniel Bloch 1718934904
49 pages
A Theory of Bond Portfolios
No ratings yet
A Theory of Bond Portfolios
47 pages
Stochastic Finance
No ratings yet
Stochastic Finance
1,208 pages
(Springer Proceedings in Mathematics & Statistics 189) Kallsen, Jan, Papapan-Toleon, Antonis (Eds.) - Advanced Modelling in Mathematical Fina
No ratings yet
(Springer Proceedings in Mathematics & Statistics 189) Kallsen, Jan, Papapan-Toleon, Antonis (Eds.) - Advanced Modelling in Mathematical Fina
508 pages
Springer Proceedings in Mathematics: For Further Volumes
No ratings yet
Springer Proceedings in Mathematics: For Further Volumes
477 pages
Inspired by Finance
100% (2)
Inspired by Finance
553 pages
Lectures Notes Mve095
No ratings yet
Lectures Notes Mve095
201 pages
Merton 1990
No ratings yet
Merton 1990
85 pages
Stochastic Finance
100% (2)
Stochastic Finance
680 pages
Mathematical Finance - Wikipedia
No ratings yet
Mathematical Finance - Wikipedia
7 pages
Optimal Positioning Derivative Sercurities Incomplete Markets
No ratings yet
Optimal Positioning Derivative Sercurities Incomplete Markets
22 pages
State Preference Theory Approach.
75% (4)
State Preference Theory Approach.
15 pages
Recovering Risk Aversion From Option Prices and Realized Returns
No ratings yet
Recovering Risk Aversion From Option Prices and Realized Returns
20 pages
Asset Pricing Modeling and Estimation
No ratings yet
Asset Pricing Modeling and Estimation
246 pages
Introductory Course On Financial Mathematics - Compressed
No ratings yet
Introductory Course On Financial Mathematics - Compressed
259 pages
A General Framework For Portfolio Theory-Part I TH
No ratings yet
A General Framework For Portfolio Theory-Part I TH
43 pages
Discrete Math Notes
No ratings yet
Discrete Math Notes
134 pages
Economics of Risk and Time
100% (1)
Economics of Risk and Time
349 pages
Kramko Schachermayer - AAP - 99
No ratings yet
Kramko Schachermayer - AAP - 99
47 pages
Portfolio Selection
No ratings yet
Portfolio Selection
8 pages
Monte Carlo Methods in Finance An Introductory Tut
No ratings yet
Monte Carlo Methods in Finance An Introductory Tut
10 pages
Solutions To Past Year
No ratings yet
Solutions To Past Year
4 pages
Foundations of Portfolio Theory
100% (1)
Foundations of Portfolio Theory
10 pages
Option Pricing - A Simplified Approach
No ratings yet
Option Pricing - A Simplified Approach
34 pages
Dynamic Asset Allocation Under Inflation
No ratings yet
Dynamic Asset Allocation Under Inflation
39 pages
Geman ChangesNumraireChanges 1995
No ratings yet
Geman ChangesNumraireChanges 1995
17 pages
Old Exams
No ratings yet
Old Exams
271 pages
Unit 9 PDF
No ratings yet
Unit 9 PDF
23 pages
Strategic Asset Allocation
No ratings yet
Strategic Asset Allocation
27 pages
ST339 23chapter1
No ratings yet
ST339 23chapter1
11 pages
1612 04407 PDF
No ratings yet
1612 04407 PDF
23 pages
Option Pricing A Simplified Approach
100% (1)
Option Pricing A Simplified Approach
34 pages
Introduction To Portfolio Selection and Capital Market Theory: Static Analysis
No ratings yet
Introduction To Portfolio Selection and Capital Market Theory: Static Analysis
97 pages
.Gianin, Sgarra - Mathematical Finance - Theory Review and Exercises - pp.286
100% (3)
.Gianin, Sgarra - Mathematical Finance - Theory Review and Exercises - pp.286
286 pages
Notite Part II
No ratings yet
Notite Part II
4 pages
1905 00711 PDF
No ratings yet
1905 00711 PDF
55 pages
Cox Huang 91
No ratings yet
Cox Huang 91
23 pages
Hobson Survey of Math Finance
No ratings yet
Hobson Survey of Math Finance
33 pages
Mark Joshi 1: T&T Note: Check Word Spacing Before Press
No ratings yet
Mark Joshi 1: T&T Note: Check Word Spacing Before Press
7 pages
Calculus Early Transcendentals 11th Edition Anton Solutions Manual Download
100% (3)
Calculus Early Transcendentals 11th Edition Anton Solutions Manual Download
23 pages
An Introduction To Isogeometric Analysis: A. Buffa, G. Sangalli, R. V Azquez
No ratings yet
An Introduction To Isogeometric Analysis: A. Buffa, G. Sangalli, R. V Azquez
44 pages
IntrinsicPricesOfRisk TrucLe Libre
No ratings yet
IntrinsicPricesOfRisk TrucLe Libre
11 pages
Descriptive Statistics Unit 2
No ratings yet
Descriptive Statistics Unit 2
72 pages
An Introduction To Computational Finance Without Agonizing Pain, Peter Forsyth
No ratings yet
An Introduction To Computational Finance Without Agonizing Pain, Peter Forsyth
77 pages
Emmanuel S Bangura Final Combination of Chapter (1) Edited
No ratings yet
Emmanuel S Bangura Final Combination of Chapter (1) Edited
96 pages
Does Constant Relative Risk Aversion Imply Asset Demands That Are Linear in Expected Returns
No ratings yet
Does Constant Relative Risk Aversion Imply Asset Demands That Are Linear in Expected Returns
15 pages
Statistics Formula Booklet
No ratings yet
Statistics Formula Booklet
13 pages
Desmos User Guide
No ratings yet
Desmos User Guide
13 pages
No-Armageddon Measure For Arbitrage-Free Pricing of Index Options in A Credit Crisis
No ratings yet
No-Armageddon Measure For Arbitrage-Free Pricing of Index Options in A Credit Crisis
21 pages
Lehman - Can Convexity Be Exploited
100% (3)
Lehman - Can Convexity Be Exploited
15 pages
Idu Final Myp 5
No ratings yet
Idu Final Myp 5
11 pages
Information Theory, IT Entropy Mutual Information Use in NLP
No ratings yet
Information Theory, IT Entropy Mutual Information Use in NLP
23 pages
Asset Pricing Solutions: 1 Problem 1
No ratings yet
Asset Pricing Solutions: 1 Problem 1
19 pages
Unit I
No ratings yet
Unit I
25 pages
FMTZ
No ratings yet
FMTZ
6 pages
Time Series Project
50% (4)
Time Series Project
2 pages
Code Sight Documents
No ratings yet
Code Sight Documents
105 pages
Books Complex Analysis
100% (1)
Books Complex Analysis
3 pages
Graphing Trigfunctions Review
100% (1)
Graphing Trigfunctions Review
46 pages
Probabilistic Aspects of Finance
No ratings yet
Probabilistic Aspects of Finance
22 pages
Life Insurance Mathematics
No ratings yet
Life Insurance Mathematics
22 pages
Topic9 PDF
No ratings yet
Topic9 PDF
19 pages
ANOVA, Correlation and Regression: Dr. Faris Al Lami MB, CHB PHD FFPH
No ratings yet
ANOVA, Correlation and Regression: Dr. Faris Al Lami MB, CHB PHD FFPH
40 pages
Calculus II 2023 2024 S1 Exam 1
No ratings yet
Calculus II 2023 2024 S1 Exam 1
10 pages
Ce4257 1 Introduction
No ratings yet
Ce4257 1 Introduction
27 pages
VP LectureNotes 2010
No ratings yet
VP LectureNotes 2010
26 pages
Cvitanic - Sol, Introduction To The Economics and Mathematics of Financial Markets
No ratings yet
Cvitanic - Sol, Introduction To The Economics and Mathematics of Financial Markets
53 pages
Character Formation
No ratings yet
Character Formation
16 pages
Operations Research: by Dr. S.M. Israr
No ratings yet
Operations Research: by Dr. S.M. Israr
18 pages
MGT 492
No ratings yet
MGT 492
6 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
Karl Weierstrass: Navigation Search
No ratings yet
Karl Weierstrass: Navigation Search
11 pages
CIVE 312 Structure I - Syllabi - ABET 2024
No ratings yet
CIVE 312 Structure I - Syllabi - ABET 2024
2 pages
Foundations of Portfolio Theory: Harrym - Markowitz
No ratings yet
Foundations of Portfolio Theory: Harrym - Markowitz
9 pages
MA3103
No ratings yet
MA3103
1 page
Saudi Aramco Test Report: 25-May-05 Mech-Calibration Test Report-Temperature Gauge SATR-A-2003
No ratings yet
Saudi Aramco Test Report: 25-May-05 Mech-Calibration Test Report-Temperature Gauge SATR-A-2003
1 page
AS/A Level Mathematics Geometric Sequences and Series: Mathsgenie - Co.uk
No ratings yet
AS/A Level Mathematics Geometric Sequences and Series: Mathsgenie - Co.uk
2 pages
5.4.3 Term Structure Model and Interest Rate Trees: Example 5.12 BDT Tree Calibration
No ratings yet
5.4.3 Term Structure Model and Interest Rate Trees: Example 5.12 BDT Tree Calibration
5 pages
Answer Key HW5
100% (3)
Answer Key HW5
2 pages
Econometrics: The Essentials
From Everand
Econometrics: The Essentials
Samir Ganaka
No ratings yet
Essential Mathematics for Market Risk Management
From Everand
Essential Mathematics for Market Risk Management
Simon Hubbert
5/5 (1)
Advanced Econometrics: Methods and Practical Uses
From Everand
Advanced Econometrics: Methods and Practical Uses
Himadri Deshpande
No ratings yet
Risk Budgeting: Portfolio Problem Solving with Value-at-Risk
From Everand
Risk Budgeting: Portfolio Problem Solving with Value-at-Risk
Neil D. Pearson
No ratings yet
Financial Risk Forecasting: The Theory and Practice of Forecasting Market Risk with Implementation in R and Matlab
From Everand
Financial Risk Forecasting: The Theory and Practice of Forecasting Market Risk with Implementation in R and Matlab
Jon Danielsson
4/5 (1)
Alarming! the Chasm Separating Education of Applications of Finite Math from It's Necessities
From Everand
Alarming! the Chasm Separating Education of Applications of Finite Math from It's Necessities
Ramune B. Adams
No ratings yet
Risk-Return Analysis, Volume 2: The Theory and Practice of Rational Investing
From Everand
Risk-Return Analysis, Volume 2: The Theory and Practice of Rational Investing
Harry M. Markowitz
No ratings yet

Convex Duality and Financial Mathematics - Compress

Uploaded by

Convex Duality and Financial Mathematics - Compress

Uploaded by

SPRINGER BRIEFS IN MATHEMATICS

Peter Carr · Qiji Jim Zhu

SpringerBriefs in Mathematics showcases expositions in all areas of mathematics

More information about this series at http://www.springer.com/series/10030

Convex Duality and

ISSN 2191-8198 ISSN 2191-8201 (electronic)

Library of Congress Control Number: 2018946786

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018

Convex duality plays an essential role in many important financial problems.

in an incomplete market. Thus, the no arbitrage principle usually determines a price

New York, NY, USA Peter Carr

2.3 Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.3.2 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Abstract We present a concise description of the convex duality theory in this

1.1 Convex Sets and Functions

f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y).

We call f : X → [−∞, +∞) a concave function if −f is convex.

© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2018 1

epi sup fα = ∩α epi fα

and, thus, supα fα is convex. In particular, the support function of a set C ⊂ X

σC (x ∗ ) = σ (C; x ∗ ) := sup{ x, x ∗ | x ∈ C} (1.1.1)

f (E[X]) ≤ E[f (X)],

where E[X] stands for the expectation of X.

f (λx + (1 − λ)y) ≤K λf (x) + (1 − λ)f (y).

1.1.2 Convex Programming

K + := {y ∗ ∈ Y ∗ : y ∗ , y ≥ 0 for all y ∈ K}.

Consider the following class of constrained optimization problems

P (y, z) Minimize f (x) (1.1.2)

where C is a closed set, f : X → R is lower semicontinuous, g : X → Y is lower

v(y, z) := inf{f (x) : g(x) ≤K y, h(x) = z, x ∈ C},

Minimize f (x) (1.1.3)

where C is a closed subset, f, gm : RN → R are lower semicontinuous, and hl :

Proposition 1.1.5 (Convexity of Optimal Value Function) Suppose that in the

f (xεi ) < v(y i , zi ) + ε, i = 1, 2. (1.1.4)

Now for any λ ∈ [0, 1], we have

f (λxε1 + (1 − λ)xε2 ) ≤ λf (xε1 ) + (1 − λ)f (xε2 ) (1.1.5)

It is easy to check that λxε1 + (1 − λ)xε2 is feasible for problem P (λ(y 1 , z1 ) + (1 −

v(λ(y 1 , z1 ) + (1 − λ)(y 2 , z2 )) ≤ λv(y 1 , z1 ) + (1 − λ)v(y 2 , z2 ),

that is to say v is convex. 

dC (z) = inf[x : x + c = z, c ∈ C].

While the value function of a convex programming problem is always convex, it

1.2 Subdifferential and Lagrange Multiplier

Many naturally arising nonsmooth convex functions lead to the definition of

Definition 1.2.1 (Subdifferential) Let X be a finite dimensional Banach space.

∂φ(x) = {x ∗ ∈ X∗ : φ(y) − φ(x) ≥ x ∗ , y − x ∀y ∈ X}.

We define the domain of the subdifferential of φ by

dom ∂φ = {x ∈ X | ∂φ(x) = ∅}.

An element of ∂φ(x) is called a subgradient of φ at x.

1.2.2 Nonemptiness of Subdifferential

a supporting hyperplane of epi f at (x̄, f (x̄)) whose normal vector is (0, 0) =

r(u − f (x̄)) + x ∗ , x − x̄ ≥ 0. (1.2.1)

Since u ≥ f (x) is arbitrary, r ≥ 0. Moreover, if r = 0, then x̄ ∈ int dom f would

Remark 1.2.5 (Constraint Qualification: Relative Interior) The Fenchel-Rockafellar

0 ∈ ri[dom g − A dom f ]. (1.2.2)

Then there is a y ∗ ∈ Y ∗ such that for any x ∈ X and y ∈ Y ,

p ≤ [f (x) − y ∗ , Ax ] + [g(y) + y ∗ , y ], (1.2.3)

where p = infx∈X {f (x) + g(Ax)}.

v(u) = inf {f (x) + g(Ax + u)}

Proposition 1.1.5 implies that v is convex. Moreover, it is easy to check that

v(0)=p ≤ v(y−Ax)+ y ∗ , y−Ax ≤ f (x)+g(y)+ y ∗ , y − Ax . (1.2.5)

We apply the decoupling lemma of Lemma 1.2.7 to establish a sandwich

0 ≤ p ≤ [f (x) − y ∗ , Ax ] + [g(y) + y ∗ , y ]. (1.2.6)

For any z ∈ X setting y = Az in (1.2.6) we have

f (x) − A∗ y ∗ , x ≥ −g(Az) − A∗ y ∗ , z . (1.2.7)

a := inf [f (x) − A∗ y ∗ , x ] ≥ b := sup[−g(Az) − A∗ y ∗ , z ].

Picking any r ∈ [a, b], α(x) := A∗ y ∗ , x + r is an affine function that separates f

∂(f + g ◦ A)(x) ⊃ ∂f (x) + A∗ ∂g(Ax), (1.2.8)

with equality if condition (1.2.2) holds.

f (x) − x ∗ , x ≥ α(x) ≥ −g(Ax).

Clearly equality is attained at x = x̄. It is now an easy matter to check that x ∗ +

N(C1 ∩ C2 ; x) = N(C1 ; x) + N (C2 ; x).

Proof Applying the subdifferential sum rule to the indicator functions of C1

The condition (1.2.2) is often referred to as a constraint qualification. Without it

1.2.4 Role in Convex Programming

Subdifferential plays important roles in convex programming. First for uncon-

which by definition is equivalent to 0 ∈ ∂f (x̄). 

that is to say v is convex.

dC (z) = inf[x : x + c = z, c ∈ C].

dom ∂φ = {x ∈ X | ∂φ(x) = ∅}.

a supporting hyperplane of epi f at (x̄, f (x̄)) whose normal vector is (0, 0) =

which by definition is equivalent to 0 ∈ ∂f (x̄).

Therefore, −λ ∈ ∂v(0, 0).

x → f (x) + λ, (g(x), h(x))

which is to say, −λ ∈ ∂v(0, 0).

f ∗ (f (x)) = x, f (x) − f (x).

Dx f ∗ (f (x)) = Dx x, f (x) − f (x) = [Dx , f (x)I ]x,