Quantitative Finance With Python Code
Quantitative Finance With Python Code
Foreword ix
Contributors xi
i
ii Contents
Bibliography 175
Author Bio
vii
Foreword
In March 2018, the Federal Reserve (“Fed”) was in the midst of its first hiking cycle
in over a decade, and the European Central Bank (“ECB”), still reeling from the
Eurozone debt crisis, continued to charge investors for the privilege of borrowing
money. US sovereign bonds (“Treasuries”) were yielding 3% over their German coun-
terparts (“Bunds”), an all-time high, and unconventional monetary policy from the
two Central Banks had pushed the cost of protection to an all-time low.
Meanwhile, across the pond, a sophisticated Canadian pension flipped a rather
esoteric coin: A so-called digital put-on Euro/Dollar, a currency pair that trades
over a trillion dollars a day. On this crisp winter morning, the EURUSD exchange
rate (“spot”) was 1.2500. If the flip resulted in heads, and spot ended below 1.2500
in 2 years, the pension would receive $10 million. If the flip were tails, and spot
ended above 1.2500, the pension would have to pay $2.5 million. Naturally, the 4 to
1 asymmetry in the payout suggests that the odds of heads were only 25%. Turns
out, the flip yielded heads, and in 2 years, spot was below 1.2500.
After the trade, I called Chris, reiterated the pitch, and explained that since Jan-
uary 1999, when EURUSD first started trading, the market implied odds of heads
had never been lower. As macroeconomic analysis and empirical realizations suggest
that the coin is fair, and there is about 50% chance of getting heads, should the
client perhaps consider trading the digital put in 10x the size? In his quintessentially
measured manner, Chris noted “we must have a repeatable experiment to isolate a
statistical edge”. Ten separate flips, for instance, could reduce the risk by 2/3. More-
over, as investors are perhaps loath to lose money with 100% probability, negative
bund yields incentivize capital flows to treasuries, and the anomalous rates “carry is
a well-rewarded risk premium”. Furthermore, as “investors value $1 in risk-off more
than $1 in risk-on”, does the limited upside in the payout of the digital put also
harness a well-rewarded tail risk premium?
I wish I were surprised by Chris’s nuance, or objectivity, or spontaneity. Having
known him for 8 years though, I have come to realize that he is the most gifted quant
I have had the privilege to work with, and this book is testament to his ability to
break complex ideas down to first principles, even in the treatment of the most com-
plex financial theory. The balance between rigor and intuition is masterful, and the
textbook is essential reading for graduate students who aspire to work in investment
management. Further, the depth of the material in each chapter makes this book
indispensable for derivative traders and financial engineers at investment banks, and
for quantitative portfolio managers at pensions, insurers, hedge funds and mutual
funds. Lastly, the “investment perspectives” and case studies make this a valuable
ix
x Foreword
guide for practitioners structuring overlays, hedges and absolute return strategies in
fixed income, credit, equities, currencies and commodities.
In writing this book, Chris has also made a concerted effort to acknowledge that
markets are not about what is true, but rather what can be true, and when: With
negative yields, will Bunds decay to near zero in many years? If so, will $1 invested in
Treasuries, compound and buy all Bunds in the distant future? Or will the inflation
differential between the Eurozone and USA lead to a secular decline in the purchasing
power of $1? One may conjecture that no intelligent investor will buy perpetual Bunds
with a negative yield. However, even if the Bund yield in the distant future is positive,
but less than the Treasury yield, market implied odds of heads, for a perpetual flip,
must be zero. As the price of the perpetual digital put is zero, must the intelligent
investor add this option to her portfolio?
Since the global financial crisis, the search for yield has increasingly pushed in-
vestors down the risk spectrum, and negative interest rates, and unconventional mon-
etary policy, are likely just the tip of the iceberg. This book recognizes that unlike
physics, finance has no universal laws, and an asset manager must develop an in-
vestment philosophy to navigate the known knowns, known unknowns and unknown
unknowns. To allow a portfolio manager to see the world as it was, and as it can be,
this book balances the traditional investment finance topics with the more innovative
quant techniques, such as machine learning. Our hope is that the principles in this
book transcend the outcome of the perpetual flip.
– Tushar Arora
Contributors
Tushar Arora
INSEAD
Paris, France
George Kepertis
Boston University
Boston, Massachusetts
Lingyi Xu
Boston University
Boston, Massachusetts
You Xie
Boston University
Boston, Massachusetts
Maximillian Zhang
Boston University
Boston, Massachusetts
xi
I
Foundations of Quant Modeling
1
CHAPTER 1
This approach can be used either after the fact, when all fixings for the reference
variable have been posted, or when the values for the floating leg can be inferred
from a forward or futures curve for the asset.
3
4 A Practical Guide to Investment Management, Trading and Financial Engineering
EXERCISES
1.1 Write a piece of pseudo-code that calculates the payoff for a condor option
1.2 Compare the payoff of a futures contract to a cash position in an equity index.
1.3 Build an option strategy that replicates a short position in the underlying asset.
1.4 Build an options strategy that replicates selling a put option using a call option
and a position in the underlying asset
1.6 Using put-call parity devise a strategy using a put option and the underlying
asset that replicates a long position in a call.
1.7 You are asked to build a statistical arbitrage strategy. Describe in detail how you
would approach each of the steps in this project, highlighting the assumptions
and challenges that you might encounter.
CHAPTER 2
Theoretical Underpinnings
of Quant Modeling: Modeling
the Risk Neutral Measure
d2 f
• Acceleration: The second derivative of the function: dx2
5
6 A Practical Guide to Investment Management, Trading and Financial Engineering
Note that a derivative is the local change to the function and incorporates only
the linear behavior of the function. Thus, a derivative can be found by finding the line
that the is tangent to the function at a given point. The non-linear behavior in the
function, if it exists, would then be captured in higher order derivatives. These higher
order derivatives will be an important part of quant finance and options modeling.
The power rule for derivatives, for example, helps us to differentiate a function of
the form xn . In the following list, we summarize some of the most relevant derivative
rules, including how to differentiate powers, exponential functions, and logarithms:
• Power Rule:
d (xn )
= nxn−1 (2.1)
dx
• Derivative of log(x):
d (log (x)) 1
= (2.2)
dx x
• Derivative of ex :
d (ex )
= ex (2.3)
dx
• Chain Rule:
dz(x, y) dz dy
= (2.4)
dx dy dx
• Product Rule:
d(uv) du dv
= v+ u (2.5)
dx dx dx
∂f f (x + ∆x, y, . . . ) − f (x, y, . . . )
= lim { } (2.6)
∂x ∆x→0 ∆x
• Total Derivative: Estimates the rate of change of a given function with respect
to its entire parameter set [6]. Mathematically, this can be thought of as the
aggregate of the partial derivatives with respect to each variable, and can be
written as:
N
X ∂f
df = dxi (2.7)
i=1
∂xi
2.1.3 Integration
Readers may also recall from their calculus courses that integration is another core
concept, and provides us with the tools to calculate the area of a given function,
f (x). It can be thought of as the reverse operation of a derivative. More formally,
recall that the indefinite integral of a function f (x) can be expressed via the following
equation:
8 A Practical Guide to Investment Management, Trading and Financial Engineering
Z
F (x) = f (x)dx (2.8)
We say that F (x) is the anti-derivative of f (x) is the following condition holds:
That is, if we recover f (x) by taking the derivative of F (x), then we say that
F (x) is the anti-derivative of f (x). In the following list, we review several important
integration techniques, each of which will be leveraged throughout the book.
Z Z
udv = uv − vdu (2.10)
Z Z
f (g(x))g 0 (x)dx = f (u)du (2.11)
This technique is often employed to simplify the integrands that we are working
with, and will be a technique that is employed throughout the text. The reader
should recall that a change of variables, or substitution in an integral requires
us to changes the limits of integration when we are working with a definite
integral.
Z b
f (x)dx = F (b) − F (a) (2.12)
a
where F (a) and F (b) are the antiderivative of f (x) evaluated as a and b respec-
tively. Therefore, computing a definite integral will require us to first find the
antiderivative of the function in question, and then evaluated it at the integral
Theoretical Underpinnings of Quant Modeling: Modeling the Risk Neutral Measure 9
limits. Importantly, it should be emphasized from this theorem that the result
of a definite integral is a single number. This means that, if we were then to
differentiate the result of the integral, we would obtain zero as it is a constant.
In the next section we highlight a particular case of differentiating definite inte-
gral, notably, when the quantity we are differentiating with respect to is found
in the limits of integration.
d
Z x
F 0 (x) = f (t)dt = f (x) (2.13)
dx a
This is a direct result of the fundamental theorem of calculus and will be lever-
aged throughout the text.
∞
X f (n) (x)
f (x + h) = hn (2.14)
n=0
n!
1 1
= f (x) + f 0 (x)h + f 00 (x)h2 + f 000 (x)h3 + . . . (2.15)
2 6
where f (n) is the nth derivative of the function f (x). Notice that the sum begins at
n = 0, which makes this term simply the value of the function of f (x) evaluated at
x 1.
As we can see from this representation, the Taylor series expansion is itself an
infinite sum of increasing order derivatives. However, the terms decrease in magnitude
as the order of the derivative increases, meaning that in practice we can choose a cutoff
k and truncate the series after k derivatives. For example, in the case where k = 2,
this leads to the following, Taylor Series approximation:
1
f (x + h) ≈ f (x) + f 0 (x)(h) + f 00 (x)h2 (2.16)
2
Taylor Series is a concept that we will return to in later chapter in multiple
1
As 0! equals 1
10 A Practical Guide to Investment Management, Trading and Financial Engineering
contexts. For example, later in this chapter, we will leverage a Taylor series expansion
to better understand calculus in a stochastic setting. Later, in chapter 9, we will use a
Taylor series expansion in the context of optimization techniques. Further, in chapter
10, we will use taylor series expansions to build finite difference estimates.
N
X
pi = 1 (2.17)
i=1
pi ≥ 0 ∀i (2.18)
In the next section we discuss some of the most common examples of discrete
distributions, such as Bernoulli and Binomial distributions.
• Continuous Probability Distribution: A continuous probability distribu-
tion is one in which any value can occur. Because any value can occur, contin-
uous probability distributions are defined by an infinite set of possible values.
This means that the probably of any exact value is zero, but the probability of
being within a given range will generally speaking not be. In order for a con-
tinuous function to be a probability distribution the following conditions must
hold:
Z ∞
f (x)dx = 1 (2.19)
−∞
f (x) ≥ 0 ∀x (2.20)
Theoretical Underpinnings of Quant Modeling: Modeling the Risk Neutral Measure 11
In the next section, we detail many of the most common continuous probability
distributions. Of particular note is the normal distribution, which is perhaps
the most commonly applied distribution in practical applications.
f (x) = P (X = x) (2.21)
N
1xi =x pxi
X
= (2.22)
i=1
Here we calculate the PDF of a discrete distribution by summing over all prob-
abilities where the outcome matches the specified value x. It should be empha-
sized that the PDF measures the probability of an exact value occurring. That
is, it is the probability at a specific point. While this makes sense in the context
of a discrete probability distribution, it is less clear in the context of a continu-
ous probability distribution, where the probability of any specific value is zero
because there are infinite values. In this case, we instead interpret the PDF
as the probability of a random variable X being within an interval [x, x + ∆x ]
for infinitesimally small ∆x . For a continuous distribution, we can express the
probability of obtaining a value for a random variable X that is between a and
b as:
Z b
P (a ≤ X ≤ b) = f (x)dx (2.23)
a
f (x) = P (x ≤ X ≤ x + ∆x ) (2.24)
F (x) = P (X ≤ x) (2.25)
N
1xi ≤x pxi
X
= (2.26)
i=1
Just as before, we calculate the CDF by summing the probability over the set of
possible outcomes conditional on the value being below our specified threshold
x. Similarly, for a continuous distribution, the CDF can be computed as an
integral of the PDF:
F (x) = P (X ≤ x) (2.27)
Z x
= f (u)du (2.28)
−∞
Importantly, this means that we can view the PDF of a continuous distribution
as the derivative of the PDF as follows:
dF (x)
f (x) = (2.29)
dx
N
X
E[X] = xi p i (2.30)
i=1
Theoretical Underpinnings of Quant Modeling: Modeling the Risk Neutral Measure 13
We also often use the square root of this variance, which is known as the
standard deviation of X:
q
σX = var(X) (2.35)
Additionally, we can write the variance of the sum of two random variables as
the following equation:
var(X + Y ) = var(X) + var(Y ) + 2cov(X, Y ) (2.37)
14 A Practical Guide to Investment Management, Trading and Financial Engineering
Notice that, unlike in the case of expected value, the variance of the sum is
dependent on the covariance, or correlation between the two random variables.
Notably, if the random variables are independent, then this term goes away and
simplifies to the sum of the individual variances. More generally, however, this
is not the case, and we can see that the variance of the sum will be smallest
when the covariance term is as small as possible. This is a fundamental concept
that we will use when learning about portfolio optimization techniques, and the
benefits of diversification .
cov(X, Y )
ρ= √ (2.40)
σX σY
N
X
E [V (X)] = V (xi )pi (2.41)
i=1
As we will soon see, derivatives pricing and options modeling largely comes
down to the calculation of these types of integrals, or sums against different
payoff functions that are determined by the options structures. In the case of
European options, such as the call and put options introduced in chapter 1, the
structures are only a function of the asset’s terminal payoff, meaning they can
be formulated as the type of integral in (2.42).
(
p k=1
f (k; p) = (2.43)
1−p k =0
Using the definition of expectation and variance above, it is easy to see that
the Bernoulli distribution has mean p and variance p(1 − p).
• Binomial Distribution: The Binomial distribution, like the Bernoulli dis-
tribution, is a discrete probability distribution . Whereas the Bernoulli distri-
bution measures the probability distribution of a single binary event, such as
success or failure, a binomial distribution measures the number of successes in
a specified number of trials. The Binomial distribution is defined by two model
parameters, p, the probability of a ”success” event, and n, the number of inde-
pendent trials that we are modeling. The PDF for a Binomial distribution can
be written as:
!
n k
f (x; p) = p (1 − p)n−k (2.44)
k
Once again, we can use the definitions above to obtain the mean and variance of
the binomial distribution which are known to be np and np(1 − p), respectively.
16 A Practical Guide to Investment Management, Trading and Financial Engineering
1 1 2
f (x) = √ e− 2 x (2.45)
2π
1 1 x−µ 2
f (x; µ, σ) = √ e− 2 ( σ ) (2.46)
σ 2π
The reader should notice that if µ = 0 and σ = 1 then we recover the PDF for
the standard normal distribution presented above. Importantly, we can easily
convert from a standard normal distribution to a normal distribution with an
arbitrary mean and variance using the following well-known identity:
X = µ + σZ (2.47)
λk e−λ
f (k; λ) = (2.48)
k!
the distribution of the time between events. In fact, the exponential distribu-
tion measures the distribution of the time between independent events in a
Poisson process. Like a Poisson process, an exponential distribution is defined
by a single parameter, λ, which again measures the rate or intensity of event
occurrences. Larger values for the paramter λ lead to more frequent events and
smaller times between events, on average. The PDF of an exponential distribu-
tion can be written as:
x
xκ−1 e− θ
f (x; κ, θ) = κ (2.50)
θ Γ(κ)
!− ν+1
Γ ν+1 t2
2
f (t; v) = √ 2 ν 1 + (2.51)
νπΓ 2 ν
The reader should note that in this example a European call option is priced via
the binomial tree, however, is encouraged to try to modify the function to handle
additional payoff structures.
EXERCISES
2.1 Consider the binomial distribution as defined in (2.44). Derive the mean and
variance for this distribution as a function of the model parameter p.
2.2 Consider the standard normal distribution as described in (2.45). Derive the
estimates of the mean and variance.
X = µ + σZ (P2.1)
where µ and σ are known constants. Show that X is normally distributed and
derive its mean and variance.
2.4 Consider the PDF of the exponential distribution as defined in (2.49). Calculate
the cumulative distribution function for this distribution.
Theoretical Underpinnings of Quant Modeling: Modeling the Risk Neutral Measure 19
a. Using a binomial tree calculate the price of a three-month call option with
strike 105 on an asset that currently trades at 100 for different values of
sigma. Assume the asset pays no dividends and that interest rates are equal
to zero.
b. Suppose you buy one unit of this call option using the price obtained when
σ = 0.1 and the market rallies leading to a new price of the underlying asset
of 110. What is the value of this option now, and how much money have you
made or lost?
c. Now suppose the market instead sold off, leading to a new price for the
underlying asset of 90. How much is your option position worth now, and
what is your profit? Comment on the difference between the amount you
made when the asset price increased, and the amount you lost when the
asset price declined.
d. Finally, assume the volatility in the market increases, leading to a new σ for
the option of 0.2. What is the value of the option now, and what is our profit
or loss? Comment on the implications of this for options holders.
a. Using a binomial tree calculate the price of a one-year digital call option
which pays $1 if the asset finishes above 100 in one-year and zero otherwise.
Assume that interest rates and dividends are equal to zero and that the asset
currently trades at 100. Further, assume that the volatility parameter, σ, is
known to be 0.25.
b. Comment on the difference between the price of this digital call option and
the internal Binomial tree parameter u. Vary σ and discuss whether this
relationship changes. Why does it (or doesn’t it) change?
2.7 Consider the following stochastic differential equation which is known as the
CEV model:
Derive the PDE that corresponds to this SDE using the Feynman-Kac formula
or the replication argument presented in the chapter.
2.8 Compare the properties of the Bachelier and Black-Scholes models introduced
in the chapter. Describe the differences in model parameters and the result-
ing strengths and weaknesses. Which model would you select (and why) for
modeling the following asset classes:
• Equities
20 A Practical Guide to Investment Management, Trading and Financial Engineering
• Interest Rates
• Foreign Exchange
CHAPTER 3
Theoretical Underpinnings
of Quant Modeling: Modeling
the Physical Measure
The reader should notice that the first column in the explanatory variables has
a constant value of 1 to represent an intercept . In chapter 5 we will show to use an
internal Python package to implement regression models in Python, and the reader
is encouraged to compare the coefficients obtained using this method to the built-in
Python package.
21
22 A Practical Guide to Investment Management, Trading and Financial Engineering
EXERCISES
3.1 Describe the form of market efficiency that persistent excess returns from the
following strategies would violate:
• Technical Analysis
• Value Strategy Based on Price to Earnings Ratio
• Strategy that buys stocks in advance of January and sells them immediately
following month-end
• Machine Learning NLP model based on parsing economic news
• Strategy based on the current slope of the yield curve
• ARMA mean-reversion model
3.2 Derive the OLS estimator for the following modified univariate regression equa-
tion with no intercept:
Y = βX + (P3.1)
3.3 Consider an investor who has access to the following investments with the spec-
ified investment properties: Calculate the expected return and volatility of the
following portfolios:
Theoretical Underpinnings of Quant Modeling: Modeling the Physical Measure 23
Comment on the properties of the different portfolio weighting schemes and their
sensitivity to correlation. Also discuss how any diversification benefit changes
as we modify the correlation assumptions.
a. Download data for your favorite stock ticker and clean the data as appropri-
ate.
b. Download data for the Fama-French factors from Ken French’s website.
c. Plot the contemporaneous returns of the stock against each Fama-French fac-
tor, with the stock returns on the y-axis and the factor returns on the x-axis.
Are there strong relationships between the variables? Do the relationships
appear to be linear?
d. Find the OLS estimator of the coefficients for each Fama-French factor with
the stock returns as the dependent variable. Comment on the significance
of each factor and what that implies about the properties of your favorite
stock.
e. Calculate the residuals, in your regression model with the β’s fixed and
comment on whether there properties match the underlying assumptions.
Are they i.i.d.? Do they contain autocorrelation?
b. Perform a stationarity test on the SPY price series. Are the prices stationary
or non-stationary?
c. Difference the data by computing SPY log-returns. Estimate an AR(1) model
on this differenced series. Is the coefficient you obtain what you would ex-
pect? What does it imply about market efficiency, and mean-reversion in the
SPY index?
3.6 Comment on whether the following statements are true or false. Please justify
each answer.
a. Prices implied from the options market are a forecast of what the asset price
will be at expiry.
b. Drift in the risk-neutral measure incorporates risk preferences and is there-
fore usually higher than drift in the physical measure.
c. Hedging and replication arguments for options structures enable us to work
in the risk-neutral measure. In the absence of these replication arguments, we
must estimate the drift in a world that incorporates investors risk-aversion.
d. Modeling the risk-neutral and physical measure rely on completely different
skills that have no overlap.
Ticker Description
SPY US Equity Index
EFA Developed Market Equity Index
EEM Emerging Market Equity Index
VXX VIX Futures ETN
b. Calculate the statistical properties of each ETF such as its historical return
and volatility.
c. Consider an equally weighted portfolio of the four assets. Calculate the com-
parable historical risk and return statistics and comment on any differences
in performance.
d. Using bootstrapping, simulate a set of synthetic data that last one year. Gen-
erate 10000 such synthetic data series and calculate the statistical properties
of the assets and the portfolios.
e. Plot a histogram of the distribution of returns over the bootstrapped paths.
Do they look reasonable? What are the best and worst case scenarios for the
equally weighted portfolio? How does that compare to history?
Theoretical Underpinnings of Quant Modeling: Modeling the Physical Measure 25
f. Compare the expected return, volatility and correlation between the assets
in the bootstrapped paths, by averaging over all paths, to the properties of
the underlying historical data. Have the returns, volatilities and correlations
converged? Why or why not?
CHAPTER 4
Python Programming
Environment
27
28 A Practical Guide to Investment Management, Trading and Financial Engineering
coordinate and merge their changes to a shared codebase. Additionally, these tools
provide a robust means of version control. That is, git enables us to keep tract of all
historical changes made to our codebase in an organized way. It even gives us a way
to revert back to a previous version if necessary. Quant teams at both buy-side and
sell-side institutions generally have strict release processes for their models. Using git
enables firm’s to keep historical code for all previous production versions of the code
and enables a quant to roll back if necessary.
Because of this, git has benefits for both large teams and quants working individ-
ually. Most notably, quants working individually will benefit from the version control
aspect of git, as will larger organizations with formal release processes. Teams using
git will also benefit from easier code coordination and merging of new changes to the
codebase.
Git consists of the following three layers of repositories :
• Local Repository
• Remote Repository
The first layer is a copy of the code files on each user’s machine, which the user
will continually update as they program. This will not be viewable by any other users.
The next layer is also stored on the user’s machine, and is called the local repository.
The local repository is also unaccessible to other users, and is where a user might
store code that has been tested and seems safe but is not necessarily production ready
1
. Moving files from a local working copy to the local repository is called a commit,
and is done using the git commit command. The final layer in a git infrastructure is
the remote repository, which is accessible to all users. The remote repository is meant
to only contain changes to the code base that have been finalized and tested. Moving
files from a local repository to a remote repository is called a push, and is done via
the git push command.
Some of the most common tasks in git are to add new files to a codebase, remove
files from a codebase, to update existing files in the local or remote repository, or to
get the latest changes from the remote repository.
The most common commands used in git are:
• git status: lists all changes you have made in your local files and local reposi-
tory.
• git commit: takes changes to all local files and commits them to your local
repository (where they are still only visible to you).
• git push: pushes all committed changes from the local repository to the remote
repository.
1
Or ready for others to view
Python Programming Environment 29
• git merge: merges your changes with changes from a remote repository.
The merge command in particular is how a user would synchronize changes that
they’ve made to their code if that file has been changed by another user as well.
In some cases, git may be able to complete this merge automatically. In other more
complex cases, git may not know how to conduct the merge, and some manual inter-
vention may be required 2 .
EXERCISES
What will be the result of the code? What will be printed to the console?
a. 3.5 3.5
b. 3.5 2
c. 3 3
d. 2 2
e. 3 2
f. The program will produce an exception
2
In these situations, the git stash command is of great use.
30 A Practical Guide to Investment Management, Trading and Financial Engineering
4.3 Consider the following code that implements the Black-Scholes formula for a
European call:
1 def callPx ( s_0 , k , r , sigma , tau ) :
2 sigmaRtT = ( sigma * math . sqrt ( tau ) )
3 rSigTerm = ( r + sigma * sigma / 2.0) * tau
4 d1 = ( math . log ( s_0 / k ) + rSigTerm ) / sigmaRtT
5 d2 = d1 - sigmaRtT
6 term1 = s_0 * norm . cdf ( d1 )
7 term2 = k * math . exp ( - r * tau ) * norm . cdf ( d2 )
8 return term1 - term2
9
What happens when a negative value is passed for s 0? How about a negative
sigma? Add the proper exception handling to this function to handle these
parameter values.
4.4 What will be the output of the following Python program?
1 greeting = " Hello , world "
2 greeting [0] = " X "
3 print ( greeting )
4
a. Xello, world
b. Hello, world
c. The program will produce an exception
4.5 Consider the following code containing price data for a few ETFs:
1 tuple1 = ( " SPY " , " S & P Index " , 290.31)
2 tuple2 = ( " XLF " , " Financials " , 28.33)
3 tuple3 = ( " XLK " , " Technology " , 75.60)
4 tuples = [ tuple1 , tuple2 , tuple3 ]
5 spyClose = tuples [ x ][ y ]
6 print ( spyClose )
7
What values of x & y enable us to extract the closing value of the S&P (290.31)
from the variable tuples?
a. x = 2, y = 0
b. x = 1, y = 1
c. x = 0, y = 2
d. None of the above
4.6 A Fibonacci sequence is a sequence that starts with 0 and 1, and then each
number equals the sum of the previous two numbers. That is:
Fibn = Fibn−1 + Fibn−2
Python Programming Environment 31
4.7 Explain the difference between a Local Working Copy, a Local Repository and
a Remote Repository in git. What commands would you use to check in your
code to the Local and Remote Repositories respectively? Why might you check
in your code to the Local Repository but not the Remote Repository?
CHAPTER 5
Programming Concepts in
Python
Note that covariance and correlation matrices are symmetric, and positive semi-
definite. Further, we know that the diagonals in a covariance matrix consist of the
variances of each variable, while the other entries represent the covariance between
each pair of variables. Symmetry of a covariance matrix follows directly from the
properties of covariance, that is: COV(A,B) = COV(B,A).
Likewise, the diagonals of the correlation matrix consist of 1’s, which represents
the correlation of each variable with itself. All the other entries are the correlation be-
tween each variable pair. It’s easy to show that CORR(A,B) = CORR(B,A) resulting
in a symmetric matrix.
In practice, the magnitude of variance and covariance terms depends on the mag-
nitude of the underlying data. Series with large values are likely to lead to larger
variances and covariances, all else equal, and vice versa. On the other hand, the cor-
relation coefficient is normalized and is always between -1 and 1. Therefore, it is more
appropriate to compare correlations over a cross-section of variables than covariances.
In these cases, it is best practice to base our analysis on a correlation matrix.
33
34 A Practical Guide to Investment Management, Trading and Financial Engineering
The reader should note that in this example we assume that we begin with a
Dataframe named df px containing price data.
This ends up being a very useful feature when working with panel data, as there
may be times when we want all elements of the panel to present as different rows,
and other times when we prefer each element in a panel to be displayed in a separate
set of columns.
The reader should notice that we first conduct an augmented dickey fuller test on
the price series. Next, we calibrate the differenced, return series to an ARMA model
with 1 AR lag and 0 MA lags.
1 import abc
2
3 class AbstractBaseClass ( ABC ) :
4
5 def __init__ ( self , r : float , q : float , tau : float ) :
6 self . r = r
7 self . q = q
8 self . tau = tau
9
10 @abc . abstractmethod
11 def abstract_method ( self ) -> float :
12 pass
13
14 class DerivedClass ( AbstractBaseClass ) :
15
16 def __init__ ( self , r : float , q : float , tau : float , sigma : float ) :
17 super () . __init__ (r , q , tau )
18 self . sigma = sigma
19
20 def abstract_method ( self ) -> float :
21 print ( " must define abstract_method in the base class " )
22 return 0.0
1 class BaseStochProc :
2 def __init__ ( self , s0 : float ) -> None :
3 pass
4
5 class Bl ackSc hole sSto chPro c ( BaseStochProc ) :
6 pass
7
8 class BachelierStochProc ( BaseStochProc ) :
9 pass
10
11 class StochProc ( Enum ) :
12 BASE_STOCH_PROC = 1
13 BS_STOCH_PROC = 2
14 BACHELIER_STOC_PROC = 3
15
16 class StochProcFactory :
17 @staticmethod
18 def create_stoch_proc ( sp : StochProc , s0 : float ) -> BaseStochProc :
19 if sp == StochProc . BASE_STOCH_PROC :
20 stoch_proc = BaseStochProc ( s0 )
21 elif sp == StochProc . BS_STOCH_PROC :
22 stoch_proc = Blac kSch oles Stoch Proc ( s0 )
23 elif sp == StochProc . BACHELIER_STOC_PROC :
24 stoch_proc = BachelierStochProc ( s0 )
25 return stoch_proc
Programming Concepts in Python 37
1 class DBConnection :
2
3 db_conn = None
4
5 @classmethod
6 def get_instance ( cls ) :
7 if cls . db_conn is None :
8 cls . db_conn = DBConnection ()
9 return cls . db_conn
Note that the member variable db conn is defined outside the constructor. This
makes the variable global rather than an attribute of a specific instance of the class.
This variable must be global so that it is accessible in the global getInstance function
that creates the DBConnection object.
34 return arr
35
36 rand_arr = np . random . rand (10 , 1)
37 arr = merge_sort ( rand_arr , 0 , len ( rand_arr ) -1)
EXERCISES
5.1 You are given a DataFrame with historical closing price data for an equity index.
Write a Python function that takes this DataFrame as an input and computes
rolling overlapping returns for the equity index. This function should take the
period length of the return as an input.
5.2 You are given two DataFrames which contain historical price data for two differ-
ent ETF’s. Unfortunately, the dates in the two data frames don’t match exactly.
Write the Python code necessary to merge these two data frames, returning only
dates where there are prices for both ETF’s.
11
12 def multiplyObjByFive ( obj ) :
13 obj . shares = 5.0* obj . shares
14
15 multiplyNumByFive ( x )
16 multiplyObjByFive ( pos )
17
18 print (x , pos . shares )
19
a. 100 500
b. 100 100
c. 500 500
d. 500 100
e. The program will produce an exception
What will be printed to the console when this application runs? What will
happen if the following code is appended to the end?
1 tple [0] = 7
2 lst [0] = 7
3 arr [0] = 7
4
5 print ( tple )
6 print ( lst )
7 print ( arr )
8
1 class BasePosition :
2 def __init__ ( self , shrs , long ) :
3 self . shares = shrs
4 self . isLong = long
5
6 def printPos ( self ) :
7 print (" calling printPos from Base ")
8 print ( self . shares )
9 print ( self . isLong )
10
5.6 a. What is the factory pattern, and in what context should it be used?
b. Using pseudocode, write a sketch of the code needed to implement the factory
pattern.
5.7 Using Python or C++, implement an insertion sort that uses a binary search.
Discuss the strengths and weaknesses of this algorithm.
5.9 a. You are given two pandas Series with historical closing price data for two
equity indices. The equity indices are for different geographical regions and
Programming Concepts in Python 43
therefore may have different holidays. Write a Python function that merges
the two data frames without losing data for any dates.
b. Write a python function that calculates rolling overlapping annualized re-
turns and rolling overlapping annualized volatilities for the merged data
frame that you constructed above. Join the index rolling return and volatil-
ity series to the price DataFrame that you created above. This function
should take the period length of the return and volatility calculations as an
input.
c. You are given an additional pandas Series identifying the market regime.
Write a Python function that joins this column to the dataframe created
above and calculates the conditional return and volatility by market regime.
Make sure that your code is generic enough to handle different numbers of
market regimes.
5.10 a. Using Python create an abstract base class named OptionsPricer. The ab-
stract base class should contain a constructor, a member function named
price, as well as any attributes you believe are appropriate.
b. Create two derived classes that inherit from your OptionsPricer class, Euro-
peanOptionPricer and LookbackOptionPricer. These classes should assume
that a simulated path for the underlying asset is provided to them.
c. Create an analogous class structure for a stochastic process consisting of a
base class and Bachelier and Black-Scholes derived classes. Each class in the
hierarchy should have a simulation method that implements the appropriate
SDE and returns a simulated path that can be read by the OptionsPricer
class hierarchy.
In addition, implement the appropriate operator to compare whether the
simulated paths in each instance of a class are the same, greater than, or
less than.
d. Write a generic global function that takes an option type and stochastic
process, as well as any additional parameters you need (i.e. strike, expiry)
and dynamically returns the price of the correct option using the desired
model.
CHAPTER 6
In general, select queries will be the most common types of queries that we write,
as they enable us to access data from the database and feed it into our models.
An important part of writing SQL statements is being able to retrieve data from
the database in a manner that is accessible to our model or application. Generally,
once the select statement is run, the results are stored in a Python data frame, as
45
46 A Practical Guide to Investment Management, Trading and Financial Engineering
in the example above, and then used in broader calculations. The other three types
of statements, INSERT, UPDATE and DELETE statements, enable us to modify
our database rather than retrieve data. These statements may be useful in our data
collection process if we choose to aggregate data from external sources and store it
in a single centralized database, as is best practice in most production environments.
As a note to the reader, the SQL programming language varies slightly between
database engines, such as SQL Server, Oracle and PostreSQL. In this text we fo-
cus on the syntax employed by SQL Server. Those who use other databases may
need to tweak the syntax of their queries from those presented in the text, however,
these differences are almost always minor. Readers are encouraged to consult the
documentation for the specific database engine that they are working with for more
information.
In the code above, the first SQL statement inserts a row into a table named
prices, the second modifies all prices for a given ticker, again in the prices table, and
the final statement deletes all rows for a given ticker. It should be noted that while
insert statements are generally seen as safe operations, because they can easily be
undone if anything unexpected occurs, this is not true for UPDATE and DELETE
statements. Once we have deleted all rows from a table, or set all prices to the same
value in a given table, the process of restoring the original data will be much more
involved. Because of this, UPDATE and DELETE commands are considered risky
and should be executed carefully. It is best practice to always run the UPDATE or
DELETE query initially as a SELECT statement to ensure that the number of rows
and content of the rows is what we expect. This should help catch situations where
we would have accidentally updated or deleted an entire table, instead of a specific
set of rows.
query. Next, the FROM clause defines the table that we are referencing. In future
examples we will show how to include multiple tables in a single query using a JOIN.
Finally, a WHERE clause restricts the rows that are selected from the table or tables
based on one or more conditions.
To see a select statement at work, let’s start with the following query that contains
only a SELECT and FROM clause:
1 SELECT ticker , price
2 FROM Equity_Prices
This query would return the ticker and price columns for ALL rows in the Eq-
uity Prices table. We may instead want to restrict to only rows from the current day,
for example, which we could do using the following WHERE clause:
1 SELECT ticker , price
2 FROM Equity_Prices
3 WHERE quote_date = ’ today ’
WHERE clauses such as the one above can include simple equality comparisons,
but can also be arbitrarily complex. For example, we could use the results from
another query in a WHERE clause, in which case that query would be referred to as
a subquery.
The next relevant clause in a select statement is an ORDER BY clause, which
can be used to sort the results of a query by one or more columns.
1 SELECT ticker , shares , mkt_value
2 FROM positions
3 WHERE quote_date = ’ today ’
4 ORDER BY mkt_value DESC
Order by clauses can reference the column name, or can reference the order it
appears in the SELECT statement. 1 Specifically, this query will show all positions
for a given quote date, and will display them in order from highest market value to
lowest. The default sort order is ascending. The DESC keyword will sort the results
in descending order.
In this query we can see that the JOIN keyword is used to specify how the prices
and positions tables are merged in the query and in particular the content after the
ON clause defines the JOIN columns. In this example we are specifying that rows
with matching tickers and dates in the positions and prices tables will be matched.
Also note that in this query an alias is used for the table names in the FROM
clause. The aliases are defined as px and pos respectively, and can generally be spec-
ified immediately following the table name in the FROM clause. This alias can be
created as it is done in the above query, or can be specified with an optional AS
keyword. When an alias is specified, that alias can then be used in other parts of the
query, such as in the query above where the alias is used in the SELECT clause and
to define the JOIN columns. If we had not specified an alias for the first column in the
select statement, ticker, then the query would no longer work as there are multiple
columns named ticker in the query and without that prefix the interpreter would not
know which column to select.
The JOIN used in the above example is referred to as an inner join, meaning that
this query will only return rows where the ticker and date exists in both the prices
and positions tables. This means that if a row is only present in the prices table,
but has no corresponding positions row, then it will not be included in the query.
This is often desired but in some cases we may want to return all rows in the a given
table, such as the prices table, even if there is no row in the second table, in this case
positions. In SQL we can accomplish this by specifying a different JOIN type, as we
explore next.
SQL supports the following main types of JOIN’s 2 :
• INNER: Return only rows that exist in both tables.
• LEFT: Return all rows in the left table regardless of whether they exist in the
right table.
• RIGHT: Return all rows in the right table regardless of whether they exist in
the left table.
• FULL: Return all rows in if they exist in either the left or right tables.
The default JOIN type in SQL is an INNER JOIN, meaning in cases where there
is no keyword preceding the JOIN keyword, such as in the example above, an INNER
JOIN is performed. If, in the previous example we wanted to return all prices rows
whether or not a corresponding positions row existed, we could specify the JOIN as a
LEFT JOIN rather than an INNER JOIN. If we were to use a RIGHT JOIN, then all
rows in the positions table would be returned, regardless of whether a corresponding
prices row exists.
2
Note that the four types of JOIN’s listed above are the most commonly used ones. There are
also other types of JOIN’s, such as CROSS JOIN and SELF JOIN.
Working with Financial Datasets 49
In this case we are able to aggregate the shares column, however, it aggregates
the shares for all rows in the table. In many cases, we won’t want a signal aggregate
value but instead will want an aggregate value for a subset of groups, for example
by ticker, by date, or by another column. The group by clause enables us to embed
this logic in our queries. In particular, if we were to add a GROUP BY clause to the
above query, it would then return the sum of the shares column for each group. An
example of this can be found in the next query:
1 SELECT ticker , SUM ( shares )
2 FROM positions
3 GROUP BY ticker
In this case we are grouping by ticker, meaning that the query will return a row
for each ticker in the positions table. The query will return two columns, the first
of which will contain the ticker, and the second of which will contain our aggregate,
that is, the sum of the shares for that specific ticker.
If we wants to restrict that groups that are returned in a query that uses a GROUP
BY clause, we can use the HAVING clause. A HAVING clause restricts the groups
returned in a query. This contrasts with a WHERE clause which restricts the rows
that are used in the underlying calculation, rather than the groups that are displayed
after the aggregation is performed.
The following example sample SQL statement shows how we can utilize a HAV-
ING clause to modify the groups that are returned:
1 SELECT ticker , SUM ( shares ) as shrs
2 FROM positions
3 GROUP BY ticker
4 HAVING shrs > 0
In particular, in this example we used the HAVING clause to only show long-only
positions, that is, cases where the total positions is greater than zero.
6.1.5 Coding Example: Filling Missing Data with Python Data Frames
The following example shows how we can use the functionality embedded in Python’s
Data Frames to fill missing data via interpolation or filling forward:
50 A Practical Guide to Investment Management, Trading and Financial Engineering
The reader should notice that for different applications of this technique, different
time periods may be used to calculate the β. When doing backtesting or making
predictions, it is important to use historical data only because we do not want to
include any information in the future. That is, we want to ensure that we are not
looking ahead. For risk management purposes, however, it may be better to use all
available data. This is because risk management calculations are less sensitive to
look-ahead bias, and the additional data can help us recover a more stable estimate
of the coefficient β.
Working with Financial Datasets 51
6.1.7 Coding Example: Using Bootstrapping to Create Synthetic History for a set of
Indices
In this coding example, we use the historical data of SPY, AAPL, GOOG to show
how to generate a paths of synthetic returns via bootstrapping techniques:
1 from pandas_datareader import data
2 import random
3
4 df_price = data . get_data_yahoo ([ ’ SPY ’ , ’ AAPL ’ , ’ GOOG ’] , start = ’
2020 -03 -01 ’ , end = ’ 2021 -02 -28 ’) [ ’ Adj Close ’]
5 df_ret = df_price . pct_change () . dropna ()
6 N = df_ret . shape [0]
7 m = 10 # length of the missing data period
8 boot_index = random . sample ( range ( N ) , m )
9 df_boot = df_ret . iloc [ boot_index ]. set_index ( pd . date_range ( start = ’
2021 -03 -01 ’ , end = ’ 2021 -03 -12 ’ , freq = ’B ’) )
EXERCISES
Ticker Description
SPY US Equity
TLT 20+ Year Treasuries
GOVT All Maturity Treasuries
DBC Commodities
HYG High Yield
EEM Emerging Market Equity
EAFE Europe and East Asia Equity
AGG US AGG
IAGG International AGG
EMB EM Debt
ACWI All Country Equities
IWM US Small Cap Equities
a. Download data from yahoo finance for the sector ETFs above.
b. Download data for the Fama-French factors from Ken-French’s website.
c. Check the yahoo finance data for splits and other anomalies.
d. Analyze both dataset for outliers using the methods discussed in this chapter.
Identify any data points that you believe might be suspicious.
e. Regress the ETF returns against the Fama-French Factors. Which sectors
have the highest exposure to the Fama-French factors?
f. Using the regression above, plot the intercepts in the regression. Which in-
tercept coefficients are biggest (smallest)? What conclusions can draw based
on the intercept?
6.2 You are given a data set of sector ETF data which is missing multiple year-long
periods for several ETFs. Describe the best approach to handle this missing
data and explain what impact filling the missing data has on the distribution
of the data as well as the assumptions implicit in your method. Write a piece
of code that will find all such gaps in the data and fill them using the method
described.
6.3 Write a function that will check the following options data for arbitrage:
How would you propose any arbitrage opportunities be handled?
Working with Financial Datasets 53
6.4 Download data for a set of 20 tickers for 20 years. Delete one year’s worth of
data for each ticker (but not the same year). Use interpolation, regression, and
k-means clustering to fill in the missing data. Calculate the distribution of the
returns before and after the missing data was removed and replaced. Comment
on how each method impacts the empirical distribution.
6.5 Describe the option portfolio that you would buy or sell if you observe that
options prices are concave with respect to strike at strike K. Show why this
portfolio leads to an arbitrage opportunity.
6.6 Download data for the following ETFs using the data source of your choice:
• UVXY
• VXX
• SVXY
Write an algorithm for finding outliers for each of the three ETFs and comment
on whether any outliers you find are legitimate data points.
CHAPTER 7
Model Validation
As a note to the reader, in order for this unittest.main function call to run, the
above code needs to be run as a Python file or from the Python console. Alterna-
tively, if the reader is using jupyter Notebook then they can simply add the following
arguments to the unittest.main call:
1 unittest . main ( argv =[ ’ first - arg - is - ignored ’] , exit = False )
55
56 A Practical Guide to Investment Management, Trading and Financial Engineering
13 for i in range ( N ) :
14 dwt = np . random . normal ( loc =0 , scale = np . math . sqrt ( dt ) , size = n )
15 dSt = r * dt + sigma * dwt
16 St = S0 + sum ( dSt )
17 St_list [ i ] = St
18 stock_mean = np . mean ( St_list )
19 stock_var = np . var ( St_list )
20 return St_list , stock_mean , stock_var
21
22 class TestSimulation ( unittest . TestCase ) :
23
24 # the setUp function prepares the framework for the test
25 def setUp ( self ) :
26 self . S0 = 20
27 self . r = 0.05
28 self . sigma = 0.5
29 self . t = 1
30 self . N = 1000
31 self . n = 100
32 self . St_list , self . stock_mean , self . stock_var =
stock_price_simu ( self . S0 , self .r , self . sigma , self .t , self .N , self
.n)
33
34 def test_valid_t ( self ) :
35 self . assertTrue ( self .t >0)
36 self . assertFalse ( self .t <=0)
37
38 def test_valid_sigma ( self ) :
39 self . assertTrue ( self . sigma >0)
40 self . assertFalse ( self . sigma <=0)
41
42 def test_valid_N ( self ) :
43 self . assertTrue ( self .N >0)
44 self . assertFalse ( self .N <=0)
45
46 def test_valid_n ( self ) :
47 self . assertTrue ( self .n >0)
48 self . assertFalse ( self .n <=0)
49
50 unittest . main ()
Where the reader is encouraged to refer back to section 7 and recall the appro-
priate method(s) for calling the unittest.main function.
EXERCISES
7.2 List three unit tests that you would add to a piece of code that downloads,
parses and cleans options data.
7.3 List three ways to simplify a problem to be more tractable so that we would
know the solution.
Model Validation 57
7.4 Consider the following implemented model. Conduct an independent code re-
view and describe how you would find all bugs, identify all model assumptions
and determine if the model is properly implemented.
7.5 Using the unittest module add unit tests to the building blocks of a model.
Describe your criteria for where to add unit tests.
II
Options Modeling
59
CHAPTER 8
Stochastic Models
61
62 A Practical Guide to Investment Management, Trading and Financial Engineering
2
3 # ex piry_derivatives
4 ex pi ry_derivatives = (( price_surface [2: ,:] - price_surface [: -2 ,:]) /
( expiries [2:] - expiries [: -2]) . reshape ( -1 ,1) ) [: ,: -2]
5
6 # strike second derivative
7 s t r i k e _ fi r s t _d e r i v at i v e s = ( price_surface [: ,1:] - price_surface
[: ,: -1]) / ( strikes [1:] - strikes [: -1])
8 s t r i k e _ s e c o n d _ de r i v a t i v e s = (( s t ri k e _ fi r s t _d e r i va t i v es [: ,1:] -
s t ri k e _ fi r s t _d e r i va t i v es [: ,: -1]) / ( strikes [1: -1] - strikes [: -2]) )
[1: -1 ,]
9
10 variance_surface = expiry_derivatives / (0.5* ( strikes [: -2]**2) *
s t r i k e _ s e c o n d _ d er i v a t i v e s )
11 vol_surface = np . sqrt ( variance_surface )
It should be emphasized that this code assumes unique option prices are available
at all strikes, and that a correspondingly fine grid of prices and strikes are provided
to the function. The reader is encouraged to try this methodology in different market
contexts, as well as with data of different quality in order to analyze how the solution
is impacted.
EXERCISES
8.1 Derive the Black-Scholes formula for a European put option
8.3 Use L’Hopital’s rule to estimate the normal volatility in the SABR model for
an at-the-money option for a given set of parameters.
8.4 Find a parameterization of the Heston, SABR and Variance Gamma models
that enables us to recover the Black-Scholes model.
8.5 Download implied volatility data for VIX and calculate the local volatility func-
tion, σ(K, T ). Discuss how you used interpolation to compute the relevant par-
tial derivatives and comment on the challenges of applying the local volatility
model in practice.
• α = 0.25
• β = 0.5
• ρ = 0.75
• σ0 = 0.075
a. Apply the SABR asymptotic formula to obtain a price for a three-month call
option of an asset whose price is 100 and whose strike price is 105. Assume
that the risk-free rate is 1.25% and the asset pays no dividens.
64 A Practical Guide to Investment Management, Trading and Financial Engineering
8.9 Consider the CEV model with σ = 0.2 and β = 0.5. Assume that the underlying
asset is currently trading at 150, the risk-free rate is 5% and the asset pays 2.8%
in dividends.
b. Compare these prices to those under the Black-Scholes model with σ = 0.2.
Which one do you expect to be higher?
c. Explain how you would find the equivalent Black-Scholes implied volatility
corresponding to the CEV model defined above. Using this σ̃ compute prices
under the Black-Scholes model. Which model leads to higher prices now?
d. The CEV model above defines the price for any European option for all
strikes and all expiries. Use this function to calculate the local volatility
function, σ(K, T ) under the assumption that these dynamics hold.
CHAPTER 9
67
68 A Practical Guide to Investment Management, Trading and Financial Engineering
9.0.4 Coding Example: Finding the Minimum Variance Portfolio of Two Assets
In this coding example, we show how to use the above equations to find a minumum
variance portfolio for two assets. To verify our result, we also show how this same
problem can be computed using Python’s built-in optimization libraries:
1 def min_var_portfolio ( sigma_1 , sigma_2 , rho ) :
2 numerator = sigma_2 **2 - sigma_1 * sigma_2 * rho
3 denominator = sigma_1 **2 + sigma_2 **2 - 2* sigma_1 * sigma_2 * rho
4 w_1 = numerator / denominator
5 return ( w_1 ,1 - w_1 )
6
7 min_ var_portfolio (0.25 ,0.3 , -0.4)
8
Options Pricing Techniques for European Options 69
9 # (0.5647058823529412 , 0.43529411764705883)
Readers should obtain the same results regardless of which methodology they
choose.
The reader should notice that a separate set of parameters are calibrated per
expiry in the code. This is because the model parameterization of the CEV mdoel is
unable to fit the term structure of volatility but is able to fit many observed volatility
skews.
EXERCISES
9.1 Option Pricing via FFT Techniques The Heston Model is defined by the
following system of stochastic differential equations:
√
dSt = rSt dt + νt St dWt1
√
dνt = κ(θ − νt ) dt + σ νt dWt2
Cov(dWt1 , dWt2 ) = ρ dt
Assume the risk-free rate is 2%, the initial asset price is 250 and that the asset
pays no dividends.
ii. Using the results above, choose a reasonable value of α and calculate
the price of the same European Call with various values of N and ∆k
(or equivalently N and B). Comment on what values seem to lead to
the most accurate prices, and the efficiency of each parameterization.
iii. Calculate the price of a European Call with strike 260 using various
values of N and ∆k (or N and B). Do the same sets of values for N ,
B and ∆k produce the best results? Comment on any differences that
arise.
b. Exploring Heston Parameters Assume the risk-free rate is 2.5%, the
initial asset price is 150 and that the asset pays no dividends.
σ = 0.4
ν0 = 0.09
κ = 0.5
ρ = 0.25
θ = 0.12
9.2 Consider a one-year European call option with strike 125 and starting asset price
of 100. Assume that the risk-free rate is 1.3% and the asset pays dividends at a
rate of 2.75%.
a. Value this option using FFT pricing techniques under Merton’s jump diffu-
sion model with the following parameters:
• σ = 0.25
• λ = 0.05
• α = −0.025
• γ = 0.1
Try many values of α as well as N and B. Comment on what the optimal
values for these parameters, as well as ∆k and ∆ν appear to be.
b. Comment on the impact of the jump component. Do they raise or lower the
price of the option compared to the equivalent process without jumps? Why?
72 A Practical Guide to Investment Management, Trading and Financial Engineering
9.3 Recall that the Heston Model is defined by the following system of SDE’s:
√
dSt = rSt dt + νt St dWt1
√
dνt = κ(θ − νt ) dt + σ νt dWt2
Cov(dWt , dWt2 )
1
= ρ dt (P9.1)
See the spreadsheet SPY data.csv for options data. r = 1.5%, q = 1.77%,
S0 = 267.15.
Consider the given market prices and the following equal weight least squares
minimization function:
X
p~min = min (c̃(τ, K, p~) − cτ,K )2 (P9.2)
p
~
τ,K
where c̃(τ, K, p~) is the FFT based model price of a call option with expiry τ
and strike K.
a. Check the option prices for arbitrage. Are there arbitrage opportunities at
the mid? How about after accounting for the bid-ask spread? Remove any
arbitrage violations from the data.
b. Using the FFT Pricing code from your last homework, find the values of κ,
θ, σ, ρ and ν0 that minimize the equal weight least squared pricing error.
You may choose the starting point and upper and lower bounds of the opti-
mization. You may also choose whether to calibrate to calls, puts, or some
combination of the two.
Note also that you are given data for multiple expiries, each of which should
use the same parameter set, but will require a separate call to the FFT algo-
rithm.
c. Try several starting points and several values for the upper and lower bounds
of your parameters. Does the optimal set of parameters change? If so, what
does this tell you about the stability of your calibration algorithm?
d. Instead of applying an equal weight to each option consider the following
function which makes the weights inversely proportional to the quoted bid-
Options Pricing Techniques for European Options 73
ask spread:
1
ωτ,K = (P9.3)
cτ,K,ask − cτ,K,,bid
X
p~min = min ωτ,K (c̃(τ, K, p~) − cτ,K )2 (P9.4)
p
~
τ,K
where as before c̃(τ, K, p~) is the FFT based model price of a call option with
expiry τ , strike K. Repeat the calibration with this objective function and
comment on how this weighting affects the optimal parameters.
9.4 Consider the process of calibrating a Variance Gamma model to market data
using FFT:
a. Download options data for JC Penney (ticker: JCP) at 1m, 3m, 6m and 1y
expiries. Check the data for arbitrage and resolve any arbitrage opportunities
using the technique of your choice.
b. Using the data above, extract and plot an implied volatility smile for each
expiry.
c. For each expiry, calibrate the optimal set of Variance Gamma parameters
{θ, σ, ν}. Comment on how you formulated your objective function and how
you chose the weights to apply to each option.
d. Try several values for the technique parameters as well as variations of the
optimization formulation (i.e. modify the weighting scheme, starting guess
and upper and lower bounds for each parameter). What do you think is the
optimal setup?
e. Using the calibrated parameters, calculate the price of at-the-money digital
put options at 1m, 3m, 6m and 1y expiries. Comment on the difference in
pricing of the digitals as expiry increases. How does it relate to the shape of
the volatility smiles you plotted above?
f. Re-price all calibrated instruments after the following shifts to the Variance
Gamma model parameters:
• σ̂ = σ + 0.05
• θ̂ = θ − 0.25
• ν̂ = ν + 0.025
Plot the updated implied volatility surfaces for 1,3,6 and 12 months. Com-
ment on the impact of this shift to the volatility smile.
9.5 Consider the following risk-neutral pricing formula for a European Digital Call:
h i
c0 = Ẽ e−rT 1{ST >K}
a. Derive the pricing formula that can be used to price European Digital Call
options using the characteristic function of a stochastic process.
74 A Practical Guide to Investment Management, Trading and Financial Engineering
b. Describe two approaches for calculating the final integral and the trade-offs
between them.
c. Discuss how you would use this pricing formula to create an approximation
for an American Digital put option. The payoff for an American Digital put
option can be written as:
h i
p0 = Ẽ e−rT 1{MT <K}
where MT is the minimum value for the asset over the observation period.
Comment on what factors would go into the accuracy of your approximation
and whether the estimate would be reliable.
d. Consider the following strike spacing grid used in the FFT method:
km = β + (m − 1)∆k for m = 1, . . . , N = 2n ,
km = β + (m − 1)∆k for m = 1, . . . , N = 2n .
75
76 A Practical Guide to Investment Management, Trading and Financial Engineering
33 plt . ylabel ( " frequncy per " + str ( n ) + " trials " )
34 plt . show ()
35
36 return price_vec
Note that in this coding example we rely on the reflection approach to handling
potential negative volatilities.
10.0.3 Coding Example: American Options Pricing via the Black-Scholes PDE
In this example, we calculate the price of an American call by solving the Black-
Scholes PDE using the explicit discretization scheme:
1 def solve_bs_pde ( s0 , smax ,k ,T ,N ,M , sig ,r , p ) :
2
3 ht = T / N
4 hs = smax / M
5 t = np . arange (0 , T + ht , ht )
6 s = np . arange (0 , smax + hs , hs )
7
8 d = 1 -( sig **2) *( s **2) * ht /( hs **2) -r * ht
9 l = 0.5*( sig **2) *( s **2) * ht /( hs **2) -r * s * ht /(2* hs )
10 u = 0.5*( sig **2) *( s **2) * ht /( hs **2) + r * s * ht /(2* hs )
11
Options Pricing Techniques for Exotic Options 77
The solve bs pde function returns the update matrix A with its eigenvalues, and
the price vector. The price vector denotes the option price at time 0 with different
underlying asset prices, so we can plug in the current value of the asset price to obtain
our estimated model price via a PDE approximation.
EXERCISES
10.1 Match the market instrument on the right with the most appropriate pricing
technique on the left.
Quadrature European Call with Heston dynamics
FFT American Option with Heston dynamics
PDE Knock-out options with Variance Gamma dynamics
Simulation Call option contingent on European trigger with Black-Scholes dynamics
You are asked to use this SDE to write a simulation algorithm to price an
78 A Practical Guide to Investment Management, Trading and Financial Engineering
American upside one touch option. Recall that the payoff of an American one
touch is defined as:
h i
c0 = Ẽ e−rT 1MT >K
where MT is the maximum value of the asset over the period.
a. (2.5 points) List the set of parameters that you will have in your algorithm
and describe their meaning.
b. (7 Points) Write a piece of pseudo-code that you can use to simulate from
the given stochastic process.
c. (7 Points) Write a piece of pseudo-code that will define the payoff function
for your exotic option.
d. (3.5 points) Describe three unit tests that you would create to ensure that
your model prices are correct.
You are asked to use this SDE to write a simulation algorithm to price an
American up-and-out call option. Recall that the price of an American up-and-
out call option is defined as:
h i
c0 = Ẽ e−rT (ST − K)+ 1MT <B
Where MT is the maximum value of the asset over the period and B is the
barrier level.
a. (2 points) List the set of model parameters and describe their meaning.
Explain how to simplify this model to the Bachelier model.
b. (2 points) Comment on whether you think this would be a good model for
an equity index? How about a rates or volatility model? Why or why not?
c. (3.5 Points) Write a piece of code in Python or C++ that you can use to
simulate a path from the given stochastic process.
d. (3.5 Points) Write a piece of code in Python or C++ that will define the
price function of your exotic option, given a simulated path.
e. (2 points) Name the simulation parameters that you must choose and de-
scribe how you would set them optimally.
f. (2 points) Describe three unit tests that you would create to ensure that
your model prices are correct.
Options Pricing Techniques for Exotic Options 79
where r is the risk-free rate and q is the dividend rate. Then the Black-Scholes
PDE for a derivative security c is:
∂c 1 2 2 ∂ 2 c ∂c
+ σ s + (r − q)s − rc = 0 (P10.2)
∂t 2 ∂s2 ∂s
a. Write a discretization of this equation which would lead to an explicit
scheme.
b. Write the boundary conditions for an American Digital Put which pays $1
if the asset touches a downside barrier K, at any time.
c. Discuss how you select the time and space grid.
d. If you know the price of the European Digital Put how could you create a
rough estimate of the American Digital Put price? What assumptions impact
the accuracy of this approximation?
10.5 Consider a generic function of two variables f (x, y). We would like to use finite
differences to approximate the derivatives of f (x, y).
10.6 For λn , λp > 0 with λn 6= λp , consider the following probability density function
for a random variable X:
(
1 λn x
2 λn e , x<0
f (x) = 1 −λp x
2 λp e , x≥0
10.7 Consider an Asian option whose risk-neutral pricing formula can be written as:
h i
c0 = Ẽ e−rT (S̄ − K)+
a. (3 points) List all parameters in the Heston model and describe intuitively
what they represent.
b. (6 points) Using the Heston model and the Euler discretization scheme for
Monte-Carlo, compute an approximation for the Asian option assuming the
following parameters:
σ0 = 0.05
κ=1
θ = 0.1
ρ = −0.5
ξ = 0.25
1
∆t =
252
c. (3 points) Run your approximation with many different numbers of simlated
paths. Estimate the number of draws that are needed for the price of the op-
tion to converge. Comment on the difference between this and the theoretical
convergence rate.
d. (5 points) Consider Z and − 12 Z as antithetic variables, where Z is a standard
normal random variable. Calculate the mean and variance of an estimator
using these pairs of variables. Is this approach unbiased? Does it reduce the
variance? By how much?
e. (3 points) Describe what you think would be an ideal control variate for the
Asian option priced above. Justify your answer with either a theoretical or
empirical argument.
f. (5 points) Reprice the Asian option above using the control variate of your
choice. Does the simulated price converge faster? Why or why not?
10.8 The SDE for the CEV model for β > 0 is:
a. (3 Points) Write the PDE that would result from using the CEV model.
Options Pricing Techniques for Exotic Options 81
b. (3 Points) What are the parameters in the model and what is their interpre-
tation?
c. (2 Points) What parameters would lead this model to match the dynamics
of the Black-Scholes model?
d. (6 Points) Write a discretization of this equation which would lead to an
implicit scheme.
e. (6 Points) Discuss how you select the time and space grid.
10.9 Suppose that the underlying security SPY evolves according to the Heston
model. That is, we know its dynamics are defined by the following system of
SDEs:
√
dSt = (r − q)St dt + νt St dWt1
√
dνt = κ(θ − νt ) dt + σ νt dWt2
Cov(dWt , dWt2 )
1
= ρ dt (P10.3)
You know that the last closing price for SPY was 282. You also know that the
dividend yield for SPY is 1.77% and the corresponding risk-free rate is 1.5%.
Using this information, you want to build a simulation algorithm to price a
knock-out option on SPY, where the payoff is a European call option contingent
on the option not being knocked out, and the knock-out is an upside barrier
that is continuously monitored. We will refer to this as an up-and-out call.
This payoff can be written as:
h i
c0 = E (ST − K1 )+ 1{MT <K2 } (P10.4)
where MT is the maximum value of S over the observation period, and K1 < K2
are the strikes of the European call and the knock-out trigger respectively.
a. Find a set of Heston parameters that you believe govern the dynamics of
SPY. You may use results from a previous Homework, do this via a new
calibration, or some other empirical process. Explain how you got these and
why you think they are reasonable.
b. Choose a discretization for the Heston SDE. In particular, choose the time
spacing, ∆T as well as the number of simulated paths, N . Explain why you
think these choices will lead to an accurate result.
c. Write a simulation algorithm to price a European call with strike K = 285
and time to expiry T = 1. Calculate the price of this European call using
FFT and comment on the difference in price.
d. Update your simulation algorithm to price an up-and-out call with T = 1,
K1 = 285 and K2 = 315. Try this for several values of N . How many do you
need to get an accurate price?
82 A Practical Guide to Investment Management, Trading and Financial Engineering
e. Re-price the up-and-out call using the European call as a control variate.
Try this for several values of N . Does this converge faster than before?
f. Calculate the delta, vega and gamma of both a European call and our up-
and-out option. What variance reduction techniques can we apply to this
calculation? Comment on the difference in the Greeks between the European
call and the up-and-out option. Do they make sense to you?
10.10 Suppose that the underlying security SPY evolves according to standard geo-
metric brownian motion.
Then its derivatives obey the Black-Scholes equation:
∂c 1 2 2 ∂ 2 c ∂c
+ σ s + rs − rc = 0 (P10.5)
∂t 2 ∂s2 ∂s
Use SPY’s closing price of 380.
We are going to find the price of an American call spread with the right to
exercise early. Suppose the two strikes of the call spread are K1 = 385 and
K2 = 390 and that the risk-free rate is 1.5%. Consider a three-month American
call spread.
a. Explain why this instrument is not the same as being long an American call
with strike 385 and short an American call with strike 390, both with expiry
in three-months.
b. Let’s assume that we are not able to find σ by calibrating to the European
call spread price and must find it by other means. Find a way to pick the σ,
explain why you chose this method, and then find the σ.
c. Set up an explicit Euler discretization of (P10.5). You will need to make
decisions about the choice of smax , hs , ht , etc. Please explain how you arrived
at each of these choices.
d. Let A be the update matrix that you created in the previous step. Find out
its eigenvalues and check their absolute values.
e. Apply your discretization scheme to find today’s price of the call spread
without the right of early exercise. The scheme will produce a whole vector
of prices at time 0. Explain how you chose the one for today’s price.
f. Modify your code in the previous step to calculate the price of the call spread
with the right of early exercise. What is the price?
g. Calculate the early exercise premium as the difference between the American
and European call spreads. Is it reasonable?
10.11 Consider a one-year fixed strike lookback option which enables the buyer
to choose the point of exercise for the option at its expiry.
Options Pricing Techniques for Exotic Options 83
Recall that the dynamics of the Black-Scholes model can be written as:
And that each Brownian motion increment dWt is normally distributed with
mean zero and variance ∆t. Assume that r = 0, S0 = 100 and σ = 0.25.
10.12 Consider again a one-year fixed strike lookback option which enables the
buyer to choose the point of exercise for the option at its expiry. The Bachelier
model can be written as:
11.0.3 Coding Example: Estimation of Lookback Option Greeks via Finite Differences
In this coding example we show how finite difference can be used compute first-order
Greeks for a lookback option, where the lookback option itself is valued using the
simulation techniques discussed in chapter 10:
1 def bs_simulate ( s0 , t , r , sigma , n , N ) :
2 dt = 1 / n
3 dwt = np . random . normal (0 , np . sqrt ( dt ) , (N , int ( n * t ) ) )
4 s = np . zeros (( N , dwt . shape [1] + 1) )
5 s [: , 0] = s0
6 for i in range (1 , s . shape [1]) :
7 s [: , i ] = s [: , i -1] + s [: , i -1] * r * \
8 dt + s [: , i -1] * sigma * dwt [: , i -1]
9 return s
85
86 A Practical Guide to Investment Management, Trading and Financial Engineering
10
11 def lookback_price (k , t , is_call , paths ) :
12 s_m = np . max ( paths , axis =1) if is_call else np . min ( paths , axis =1)
13 return np . fmax ( s_m - k , 0) if is_call else np . fmax ( k - s_m , 0)
14
15 def c ( s0 = s0 , sigma = sigma , t =t , r =r , k =k , n =n , N = N ) :
16 np . random . seed (0) # to get the same random numbers every time
17 paths = bs_simulate ( s0 , t , r , sigma , n , N )
18 look_prc = np . exp ( - r * t ) * np . mean ( lookback_price (k , t , True ,
paths ) )
19 return look_prc
20
21 epsilon = 1e -6 # different epsilon gets different gamma
22 delta = ( c ( s0 + epsilon ) - c ( s0 - epsilon ) ) / (2* epsilon ) #
1.1608790693173887
23 vega = ( c ( s0 , sigma + epsilon ) - c ( s0 , sigma - epsilon ) ) / (2* epsilon ) #
106.27836798526857
EXERCISES
11.1 Derive the formula for the delta of a call option under the Bachelier model using
the closed form pricing formula in (??).
11.2 Derive the formula for theta under the Black-Scholes model as defined in (??).
11.3 Option traders often say that when buying options we get gamma at the expense
of theta. Explain what they mean.
11.4 Describe the options structure that you would trade on the S&P 500 if you had
the following objectives:
Description
Long at-the-money one-year call option
Short 25 delta three-month put option
Long one-month at-the-money straddle
Six-month risk reversal
Long three-month at-the-money to 25 delta call spread
11.6 Hedging Under Heston Model: Consider a 3 month European call with
strike 275 on the same underlying asset.
a. Calculate the price of this 3-month 110 strike call using your calibrated
Heston parameters and your FFT pricing algorithm.
b. Extract the Black-Scholes implied volatility for this option.
c. Calculate this option’s Heston delta using finite differences. That is, calculate
a first order central difference by shifting the asset price, leaving all other
parameters constant and re-calculating the FFT based Heston model price
at each value of S0 .
i. Compare this delta to the delta for this option in the Black-Scholes
model. Are they different, and if so why? If they are different, which do
you think is better and why? Which would you use for hedging?
ii. How many shares of the asset do you need to ensure that a portfolio
that is long one unit of the call and short x units of the underlying is
delta neutral?
d. Calculate the vega of this option numerically via the following steps:
i. Calculate the Heston vega using finite differences. To do this, shift θ
and ν0 by the same amount and calculate a first order central difference
leaving all other parameters constant and re-calculating the FFT based
Heston model price at each value of θ and ν0 .
ii. Compare this vega to the vega for this option in the Black-Scholes model.
Are they different, and if so why?
iii. Calculate the Heston vega of an at-the-money straddle (long 1 unit of
the 267.15 strike call + long 1 unit of the 267.15 strike put) using finite
differences.
iv. How many units of the straddle do you need in order to ensure that a
portfolio that is long one unit of the call and short x units of the straddle
is vega neutral?
a. Download historical data for the S&P using the SPY ETF and for the VIX
index, which we will use as a proxy for volatility.
b. Examine both the S&P and the VIX index data for autocorrelation. You
may do this using the regression approach you used in the first two home-
works, or via stationarity tests and ARMA models. Do you find evidence of
autocorrelation? Which series has more evidence of autocorrelation? Where
would you expect more autocorrelation?
c. Calculate the correlation of the S&P and its implied volatility (using VIX
as a proxy) on a daily and monthly basis. Is the correlation significant?
What implications does this have for an options pricing model, such as the
Black-Scholes model?
Greeks and Options Trading 89
d. Calculate rolling 90-day correlatons of the S&P and its implied volatility as
well. When does the correlation deviate the most from its long-run average?
e. Calculate rolling 90-day realized volatilities in the S&P and compare them
to the implied volatility (again using VIX as a proxy). Plot the premium of
implied vol. over realized vol. Is the premium generally positive or negative?
When is the premium highest? Lowest?
f. Construct a portfolio that buys a 1M at-the-money straddle (long an at-
the-money call and long an at-the-money put) every day in your historical
period. Use the Black-Scholes model to compute the option prices and use
the level of VIX as the implied vol input into the BS formula.
g. Calculate the payoffs of these 1M straddles at expiry (assuming they were
held to expiry without any rebalances) by looking at the historical 1M
changes in the S&P. Calculate and plot the P&L as well. What is the average
P&L?
h. Make a scatter plot of this P&L against the premium between implied and
realized volatility. Is there a strong relationship? Would you expect there to
be a strong relationship? Why or why not?
ETF Description
SPY S&P 500
HYG High-Yield
TLT Long Maturity Treasuries
USO Oil
91
92 A Practical Guide to Investment Management, Trading and Financial Engineering
29
30 dk = ( upper_k - lower_k ) / n_points
31
32 ks = np . linspace ( lower_k , upper_k , n_points )
33 ivs = np . full ( n_points , 0.2)
34
35 rnd = breeden_litzenberger ( ks , ivs , s0 , r , exp_t , dk )
36 print ( rnd )
Note that the callPx function is used in the above coding sample is taken from
section ??.
Where again the callPx function referenced in the above code leverages the ex-
ample in ??
EXERCISES
12.1 Show how you would extract the risk-neutral density using the Breeden-
Litzenberger methodology for a set of put options.
12.2 Derive the price of a digital put option and show its relationship to the CDF of
a risk-neutral distribution.
d. Describe a set of arbitrage conditions that a set of call spreads with different
K’s (but the same ) should obey.
You also know that the current stock price is 100, the risk-free rate is 0, and
the asset pays no dividends.
NOTE: The table of strikes is quoted in terms of Deltas, where the DP rows
indicate ”X Delta Puts” and the DC rows indicate ”X Delta Calls”.
12.5 Consider the following risk neutral pricing formula for a European put option:
p0 = Ẽ (K − S)+
For purposes of this problem you may assume that interest rates are equal to
zero and thus ignore discounting terms.
a. Download futures and options data for 1M, 3M and 6M Gold futures.
b. Using weighted monte carlo with the Black-Scholes model as a prior calibrate
the risk-neutral density.
c. Plot the 1M, 3M and 6M risk-neutral densities and comment on the differ-
ences between them.
d. Using bootstrapping of historical returns create an estimate of the physical
distribution of Gold. Compare this to the risk-neutral distribution and com-
ment on any features present in the risk-neutral distribution but not in the
physical distribution.
12.7 Compare the risk-neutral density of two models (i.e. SABR vs. Heston) with
given parameters
12.8 Choose a model and compare the risk-neutral distribution as wel change the
parameters.
III
Quant Modelling in Different Markets
97
CHAPTER 13
99
100 A Practical Guide to Investment Management, Trading and Financial Engineering
The reader will notice that instruments other than swaps, such as Eurodollar
futures and FRAs, are not included in this bootstrapping algorithm. This is instead
left as an exercise to the reader to extend the coding example to accommodate these
instruments.
EXERCISES
13.1 Consider the following table of bond yields by maturity:
Maturity(years) Yield
1 0.025
2 0.026
3 0.027
5 0.03
10 0.035
30 0.04
Interest Rate Markets 103
Note: these are yield to maturities for each bond can be used for all its under-
lying cashflows. For example, for a 5-year coupon (or zero-coupon) bond, use a
constant yield of 3% for all the bonds cashflows.
a. Calculate prices of a zero coupon bond that pays $100 at maturity for each
maturity & yield combination. Which price is the highest? Is this reasonable?
b. Calculate the duration of each zero coupon bond, or sensitivity of the bond
price to a change in bond yield. What is the relationship between bond prices
and bond yields?
c. Calculate prices of coupon bonds that pay $100 at maturity at 3% annually
until maturity. Which prices are below $100? Which prices are above? Why?
d. Calculate the duration of each coupon bond using finite differences. Do zero-
coupon bonds or coupon bonds have higher duration? Why?
e. Calculate the second derivative of each bond price with respect to yield (com-
monly known as convexity). Are the second derivatives positive or negative?
f. Consider a portfolio that is long one unit of the 1 year zero-coupon bond,
long one unit of the 3 year zero-coupon bond and short two units of the 2
year zero-coupon bond. Calculate the initial value of the portfolio.
g. Calculate the duration of this portfolio. Calculate the convexity of the port-
folio as well. Which quantity is bigger?
h. Adjust the number of units of the short position in the two year zero-coupon
bond so that the portfolio is duration neutral (leaving the units of the one
and three year zero-coupon bonds unchnaged). How many units of the two
year zero-coupon bond are required to do this?
i. Suppose you own this adjusted portfolio and rates sell off by 100 basis points
(each yield rises by 1%). What happens to the value of your portfolio?
j. Now suppose you own this adjusted portfolio and rates rally by 100 basis
points (each yield decreases by 1%). What happens to the value of your
portfolio? Is this a portfolio you would want to own? What are the risks of
owning this portfolio?
k. Print the cashflows of a 5-year amortizing bond that repays 20% of its prin-
cipal annually and pays a 3% coupon (annually).
l. Calculate the price and duration of the amortizing bond using finite differ-
ences. Comment on the difference between this bond and its zero coupon
and coupon equivalents.
1Y 2.8438%
2Y 3.060%
3Y 3.126%
4Y 3.144%
5Y 3.150%
7Y 3.169%
10Y 3.210%
30Y 3.237%
a. Extract the constant forward rate for the first year that enables you to match
the 1Y market swap rate.
b. Holding this first year forward rate fixed, find the forward rate from one
year to two years that enables you to match the two year swap (while also
matching the one year).
c. Continue this process and extract piecewise constant forward rates for the
entire curve. Comment on the forward rates vs. the swap rates.
d. Compute the fair market, breakeven swap rate of a 15Y swap. That is, find
the swap rate that equates the present values of the fixed and floating legs.
e. Compute discount factors. Compute zero rates by finding the constant rate
that leads to the calibrated discount factors. Comment on the differences in
the zero rates and swap rates.
f. Shift all forward rates up 100 basis points and re-calculate the breakeven
swap rates for each benchmark point. Generate a table of new swap rates.
Are these rates equivalent to having shifted the swap rates directly?
g. Consider a bearish steepener to the swap rates, that is perform the following
shifts on each swap rate:
1Y +0 bps
2Y +0 bps
3Y +0 bps
4Y +5 bps
5Y +10 bps
7Y +15 bps
10Y +25 bps
30Y +50 bps
1Y -50 bps
2Y -25 bps
3Y -15 bps
4Y -10 bps
5Y -5 bps
7Y +0 bps
10Y +0 bps
30Y +0 bps
a. Calculate the three-month carry of both payer and receiver swap at each
benchmark by revaluing the swaps with expiry shortened by three-months.
Assume that swaps rates roll-down the yield curve, meaning the nine-month
swap rate in three-months is equal to today’s nine-month swap rate..
b. Do payer or receiver swaptions seem to have positive carry? Why?
c. Construct a duration portfolio that is long the highest carry swap and short
the lowest carry swap. Describe this portfolio. What are its main risks?
13.4 Consider the following table of normal swaption volatilities and corresponding
par swap rates:
a. Calculate the constant instantaneous forward rate for each swap that will
lead to the par swap rates listed. You may use a different instantaneous
forward rate for each swap and are not required to go through an entire
bootstrapping exercise.
b. Using the rates obtained above, calculate the current annuity value for each
swap in the above table.
106 A Practical Guide to Investment Management, Trading and Financial Engineering
c. Calculate a table of premiums for each swaption in the table using the Bache-
lier pricing formula and the annuities computed above.
d. For each option expiry, find the set of SABR parameters that best matches
the quoted normal volatilities. Utilize the asymptotic approximation formula
to calculate the normal volatility for a given set of SABR parameters and
look for a solution that minimizes the distance between market and model
volatilities.
e. Comment on the relationship of the calibrated parameters as a function of
expiry.
f. Using these calibrated SABR parameters, calculate the price and normal
volatility of swaptions with strikes equal to ATM - 75 and ATM + 75.
g. Calculate the equivalent Black volatilities for each option in the table above.
13.5 Consider an investor who buys a 3-year no call 30 year Bermudan swaption and
sells a 3y27 swaption at the same, at-the-money strike.
a. Without modeling the Berm, plot an estimated payoff diagram of the un-
derlying structure and describe the trades properties.
b. What happens to the holder of this position if rates rally or sell-off by 100
basis points? Do they make or lose money?
c. Suppose Bermudan swaptions are not available to trade in this rates market.
Propose an analogous structure of European swaptions that would mimic the
payoff you made above.
13.6 Consider an investor who has a strong macroeconomic view that the yield curve
is likely to steepen.
a. Describe the exotic option structure that will most precisely instrument that
view.
b. Would higher or lower correlations lead to better pricing for this investor?
c. What is the correlation bet implicit in their view?
d. Suppose exotic options are illiquid and cannot be traded. Construct a portfo-
lio of swaptions that best replicate this payoff. Highlight any key differences
between the two structures.
13.7 The dynamics of the Hull-White mode are presented in (??). Consider this
model with the following parameters:
σ = 0.25
κ=1
r0 = 0.0118
Interest Rate Markets 107
a. Download yield curve data and extract a set of zero-coupon bond prices for
various maturities.
b. Calibrate the Hull-White drift term, θt so that the model matches the current
yield curve.
c. Derive the conditional distribution of the short rate, r(t) at time t. Calculate
its expected value and variance.
d. Using simulation, generate N = 10000 paths for the short rate that ex-
tend to time t = 10. Discuss the most efficient approach to conducting this
simulation.
e. Calculate the value of a 10-year zero-coupon bond along each simulated
path. Compare this to the known formula for Zero-Coupon bonds in the
Hull-White model.
f. Calculate the value of a call option on a 10-year zero-coupon bond with
strike K = 99.
g. Re-price this zero-coupon bond option with σ = 0.4. What happens to the
price? Why?
h. Re-price this zero-coupon bond option with κ = 5. What happens to the
price? Why?
a. Download data from treasury.gov or another reputable source and use boot-
strapping to extract a piecewise constant instantaneous forward rate process.
b. Calculate the price of each bond listed above. Which is the highest? Lowest?
Why?
c. Calculate the duration or sensitivity of each bond to a parallel shift of the in-
stantaneous forward rates by one basis point. Which bonds have the highest
and lowest duration? Explain why.
d. Calculate the convexity of each bond to a parallel shift of the instantaneous
forward rates. Which bonds have the highest and lowest convexity? Explain
why.
e. Which bond is best suited to serve as an inflation hedge (or more simply a
hedge against rising rates)?
108 A Practical Guide to Investment Management, Trading and Financial Engineering
f. Construct a duration neutral portfolio that is long one unit of the five-year
zero coupon bonds and short ∆ units of the thirty-year zero coupon bond.
Explain the remaining exposures for this trade. Does it benefit from the
curve steepening or flattening?
13.9 Match each of the following exotic options to the most appropriate model below
(use each model only once):
Bermudan Swaption
Vanilla Swaption
Cap
Eurodollar Future Convexity Correction
a. SABR/LMM
b. Hull-White Short Rate Model
c. Black’s Model for the Underlying Rate
d. SABR Model for the Underlying Rate
a. (15 points) Write a series of functions that take a vector of prices of zero
coupon bonds with distinct maturities and extracts the following rates:
i. Zero Rates from today until time T
ii. Spot Rates from today until time T
iii. Instantaneous, Continuously Compounded Forward Rates from time t
to T
iv. Semi-Annually Compounded Forward Rates from time t to T
v. Par Swap Rates from time t to T
t and T should be inputs to the functions. You may write the functions in
Python, or use pseudo-code.
b. (12 points) Consider a 5y fixed-for-floating swap that you enter paying fixed.
i. What is the present value of the swap entered at the par swap rate?
ii. Given that you are paying fixed on the swap, do you make or lose money
if rates increase?
iii. Write a function or pseudocode that computes the value of a con-
tract struck at fixed coupon C after the instantaneous forwards in
(13.10(a.)iii) increase by δ basis points in parallel. Assume that C was
the par swap rate prior to the shift in the instantaneous forwards.
c. (8 points) Consider the following instruments:
Interest Rate Markets 109
13.11 Suppose you have calibrated a set of caplet volatilities to a SABR model and
obtained the following parameters:
a. Using this table of parameters, calculate the price and constant Black volatil-
ity of spot starting 1y and 2y caps and floors that are at-the-money and 50
basis points above and below the at-the-money-strike.
b. Create a set of shifts to the volatility surface such that:
• The volatility surface increases in parallel
• The volatility surface steepens
• The skew smiles
• The tails become fatter
Evaluate the table of options above under each scenario and comment on
which options have the highest and lowest sensitivity to these movements.
c. Without modeling or calculating the value, discuss how you might use this
table of SABR coefficients to value a swaption. What are the challenges?
What approximations would you rely on?
CHAPTER 14
Credit Markets
111
112 A Practical Guide to Investment Management, Trading and Financial Engineering
7 prevT = 0.0
8 for tenor_ii , lambda_ii in zip ( cdsTenors , cdsLambdas ) :
9 if ( T > prevT ) :
10 tau = min ( tenor_ii , T ) - prevT
11 survProb *= math . exp ( - tau * lambda_ii )
12 prevT = tenor_ii
13
14 return survProb
15
16 def pvCDS ( cdsLambdas , cdsTenors , cdsSpread , cdsMat , coupon_frequency ,
rf , recov ) :
17 total_payments = coupon_frequency * cdsMat
18 pv_def_leg = 0.0
19 pv_no_def_leg = 0.0
20
21 for payment in range (1 , total_payments +1) :
22 payment_time = payment / coupon_frequency
23
24 pv_no_def_leg += cdsSpread * math . exp ( - rf * payment_time ) *
getSurvProbAtT ( payment_time , cdsTenors , cdsLambdas , recov )
25
26 lambda_ord = np . where ( cdsTenors == min ( cdsTenors [ cdsTenors >=
payment_time ]) ) [0][0]
27 lambda_curr = cdsLambdas [ lambda_ord ]
28
29 pv_def_leg += (1 - recov ) * math . exp ( - rf * payment_time ) *
getSurvProbAtT ( payment_time , cdsTenors , cdsLambdas , recov ) *
lambda_curr
30
31 return pv_def_leg , pv_no_def_leg
32
33 def cd sP ri cin gE rr or Squ ar ed ( cdsLambdas , cdsTenors , cdsSpreads ,
coupon_frequency , rf , recov ) :
34 sse = 0.0
35
36 for tenor_ii , spread_ii in zip ( cdsTenors , cdsSpreads ) :
37 pv_def_leg_ii , pv_no_def_leg_ii = pvCDS ( cdsLambdas , cdsTenors
, spread_ii , tenor_ii , coupon_frequency , rf , recov )
38 pv_cds_ii = ( pv_def_leg_ii - pv_no_def_leg_ii )
39 sse += pv_cds_ii **2
40
41 return sse ;
42
43 cdsSpreads = np . asarray ([120 , 135 , 150 , 160 , 175]) / 10000.0
44 cdsTenors = np . asarray ([1 , 3 , 5 , 7 , 10])
45 cdsLambdas = [0.05 , 0.05 , 0.05 , 0.05 , 0.05]
46 coupon_frequency = 2
47 rf = 0.005
48 recov = 0.4
49
50 pvCDS ( cdsLambdas , cdsTenors , 0.05*(1 -0.4) , 5 , 2 , 0.0 , 0.4)
51
52 res = minimize ( cdsPricingErrorSquared ,
53 x0 = cdsLambdas ,
Credit Markets 113
The reader is encouraged to notice the use of the risky annuity term as the
numeraire when modeling this CDS swaption.
8 rho = 0.2
9
10 def index_loss_distr (N , M , rho , spread , recov , tau ) :
11 implied_lambda = spread / (1 - recov )
12 surv_prob = math . exp ( - implied_lambda * tau )
13 def_prob = 1 - surv_prob
14 C = norm . ppf ( def_prob )
15
16 fLs = np . zeros ( N +1)
17
18 for ii in range ( M +1) :
19 Zi = -5.0 + ii * (10 / M )
20 phiZ = norm . pdf ( Zi )
21 dz = 10 / M
22
23 for n in range ( N +1) :
24 innerNumer = C - rho * Zi
25 innerDenom = np . sqrt (1 - rho **2)
26 innerTerm = innerNumer / innerDenom
27
28 pLCondZ = (1 - recov ) * norm . cdf ( innerTerm )
29 fLCondZ = ( math . factorial ( N ) / ( math . factorial ( n ) * math .
factorial (N - n ) ) ) * pLCondZ ** n * (1 - pLCondZ ) **( N - n )
30 fLs [ n ] += fLCondZ * phiZ * dz
31
32 return fLs
33
34 fLs = index_loss_distr (N , M , rho , spread , recov , tau )
35 print ( fLs )
36 print ( fLs . sum () )
37 plt . plot ( fLs )
13
14 return ( sigmaAssetsError + eError )
15
16 A = 100.0
17 L = 90.0
18 sigmaEq = 0.5
19 t = 5.0
20 r = 0.0
21
22 bounds = Bounds ((0 ,0) , ( None , None ) )
23 guess = ( sigmaEq , A - L )
24 guess
25 res = minimize ( solveForSigmaAssetsAndValueOfEquity ,
26 x0 = guess ,
27 bounds = ((0.0 , None ) ,(0.0 , None ) ) ,
28 args = (A , L , sigmaEq , t , r ) ,
29 tol = 1e -10 ,
30 method = ’ SLSQP ’ ,
31 options ={ ’ maxiter ’: 400 , ’ ftol ’: 1e -14}) [ ’x ’]
32
33 print ( res )
34 print ( s o l v e F o r S i g m a A s s e t s A n d V a l u e O f E q u i t y ( res , A , L , sigmaEq , t , r ) )
35
36 def merton_model (A , L , sigmaAssets , t , r ) :
37
38 d1 = ( math . log ( A / L ) + r + 0.5* sigmaAssets **2* t ) /( sigmaAssets *
math . sqrt ( t ) )
39 d2 = ( math . log ( A / L ) + r - 0.5* sigmaAssets **2* t ) /( sigmaAssets *
math . sqrt ( t ) )
40
41 E = A * norm . cdf ( d1 ) - L * math . exp ( - r * t ) * norm . cdf ( d2 )
42
43 DD = ( math . log ( A / L ) + ( r - 0.5* sigmaAssets **2* t ) ) / ( sigmaAssets
* math . sqrt ( t ) )
44 pdef = norm . cdf ( DD )
45 return E , sigmaAssets , DD , pdef
46
47 print ( merton_model (A , L , res [0] , t , r ) )
EXERCISES
14.1 Consider the following table of par spreads for an IG credit index:
Assume that the recovery rate, R, is 40% and the discount curve is defined by
flat instantaneous forward rates at 2%.
116 A Practical Guide to Investment Management, Trading and Financial Engineering
14.2 Suppose you are creating a bespoke Emerging market sovereign credit index
product that is a function of the following underlying single-names:
a. Find the constant hazard rate that matches each par CDS spread and the
survival probability of each entity five years in the future.
b. Calculate the fair spread of the bespoke index.
c. Suppose the dealer quotes you a spread of 200 basis points to enter into
this contract. Would you enter into this contract? Would be long or short
protection?
Credit Markets 117
d. Is there a trade available with the single-names and the bespoke index that
you would engage in? Describe carefully the units of each item you would
buy or sell and the properties of the entire package.
a. Download historical data for HYG, JNK and LQD and clean the data for
stock splits and other anomalies.
b. Calculate the rolling 1y realized volatility of each ETF and comment on any
patterns that you observe.
c. Estimate the carry in each ETF and compute the implied sharpe ratio due to
Carry
carry as: Realized Vol . Where realized vol is measured over the entire period.
Rank the ETFs by their relative carry and compare them to the analogous
value for S&P 500 and comment on the difference.
d. Obtain at-the-money implied volatility data for each ETF and compare it
to realized volatilty. Is it higher or lower? Why?
14.4 Consider a CDS whose 5 year par spread is 145 basis points.
a. Calculate the value of a knockout payer swaption with strike spread of 150
basis points expiring in three months. Assume σ = 0.2.
b. Calculate the value of the front-end protection on this CDS from today until
the option expiry in three months.
c. Use the results above to value a non-knockout payer swaption with strike
spread of 150 basis points expiring in three months.
d. Calculate delta, gamma, vega and theta of the knockout swaption. Describe
the dominant Greeks for this option.
e. Describe how you would construct a delta-hedging scheme for this option
and describe the major remaining risks once the delta has been neutralized.
f. Plot the value of the non-knockout option as time passes from three months
until expiry and use this to comment on the theta profile of the option.
a. Download an options volatility surface for Xerox Corp (ticker XRX). Clean
the data for arbitrage.
b. Calibrate a Variance Gamma model for each expiry with the additional de-
fault probability parameter calibrated as well. Comment on the default prob-
abilities for each expiry.
c. Try multiple start guesses, upper and lower bounds and weighting schemes
in your calibration. Are the parameters stable with respect to changes in the
optimization structure?
118 A Practical Guide to Investment Management, Trading and Financial Engineering
d. Repeat this exercise with a CEV model replacing the Variance Gamma dy-
namics and comment on any differences in the extracted default probabilities.
14.6 Assume that the risk-free interest rate is 2% for all maturities and suppose that
the CDS spreads for contracts that are starting today are given by the table
below. Also, assume that the expected recovery R = 40%.
14.7 Suppose a company HTZ issues a 5 year bond with a 4% coupon, paid semi-
annualy. The current market price of the bond is $10 (per $100 face) and the
risk-free interest rate is constant at 1%.
Suppose the expected recovery is zero, i.e., R = 0%, and the spreads are
a. Compute the quarterly survival curve out to 5 years. Be explicit about the
assumptions you are making.
b. Using survival probabilities from 14.7a., show how would you value a single
payment at time t that is conditional on survival of HTZ until time t.
Credit Markets 119
c. Consider the bond above that pays a 2% coupon twice a year. Show how you
would value this coupon bond on HTZ? Provide the bond’s present value.
d. Is the market price above or below the value? What you do if it is above or
below?
e. Calculate CDS-bond basis using the following method: find the parallel shift
of the CDS curve that would make the present value of the bond’s cashflow
equal to its market price.
f. Suppose that you own $10M notional (face value) of the bond. How much
of the CDS should you buy to offset as much of the bond’s risk as possible?
Please specify the maturity, notional, and the spread of the CDS contract.
CHAPTER 15
The reader should note that the risk-free rate in this Black-Scholes pricing formula
should correspond to the risk-free rate of the domestic currency .
121
122 A Practical Guide to Investment Management, Trading and Financial Engineering
17 if isPut > 0:
18 delta = - norm . cdf ( - d1 )
19 else :
20 delta = norm . cdf ( d1 )
21 return delta
22
23 def g e t S t r i k e G i v e n D e l t a S A B R S m i l e ( F_0 ,T , sigma_0 , alpha , beta , rho ,
isPut , deltaHat , initial_guess ) :
24 root_fn = lambda x : sabrDeltForStrike ( F_0 ,T ,x , sigma_0 , alpha ,
beta , rho , isPut ) - deltaHat
25 return root ( root_fn , initial_guess ) [ ’x ’ ][0]
26
27 def objectiveFunctionFX ( params , beta , F_0 ,T , K_atm , K_c_st , K_p_st ,
rDom , rFor , sigma_atm , rr_vol , st_px ) :
28
29 # extract the 25 D risk reversal strikes
30 K_c_rr = g e t S t r i k e G i v e n D e l t a S A B R S m i l e ( F_0 ,T , params [0] , params [1] ,
beta , params [2] , 0 , 0.25 , K_c_st )
31 K_p_rr = g e t S t r i k e G i v e n D e l t a S A B R S m i l e ( F_0 ,T , params [0] , params [1] ,
beta , params [2] , 1 , -0.25 , K_p_st )
32
33 # check the pricing error
34 sigma_atm_hat = bs_sabr_vol ( F_0 , K_atm , T , params [0] , beta ,
params [1] , params [2])
35 atm_error = ( sigma_atm - sigma_atm_hat ) **2
36
37 sigma_rr_c_hat = bs_sabr_vol ( F_0 , K_c_rr , T , params [0] , beta ,
params [1] , params [2])
38 sigma_rr_p_hat = bs_sabr_vol ( F_0 , K_p_rr , T , params [0] , beta ,
params [1] , params [2])
39 rr_vol_hat = sigma_rr_c_hat - sigma_rr_p_hat
40 rr_error = ( rr_vol - rr_vol_hat ) **2
41
42 sigma_st_c_hat = bs_sabr_vol ( F_0 , K_c_st , T , params [0] , beta ,
params [1] , params [2])
43 sigma_st_p_hat = bs_sabr_vol ( F_0 , K_p_st , T , params [0] , beta ,
params [1] , params [2])
44 px_st_c_hat = math . exp ( - r_d * T ) * BSM ( kind = ’ call ’ , S0 = F_0 , K =
K_st_c , T = T , r = 0.00 , sigma = sigma_atm + strangle_offset ) . value
()
45 px_st_p_hat = math . exp ( - r_d * T ) * BSM ( kind = ’ put ’ , S0 = F_0 , K = K_st_p
, T = T , r = 0.00 , sigma = sigma_atm + strangle_offset ) . value ()
46 st_px_hat = ( px_st_c_hat + px_st_p_hat )
47 st_error = ( st_px - st_px_hat ) **2
48
49 error = atm_error + rr_error + st_error
50 return error
51
52 # input data
53 F_0 , r_d , r_f , T = 100.0 , 0.015 , 0.005 , 0.5
54 sigma_atm , sigma_rr , strangle_offset = 0.2 , 0.04 , 0.025
55
56 # extract the ATM strike
57 K_atm = strikeGivenDelta ( F_0 , T , r_d , r_f , 0.5 , sigma_atm )
58
Foreign Exchange Markets 123
For more details on how to build the code to generate the underlying simulated
paths, the reader is encourage to refer to the coding examples in chapter 10.
EXERCISES
a. Download historical data for the USDCAD exchange rate as well as one-
month forwards.
b. Extract and plot a time series of the interest rate differential for the USD-
CAD currency pair. Comment on how stable and / or volatile it is.
c. Compare the expected change in USDCAD over the interval implied by the
forwards to the actual change in the rate. Has the forward realized?
d. Plot subsequent one-month returns of USDCAD as a function of the interest
rate different. Do you observe a positive or negative relationship? Why?
c. Again using simulation, calculate the value of a one-year volatility swap with
σk = 0.1 Would this contract struck at 0 upfront cost be over or undervalued?
d. Construct a histogram of payoffs for the volatility and variance swap. Com-
ment on when they are similar and when they diverge the most.
e. Construct a strategy that is long the volatility swap and short the variance.
That is, consider a strategy that pays fixed and received floating on a volatil-
ity swap and receives fixed and pays floating on a variance swap. How would
you think about sizing the two legs?
f. Make a histogram of payoffs of this strategy. How does it compare to the
outright histograms for the volatility and variance swaps?
a. Download historical exchange rate and three-month forward data for the
G10 major currencies vs. USD. Clean the data for outliers and anomalies.
b. Plot the interest rate differential for each of the 10 currency pairs over time.
For which currency pairs does USD tend to have a higher (lower) yield?
c. For each date, sort the currency pairs by interest rate differential and com-
ment on how this evolves over time.
d. Back-test a strategy that is long the three pairs with the highest carry and
short the three pairs with the lowest carry. The strategy should take positions
in the spot exchange rate.
EXERCISES
Ticker Description
SPY S&P 500 Index
DIA Dow Jones Industrials Index
QQQ NASDAQ
IWM Russell 2000 Small Cap Index
a. Download historical data for the basket components and calculate the entire
period realized volatility and correlation matrix.
b. Assume that each asset is governed by a Black-Scholes model with volatility
equal to historical, realized volatility and that the correlation structure of
the assets matches the realized correlation matrix. Using simulation, price a
six-month basket call option with strike equal to 105% of the current basket
value. Further assume that r = 0, and q = 0 for all assets, and that each
asset is currently trading at 100.
c. Re-calculate the price of the basket option using the current asset prices for
S0 instead of 100. Comment on the impact this scaling has on your model.
d. Comment on whether using realized volatility as a proxy for implied volatility
is a reasonable assumption and what bias it might introduce into your model
price.
e. Build a portfolio of vanilla options on the underlying assets that will best
replicate the basket options. Describe any remaining exposures and why they
exist.
f. Calculate the sensitivity of the structure to an increase and decrease in
127
128 A Practical Guide to Investment Management, Trading and Financial Engineering
correlations of the assets. Explain why this relationship exists for a basket
option.
g. Discuss the strengths and weakness of applying the Black-Scholes model for
basket options pricing.
Ticker Description
SPY US Equities (S&P 500 Index)
EFA Europe, East Asia and the Far East
EEM Emerging Markets
a. Download current implied volatility data for the three underlying assets and
calibrate individual Heston models for the assets.
b. Download historical price and return data and calculate a covariance matrix
of asset returns. Be explicit about the assumptions you make in estimates
the covariances, such as the lookback window and sampling frequency and
why you believe they are appropriate.
c. Build a correlation matrix that can be used to perform a multi-dimensional
simulation from the Heston model by assuming that the cross asset corre-
lations are in line with the realized estimates that you obtained above, and
that the cross asset volatility processes are uncorrelated.
d. Comment on whether you think these correlation assumptions across the
assets and with respect to the volatility processes are appropriate, and what
impact you think it might have on the results.
e. Consder a three-month, at-the-money, worst-of option on these assets. Using
simulation, the individual calibrated Heston parameters and the augmented
covariance matrix, estimate the price of this worst-of option. Assume that
each asset has a starting value of 100, and that r = 0 and the assets pay no
dividends (q = 0).
f. Create a table of sensitivities to each Heston parameter for each asset as
well as the additional correlations. What parameter or parameters are most
important for worst-of options? Why do you think this is?
16.3 Consider a 3 month fixed strike lookback put on the ETF SPY with its strike
set to 90% of its current value.
a. Describe the conditions in which the lookback makes the most money.
b. Gather data on the current level of SPY, as well as the current level of VIX,
the current three month T-Bill rate, and the dividend yield of the S&P.
Equity & Commodity Markets 129
c. Using the current value of VIX as a proxy for the implied volatility, use to
Black-Scholes formula to estimate the price of a three month European put
with its strike equal to 90% of its current value. Use the parameters you
obtained above for S0 , r and q, respectively.Perform a simulation from the
Black-Scholes model to verify that your simulation yields the same result as
the closed form solution and comment on any divergence.
d. Using the same implied volatility, estimate the price of the lookback via
simulation assuming it follows a Black-Scholes model. As before, use the
parameters you obtained above for S0 , r and q, respectively.
e. Plot the payoffs of the European and Lookback across each path and cal-
culate the correlation of their payoffs. Comment on when each payoff is a
highest and lowest. What identifying traits about these paths can you iden-
tify? What about on the paths when the values diverge?
16.4 Consider an investor who buys a one-month fixed strike lookback structk at
95% of the value of the financials ETF, XLF and sells a European put option
with matching expiry and strike.
a. Derive the payoff for the lookback vs. European structure as a function of
the minimum value of XLF, its terminal value, and the strike, K.
b. Describe how you would expect this trade to behave as a function of the
following:
• Volatility
• Upward Drift
• Skew to the Put Side
• Mean-Reversion in XLF
c. Download current options data for XLF and calibrate a Variance Gamma
to the set of one-month options. Comment on what the parameters imply
about the risk-neutral distribution.
d. Using the calibrated Variance Gamma model, use FFT techniques to price
the European option that you have sold.
e. Using simulation and the same set of Variance Gamma parameters, obtain
a comparable simulation based estiamte of the European option price. Does
it match the FFT based value? Why or why not?
f. Use simulation to value both the lookback option in isolation and the value
of the Lookback vs. European package. Comment on the difference in payoff
profiles. What does the sold European option do to our profile?
16.5 Consider a VIX roll-down strategy that attempts to harvest risk premium in
VIX futures:
a. Download data for all available VIX futures as well as the VIX index.
130 A Practical Guide to Investment Management, Trading and Financial Engineering
b. Extract the annualized implied dividend, q, for each future with respect to
spot. Plot this data for each future historically.
c. Also plot the slope of the futures curve with respect to spot as defined by:
fi
s = log (P16.1)
s0
where fi is the value of future i, s0 is the spot value of the VIX index and
s is the slope of the curve. Annualize these slopes in your plot. Which parts
of the curve have the most/least slope?
d. Consider a strategy that persistently sells the one-month future and rolls
one-week prior to expiry. What is the payoff profile of this strategy?
e. Next, consider a strategy that persistently sells the three-month future and
rolls every month (when it becomes a two-month future). How does this
compare to selling the front month? How correlated are their payoffs? Do
you see any pros or cons or using the front vs. third month to harvest a roll
premium?
f. Finally, consider a dynamic strategy which creates a long-short portfolio that
is long the flattest part of the curve and short the steepest. Comment on
your approach to sizing the long and short legs.
a. Download the current futures and volatility surface data. Check the options
data for arbitrage correcting any anomalies.
b. Plot the volatility smile for each option expiry and choose the stochastic
model that you think is best suited to calibrate the features of the vol.
surface. Explain why you think this is the case.
c. Using the calibrated volatility surface price the following set of exotic op-
tions:
• An at-the-money basket put option with a one-year expiry.
• A 5% out-of-the-money best-of call option with a one-year expiry.
• A 5% out-of-the-money worst-of call option with a one-year expiry.
d. Calculate the ratio of the exotic options relative to each other. Which is the
biggest? Smallest? Why?
e. Try several different parameter sets and comment on how these ratios evolve
over time. Is the order preserved? Why or why not? Are the ratios contained
with certain intervals?
16.7 Consider an investor looking to sell strangles on the NASDAQ ETF QQQ and
delta-hedge:
Equity & Commodity Markets 131
a. Download historical price and options data for the QQQ ETF. Compute a
time series of realized and implied volatilities and plot these time series as
well as the spread between implied and realized.
b. Assuming a flat volatility surface with volatility equal to the at-the-money
observed volatility, calculate the price of a 25-delta one-month strangle on
QQQ.
c. Calculate the realized payoff of being short this 25-delta strangle without
any hedge. What is the expected payout? Plot a histogram of payoffs as
well.
d. Back-test a strategy that sells one-month 25-delta strangles on QQQ and
then delta-hedges every week. A single, one-month short strangle position
should be held each month until expiry, or another roll date of your choice.
What is the risk / return profile of this strategy?
e. What are the main risks when running this type of volatility selling algo-
rithm? How could the underlying structure be improved while still allowing
us to harvest the well-documented volatility risk premium?
a. Download historical data for the front two month futures for the above com-
modity asset universe.
b. Calculate the slope of each futures curve as:
f2
s = log (P16.2)
f1
where f2 is the second month future’s value and f1 is the value of the front
month future.
Plot this slope, s for each commodity over time and comment on the various
slopes across time and across the cross section of commodities.
c. Build an algorithm that sorts the commodities by curve slope at each his-
torical date. Plot the relative rankings over time. Which commodities have
the steepest / flattest curves consistently?
132 A Practical Guide to Investment Management, Trading and Financial Engineering
d. At each date, build an equally weighted portfolio that is long the second fu-
ture for the two flattest curves and short the second future for the two steep-
est curves, re-balancing every day. What are the properties of this strategy?
Does it have appealing risk/reward characteristics?
IV
Portfolio Construction & Risk Management
133
CHAPTER 17
135
136 A Practical Guide to Investment Management, Trading and Financial Engineering
1
As is shown in (??)
Portfolio Construction & Optimization Techniques 137
23
24 return w , noise_free_w , inv_mat
It should be emphasized that this implement of a risk parity, or equal risk contri-
bution portfolio does not include leverage. We could then add leverage to the portfolio
by simply rescaling the weights, and adding a corresponding short position in cash,
or the risk-free asset.
EXERCISES
17.1 Consider the following set of sector ETFs:
Ticker Description
XLB Materials
XLE Energy
XLF Financials
XLI Industrials
XLK Technology
XLP Consumer Staples
XLU Utilities
XLV Healthcare
XLY Consumer Discretionary
a. Download historical price data from January 1st 2010 until today for the
sector ETFs in the table above. Clean the data for splits and any other
anomalies.
b. Calculate the covariance matrix of daily returns for the sector ETFs.
c. Perform an eigenvalue decomposition of the covariance matrix. Plot the
eigenvalues in order from largest to smallest. How many eigenvalues are
positive? How many are negative? How many are zero? If any are negative,
is this a problem? How many do you think are statistically significant?
d. Generate a random matrix of the same size as your covariance matrix, where
each element has a standard normal distribution.
e. Perform an eigenvalue decomposition of this random matrix. Plot the
eigenevalues in order from largest to smallest. Comment on the differences
and similarities between the structure of the eigenvalues in this matrix and
your historical covariance matrix.
17.2 Recall that an unconstrained optimal portfolio can be found using the following
formula:
max wT R − awT Cw (P17.1)
w
The constant a indicates the amount of our risk aversion. Assume you are an
investor with a risk aversion coefficient of 1.
140 A Practical Guide to Investment Management, Trading and Financial Engineering
a. Calculate the historical annualized returns for each sector ETFs in 17.1.
b. Calculate the covariance matrix of the returns of the sector ETFs. Comment
on the rank of the matrix and the profile of the eigenvalues.
c. Using the annualized returns as expected returns and the covariance matrix
obtained above, calculate the weights of the unconstrained mean-variance
optimal portfolio.
d. Create a new set of expected returns equal to the historical annualized return
(µ) plus a random component, that is:
E [r] = µ + σ ∗ Z (P17.2)
where δ is the weight of the diagonal matrix and (1 − δ) is the weight of the
historical covariance matrix.
f. Set δ = 1 and perform and eigenvalue decomposition of the regularized
covariance matrix. What is the rank of the regularized covariance matrix
now?
g. Try different values of δ between 0 and 1 and perform eigenvalue decom-
position on the regularized covariance matrix. How many eigenvalues are
positive? How many are zero?
h. Repeat the exercise in (17.2d.) with the regularized covariance matrix for a
few values of δ. Compare the stability of your portfolio weights to what you
got with the historical covariance matrix.
a. Download historical data from your favorite source for 5 years and at least
100 companies or ETFs. In this problem we will look at the covariance matrix
for these assets and its properties.
i. Clean the data so that your input pricing matrix is as full as possible.
Fill in any gaps using a reasonable method of your choice. Explain why
you chose that particular method.
Portfolio Construction & Optimization Techniques 141
e. Calculate the volatility of the risk parity portfolio and compare it to the
volatility of the equally weighted portfolio. Explain how you would lever up
or down the risk parity portfolio so that the volatility in the leveraged risk
parity portfolio matches the equally weighted portfolio. What is the leverage
required?
a. Download data for the country ETFs listed above and compute a correlation
matrix of daily and monthly returns.
b. Construct the fully-invested efficient frontier using these ETFs with no ad-
ditional constraints.
c. Construct the efficient frontier again assuming that the portfolio must be
long-only and the max position size is 10%.
d. Download data for the VIX index. Create a conditioning variable that defines
the market regime as follows:
• High Volatility Regime: VIX ≥ 20
• Normal Volatility Regime: VIX < 20
Calculate the correlation matrix of daily and monthly returns for both the
high volatility and normal volatility regimes. Comment on the difference
between the correlation structure as well as the deviation from the uncondi-
tional correlation matrix.
e. Using a subjective probability of being in a high regime of 25%, calculate a
Portfolio Construction & Optimization Techniques 143
modified correlation and covariance matrix. How does this compare to the
unconditional?
f. Construct the fully-invested efficient frontier again using this regime
weighted covariance matrix. Compare to the results obtained using the un-
conditional covariance matrix and comment on any notable differences.
a. Download price and market capitilization data for the sector ETFs in the
S&P 500 defined in 17.1. Calculate the market cap weights of the sectors.
b. Using the historical returns and covariance matrix, calculate the uncon-
strained mean-variance optimal portfolio for a given risk aversion parameter,
λ. Explain how you chose a value for λ.
c. Shift the historical returns for each asset up and down by 1% respectively.
Comment on the changes to the optimal portfolio weights.
d. Extract a set of equilibrium returns implied by the market cap weights.
e. Calculate the Q and P matrix respectively such that the view specified for
each sector ETF is its average historical return over the period.
f. Within the Black-Litterman model, describe how you would determine the
additional parameters, such as the risk aversion parameter, the uncertainty
in each view and τ the scaling coefficient.
g. Using the market implied equilibrium returns, and the P and Q matrices,
and the additional parameters estimated above, calculate a set of optimal
weight that satisfy this Black-Litterman problem. Compare these
h. Shift all input returns in the Q matrix up and down by 1% respectively.
Comment on the changes in the optimal portfolio weights relative to the
changes in a mean-variance implementation with similar shifts.
CHAPTER 18
145
146 A Practical Guide to Investment Management, Trading and Financial Engineering
26 ret_i = df_rets [ ii ]
27
28 logLikeli -= (0.5 * np . log ( sigma_i **2) + 0.5*( ret_i **2/
sigma_i **2) )
29
30 return - logLikeli
31
32 res = minimize ( garchLogLikelihood ,
33 x0 = [ alpha , beta , gamma , sigma_0 ] ,
34 bounds = ((0.0 , None ) ,(0.0 , None ) , (0.0 , None ) , (0.0 ,
None ) ) ,
35 args = ( df_rets ) ,
36 tol = 1e -12 ,
37 method = ’ SLSQP ’ ,
38 options ={ ’ maxiter ’: 2500 , ’ ftol ’: 1e -14})
39
40 print ( res . message )
41 print ( res [ ’x ’ ])
42
43 # calculate GARCH coefficients using arch package
44 garch = arch_model ( rets , vol = ’ GARCH ’ , p =1 , q =1)
45 garch_fitted = garch . fit ()
46 print ( garch_fitted )
As the reader can see in the coding example, the initial volatility, σ0 , is treated
as a model parameter in the GARCH calibration.
EXERCISES
Download the appropriate data for these sector ETFs as well as the Fama-
French Factors and clean as needed.
b. For each sector ETF, estimate the exposures to the Fama-French factors by
running a contemporaneous regression of the sector ETF returns against the
factor returns. Make a table of these exposures and comment on anything
noteworthy.
c. Recalculate these exposures using Ridge regression with varying values for
the parameter λ. How does this change the coefficients? Does it make them
more or less consistent across sector?
d. Now use Lasso regression, again with different values for λ and discuss how
the coefficients change. What factors seem to be removed first for each sector
ETF as λ increases? Is there a pattern across sectors?
Modelling Expected Returns and Covariance Matrices 149
a. Download data for the ETF ACWI and clean the data for splits and other
anomalies. In addition to closing price data, download data for the open,
high and low. Clean these data as well and ensure they are in line with the
closing levels.
b. Using an expanding window, calculate the realized volatility of the ETF over
all periods with at least one-year of data.
c. Calculate the realized volatility using a rolling window of overlapping obser-
vations and a one-year lookback window.
d. Now consider an EWMA volatility estimate that using an expanding window
and a one-year half-life. Calculate the rolling realized volatility using this
model over all overlapping one-year periods. Discuss the difference between
150 A Practical Guide to Investment Management, Trading and Financial Engineering
the rolling window approach and the EWMA approach with a similar half-
life.
e. Finally, use the open, high, low and close data to estimate a Garmin-Klass
range based volatility estimate. Create a realized volatility time series using
this range based volatility estimate.
f. Plot the time series of volatility estimates and comment on the difference in
behavior of the three approaches. Discuss how this relates to the strengths
and weaknesses of each model and which one you would use in practice.
Whic is most stable? Which is most dynamic?
a. Download data for the ETF SPY and calculate rolling one-month, three-
month and one-year realized volatility estimates. Plot these estimates and
comment qualitatively on the series. Do they appear mean-reverting? Are
they structurally similar?
b. Using Maximum Likelihood estimate a GARCH(1,1) to SPY returns. Com-
ment on the coefficients that you obtain and what that implies about the
level of mean-reversion of volatility in SPY.
c. Re-estimate the parameters on a rolling basis with a given rolling window
length of τ . Comment on how you choose τ and why you expect it to be
reasonable. Are the estimated GARCH coefficients consistent over time?
d. Download data for the ETF VXX and calculate an analogous set of rolling
one, three and twelve-month realized volatilities. Plot these estimates and
comment on how similar or different they are relative to the SPY charts
above.
e. Describe conceptually what phenomenon we are exploring when modeling
the volatility of an ETF that is already linked to volatility. Does this lead
to conceptual differences that should manifest in a volatility model such as
GARCH?
f. Using MLE estimate a GARCH(1,1) model on VXX returns. Compare the
coefficients to those obtain for SPY and try to explain any fundamental
differences.
Risk Management
As we can see above, we start with a return data frame and then compute annu-
alized volaitlity, downside deviation 1 and daily VaR and CVaR. The reader should
note that the VaR and CVaR quantities are not annualized. In order to conduct this
analysis, we leverage the same return dataset as the efficient frontier example in .
The reader should also note that the calculations are based on an equally weighted
portfolio, and specify a 5% confidence level for the VaR and CVaR calculations via
the variable eps.
1
Based on daily returns
153
154 A Practical Guide to Investment Management, Trading and Financial Engineering
8
9 for i in range ( paths ) :
10 idx_simu = random . sample ( range ( N ) , pathLength )
11 path_rets = np . zeros (( pathLength , n ) )
12 for j in range ( pathLength ) :
13 ret_simu [i ,:] += df_ret . iloc [ idx_simu [ j ] ,:]
14
15 df_simu = pd . DataFrame ( ret_simu )
16 df_simu_port = df_simu . mean ( axis =1)
17 VaR = df_simu_port . quantile ( eps )
18
19 return VaR
Note that again we assume an equally weighted portfolio in our VaR calculation.
It should be emphasized that every time we call this function, a new set of return
paths will be generated, which results in a new VaR estimate. Each such estimate
is statistically equivalent, but will naturally vary because of the randomness in the
simulated paths. By running the algorithm repeated times, we can get a sense of
the distribution of these estimates. For example, the plot below shows the density of
10,000 simulated VaR’s at a 0.05 confidence level, using two years of historical ETF
data:
The reader might notice that this distribution of VaR estimates is not as smooth
as they might expect. The reason for this is two-fold. First, in this example, we
bootstrap a daily VaR. This means that each bootstrapped path consists of a single
historical return. A byproduct of this is that the possible set of simulated returns
is limited to the historically observed set of returns. In traditional applications of
bootstrapping we would stitch together many returns in the underlying dataset on
each bootstrapped path, and this would no longer be the case. Secondly, in this
calculation a relatively short historical window is used, further limiting the set of
potential outcomes. The reader is welcome to relax these assumptions by trying a
multi-day VaR or using a longer historical calculation window and experiment with
how the distribution of VaR estimates becomes smoother.
EXERCISES
19.1 Describe in detail how you would build a risk model for each type of instrument
listed below. Be explicit about the assumptions you are making and why they
are appropriate.
19.2 Describe three ways that you would validate a VaR or CVaR model to ensure
it is robust and accurate.
b. Download daily historical data for the last ten years for the assets. Clean as
appropriate.
c. Using a two-year rolling window and the historical simulation method dis-
cussed in the chapter, calculate a daily VaR and CVaR for the portfolio
using a 5% threshold. Plot both on the same chart and comment on their
relationship to each other, as well as their evolution over time.
d. Plot the VaR against the returns of the portfolio. Comment on how often
the VaR is breached.
e. Calculate the total VaR contribution for each asset by setting each weight
to zero and re-calculating the VaR. Comment on which assets seems to be
contributing the most and least to the total VaR.
a. Download data for 2y, 5y, 10y and 30y treasury yields and swap rates. Clean
the data as you see fit.
b. Build a historical simulation of one-month forward treasury yields of all
maturities. Comment on the fifth percentile of the distribution of yields.
c. Assume that you purchased the 2y, 5y, 10y and 30y treasuries at the latest
available yield and that the coupon is equal to the yield. Estimate the change
in price to each treasury along each path based on the change in yields. Plot
the changes to the treasury prices for each maturity over all paths.
156 A Practical Guide to Investment Management, Trading and Financial Engineering
d. Calculate both VaR and CVaR for each maturity treasury, as well as an
equally weighted portfolio that includes all of them.
e. Build a comparable historical simulation of one-month forward swap rates.
How does the magnitude of the movements in treasury yields compare to
that of swap rates?
f. Assume you entered a swap contract at the current par swap rate (for the
latest available date). Calculate the implied mark-to-market changes in the
value of the swap along each path.
g. Using these quantities, calculate both VaR and CVaR for each swap.
a. Download data for all 500 S&P constituents. Generate a random portfolio
of the underlying constituents.
b. Consider the following set of stress tests:
Stress Description
Equity Drawdown Equities Sell-off by 20%
Yield Curve Steepens Long End rates sell-off by 100 bps
Dollar Sell-off A -3 standard deviation move in USD vs. all currencies
Equity Rotation A +5 standard deviation move of the Fama-French value factor
Risk Management 157
Carefully describe the correlation assumptions that you would make in order
to translate shifts to these variables into shocks to the S&P constituents.
Calculate the value of the portfolio in each stress test. Which is the worst?
Why?
c. Repeat this exercise many times and comment on how the value of the
portfolio in the stress test changes among random paths.
Download data for this set of ETFs from a source of your choice and validate
the data for anomalies.
b. Calculate an inverse volatility weighted portfolio of the assets using the entire
period in your volatility calculation.
c. Calculate a vector of expected returns and a covariance matrix of the un-
derlying assets.
d. Test the individual return series to see if they are Gaussian, using a QQ plot
or another technique of your choice. Does it appear the returns are normally
distributed?
e. Perform an eigenvalue decomposition of the covariance matrix, Is it positive
definite? Positive semi-definite? Explain how you know.
f. Implement the Monte Carlo based approach to calculating VaR and CVaR
assuming the joint distribution is Gaussian. Comment on any assumptions
that you believe may be violated and how you think that might impact the
results.
19.8 Discuss how you would incorporate undefined risks into a risk management
process and explain the pros and cons of integrating this into your model.
CHAPTER 20
159
160 A Practical Guide to Investment Management, Trading and Financial Engineering
It should be emphasized that in this example we leave out many of the practi-
calities of a pairs trading algorithm described above for simplicity. For example, we
neglect to incorporate a stop-loss criteria and calibrate a single optimal hedge ratio
over the entire period, thus looking ahead. Extending this piece of code to be suitable
for practical use is an exercise that is left to the reader.
In this example, we consider the case where we buy the assets with the top N prior
period returns and sell assets with the lowest prior period returns. In this example,
we show how the highest and lowest returns can be identified, and mark the position
in each asset that would be taken. We then leave it as an exercise for the reader to
turn this into a back-test quantitative strategy.
An important caveat of this back-testing code is that, for simplicity and brevity
we consider the value of the structure only when they are initially traded and then
at expiry. This simplifies the code greatly but in practice a daily mark-to-market for
the structure should be incorporated, as should a more robust re-balancing / rolling
strategy. The reader is encouraged to think about how this code could be generalized
to incorporate these features, as well as different options payoff structures.
EXERCISES
20.1 Describe how you would choose between a single-stock model, a cross asset
auto-correlation model, pairs trading or a PCA based factor model. Explain the
alpha research process you would follow and how you would avoid overfitting.
a. Download data for 100 single-name stocks of your choice. These stocks may
be members of an index or chosen in some other way. Comment on why your
methodology for choosing stocks may or may not lead to bias in a back-test.
b. Perform any necessary data quality checks on the data and handle outliers
or anomalous data as needed.
c. Back-test a strategy that is long the five stocks with the highest returns over
the past twelve months. Try this with and without the most recent months
return. Is this strategy profitable? Does removing the last month improve
performance?
Quantitative Trading Models 163
d. Now consider varying the number of stocks in your portfolio. Try 10, 20 &
30. Do the results deteriorate as we add stocks? Why or why not?
e. Next lets consider the short side of a momentum strategy. Back-test a strat-
egy that is short the five lowest momentum stocks using the same twelve
month lookback, with and without the previous one-month. How do the re-
sults of the short side compare to what you obtained above on the long side?
Do most of the profits in a long-short momentum strategy appear to come
from longs, shorts, or is it mixed?
f. Finally, vary the number of stocks included on the short side. Does perform
suffer as we add stocks to the short side?
a. Download historical data for the Fama-French factors as well as the S&P
sector ETFs and clean the data for anomalies.
b. Explain how you would create a portfolio with high exposure to a single
Fama-French factor, size, with minimal exposures to the other factors.
c. Test the Fama-French factors for autocorrelation. Explain how you would
build a sector ETF strategy that would take advantage of autocorrelation in
the factor returns. Back-test this strategy on the Fama-French factors. Is it
profitable?
d. Using the set of Fama-French factors, calculate the residuals for each sector
ETF. Test these residuals for autocorrelation. Back-test a strategy based on
the z-score of the residuals and takes a long positions when ẑ < −x and a
short position when ẑ > x. How does this strategy do?
a. Download data for the constituents of the S&P 500 and identify and handle
any data anomalies.
164 A Practical Guide to Investment Management, Trading and Financial Engineering
a. Download data for the SPY ETF as well as the VIX index.
b. Plot the level of VIX against one-month, three-month and six-month realized
volatility. If we assume VIX is a proxy for implied volatility, what does this
tell you about an implied vs. realized premium? What factors might bias
this calculation?
c. Assuming the volatiilty surface is flat with its implied volatility specified by
the level of VIX. Construct a strategy that sells an at-the-money straddle
on SPY every day and holds the position until expiry, with no intermediate
hedging. Explain the factors that determine the profit or loss of the strategy,
and comment on the distribution of returns for this strategy.
d. Next, consider a strategy that sells a 25-delta put every day and holds it
until expiry. How does this compare to the short straddle structure above?
e. Finally, consider a strategy that enters a new butterfly contract every day
that is centered on the at-the-money strike, with a reasonable width of your
choice. How does this strategy compare to the previous two?
f. Comment on which strategy you would prefer to invest in and why. Also
comment on any biases introduced in this analysis and how you would handle
them.
a. Download data for the ETFs DUST (3x Inverse) and NUGT (3x Long), both
triple leveraged gold ETFs. Check the data for splits and other anomalies.
Quantitative Trading Models 165
Incorporating Machine
Learning Techniques
1
The original example is from sklearn’s documentation.
167
168 A Practical Guide to Investment Management, Trading and Financial Engineering
In the above code we can see that the labels attribute indicates what cluster
each observation belongs to. Similarly, the center of each cluster can be accessed
through the cluster centers attribute. The number of centers coincides with k, or
n clusters.
attribute. For decisions trees, calling feature importances provides similar results.
Simple visualization provides a straight-forward demonstration.
For models dealing with images and therefore have visualization needs, Grad-
CAM (Gradient-weighted Class Activation Mapping) offers a user-friendly approach
that converts feature vectors into visible graphical patterns. It generates a heatmap
that measures the importance of all pixels in an image, and we can further combine
this heatmap with the original image to understand which part contributes most to
the model prediction. A sample heatmap is shown velow. More information about
Grad-CAM can be found on CloudCV.3
2
SHAP repository: https://github.com/slundberg/shap
3
Grad-CAM site: http://gradcam.cloudcv.org/
170 A Practical Guide to Investment Management, Trading and Financial Engineering
EXERCISES
b. Discuss the main challenges with applying machine learning when building
an alpha signal, and how you would attempt to overcome these challenges,
if possible.
c. Describe how you would implement machine learning techniques in a quan-
tamental context. That is, describe how you could leverage machine learning
algorithms while still retaining interpretability and an underlying economic
rationale.
21.2 Discuss the pros and cons of using classification techniques to forecast asset
returns instead of regression.
21.3 Match the following list of techniques that you would use to solve the problems
below:
Technique
Linear Regression
K-means Clustering
Principal Component Analysis
Support Vector Machines
Use each technique only once and explain why your choice is optimal.
a. Download data for the VIX implied volatility index and clean as needed.
Plot the data and comment on the time periods where VIX is most elevated.
b. Download data for the S&P 500 index and calculate returns. Also download
data for 30 year and 2 year treasury yields and calculate the 30y to 2y slope
of the yield curve.
c. Using k-means, cluster the data contemporaneously including the level of
VIX, daily S&P return and the slope of the yield curve. Try for k =
{1, 2, 3, 4} and use the method of your choice to determine the optimal num-
ber of clusters.
d. Comment on the statistical properties within each cluster. What market
conditions does each represent?
172 A Practical Guide to Investment Management, Trading and Financial Engineering
a. Download historical data for the S&P sector ETFs and check the data for
outliers and other anomalies.
b. Using a correlation distance metric and daily returns, cluster the data into
3 clusters and comment on which sectors end up together.
c. Calculate the average pairwise correlation between the sector ETFs within
each cluster and compare this to the average pairwise correlation between
ETFs in different clusters.
d. Build a portfolio for each cluster that relies on an inverse volatility weighting.
e. Use a risk parity approach to combine the three cluster portfolios into a
single overarching portfolio. Comment on the largest / smallest weights and
high-level characteristics of the portfolios.
f. Calculate the same clusters on a rolling basis of all available two-year periods.
How consistent are the clusters over time?
g. Repeat the above exercise using a Euclidean distance metric. Are the clusters
similar to what you found using a correlation based distance?
a. Download data for the following country ETFs and clean as appropriate:
Incorporating Machine Learning Techniques 173
b. Calculate a covariance matrix of daily returns for the country ETFs and
perform PCA on this covariance matrix. Plot the first three principal com-
ponents and describe what they represent.
c. Using k-means clustering, split the country ETFs into three clusters. Com-
pare the three clusters that you obtain to the principal components obtained
above, emphasizing any commonality.
d. Repeat the PCA exercise over all rolling, overlapping, one-year periods. Plot
the rolling weights of the top three principal components. Are they stable
over time?
e. Repeat this exercise with the country clusters as well. Compare the stability
of the clusters to the stability of the PCs.
f. Discuss what implications these differences in stability would have in practice
and what you think are the strengths and weaknesses of each approach..
a. Download data for the US Small Cap ETF IWM. Calculate daily returns,
plot these returns and search for any anomalies.
b. Construct a classification signal that is based on the sign of the daily return
of IWM. Additionally, construct a reversal and momentum feature based on
previous returns. You are free to choose the lookback period of the reversal
and momentum features, but should justify why you believe they are optimal.
c. Build an SVM model that classifies one-day forward returns using these
174 A Practical Guide to Investment Management, Trading and Financial Engineering
reversal and momentum features. Try different SVM kernels and values of
the cost parameter, C. Comment on their impact on the model.
d. Separate the dataset into a training and test period and comment on how
you chose these periods.
e. Using a cross-validation technique, such as k-fold cross validation or another
built-in technique, find the optimal hyperparameters for the SVM model.
f. Optimize your SVM model over the training set including tweaking the look-
back windows of the reversal and momentum features if needed. Once you
believe the model has been optimized, try it on the test dataset as well.
g. Print the confusion matrix of the model both during the test and training
datasets. How well do you think this model works? What slippage should we
expect out of sample?
h. Discuss how you would properly back-test a model such as this in a profes-
sionally managed quant strategy.
Bibliography
[1] Kai Lai Chung. A Course in Probability Theory, Third Edition. Academic Press,
New York NY, USA, 2001.
[2] Tobias Moskowitz Clifford Asness and Lasse Pedersen. Value and Momentum
Everywhere. The Journal of Finance, 2013.
175