Introduction To Statistical Methods For Financial Models
Introduction To Statistical Methods For Financial Models
Mathematical Statistics: Basic Ideas and Modelling Survival Data in Medical Research,
Selected Topics, Volume I, Third Edition
Second Edition D. Collett
P. J. Bickel and K. A. Doksum Introduction to Statistical Methods for
Mathematical Statistics: Basic Ideas and Clinical Trials
Selected Topics, Volume II T.D. Cook and D.L. DeMets
P. J. Bickel and K. A. Doksum Applied Statistics: Principles and Examples
Analysis of Categorical Data with R D.R. Cox and E.J. Snell
C. R. Bilder and T. M. Loughin Multivariate Survival Analysis and Competing
Statistical Methods for SPC and TQM Risks
D. Bissell M. Crowder
Introduction to Probability Statistical Analysis of Reliability Data
J. K. Blitzstein and J. Hwang M.J. Crowder, A.C. Kimber,
T.J. Sweeting, and R.L. Smith
Bayesian Methods for Data Analysis,
Third Edition An Introduction to Generalized
B.P. Carlin and T.A. Louis Linear Models, Third Edition
A.J. Dobson and A.G. Barnett
Second Edition
R. Caulcutt Nonlinear Time Series: Theory, Methods, and
Applications with R Examples
The Analysis of Time Series: An Introduction, R. Douc, E. Moulines, and D.S. Stoffer
Sixth Edition
C. Chatfield Introduction to Optimization Methods and
Their Applications in Statistics
Introduction to Multivariate Analysis B.S. Everitt
C. Chatfield and A.J. Collins
Bayesian Data Analysis, Third Edition Design and Analysis of Experiments with SAS
A. Gelman, J.B. Carlin, H.S. Stern, D.B. Dunson, J. Lawson
A. Vehtari, and D.B. Rubin A Course in Categorical Data Analysis
Multivariate Analysis of Variance and T. Leonard
Repeated Measures: A Practical Approach for Statistics for Accountants
Behavioural Scientists S. Letchford
D.J. Hand and C.C. Taylor Introduction to the Theory of Statistical
Practical Longitudinal Data Analysis Inference
D.J. Hand and M. Crowder H. Liero and S. Zwanzig
Logistic Regression Models Statistical Theory, Fourth Edition
J.M. Hilbe B.W. Lindgren
Richly Parameterized Linear Models: Stationary Stochastic Processes: Theory and
Additive, Time Series, and Spatial Models Applications
Using Random Effects G. Lindgren
J.S. Hodges Statistics for Finance
Statistics for Epidemiology E. Lindström, H. Madsen, and J. N. Nielsen
N.P. Jewell The BUGS Book: A Practical Introduction to
Stochastic Processes: An Introduction, Bayesian Analysis
Second Edition D. Lunn, C. Jackson, N. Best, A. Thomas, and
P.W. Jones and P. Smith D. Spiegelhalter
The Theory of Linear Models Introduction to General and Generalized
B. Jørgensen Linear Models
Pragmatics of Uncertainty H. Madsen and P. Thyregod
J.B. Kadane Time Series Analysis
Principles of Uncertainty H. Madsen
J.B. Kadane Pólya Urn Models
H. Mahmoud
Thomas A. Severini
Northwestern University
Evanston, Illinois, USA
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of allmaterial reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known
or hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.
copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate systemof payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Preface xv
1 Introduction 1
2 Returns 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Adjusted Prices . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Statistical Properties of Returns . . . . . . . . . . . . . . . . 14
2.5 Analyzing Return Data . . . . . . . . . . . . . . . . . . . . . 20
2.6 Suggestions for Further Reading . . . . . . . . . . . . . . . . 37
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Portfolios 69
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3 Negative Portfolio Weights: Short Sales . . . . . . . . . . . . 73
4.4 Optimal Portfolios of Two Assets . . . . . . . . . . . . . . . 74
4.5 Risk-Free Assets . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.6 Portfolios of Two Risky Assets and a Risk-Free Asset . . . . 84
4.7 Suggestions for Further Reading . . . . . . . . . . . . . . . . 91
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
xi
6 Estimation 145
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2 Basic Sample Statistics . . . . . . . . . . . . . . . . . . . . . 145
6.3 Estimation of the Mean Vector and Covariance Matrix . . . 151
6.4 Weighted Estimators . . . . . . . . . . . . . . . . . . . . . . 157
6.5 Shrinkage Estimators . . . . . . . . . . . . . . . . . . . . . . 163
6.6 Estimation of Portfolio Weights . . . . . . . . . . . . . . . . 171
6.7 Using Monte Carlo Simulation to Study the Properties
of Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.8 Suggestions for Further Reading . . . . . . . . . . . . . . . . 189
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
References 355
Index 363
xv
as the market portfolio, and measured by a suitable market index, such as the
Standard & Poors (S&P) 500 index. Such models are useful for understand-
ing the nature of the risk associated with an asset, as well as the relationship
between the expected return on an asset and its risk. The single-index model
extends this idea to a model for the correlation structure of the returns on a
set of assets; in this model, the correlation between the returns on two assets
is described in terms of each asset’s correlation with the return on the market
portfolio.
The CAPM, the market model, and the single-index model are all based
on the relationship between asset returns and the return on some form of a
market portfolio. Although the behavior of the market as a whole may be
the most important factor affecting asset returns, in general, asset returns
are related to other economic variables as well. A factor model is a type of
generalization of these models; it describes the returns on a set of assets in
terms of a few underlying “factors” affecting these assets. Such a model is
useful for describing the correlation structure of a set of asset returns as well
as for describing the behavior of the mean returns of the assets. The factors
used are chosen by the analyst; hence, there is considerable flexibility in the
exact form of the model. The parameters of a factor model are estimated
using statistical techniques such as regression analysis and the results provide
useful information for understanding the factors affecting the asset returns;
the results from an analysis based on a factor model are important in analyzing
potential investments and constructing portfolios.
The other is that there are many R packages available that extend its function-
ality; several of these provide functions that are useful for analyzing financial
data.
2.1 Introduction
As discussed in Chapter 1, the goal of this book is to provide an introduction
to the statistical methodology used in modeling and analyzing financial data.
This chapter introduces some basic concepts of finance and the types of finan-
cial data used in this context. The analyses focus on the returns on an asset,
which are the proportional changes in the price of the asset over a given time
interval, typically a day or month. The statistical foundations for the analysis
of such data are presented, along with statistical methods that are useful for
investigating the properties of return data.
Pt − Pt−1 Pt
Rt = = − 1, t = 1, 2, . . . .
Pt−1 Pt−1
That is, the return on the asset is simply the proportional change in its price
over a given time period; the return is positive if the price increased and is
negative if the price decreased.
Example 2.1 Suppose that, for a given asset, P0 = 60, P1 = 62.40, P2 = 63.96,
P3 = 61.40, and P4 = 66; assume that all prices are in dollars but, for
Therefore, in Example 2.1, if the initial investment is $100, the revenue over
the period from t = 0 to t = 1 is
100(0.04) = $4.
Log-Returns
It is sometimes convenient to work with log-returns, defined by rt =
log (1 + Rt ), t = 1, 2, . . .; note that throughout the book, “log” will denote
natural logarithms.
Let pt = log Pt , t = 0, 1, . . . denote the log prices. Then the log-returns are
defined as
Pt
rt = log (1 + Rt ) = log = pt − pt−1 .
Pt−1
That is, log-returns are simply the change in the log-prices.
One advantage of working with log-returns is that it simplifies the analysis
of multi-period returns. Let rt (k) denote the k-period log-return at time t.
Then, by analogy with the single-period case, rt (k) = log(1 + Rt (k)) and
rt (k) = log (1 + Rt (k))
= log ((1 + Rt )(1 + Rt−1 ) · · · (1 + Rt−k+1 ))
= log (1 + Rt ) + log (1 + Rt−1 ) + · · · + log (1 + Rt−k+1 )
= rt + rt−1 + · · · + rt−k+1 ;
that is, the k-period log-return at time t is simply the sum of the k
single-period log-returns, rt−k+1 , rt−k+2 , . . . , rt . Alternatively, because rt =
pt − pt−1 , the k-period log-return is the change in the log-price from period
t − k to period t,
rt (k) = pt − pt−k .
Example 2.3 Using the sequence of prices given in Example 2.1, P0 = 60,
P1 = 62.40, P2 = 63.96, P3 = 61.40, and P4 = 66, the log-prices are given by
p0 = log(60) = 4.0943,
p1 = log(62.40) = 4.1336,
p2 = log(63.96) = 4.1583,
p3 = 4.1174,
and
p4 = 4.1897.
It follows that the log-returns are
r1 = p1 − p0 = 4.1336 − 4.0943 = 0.0393,
r2 = p2 − p1 = 4.1583 − 4.1366 = 0.0217,
r3 = p3 − p2 = 4.1174 − 4.1583 = −0.0409,
and
r4 = p4 − p3 = 4.1897 − 4.1174 = 0.0723.
Alternatively, the log-returns may be calculated from the returns; for example
R1 = 0.04 so that
r1 = log(1 + R1 ) = log(1 + 0.04) = log(1.04) = 0.0392,
with the difference between this and our previous result due to round-off
error. The three-period log-return at time 4 is
Dividends
Now suppose that there are dividends. Let Dt represent the dividend paid
immediately prior to time t, that is, after time t − 1 but before time t; for
convenience, we will refer to such a dividend as being paid “at time t.” Then
the gross return from time t − 1 to time t takes into account the payment of
the dividend, along with the change in price; it is defined as
Pt + Dt
1 + Rt = .
Pt−1
The net return is given by
Pt + Dt Pt Dt
Rt = −1 = −1 +
Pt−1 Pt−1 Pt−1
= (proportional change in price)
+ (dividend as a proportion of price at time t − 1).
Note that, when there are dividends, the definition of multiperiod returns
assumes that the dividends are reinvested. To see this, consider the following
example.
Example 2.5 Consider an asset with prices P0 = 8, P1 = 10, and P1 = 12
and with dividends D1 = 2, D2 = 1. Suppose that our initial investment is
$200. The initial price of the asset is P0 = 8; hence, we buy 200/8 = 25 shares.
The price at time t = 1 is P1 = 10; therefore, in time period 1, those shares
are worth (25)(10) = $250, plus we receive a dividend of $2 per share for a
total dividend of 2(25) = $50. The dividends may be used to buy more of the
asset; the price is P1 = 10, so we buy 50/10 = 5 additional shares for a total
of 25 + 5 = 30 shares at the end of time 1.
At time t = 2, the price of the asset is P2 = 12, so those 30 shares are worth
(30)(12) = $360, plus we receive a dividend of $1 per share or 1(30) = $30 for
the 30 shares, leading to a total worth of $390. Thus, our initial investment of
$200 is worth $390, a net return of 390/200 − 1 = 0.95 over the two periods,
which agrees with [(12 + 1)/10][(10 + 2)/8] − 1 = 0.95.
Note, however, that the log-return is no longer directly related to the change
in the log-price.
Multiperiod returns for log-returns in the presence of dividends are defined
as the sum of the single-period log-returns:
rt (k) = rt + · · · + rt−k+1 , t = k, k + 1, . . . , T.
To see why this is true, consider one share of a particular stock and suppose
that a dividend Dt is paid at time t. Investors selling the stock at time t − 1
will receive Pt−1 ; investors selling the stock at time t receive Pt + Dt . Under
the assumption that the instrinsic value of the investment is stable from time
t − 1 to time t, we must have
Pt−1 = Pt + Dt ,
that is,
Pt = Pt−1 − Dt .
Thus, when measuring how the value of a share of stock changes from time
t − 1 to time t, we should compare Pt to Pt−1 − Dt rather than to Pt−1 . In a
sense, the “effective price” at time t − 1 is Pt−1 − Dt .
This reasoning is the basis for defining adjusted prices. Let P0 , P1 , . . . , PT
denote a sequence of prices of an asset, let D1 , D2 , . . . , DT denote a sequence
of dividends paid by the asset, and let P̄0 , P̄1 , . . . , P̄T denote the corresponding
sequence of adjusted prices. Define P̄T = PT and
DT
P̄T −1 = PT −1 − DT = 1 − PT −1 .
PT −1
Note that the ratio of the adjusted prices may be written
P̄T PT
= (2.1)
P̄T −1 PT −1 − DT
so that it reflects the ratio of the prices, taking into account the dividend DT .
To define the adjusted price at time T − 2, P̄T −2 , we use the relationship
between P̄T −1 and P̄T −2 implied by (2.1):
P̄T −1 PT −1
= .
P̄T −2 PT −2 − DT −1
Solving for P̄T −2 ,
PT −2 − DT −1 DT −1 P̄T −1
P̄T −2 = P̄T −1 = 1− PT −2 .
PT −1 PT −2 PT −1
Using the fact that
P̄T −1 DT
= 1− ,
PT −1 PT −1
it follows that
DT DT −1
P̄T −2 = 1− 1− PT −2 .
PT −1 PT −2
This relationship may be generalized to
DT DT −1 DT −k+1
P̄T −k = 1 − 1− ··· 1− PT −k , k = 1, 2, . . . , T.
PT −1 PT −2 PT −k
Thus, the adjusted prices describe the changes in a stock’s value, taking into
account dividends.
and
1 2
P̄0 = 1 − 1− 60 = 57.07.
62.40 60
There are some unexpected properties of adjusted prices that are impor-
tant to keep in mind. One is that when a dividend occurs in the current period,
the entire series of adjusted prices changes.
To see this, let P̄0 , P̄1 , . . . , P̄T denote the adjusted prices based on observ-
ing prices and dividends for periods 0, 1, . . . , T . Now suppose that we observe
PT +1 and DT +1 ; let P̃0 , P̃1 , . . . , P̃T +1 denote the adjusted prices based on
observing prices and dividends for periods 0, 1, . . . , T, T + 1. Then P̃T +1 =
PT +1 ,
DT +1 DT +1
P̃T = 1 − PT = 1 − P̄T ,
PT PT
DT +1 DT DT +1
P̃T −1 = 1 − 1− PT −1 = 1 − P̄T −1 ,
PT PT −1 PT
and so on. In general,
DT +1
P̃T −k = 1− P̄T −k , k = 1, 2, . . . , T.
PT
Example 2.7 For the asset described in Example 2.6, suppose that we
observe an additional time period, with P3 = 61.40 and D3 = 3. Then the
updated adjusted prices are P̄3 = 61.40,
3
P̄2 = 1 − 63.96 = 60.96,
63.96
3 1
P̄1 = 1 − 1− 62.40 = 58.52,
63.96 62.40
3 1 2
P̄0 = 1 − 1− 1− 60 = 54.39.
63.96 62.40 60
The series of adjusted prices is now 54.39, 58.52, 60.96, and 61.40, corre-
sponding to periods 0, 1, 2, 3, respectively. These values can be compared with
the series of adjusted prices 57.07, 61.40, and 63.96 for periods 0, 1, and 2,
respectively, that were computed before observing period 3.
This is close to, but slightly different than, the actual return calculated pre-
viously, with a difference of 0.00003. Note that here D1 /P0 = 0.0066 and
P1 /(P0 − D1 ) = 1.0045.
Now consider monthly returns. Let P0 denote the price of one share of
Target stock at the end of April 2015, and let P1 denote the price at the end
of May 2015. Then P0 = $78.83, P1 = $79.32, and D1 = $0.52. The adjusted
monthly prices are P̄0 = $78.31 and P̄1 = $79.32. Then the monthly return
on Target stock is 0.01281; using the adjusted prices, the monthly return is
0.01290, again, a slight difference.
Adjusted stock prices are generally adjusted for stock splits as well as
for dividends. A stock split occurs when a company decides to proportion-
ally increase the number of shares owned by investors. For instance, in a
two-for-one stock split, the owner of each share of stock is given a second
share, in a sense, splitting each share into two. Of course, the price of the
shares is adjusted accordingly.
Adjusted prices better reflect changes in the asset’s value over time, and
they are a useful alternative to the raw prices. In the remainder of this book,
the term “prices” will always refer to adjusted prices and the term “returns”
will always refer to returns calculated from adjusted prices. The notations
used for prices, returns, log-prices, and so on will refer to quantities based
on the adjusted prices; for example, Pt will be used to denote the adjusted
price of an asset at time t and Rt will be used to denote the return based on
adjusted prices.
μt = E(Yt ), t = 1, 2, . . .
denote the mean function of the process so that μ3 = E(Y3 ), for example.
Similarly, the variance function of the process is given by
σ2t = Var(Yt ), t = 1, 2, . . . .
γ0 (t, s) = Cov(Yt , Ys ), t, s = 1, 2, . . . .
Weak Stationarity
In financial applications, the assumption of stationarity is generally stronger
than is needed. Furthermore, because it refers to the entire distribution of
each random variable, it is difficult to verify in practice. Hence, a weaker
version of stationarity, based on means, variances, and covariances, is often
used.
Yt = Zt − Zt−1 , t = 1, 2, . . .
and that
Example 2.13 Let Z1 , Z2 , . . . denote i.i.d. random variables each with mean
0 and standard deviation σ. Define
Z1 + Z2 + · · · + Zt
Xt = √ , t = 1, 2, . . .
t
and consider the properties of the process {Xt : t = 1, 2, . . .}.
Note that
1 1
E(Xt ) = √ E(Z1 ) + · · · + √ E(Zt ) = 0
t t
and
1 1 1
Var(Xt ) = Var(Z1 ) + · · · + Var(Zt ) = tσ2 = σ2 .
t t t
Now consider Cov(Xt , Xs ) for t = s. The calculation is simpler if we know
which of t and s is smaller; note that, without loss of generality, we may
assume that t < s. Then
Z1 + Z2 + · · · + Zt Z1 + Z2 + · · · + Zt
Cov(Xt , Xs ) = Cov √ , √
t s
Zt+1 + Z2 + · · · + Zs
+ √
s
Z1 + Z2 + · · · + Zt Z1 + Z2 + · · · + Zt
= Cov √ , √
t s
Z1 + Z2 + · · · + Zt Zt+1 + Zt+2 + · · · + Zs
+ Cov √ , √ .
t s
T
Ȳ = Yt
T t=1
(Y1 , Y2 ), (Y2 , Y3 ), . . ., and so on. Parameter estimation using these ideas will
be considered in detail in the following section.
Cov(Yt , Ys ) = Cov(Z + Zt , Z + Zs )
= Cov(Z, Z) + Cov(Z, Zs ) + Cov(Zt , Z) + Cov(Zt , Zs )
= Var(Z)
= 1.
To find the covariance of Xt and Xs , we may use the same general approach
used in Example 2.13. Note that, without loss of generality, we may assume
that t < s. Then
Cov(Xt , Xs ) = Cov(Z1 + · · · + Zt , Z1 + · · · + Zs )
= Cov(Z1 + · · · + Zt , Z1 + · · · + Zt )
+ Cov(Z1 + · · · + Zt , Zt+1 + · · · + Zs ).
Cov(Xt , Xs ) = t;
thus, Cov(Xt , Xs ) = min{t, s}. Because the Var(Xt ) depends on t and the
covariance of Xt , Xs is not a function of |t − s|, the process {Xt : t = 1, 2, . . .}
is not weakly stationary.
dates of the time period under consideration, quote, which specifies the data
to be downloaded, for which we use AdjClose for the adjusted closing price,
and compression, which specifies the sampling frequency of the data, the
time interval over which data are recorded. We may view this choice in terms
of the return interval, the length of the time period over which each return is
calculated. Typical choices are days or months, but sometimes weeks or years
are used.
Daily data have the advantage that more observations are available in a
given time period and that they may reflect more subtle changes in the price
of the asset. On the other hand, investment decisions are often made on a
monthly basis and, in many cases, monthly returns are more stable than daily
returns. Hence, both daily and monthly data are commonly used. For daily,
data, we use compression = "d"; for monthly data, “d” is replaced by “m.”
> library(tseries)
> x<-get.hist.quote(instrument="WMT", start="2009-12-31",
+ end="2014-12-31", quote="AdjClose", compression="d")
> wmt<-as.vector(x)
This command assigns five years of Wal-Mart price data to the variable x;
the format of x is known as “zoo,” an R data format for irregularly observed
time series. Here, we analyze the prices as a standard vector; hence, the
command wmt<-as.vector(x) converts x to a standard vector and assigns it
to the variable wmt. To check the contents of wmt, we can use the command
head, which displays the first few elements of the vector.
> head(wmt)
[1] 45.64114 46.30719 45.84608 45.74361 45.76923 45.53868
> length(wmt)
[1] 1259
Thus, the first adjusted price of Wal-Mart stock in the sequence is $45.64114
and there are 1259 prices in the variable wmt.
The number of significant figures displayed can be controlled by the digits
argument of the options function. For instance, options(digits=5) limits
the number of significant figures printed to five. Throughout this book, the
number of digits will be adjusted without comment, based on the context of
the example, the desire to fit the output to the page, and so on.
> options(digits=5)
> head(wmt)
[1] 45.641 46.307 45.846 45.744 45.769 45.539
Note that, when displaying the contents of a vector, the number of digits
shown is chosen so that all elements of the vector have the required number
of significant figures. For example,
> options(digits=2)
> c(11/100000, 1/3)
[1] 0.00011 0.33333
> summary(wmt.ret)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.04660 -0.00452 0.00066 0.00052 0.00558 0.04720
> sd(wmt.ret)
[1] 0.0091715
Therefore, roughly 5% of the sample values are less than or equal to −0.01352;
the help file for the function quantile gives details on the exact method of
calculation of the sample quantiles. Note that the 25% and 75% quantiles
correspond to the sample quartiles calculated using the function summary and
the 50% quantile corresponds to the sample median.
The following commands give plots of the Wal-Mart stock prices and
returns, given in Figures 2.1 and 2.2, respectively. Note that, if the plot
command has only one argument, it is understood to be the y-variable, with
the x-variable taken to be the corresponding index (1 to 1259 in the case of
80
70
Stock price
60
50
FIGURE 2.1
Time series plot of Wal-Mart daily stock prices.
0.04
0.02
Return
−0.02
−0.04
FIGURE 2.2
Time series plot of Wal-Mart daily stock returns.
wmt and 1 to 1258 in the case of wmt.ret); type="l" specifies that the plot
be drawn as lines connecting the points, which are not displayed.
> plot(wmt, type="l", ylab="Price", xlab="Time")
> plot(wmt.ret, type="l", ylab="Return", xlab="Time")
Monthly Returns
So far in this section, we have analyzed daily prices and daily returns.
In practice, models for asset returns are often based on monthly returns,
which, in many cases, correspond better to the investment horizon of interest
and are often more stable than daily returns. To obtain monthly returns, we
use the get.hist.quote function with compression="m."
Example 2.16 The commands
> x<-get.hist.quote(instrument="WMT", start="2009-12-31",
+ end="2014-12-31", quote="AdjClose", compression="m")
> wmt.m<-as.vector(x)
return the prices of Wal-Mart stock for the last trading day of the month,
for each month from December 2009 to December 2014, storing them in the
vector wmt.m.
Thus, there are 61 monthly prices in wmt.m, which lead to 60 monthly
returns.
> length(wmt.m)
[1] 61
> wmt.m.ret<-(wmt.m[-1]-wmt.m[-61])/wmt.m[-61]
> length(wmt.m.ret)
[1] 60
> head(wmt.m.ret)
[1] -0.000374 0.011978 0.034093 -0.035252 -0.051944 -0.049248
There is one possible pitfall when downloading monthly data—the last
price returned corresponds to the last trading day that occurred on or before
the day listed, even if that day is not the last day of the month. For example,
if we use the command
> x<-get.hist.quote(instrument="WMT", start="2009-12-31",
+ end="2014-12-15", quote="AdjClose", compression="m")
the last price in x will be the price for December 15, 2014. Thus, the last
monthly return will correspond to only the first half of December 2014. There-
fore, when downloading monthly returns, it is important that the end date be
the last day of the month under consideration.
Figure 2.3 contains a time series plot of the monthly returns on Wal-Mart
stock.
0.15
0.10
Return
0.05
−0.05
0 10 20 30 40 50 60
Time
FIGURE 2.3
Time series plot of Wal-Mart stock monthly returns.
0.15
0.10
Return
0.05
−0.05
0 10 20 30 40 50 60
Time
FIGURE 2.4
Alternative time series plot of Wal-Mart stock monthly returns.
plot of daily returns; however, when plotting a large number of values, the
points tend to overwhelm the plot.
> summary(wmt.m.ret)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.07294 -0.02235 0.01192 0.01094 0.03848 0.14780
> sd(wmt.m.ret)
[1] 0.0441572
The mean monthly return is about 21.6 times the mean daily return and
the standard deviation of the monthly returns is about 4.8 times as large as
the standard deviation of the daily returns; both of these ratios are close to
what we would expect.
over time. However, the variability inherent in these plots makes assessment
of such properties difficult.
For instance, suppose we are interested in the mean function μt . If μt
is not constant as a function of t, then we have, in general, only a single
observation, Rt , with expected value μt ; hence, it is difficult, if not impossible,
to accurately estimate μt . Thus, it is important to attempt to determine if
the observed returns are consistent with a mean function μt that is constant
over time. One approach to assessing the extent to which μt varies with t is
to calculate running means.
Suppose we observe returns R1 , R2 , . . . , RT and consider running means
based on w observations for some positive integer w. Then the first running
mean is the average of the first w observations:
w
R̄1,w ≡ Rt ;
w t=1
w+1
R̄2,w ≡ Rt ,
w t=2
and so on. The result is a sequence R̄1,w , R̄2,w , . . . , R̄T −w+1,w such that each
element in the sequence is an average of w returns.
Note that
1
w+t−1
E(R̄t,w ) = μj
w j=t
0.04
0.03
Return 0.02
0.01
−0.01
−0.02
0 10 20 30 40 50
Time
FIGURE 2.5
Time series plot of running means of monthly returns on Wal-Mart stock.
When analyzing a plot such as Figure 2.5, there are a few things to keep
in mind. For instance, the error bars are based on the standard error of a
single sample mean centered at the sample mean based on all observations.
It is important to realize that the plot is based on many sample means (49 in
the case of Figure 2.5); the standard error does not apply to the maximum of
those sample means, for example. Therefore, we would not conclude that μt is
nonconstant simply because the plotted line crosses an error bar at some point.
Also, the random variables corresponding to the plotted running means are
often highly correlated; in the present case, adjacent running means are based
on sets of 12 observations, 10 of which are included in both running means.
Hence, if one running mean is large, it is likely that the neighboring running
means are large as well. Therefore, the error bars are useful for giving a rough
idea of the expected variability in the running means if the underlying process
is weakly stationary; however, they should not be used for any type of formal
inference.
The same approach used here for running means may be used for any other
summary statistic of the returns. Most useful in this regard is the standard
deviation; recall that if {Rt : t = 1, 2, . . .} is a weakly stationary process, then
σ2t = Var(Rt ) does not depend on t. Running standard deviations may also be
calculated using the running command.
−2.6
−3.0
−3.2
−3.4
−3.6
−3.8
0 10 20 30 40 50
Time
FIGURE 2.6
Time series plot of running standard deviations of monthly returns on
Wal-Mart stock.
From these data we have T − 1 pairs of observations one time period apart:
Then γ(1), the covariance of two observations one time period apart, may be
estimated by
T −1
1
T −h
1
γ̂(h)
ρ̂(h) = , h = 1, 2, . . . .
γ̂(0)
for ρ̂(h) to take values outside the interval [−1, 1]. Similarly, we take γ̂(0) to
be the sample variance of Y1 , . . . , YT . An estimator of ρ(1) is then given by
γ̂(1)
ρ̂(1) = .
ˆ
γ(0)
To calculate the sample autocorrelation function for a set of observed
returns in R, we can use the function acf.
Example 2.20 Consider the monthly returns on Wal-Mart stock stored in the
variable wmt.m.ret. To calculate the sample autocorrelation of these returns,
we use
> print(acf(wmt.m.ret, lag.max=12))
Autocorrelations of series wmt.m.ret, by lag
0 1 2 3 4 5 6
1.000 -0.06 -0.04 -0.156 -0.043 -0.122 -0.007
7 8 9 10 11 12
0.119 0.043 0.136 -0.075 -0.221 0.000
The argument lag.max sets the maximum value of h for which to cal-
culate ρ̂(h); for monthly data, 12 is a reasonable maximum value. The acf
command also produces a plot of the sample autocorrelation function; the
plot for Wal-Mart monthly returns is given in Figure 2.7. Note that, without
using the function print, only the plot is produced. √
The dashed lines in the plot are at the values ±(2/ T ); if the true auto-
√
correlation is 0, the standard error of the estimated autocorrelation is 1/ T .
1.0
0.8
0.6
ACF
0.4
0.2
−0.2
0 2 4 6 8 10 12
Lag
FIGURE 2.7
Sample autocorrelation function for monthly returns on Wal-Mart stock.
Thus, the dashed lines give some indication of the statistical significance of the
autocorrelation estimates. However, it is important to keep in mind that this
standard error applies to each individual estimate (not the maximum of a
series of estimates, for example); thus, these error bars on the plot are best
used as a rough guide when evaluating the magnitude of the estimates. Also
note that the plot includes the value of the sample autocorrelation function
at h = 0, which is always 1; thus, this value contains no information.
For the Wal-Mart monthly returns, all sample autocorrelations are rel-
atively small in magnitude and are roughly consistent with a series of
uncorrelated random variables. A formal hypothesis test based on the sample
autocorrelation function will be discussed in Section 3.5.
500
400
Frequency
300
200
100
0
−0.04 −0.02 0 0.02 0.04
wmt.ret
FIGURE 2.8
Histogram of daily returns on Wal-Mart stock.
Frequency 100
50
0
−0.04 −0.02 0 0.02 0.04
wmt.ret
FIGURE 2.9
Histogram of daily returns on Wal-Mart stock using the Freedman–Diaconis
rule.
> x_norm<-rnorm(100)
> x_t<-rt(100, df=3)
> x_chisq<-rchisq(100, df=5)
> x_unif<-runif(100)
To construct the normal probability plot for the data in x_norm, we use
the command
> qqnorm(x_norm)
To add the reference line, we can use the function abline, as described
previously:
>abline(a=mean(x_norm), b=sd(x_norm))
Figure 2.10 gives the normal probability plots for the four randomly gen-
erated sets of data. Note that the plotted points for the normal data generally
follow the line, although there are some minor deviations. The points for the t
data are below the line when the normal quantile is large and negative and are
above the line when the normal quantile is large and positive. This indicates
that the quantiles of the sample corresponding to probabilities close to one are
more extreme than those of the normal distribution, and the quantiles of the
sample corresponding to probabilities close to zero are also more extreme than
those of the normal distribution; that is, the sample distribution is long-tailed.
The chi-squared distribution has a long right tail but a short left tail.
Therefore, the quantiles corresponding to probabilities close to one are larger
than those of the normal distribution while the reverse is true for quantiles
2
Sample quantiles
Sample quantiles
5
1
0
0
−5
−2 −10
−2 −1 0 1 2 −2 −1 0 1 2
Theoretical quantiles Theoretical quantiles
10
Sample quantiles
Sample quantiles
0.8
8
6
4 0.4
2
0 0
−2 −1 0 1 2 −2 −1 0 1 2
Theoretical quantiles Theoretical quantiles
FIGURE 2.10
Normal probability plots for randomly generated data.
Example 2.23 The normal probability plot for the daily returns on Wal-Mart
stock analyzed in Example 2.21 can be constructed using the commands
> qqnorm(wmt.ret)
> abline(a=mean(wmt.ret), b=sd(wmt.ret))
The plot is given in Figure 2.11. According to this plot, the distribu-
tion of daily returns on Wal-Mart stock is approximately symmetric but
long-tailed.
The results for the returns on Wal-Mart stock presented in the previous
example are typical of stock return data—the distributions tend to be long-
tailed relative to the normal distribution. For instance, a t-distribution with
a small degrees of freedom, for example, six, is sometimes used as a model for
return data.
0.04
0.02
Sample quantiles
−0.02
−0.04
−3 −2 −1 0 1 2 3
Theoretical quantiles
FIGURE 2.11
Normal probability plot of daily returns on Wal-Mart stock.
2.7 Exercises
1. Consider an asset with prices P0 = $10.00, P1 = $10.40, P2 =
$10.20, P3 = $11.00, and P4 = $11.10. Find the corresponding
returns and log-returns.
2. Suppose that an asset has prices of the form Pt = a exp(bt), t =
0, 1, 2, . . . for some constants a > 0 and b. Find expressions for the
return at time t, Rt , and the log-return at time t, rt .
3. Consider an asset with return Rt and log-return rt at time t. Using
a Taylor’s series approximation, find a quadratic function of Rt that
approximates rt for small values of |Rt |.
Yt = Xt − Xt−1 , t = 1, 2, . . . , .
Xt = ZZt , t = 1, 2, . . . .
r̃1 = r1 + r2 + · · · + r21 ,
w+k−1
X̄k,w = Xt , k = 1, 2, . . . , T − w + 1
w
t=k
Yj = X̄k,w , j = 1, 2, . . . , T − w + 1.
3.1 Introduction
Let P0 , P1 , P2 , . . . , denote a sequence of prices of an asset, such as one share
of a particular stock, and let R1 , R2 , . . . denote the corresponding sequence
of returns. Clearly, in making investment decisions, it would be useful to
be able to use historical price and return data to predict future prices and
returns. For instance, “technical analysis” is based on the belief that there
are certain patterns in stock market data that tend to appear frequently and,
by recognizing such patterns, it is possible to make useful predictions about
the future prices and returns. A more statistical approach may look for a
model that relates future prices or returns to past values and may use such a
model for prediction.
An alternative viewpoint is that changes in the price of a stock are essen-
tially unpredictable; this theory is known as the random walk hypothesis. Note
that the random walk hypothesis does not mean that all stock prices are totally
random and that one stock is as good as any other. The random walk hypoth-
esis is a statement about the lack of useful statistical relationships between
past and future prices and returns. However, one stock might still have a
higher average return than another stock. Thus, past returns on a stock may
provide useful information about future returns in the sense that they may be
used to estimate the parameters of the return distribution.
41
However, we are more concerned with those properties that describe how
E(Y |X) reflects the relationship between X and Y .
The following lemma gives three such properties; see, for example,
Blitzstein and Hwang (2015, Chapter 9) for proofs of these results, along
with a detailed discussion of conditional expectation.
Lemma 3.1. Let Y and X denote random variables and let g(·) be a
real-valued function on the range of X.
Part 1 of the lemma states that the expected value of the random vari-
able E(Y |X) is the same as the expected value of Y . This result may give a
convenient way to calculate E(Y ), if E(Y |X) is easily obtained, using
Part 2 states that, when computing E(Y g(X)|X), we may treat g(X) as
a constant and factor it out of the conditional expectation calculation. Sup-
pose X and Y are independent, then treating X as fixed does not change
the expected value of Y ; it follows that E(Y |X = x) = E(Y ), leading to
Part 3 of the lemma. Note that these three results continue to hold if X is a
vector-valued random variable of the form (X1 , X2 , . . . , Xm )T .
Example 3.1 Let X and Y denote real-valued random variables and let
Ŷ = E(Y |X). Consider the covariance of Ŷ and X, Cov(Ŷ , X); recall that we
may write
Cov(Ŷ , X) = E(Ŷ X) − E(Ŷ )E(X).
it follows that
That is, the covariance of E(Y |X) and X is the same as the covariance of
Y and X.
The random variable E(Y |X) may be interpreted as the function of X that
“best approximates” Y , and it is sometimes described as the “best predictor”
of Y among functions of X. This idea is made precise by the following lemma.
Lemma 3.2. Let Y denote a real-valued random variable and let X denote a
random variable, possibly vector-valued. For any real-valued function g on the
range of X,
E (Y − E(Y |X))2 ≤ E (Y − g(X))2
with equality if and only if g(X) = E(Y |X) with probability one.
Before presenting the proof of this result, consider its relationship to pre-
diction. Suppose that our goal is to choose a function of X to approximate Y .
For instance, suppose that Y is the future price of an asset and X is a vector
containing past prices of the asset; in this case, approximating Y by a function
of X corresponds to predicting the future price using past price data. Suppose
that the quality of the approximation given by a function g(X) is defined as
the expected squared error of the approximation,
2
E (Y − g(X)) .
Then, according to Lemma 3.2, the best approximation, that is, the best
predictor of Y among functions of X, is given by the conditional expectation
E(Y |X). For many purposes, it is more useful to interpret E(Y |X) as the best
approximation or best predictor of Y among functions of X rather than in
terms of the expected value of the conditional distribution of Y given X = x.
and
2
E (Y − g(X)) ≤ E (Y − E(Y |X))2
with equality if and only if
2
E [E(Y |X) − g(X)] = 0,
Example 3.2 Let Y and X denote real-valued random variables such that
Y = α + βX +
for some constants α and β where is a mean-0 random variable such that
and X are independent. Then
E(Y |X) = E(α + βX + |X) = α + βE(X|X) + E(|X).
Because and X are independent, E(|X) = E() = 0 and E(X|X) = X.
Hence,
E(Y |X) = α + βX;
that is, the best predictor of Y among functions of X is α + βX.
Note that this same result holds provided only that E(|X) = 0; indepen-
dence of and X is not required.
Note that the same argument holds if Ωt+1 is replaced by any set of
information that includes Ωt .
The following proposition gives a formal statement of this result. A proof
may be based on formalizing the argument described previously; the details
are omitted.
Proposition 3.1. For any random variable V and for any information sets
Ωt and Ωt+h such that Ωt ⊂ Ωt+h ,
with probability 1.
That is, the best predictor of tomorrow’s price of the stock is today’s price.
Clearly, the same argument works for any price in the future: for any
h = 1, 2, . . .,
The martingale model for asset prices has important implications for the
properties of the corresponding returns. Let Xt+1 = Pt+1 − Pt denote the
change in the price from time t to time t + 1 and consider E(Xt+1 ). Then
E(Xt+1 ) = E{E(Xt+1 |Ωt )}
= E{E(Pt+1 − Pt |Ωt )}
= E{E(Pt+1 |Ωt ) − E(Pt |Ω)}.
Note that Ωt , the information available at time t, includes Pt ; that is, the
random variable Pt is a function of the information in Ωt . It follows that
E(Pt |Ωt ) = Pt ;
furthermore, the result given in (3.2) shows that E(Pt+1 |Ωt ) = Pt . It follows
that
E(Xt+1 ) = E{E(Pt+1 |Ωt ) − Pt }
= E(Pt − Pt ) = 0.
Thus, price changes have an expected value of 0. Furthermore, the previous
argument shows that
E(Xt+1 |Ωt ) = 0
so that, given the information available at time t, the predicted value of the
change in price from time t to time t + 1 is 0.
Let
Pt+1 Pt+1 − Pt
Rt+1 = −1 = , t = 0, 1, 2, . . .
Pt Pt
denote the return at time t + 1. Using the fact that Pt is a function of Ωt ,
Pt+1 − Pt 1
E(Rt+1 |Ωt ) = E |Ωt = E(Pt+1 − Pt |Ωt ) = 0; (3.3)
Pt Pt
that is, under the martingale model, the best predictor of the return in period
t + 1 using financial information in periods up to and including period t is zero.
Note that this result also implies that the (unconditional) expected value of
Rt+1 is also zero:
E(Rt+1 ) = E {E(Rt+1 |Ωt )} = E(0) = 0.
Furthermore, the following result shows that the correlation of any two
returns Rt and Rs , t = s, is zero.
Proposition 3.2. Using the framework of this chapter, define the price of an
asset by
Pt = E(V |Ωt ), t = 0, 1, . . .
and let R1 , R2 , . . ., denote the corresponding returns.
Then, under the martingale model, for any t, s = 1, 2, . . ., t = s,
Cov(Rt , Rs ) = 0.
Cov(Rt , Rs ) = E(Rt Rs ).
Note that
E(Rt Rs ) = E {E(Rt Rs |Ωs−1 )}
and that t ≤ s − 1, Rt , and Ps−1 are functions of Ωs−1 . It follows that
Ps − Ps−1
E(Rt Rs |Ωs−1 ) = Rt E(Rs |Ωs−1 ) = Rt E |Ωs−1
Ps−1
1
= Rt E (Ps − Ps−1 |Ωs−1 )
Ps−1
1
= Rt (Ps−1 − Ps−1 ) = 0,
Ps−1
using the fact that E(Ps |Ωs−1 ) = Ps−1 , establishing the result.
but they do not necessarily have zero mean. These models are based on the
concept of a random walk.
Consider a sequence of random variables Y0 , Y1 , Y2 , . . . . A simple model for
such a process is one in which the changes in process, Yt − Yt−1 , are “random”
in the sense that they have no discernible pattern and no relationships among
them. Such a process is known as a random walk because it can be viewed as
a model for the location on the real line of an “individual” who, at each time
point, moves randomly along the line. The statistical properties of a process of
this type depend on the interpretation of the term “random” used to describe
the movements of the individual. Thus, there are several different technical
definitions of a random walk, corresponding to different interpretations.
For a given stochastic process {Yt : t = 0, 1, 2, . . .}, let Zt = Yt − Yt−1 ,
t = 1, 2, . . . denote the changes in the values of Yt ; the Zt are known as the
increments of the process. Note that when discussing random walks, we will
often include Y0 , the value at time t = 0, in the process; this random variable
is needed to define the first increment Z1 . Thus, Y0 represents the “starting
point” of the process. The increment process {Zt : t = 1, 2, . . .} will generally
start at time t = 1.
Note that, given Y0 ,
Yt = Y0 + Z1 + · · · + Zt , t = 1, 2, . . . (3.4)
Note that
Here, μ and σ are called the drift and volatility, respectively, of the process.
Because Yt = Y0 + Z1 + · · · + Zt , it follows that Yt and Zt+1 are indepen-
dent; using the fact that Yt+1 = Yt + Zt+1 ,
That is, the best predictor of the position of the random walk at time t + 1,
given knowledge of the position at time t, is Yt + μ.
In fact, the same basic argument can be used to show that
that is, the best predictor of Yt+1 , given previous knowledge of all past values
of the process, is Yt + μ.
Weaker forms of the random walk model are based on weaker assumptions
regarding the distribution of the increments. For instance, Random Walk
2 (RW2) assumes that the increments Z1 , Z2 , . . ., together with the initial
value Y0 , are independent random variables, as in RW1; Z1 , Z2 , . . . are each
assumed to have mean μ and standard deviation σ, but they are not necessarily
identically distributed.
Under this model, the result
E(Yt+1 |Y0 , Y1 , . . . , Yt ) = Yt + μ
Cov(Zt , Y0 ) = 0, t = 1, 2, . . . ;
this is known as Random Walk 3 (RW3). That is, in RW3, the increment
process {Zt : t = 1, 2, . . .} is a weak white noise process.
In this case, we can no longer say that the best predictor of Yt+1 based
on Y0 , Y1 , . . . , Yt is Yt + μ. However, the best linear predictor of Yt+1 based on
Y0 , Y1 , . . . , Yt is Yt + μ.
The best linear predictor of Yt+1 based on Y0 , Y1 , . . . , Yt is defined as the
function of the form
a + b0 Y0 + b1 Y1 + · · · + bt Yt
that minimizes
E{(Yt+1 − a − b0 Y0 − · · · − bt Yt )2 }. (3.5)
a + b0 Y0 + b1 Y1 + · · · + bt Yt = a + b0 Y0 + b1 Y1 + · · · + (bt − 1)Yt + Yt
may be written as
Yt + c + d0 Y0 + d1 Z1 + · · · + dt Zt (3.6)
E{(Zt+1 − c − d0 Y0 − d1 Z1 − · · · − dt Zt )2 }, (3.7)
Recall that, for any random variable X, E(X 2 ) = E(X)2 + Var(X). Note
that Zt+1 − c − d0 Y0 − d1 Z1 − · · · − dt Zt has the expected value
⎛ ⎞
t
⎝1 − dj ⎠ μ − c − d0 E(Y0 ) (3.8)
j=1
and variance ⎛ ⎞
t
⎝1 + d2j ⎠ σ2 + d20 Var(Y0 )
j=1
where here we have used the fact that any pair of Y0 , Z1 , . . . , Zt , Zt+1 is uncor-
related. Clearly, the variance is minimized by d0 = d1 = · · · = dt = 0. Because,
for these choices of d0 , d1 , . . . , dt , the expected value in (3.8) is 0 for c = μ,
it follows that the best linear predictor of Yt+1 is an expression of the form
3.6 with d0 = d1 = · · · = dt = 0 and c = μ, yielding the expression Yt + μ, as
claimed in the proposition.
Note that the random walk models are related: RW1 implies RW2, which
implies RW3. Thus, if RW3 does not hold, then neither RW2 nor RW1 holds,
and if RW2 does not hold, then RW1 does not hold.
Ut = exp(Y0 ) exp(Z1 + · · · + Zt )
≡ U0 exp(Z1 ) · · · exp(Zt )
where U0 = exp(Y0 ). These ideas may also be applied to RW2 and RW3.
for instance, if {Pt : t = 0, 1, 2, . . .} follows RW3, then the price changes form
a weak white noise process.
However, empirical analyses suggest that changes in prices are often
roughly proportional to the price, in the sense that stocks with higher prices
tend to exhibit larger price changes than stocks with lower prices, generally
speaking.
This behavior may be modeled by assuming that
Pt − Pt−1 = Wt Pt−1 , t = 1, 2, . . . ,
where Wt is a random variable representing the proportional change in price.
Under this assumption, the conditional expectation of Xt = Pt − Pt−1 given
Pt−1 depends on Pt−1 in general. On the other hand,
Pt
pt − pt−1 = log = log (Wt + 1)
Pt−1
so that, letting Zt = log(Wt + 1), pt − pt−1 = Zt , t = 1, 2, . . . . Thus, if price
changes are proportional to the price, we might expect log-prices to follow a
random walk so that {Pt : t = 0, 1, 2, . . .} follows a geometric random walk.
Note that the increments of the process {pt : t = 0, 1, 2, . . .} are simply
the log-returns. However, a basic argument showing that market efficiency
implies that log-returns are uncorrelated, along the lines of the one we used
in Proposition 3.2 for returns, is not available.
To see why such an argument fails, consider the framework of Section 3.3,
in which Pt = E(V |Ωt ). Then
pt = log E(V |Ωt )
so that
E(V |Ωt )
rt = pt − pt−1 = log .
E(V |Ωt−1 )
It follows that
E(V |Ωt+1 ) E(V |Ωt+1 )
E(rt+1 rt ) = E log log .
E(V |Ωt ) E(V |Ωt )
Because of the presence of the log(·) function, this expression cannot be
usefully simplified.
However, suppose that the returns Rt = (Pt − Pt−1 )/Pt−1 are small. Then
Pt Pt − Pt−1
rt = pt − pt−1 = log = log 1 +
Pt−1 Pt−1
. Pt − Pt−1
= = Rt .
Pt−1
For example, if Rt = 0.02, a fairly large value for a monthly return, then
Pt /Pt−1 = 1.02 and rt = log(1.02) = 0.01980 and approximating rt by Rt
yields an error of 0.0002, or about 1%.
m
ρ̂(h)2
B = T (T + 2) .
T −h
h=1
Note that B tends to be large when the sample autocorrelations are far from 0.
Under the null hypothesis that RW3 holds for log-prices, so that ρ(1) = ρ(2) =
· · · = ρ(m) = 0, B has a chi-squared distribution with m degrees of freedom.
This is known as the Box–Ljung test.
In order to carry out the test, m, the number of lags used to compute B,
must be selected. When the data consist of five years of monthly returns, a
relatively small value of m should be used; for example, m = 12 is a reasonable
choice. For longer series of daily returns, a larger value of m could be used.
Example 3.3 Consider the monthly log-returns for Wal-Mart stock, stored
in the variable wmt.m.logret. To compute the test statistic B and the
corresponding p-value, we may use the Box.test function.
> Box.test(wmt.m.logret, lag=12, type="L")
Box-Ljung test
data: wmt.m.logret
X-squared = 9.9011, df = 12, p-value = 0.6246
The argument lag in the Box.test function specifies m, the number of
lags to be used and the type="L" argument specifies that the Box–Ljung test
be used.
These results indicate that, for m = 12, B = 9.9011 and the p-value is
0.6246. Therefore, based on this test, there is no evidence to reject the
hypothesis that the data do not exhibit autocorrelation (at least up to lag 12),
confirming our informal conclusion in Section 2.5.
Variance-Ratio Test
The variance-ratio test is based on the following observation. Suppose that
RW3 holds for the log-prices, so that r1 , r2 , . . . , rT each has mean μ and
standard deviation σ and rt , rs are uncorrelated for all t = s. Then
and
Var(rt + rt−1 ) = Var(rt ) + Var(rt−1 ) = σ2 + σ2 = 2σ2 .
More generally, rt + rt−1 + · · · + rt−q+1 has mean qμ and variance qσ2 .
Note that rt + rt−1 + · · · + rt−q+1 is simply the q-period log-return at time t.
Therefore, if RW3 holds for log-prices, then there is a simple relationship
between the variance of multiperiod log-returns and the variance of single-
period log-returns.
This fact can be used to test RW3 by comparing an estimate of the
variance of
rt + rt−1 + · · · + rt−q+1 , t = q, . . . , T
to an estimate of the variance of r1 , r2 , . . . , rT ; if RW3 holds, the ratio of these
estimates should be roughly q.
For a given value of q, let
T
t=q (rt + rt−1 + · · · + rt−q+1 − qr̄)
2
Sq2 = ,
T −q
which is essentially the sample variance of rt + rt−1 + · · · + rt−q+1 , t = q,
q + 1, . . . , T , with the divisor equal to the sample size minus one, except that
instead of subtracting the sample mean of these values we subtract qr̄, where
T
r̄ = rt .
T t=1
T 1 Sq2
Vq = .
T − q + 1 q S2
If RW3 holds, we expect that
1 Sq2
=
˙ 1;
q S2
the factor T /(T − q + 1) is an adjustment term designed to improve the accu-
racy of the normal approximation to the distribution of Vq in small samples.
Note that, like the Box–Ljung test, the variance-ratio test is a test of the
correlation structure of the log-returns.
and the null hypothesis is rejected for large values of |V̄q |. The p-value of the
test is given by
P(|Z| > |V̄q,0 |)
where Z has a standard normal distribution and V̄q,0 is the observed value of
V̄q ; hence,
P(|Z| > |V̄q,0 |) = 2 1 − Φ(|V̄q,0 |)
where Φ denotes the standard normal distribution function.
Example 3.4 Consider the statistic V3 applied to the log-returns on
Wal-Mart stock. This statistic may be calculated using the following com-
mands:
> x<-wmt.m.logret - mean(wmt.m.logret)
> x3<-x[3:60] + x[2:59] + x[1:58]
> (60/58)*(1/3)*(sum(x3^2)/57)/var(x)
[1] 0.90135
Here, x is the vector of mean-corrected log-returns for Wal-Mart stock and x3
is the vector of three-month mean-corrected log-returns,
Runs Test
Not all types of dependence are reflected in correlation. Another approach
to detecting relationships in a series of returns is to look at patterns of
above-average and below-average returns.
More formally, let r1 , r2 , . . . , rT denote a sequence of log-returns and
let med(r1 , r2 , . . . , rT ) denote the sample median of r1 , r2 , . . . , rT . For t =
1, 2, . . . , T , define
1 if rt > med(r1 , . . . , rT )
Gt =
0 if rt ≤ med(r1 , . . . , rT )
Runs Test
data: wmt.m.logret
statistic = 1.0417, runs = 35, n1 = 30, n2 = 30, n = 60, p-value =
0.2976
alternative hypothesis: nonrandomness
Therefore, the p-value of the test is 0.2976 so that there is no evidence to
reject the null hypothesis that RW1 holds for the log-returns.
k
(rt − r̄), k = 1, 2, . . . , T.
t=1
Large values of H are evidence against the null hypothesis that RW1 holds
for the log-prices.
A large value of H indicates that there are times t0 , t1 such that
t1
(rt − r̄)
t=1
t0
(rt − r̄)
t=1
TABLE 3.1
Critical Values for the Rescaled Range Test
Significance Level Critical Value
0.10 1.620
0.05 1.747
0.025 1.862
0.005 2.098
is a large negative value; note that the values rt − r̄, t = 1, 2, . . . , T must sum
to 0. That is, there is a time period over which the log-returns differ greatly
from their sample mean.
To determine if the observed value of H is statistically significant, we
compare it to the critical values in Table 3.1.
Let
k
H1 = max (rj − r̄)
1≤k≤T
j=1
and
l
H2 = min (rj − r̄)
1≤l≤T
j=1
√
so that H = (H1 − H2 )/(S T ).
For the Wal-Mart monthly log-returns, H1 and H2 may be calculated by
> H1<-max(cumsum(wmt.m.logret-mean(wmt.m.logret)))
> H1
[1] 0.085685
> H2<-min(cumsum(wmt.m.logret-mean(wmt.m.logret)))
> H2
[1] -0.19496
and H is given by
To compute the p-value for a test of RW2, we compare the observed value of
H to critical values in Table 3.1. It follows that the p-value is greater than 0.10.
Therefore, according to the rescaled range test, there is no evidence to
reject the hypothesis that the RW2 model holds for Wal-Mart stock.
15 20
15
10
Frequency
Frequency
10
5
5
0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
p-value p-value
10 15
Frequency
Frequency
6 10
4
5
2
0 0
0 0.2 0.4 0.6 0.8 1.0 0.6 0.8 1.0 1.2 1.4 1.6
p-value Test statistic
FIGURE 3.1
Results of tests of the random walk model for stocks in the S&P 100.
3.8 Exercises
1. Let X and Y denote random variables, each with mean 0, such that
E(Y |X) = X + a
Ŷ = E(Y |X).
Show that
Cov(h(X), Ŷ ) = Cov(h(X), Y )
for any function h.
Does it follow that the correlation of h(X) and Ŷ is equal to the
correlation of h(X) and Y ? Why or why not?
3. Let Y and X denote random variables such that
Y = β0 + β1 X +
Pt = E(V |Ωt ) + Zt
where Zt is a random variable such that E(Zt |Ωt ) = 0. That is, the
price Pt is approximately, but not exactly, equal to E(V |Ωt ).
Does the martingale property
E(Pt+h |Ωt ) = Pt , h = 1, 2, . . .
Xt = Yt+h − Yh , t = 0, 1, . . . .
14. For the log-prices of Best Buy stock calculated in Exercise 11,
calculate the p-value of the runs test; see Example 3.5.
Based on your results, what do you conclude regarding the
random walk hypothesis as applied to the log-prices of Best Buy
stock?
15. For the log-returns on Best Buy stock calculated in Exercise 11,
calculate the rescaled-range statistic H; see Example 3.6.
Based on this result, what do you conclude regarding random
walk hypothesis as applied to the log-prices of Best Buy stock?
4.1 Introduction
Suppose we have a given amount of capital to invest and a number of possible
assets in which to invest. How should we place our investment in the various
assets? This is known as the portfolio selection problem and the mathematical
and statistical methods developed to solve it are known as portfolio theory.
If it were possible to accurately forecast the future returns of the assets we
could simply invest in the asset or assets with the largest predicted returns
over the future time period of interest. However, empirical evidence, such
as that presented in the previous chapter, suggests that such forecasting is
difficult at best.
Hence, in portfolio theory, we attempt to choose the combination of assets
that yields a portfolio return with desirable statistical properties; specifically,
we seek a large expected return, while minimizing the “risk” of the portfolio,
defined here as the standard deviation of the return. Thus, in an ideal case, the
return on the portfolio will have a large expected value and a small standard
deviation so that the portfolio realizes a large return with high probability.
However, complicating the situation is the fact that the two goals are typically
in conflict: Riskier assets generally have a higher expected return, as a reward
for assuming the risk.
69
let Pj,0 and Pj,1 denote the prices of asset j in periods 0 and 1, respectively,
so that
Pj,1
Rj = − 1.
Pj,0
Suppose we wish to invest capital C in the N assets, according to the weights
w1 , w2 , . . . , wN . Then, in period 0, we buy
wj C
Pj,0
shares of asset j, j = 1, 2, . . . , N .
In period 1, the shares in asset j are worth
wj C
Pj,1 .
Pj,0
Thus, the total worth of the portfolio in period 1 is
N
wj C
Pj,1
j=1
Pj,0
N
wj (Rj + 1) − 1 = wj Rj .
j=1 j=1
That is, the return on the portfolio is a linear function of the individual asset
returns, with the coefficients given by the portfolio weights:
N
Rp = w1 R1 + w2 R2 + · · · + wN RN = wj Rj .
j=1
Then, using results on the mean and variance of a sum of random variables,
and
The portfolio problem for the case of two assets is concerned with choosing w
so that μp (w) is large and σp (w) is small.
For instance, the portfolio placing half its weight on asset 1 and half its weight
on asset 2 has expected return
Diversification
Note that, although the mean return on the portfolio depends only on the
mean returns on the individual assets, the risk of the portfolio depends on the
relationship between the asset returns, as measured by their correlation. This
is a fundamental idea in portfolio theory; in particular, it plays an important
role in the concept of diversification, which refers to reducing the risk of a
portfolio by investing in many assets, a central idea in portfolio theory. We
illustrate this concept by considering two examples.
Example 4.2 Consider the case in which we have two assets with returns
R1 , R2 with the same mean and variance: E(R1 ) = E(R2 ) = μ and Var(R1 ) =
Var(R2 ) = σ2 . Furthermore, assume that R1 and R2 are uncorrelated.
Investing entirely in either asset 1 or asset 2 yields the same expected
return and the same risk. Now consider a portfolio consisting of both asset
1 and asset 2. Let w be the proportion of our investment in asset 1 so that
the portfolio weights are w1 = w and w2 = 1 − w. Then the expected return
on the portfolio is
Therefore, the expected return on the portfolio does not depend on the value
of w; in particular, investing in the portfolio yields the same expected return
as investing in either asset 1 or asset 2. However including both assets in the
portfolio reduces risk.
Because R1 and R2 are uncorrelated,
For w = 0 or 1, σp = σ. However, for any other value of√w, 0 < w < 1, σp < σ.
Choosing w = 1/2 minimizes σp yielding a value of σ/ 2.
The scenario in this example is a special one since the assets have the
same mean and variance, and they are uncorrelated; the following example
generalizes this by assuming the asset returns are correlated.
Example 4.3 Consider the framework of Example 4.2 but now assume that
the correlation of R1 and R2 is ρ12 , −1 < ρ12 < 1. Consider the portfolio
placing equal weight on the two assets so that w1 = w2 = 1/2. Then Rp =
(1/2)R1 + (1/2)R2 ; hence, μp = E(Rp ) = μ as in Example 4.2. The variance
of Rp is given by
1 1 11
σ2p = Var(Rp ) = VarR1 + VarR2 + 2 Cov(R1 , R2 )
4 4 22
1 2 1 2 1
= σ + σ + ρ12 σ 2
4 4 2
1
= (1 + ρ12 )σ .
2
2
Therefore, for any value of ρ12 , −1 < ρ12 < 1, σp < σ.
When ρ12 is close to 1, then σp is close to σ, the standard deviation of
R1 and R2 ; therefore, there is little benefit to including both assets in the
portfolio. This is not surprising because, when ρ12 =1, ˙ R1 and R2 have a lin-
ear relationship; hence, there is little diversity in including both assets in the
portfolio. On the other hand, the standard deviation of the portfolio return is
the smallest when the asset returns are negatively correlated so that one asset
return increases when the other decreases and vice versa.
Example 4.4 Let R1 , R2 denote returns on two assets and suppose that
E(R1 ) = E(R2 ) = μ,√Var(R1 ) = 0.3, Var(R2 ) = 0.1, and suppose that the cor-
relation of R1 , R2 is 3/2. If we construct a portfolio by investing a proportion
w of our investment in asset 1 and 1 − w of our investment in asset 2, the return
on our portfolio is Rp = wR1 + (1 − w)R2 . It is straightforward to show that
Rp has mean μp (w) = μ and variance σ2p (w), given by
√
3
σ2p (w) = (0.3)w + (0.1)(1 − w) +
2 2
(0.3)(0.1)w(1 − w)
2
= (0.1)(w2 + w + 1).
Because all portfolios have the same expected return, we should choose w
to minimize risk. Note that σ2p (w) is a quadratic function of w with a positive
coefficient of w2 ; hence, it can be minimized by taking the derivative, setting
it equal to 0, and solving for w, yielding w = −1/2. That is, the minimum
variance portfolio places weight −1/2 on asset 1 and weight 3/2 on asset 2.
to buy 5 shares of asset 1. Therefore, our net worth at the end of period 1 is
210$ − 60$ = $150. This corresponds to a return of
150 − 100
= 0.5.
100
In terms of w1 , w2 , R1 , and R2 the return on our portfolio is
w1 R1 + w2 R2 = −0.5R1 + 1.5R2 = −0.5(0.2) + 1.5(0.4) = 0.5
as determined previously.
A short sale of an asset may be viewed as a loan, with the interest rate of
the loan equal to the return on that asset. Therefore, if the price of the asset
decreases over the loan period, so that the return is negative, then the interest
rate on the loan is effectively negative and the investor makes money on the
loan as well. Because of this, short sales are often described as an appropriate
investment for an asset expected to have a negative return. Although this may
be true, a short sale can be useful for controlling risk even in cases in which
the asset price is expected to increase. This illustrated in Example 4.4.
Although portfolios with negative weights are convenient from a mathe-
matical point of view, and we will use them here, there are some important
practical considerations. Note that there is a fundamental difference between
a short sale and a more typical “long” position. If we purchase $100 of an
asset, the most we can lose is $100, which occurs if the share price decreases
to 0. However, if we borrow $100 worth of an asset, our potential losses
are unbounded because the price of the asset can, in principle, increase
indefinitely. Therefore, short sales are tightly regulated.
For instance, when shares are borrowed, a certain amount of collateral
must be placed in a “margin account,” effectively increasing the cost of the
loan. If the price of the stock increases, then more collateral may be needed
since the cost of buying back the stock is greater; in this case, the investor
receives a “margin call” demanding that additional funds be added to the
margin account. Also, short positions may be subject to a “forced buy-in,”
in which the lender of the asset requires that the loan be repaid immediately,
even if such a repayment requires the investor who borrowed the shares to
lose money.
Because of such considerations, analysts sometimes only consider portfolios
in which the weights are restricted to lie in the interval [0, 1], although such
constraints complicate the details of the analysis. Constraints of this type will
be considered in Section 5.8.
and
σ2p (w) = w2 σ21 + (1 − w)2 σ22 + 2w(1 − w)ρ12 σ1 σ2 .
In this section, we consider the problem of choosing the value of w.
As w varies, μp (w) and σp (w) vary; we may view these values as points
(σp (w), μp (w)) in the “risk-return space.” A plot of (σp (w), μp (w)) as w
varies is a useful way to understand the relationship between the expected
return and risk of a portfolio.
Example 4.6 Suppose that μ1 = 0.2, σ1 = 0.1, μ2 = 0.1, σ2 = 0.05, and
ρ12 = 0.25. Then μp (w) = w(0.2) + (1 − w)(0.1) = 0.1 + 0.1w and
The curve in Figure 4.1 represents the set of all (σp (w), μp (w)) pairs that
are available to the investor; it is known as the opportunity set. This term
0.25
0.19
0.13
μp
0.07
0.01
−0.05
0.04 0.07 0.10 0.13 0.16
σp
FIGURE 4.1
Expected return and risk for different portfolios in Example 4.6.
will also be used to describe the corresponding portfolios; for instance, in the
previous example, a portfolio in the opportunity set has a value of (σp , μp ) on
the curve in Figure 4.1.
Note that, unless μ1 = μ2 , for a given value m there is exactly one value of
w such that μp (w) = m. On the other hand, for a given value s > 0, there
may be zero, one, or two values of w such that σp (w) = s, depending on the
number of solutions to the quadratic equation σ2p (w) − s2 = 0.
Example 4.7 Consider the assets described in Example 4.6. Note that, for
a given value of μp , say m, the portfolio with that expected return may be
found by solving
μp (w) = 0.1 + 0.1w = m
for w, yielding w = 10m − 1. On the other hand, for a given value of σp , say
s, there may be zero, one, or two portfolios with that standard deviation,
depending on the solutions to the quadratic equation
For instance, there are no portfolios with σp (w) = 0.0375 because there are
no (real) solutions to the equation
there are two portfolios with σp (w) = 0.0625, corresponding to the two
solutions to the equation
Efficient Portfolios
For the case in which two portfolios have the same return standard deviation,
only the one with the larger return mean is of interest. For example, in Fig-
ure 4.1, each portfolio with a (σp (w), μp (w)) pair on the lower half of the
curve is dominated by a portfolio with a (σp (w), μp (w)) pair on the upper
half of the curve. Thus, the portfolios corresponding to the upper half of the
curve are known as efficient portfolios.
0.25
0.19
0.13
μp
0.07
0.01
−0.05
0.04 0.07 0.10 0.13 0.16
σp
FIGURE 4.2
Three possibilities for the solutions to σ2p (w) = s in Example 4.7.
0.25
0.19
0.13
μp
0.07
0.01
−0.05
0.04 0.07 0.10 0.13 0.16
σp
FIGURE 4.3
Efficient frontier in Example 4.7.
That is, for an efficient portfolio, it is not possible to have a larger expected
return without increasing the risk or, conversely, it is not possible to have lower
risk without decreasing the expected return. This upper half of the opportu-
nity set is known as the efficient frontier ; the efficient frontier for the assets
described in Example 4.7 is given in Figure 4.3. The term “efficient frontier”
will also be used to be describe the portfolios with a (σp (w), μp (w)) pair on
the efficient frontier.
Each portfolio on the efficient frontier has the largest possible expected
return for a given level of risk and the lowest possible risk for a given expected
return. Therefore, there is no objective way to choose from among the effi-
cient portfolios; such a choice depends on an investor’s view of the relative
importance of a portfolio’s expected return and risk.
dσ2p (w)
= 2wσ21 − 2(1 − w)σ22 + 2(1 − 2w)ρ12σ1 σ2
dw
and
d2 σ2p (w)
= 2σ21 + 2σ22 − 4ρ12 σ1 σ2 .
dw2
Clearly,
d2 σ2p (w)
≥ 2σ21 + 2σ22 − 4σ1 σ2
dw2
= 2(σ1 − σ2 )2 ≥ 0
d2 σ2p (w)
> 0.
dw2
Hence, σ2p (w) can be minimized by solving
σ22 − ρ12 σ1 σ2
w = wmv ≡
σ21 + σ22 − 2ρ12 σ1 σ2
and
σ21 − ρ12 σ1 σ2
1 − wmv = .
σ21 + σ22 − 2ρ12 σ1 σ2
Note that σ21 + σ22 − 2ρ12 σ1 σ2 = Var(R1 − R2 ). Using the fact that
Var(R1 − R2 ) = σ21 + σ22 − 2ρ12 σ1 σ2 = (σ1 − σ2 )2 + 2(1 − ρ12)σ1 σ2 ,
it follows that if ρ12 < 1, then Var(R1 − R2 ) > 0. Therefore, provided that
ρ12 < 1, the denominator in the expression for wmv is nonzero.
Example 4.8 Consider Example 4.6 in which σ1 = 0.1, σ2 = 0.05, and ρ12 =
0.25. Then
(0.05)2 − (0.25)(0.1)(0.05) 1
wmv = = .
(0.1) + (0.05) − 2(0.25)(0.1)(0.05)
2 2 8
Recall that in Example 4.6 we saw that the quadratic equation σ2p (w) −
15/6400 = 0 has a single root, at w = 1/8. Such a single root always occurs
at the point of minimum risk; see Figure 4.2.
Therefore, risk is minimized by placing 1/8th of our investment in asset 1.
Here, μ1 = 0.2 and μ2 = 0.1, so that the minimum-variance portfolio has
expected return
(1/8)μ1 + (7/8)μ2 = 0.1125
and, using the result in Example 4.6, the standard deviation of the return is
1 .
(0.01w2 − 0.0025w + 0.0025) 2 = 0.0484.
w=1/8
This gives the portfolio with minimum risk. However, it is only the opti-
mal choice if our goal is to minimize risk. For example, suppose that we are
willing to increase the standard deviation of the portfolio to 0.05; the solu-
tions to 0.01w2 − 0.0025w + 0.0025 = (0.05)2 are w = 0.25 and w = 0. Only
the first of these corresponds to a portfolio on the efficient frontier (why?),
and its expected return is 0.1 + 0.1(0.25) = 0.125. Hence, a 3.3% increase in
risk (i.e., 0.0484 to 0.05) yields a 11% increase in expected return (i.e., 0.1125
to 0.125).
and
provided that ρ12 < 1. Hence, we can maximize fλ (w) by solving fλ (w) = 0
for w, yielding the solution
it follows that
σ21 − ρ12 σ1 σ2 − (μ1 − μ2 )/λ
1 − wλ = .
σ21 + σ22 − 2ρ12 σ1 σ2
Example 4.9 Consider two assets such that μ1 = 0.04, μ2 = 0.02, σ1 = 0.2,
σ2 = 0.1, and ρ12 = 0.25. Then, using (4.1),
1 1
w = wλ ≡ + , λ > 0.
8 2λ
Figure 4.4 contains plots of μp (wλ ) and σp (wλ ) as λ varies. Note that, for large
values of λ, there is a large penalty on the variance of the return; hence, the
optimal portfolio has small risk. To achieve this, the portfolio must also have
small expected return. Conversely, for a small λ, the variance of the return is
largely irrelevant; hence, the optimal portfolio has large risk. As a reward for
the large risk incurred, the portfolio also has a large expected return.
When λ is small, the weight on asset 1 is large; hence, the weight on asset
2 is negative. That is, in order to achieve a large expected return, we must
borrow asset 2 (which has a low expected return) in order to buy asset 1
(which has a large expected return). For instance, if λ = 0.25, the optimal
portfolio places weight 2.125 on asset 1 and weight −1.125 on asset 2.
0.07
0.06
Expected return
0.05
0.04
0.03
1 2 3 4 5
λ
0.5
0.4
Risk
0.3
0.2
0.1
1 2 3 4 5
λ
FIGURE 4.4
Properties of the optimal portfolio as a function of λ in Example 4.9.
risk-free asset. Then E(Rf ) = μf , the risk-free rate of return, and Var(Rf ) = 0,
that is, Pr(Rf = μf ) = 1.
Investment in a risk-free asset might contribute only a small return to
the portfolio but it reduces the risk of our investment, giving the investor a
convenient way to control risk; note that, since Rf has zero variance, it also
has zero covariance with any other asset return. Although we might consider
a simple “savings account” as a risk-free asset, the usual risk-free asset used
in portfolio analysis is a three-month U.S. Treasury Bill.
For instance, consider a portfolio consisting of a risk-free asset, with
return Rf , and a standard, “risky” asset, with return R. Let μ = E(R) and
σ2 = Var(R); assume that μ > μf . Suppose we invest a proportion w of our
investment in the risky asset, with the remainder invested in the risk-free
asset; assume that w > 0 so that we do not borrow the risky asset to buy the
risk-free asset. Then the return on our portfolio is wR + (1 − w)Rf .
This portfolio has expected return
μ0 (w) = wμ + (1 − w)μf (4.2)
and return standard deviation
σ0 (w) = wσ. (4.3)
Note that, although Rf and μf are equal with probability one, we will use
Rf when referring to returns, for example, the return on the portfolio is
wR + (1 − w)Rf , and use μf when referring to properties of the distribution of
returns, for example, the expected return on the portfolio is wμ + (1 − w)μf .
Hence, the risk of the portfolio can be made as small as we like, by choosing
a value of w close to 0. Of course, reducing the risk of the portfolio also reduces
the expected return. Note that solving (4.3) for w yields
σ0 (w)
w=
σ
and using this result in (4.2) yields the following relationship between the
expected return on such a portfolio and its risk:
μ − μf
μ0 (w) = μf + σ0 (w).
σ
Example 4.10 Consider a risky asset with expected return μ = 0.02 and
return standard deviation σ = 0.1; suppose that the risk-free return is μf =
0.001. Then
μ − μf 0.02 − 0.001
= = 0.19
σ 0.1
so that the mean return μ0 (w) and return standard deviation σ0 (w) of the
portfolio placing weight w on the risky asset and weight 1 − w on the risk-free
asset are related by
μ0 (w) = 0.001 + 0.19σ0(w).
Because of the simple relationship between the expected return and return
standard deviation for portfolios formed from a risky asset and a risk-free asset,
the set of risk, expected-return pairs available to the investor takes a particu-
larly simple form.
Example 4.11 Consider the asset and risk-free asset described in Example
4.10. Let μ0 (w) and σ0 (w) denote the mean and standard deviation, respec-
tively, of the portfolio placing weight w on the risky asset. Figure 4.5 contains
a plot of the efficient frontier (the solid line) together with the opportunity
set (the solid line together with the dotted line). The minimum risk portfolio
in this context is the one with w = 0 corresponding to the vertex in the plot,
which occurs at (0, μf ); note that the portfolio with w = 0 is the one in which
the investor neither invests in nor borrows the risky asset.
The efficient frontier in this case is that part of the opportunity set cor-
responding to w ≥ 0, that is, the line segment with positive slope. The slope
of this line segment is (μ − μf )/σ, which may be viewed as a measure of the
mean excess return on the asset per unit of risk. Note that if we have two
possible risky assets, the one with the larger slope leads to portfolios with a
higher expected return for a given level of risk. This property plays an impor-
tant role in constructing a portfolio from two risky assets together with the
risk-free asset, and it will be discussed in detail in Section 4.6.
The risk-free return provides a convenient baseline for measuring the return
on an asset; for a given asset with return R, the quantity R − Rf is known as
the excess return of the asset. Therefore, for the portfolio discussed earlier,
the expected excess return on the portfolio is proportional to the portfolio
risk:
μ − μf
μ0 (w) − μf = σ0 (w).
σ
0.03
0.01
μ0
μf
−0.01
FIGURE 4.5
Efficient frontier and opportunity set in Example 4.11.
Let Rp denote the return on the portfolio of risky assets; once the portfolio
of risky assets has been selected, we are effectively back to the case of one risky
asset together with the risk-free asset.
The return on the entire portfolio may be written
(1 − wf )Rp + wf Rf .
0.20
0.15
Expected return
0.10
0.05
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk
FIGURE 4.6
Efficient frontier for portfolios of the two risky assets in Example 4.13.
0.20
0.15
Expected return
0.10
0.05
(0, μf)
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk
FIGURE 4.7
Risk, mean–return pairs corresponding to a particular portfolio of risky assets
in Example 4.13.
Let (σp , μp ) denote the risk, mean–return pair for a portfolio of risky
assets; for example, in Figure 4.7, (σp , μp ) is a point on the curve. Because all
half-lines starting at (0, μf ) and passing through (σp , μp ) have the same
starting point, the different possible half-lines may be described by their
slopes,
μp − μf
.
σp
0.20
0.15
Expected return
0.10
0.05
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk
FIGURE 4.8
Comparison of two portfolios of risky assets in Example 4.13.
Sharpe Ratio
Consider a portfolio of risky assets with expected return μp and risk σp . The
slope of the line connecting (0, μf ) and (σp , μp ) is known as the Sharpe ratio
of the portfolio; hence, the Sharpe ratio is given by
μp − μf
SR = .
σp
The Sharpe ratio has a useful interpretation as the expected excess return on
the portfolio per unit of risk.
Note that the portfolio giving weight wf to the risk-free asset and weight
1 − wf to the portfolio of risky assets has expected return
(1 − wf )μp + wf μf
Tangency Portfolio
Thus, to construct the optimal portfolio of risky assets with returns R1 and
R2 , we find w̄1 , w̄2 , w̄1 + w̄2 = 1, so that the portfolio with return
w̄1 R1 + w̄2 R2
Write w̄1 = w and w̄2 = 1 − w. Our goal is to find the value of w that
maximizes
f (w) = (μp (w) − μf )/σp (w)
where
μp (w) = wμ1 + (1 − w)μ2
and
σ2p (w) = w2 σ21 + (1 − w)2 σ22 + 2w(1 − w)ρ12 σ1 σ2 .
We may maximize f (w) using standard results from calculus. Using the
rule for taking the derivative of a ratio, it follows that
μp (w)
= f (w). (4.5)
σp (w)
0.20
0.15
Expected return
0.10
0.05
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk
FIGURE 4.9
Tangency portfolio in Example 4.14.
Figure 4.9 shows the efficient frontier in this example with the location of
the tangency portfolio (the point shown on the efficient frontier) and a dashed
line representing the risk-expected return pairs for portfolios constructed from
the tangency portfolio and the risk-free asset.
Note that, since the efficient frontier of the two risky assets lies below the
dashed line, for any desired level of risk, a portfolio based on the tangency
portfolio and the risk-free asset will have an expected return at least as large
(and usually larger) than that of any portfolio on the efficient frontier with
that level of risk.
Consider the problem of finding the best combination of risky assets, with
returns R1 , and R2 , and risk-free asset, with return Rf ; that is, consider the
problem of finding the optimal portfolio return of the form
wf Rf + (1 − wf )[wR1 + (1 − w)R2 ].
The previous results show that the optimal solution is to first take w = wT ,
corresponding to the tangency portfolio; this gives the optimal combination
of risky assets. Let μT and σT denote the mean and standard deviation of the
return on the tangency portfolio.
Given a desired level of risk s we then choose wf so that wf Rf + (1 − wf )
RT has a standard deviation equal to s, that is, take wf = s/σT ; alternatively,
given a desired value for the expected return, m, we find wf so that
wf μf + (1 − μf )μT = m.
Note that, according to this theory, all investors should use the same com-
bination of risky assets; only the proportion of the tangency portfolio versus
the risk-free asset depends on the investor’s goals.
4.8 Exercises
1. Suppose that there are N assets, with log-returns r1 , r2 , . . . , rN .
Consider a portfolio placing weight wj on asset j, j = 1, 2, . . . , N
and let rp denote the log-return on the portfolio.
Is it true that
rp = w1 r1 + w2 r2 + · · · + wN rN ?
E(R1 ) = E(R2 ).
Describe the opportunity set and the efficient frontier for these
assets.
9. For the assets described in Exercise 2 find the minimum-variance
portfolio.
10. Let wmv denote the weight given to asset 1 in the minimum-variance
portfolio. Find conditions on ρ12 , in terms of σ1 , σ2 , so that 0 <
wmv < 1.
11. For the assets described in Exercise 5, find the weight of asset 1 in
the risk-averse portfolio with parameter λ.
19. Let R1 and R2 denote the returns on two assets. Suppose that
R2 = R1 + where E() = 0 and R1 and are uncorrelated. Assume
that Var() > 0. Thus, the return on asset 2 is equal to the return
on asset 1 plus “noise.”
a. Find the minimum variance portfolio of R1 , R2 .
b. Find the tangency portfolio of R1 , R2 .
5.1 Introduction
In Chapter 4, we considered the problem of constructing a portfolio of two
risky assets, possibly together with a risk-free asset. The focus of this chapter
is the extension of those results to the general case of N risky assets, for an
integer N ≥ 2.
Although the basic approaches we will consider are the same as the ones
used in the N = 2 case, there are a number of important details in which
the general N case is different. The most obvious difference is in the scale
of the problem. When N = 2, the portfolio is described by a single weight,
w, representing the investment in asset 1, with 1 − w invested in asset 2; the
mean and variance of the portfolio return are linear and quadratic functions,
respectively, of w. The statistical properties of the portfolio returns depend
on five parameters: two mean returns, two return standard deviations, and
the correlation of the returns.
In the general N case, the mean and variance of the portfolio return
are functions of the weights w1 , w2 , . . . , wN ; because the weights must sum
to 1, there are effectively N − 1 weights that must be selected. The mean
and standard deviation of the portfolio return depend on N asset expected
returns, N return standard deviations, and N (N − 1)/2 correlations between
the returns on different assets, representing the ways in which the asset returns
are interrelated.
Perhaps the most important mathematical difference between the two cases
is that, in the two-asset case, there is only one portfolio with a given mean
return. That is, if we require a portfolio mean return of 0.02, for example, this
specifies the value of w that must be used. Furthermore, this value of w deter-
mines the value of the return standard deviation σp , so that there is only a single
possible value of σp for a given value of μp . That is, σp is a function of μp . When
N > 2, typically there are infinitely many portfolios with a given mean return.
95
N
E(Rp ) = wj E(Rj ) = wj μj
j=1 j=1
where μj = E(Rj ), j = 1, 2, . . . , N .
The standard deviation of Rp depends on the standard deviations of R1 ,
R2 , . . . , RN , but it also depends on the relationships among R1 , R2 , . . . , RN ,
as measured by their covariances or correlations. Specifically,
where the second summation in this expression is the sum over all i, j from 1
to N for which i is less than j.
Let σj denote the standard deviation of Rj , j = 1, 2, . . . , N , let σij denote
the covariance of Ri , Rj for i, j = 1, 2, . . . , N , i = j, and let σp denote the
standard deviation of the portfolio with weights w1 , w2 , . . . , wN . Then (5.1)
may be written
Matrix Notation
In the case of N assets, the expressions for the expected return and risk of a
portfolio may be conveniently expressed using matrix notation. Let
⎛ ⎞
R1
⎜ R2 ⎟
⎜ ⎟
R=⎜ . ⎟
⎝ .. ⎠
RN
N
μp = wj μj = wT μ,
j=1
Thus, Σij , the (i, j)th element of Σ, is Cov(Ri , Rj ) for i = j and is Var(Ri ) for
i = j. Because Cov(Ri , Rj ) = Cov(Rj , Ri ), so that σij = σji , Σ is a symmetric
matrix.
A covariance matrix gives a particularly simple way of expressing the vari-
ance of a linear function of a random vector. Let aN denote an N -dimensional
vector, that is, an element of N ; then aT R = j=1 aj Rj and
Var(aT R) = aT Σa.
and
⎛ ⎞
a1 σ21 + a2 σ12 + · · · + aN σ1N
⎜
⎜ a1 σ21 + a2 σ22 + a3 σ23 + · · · + aN σ2N ⎟
⎟
aT Σa = a1 a2 ··· aN ⎜ .. ⎟
⎝ . ⎠
a1 σN 1 + a2 σN 2 + · · · + aN −1 σN,N −1 + aN σ2N
= (a21 σ21 + a1 a2 σ12 + · · · + a1 aN σ1N )
+ (a2 a1 σ21 + a22 σ22 + a2 a3 σ23 + · · · + a2 aN σ1N )
+ · · · + (aN a1 σN 1 + · · · + aN aN −1 σN,N −1 + a2N σ2N ).
Note that, in this sum, each term of the form a2j σ2j occurs once and each term
of the form aj ak σjk , j < k, occurs twice; hence, the sum is equal to
In particular, (5.1) for the variance of the return on the portfolio based on
the weight vector w may be written using matrix notation as
Var(Rp ) = wT Σw.
Cov(aT R, bT R) = aT Ab
Consider the portfolio with the weight vector (0.20, 0.30, 0.10, 0.40)T .
To calculate the mean and variance of the portfolio return in R, we may
use the following commands.
aT Σa = 0 if and only if a = 0N ,
1
where V 2 denotes the matrix square root of V , the diagonal matrix with the
asset standard deviations on the diagonal.
The requirement that Σ be nonnegative definite can be expressed in terms
of C: using the fact that all σj are nonnegative, Σ is nonnegative definite
if and only if C is nonnegative definite. This condition on C is the N -asset
analogue of the requirement that −1 ≤ ρ12 ≤ 1 in the two-asset case.
Similarly, Σ is positive definite if and only if all σj are positive and C
is positive definite, the analogue of the condition that −1 < ρ12 < 1 in the
two-asset case. In particular, the condition that −1 < ρij < 1 for all i = j is
not sufficient for the return covariance matrix Σ to be positive definite.
Diversification
In Section 4.2, we saw the benefits of a diversified portfolio of two assets.
When there are N assets available, the analysis is more complicated but the
potential benefits of diversification are potentially even greater.
Example 5.3 Consider a set of N assets and suppose that the returns on the
assets all have mean μ, standard deviation σ, and that they are uncorrelated.
Therefore, μ, the mean vector of the returns, may be written μ1, and Σ,
the covariance matrix of the returns, is σ2 IN where IN denotes the N × N
identity matrix; when the dimension of the identity matrix is clear from the
context, we will write it simply as I.
Consider an equally-weighted portfolio with weights w1 = w2 · · · = wN =
1/N , which we may write as w = (1/N )1. Then the expected return on the
portfolio is
1 μ μ
wT μ = 1T (μ1) = 1T 1 = N = μ
N N N
and the variance of the portfolio return is
T
T 1 2 1 σ2 σ2
w Σw = 1 (σ I) 1 = 2 1T I1 = ;
N N N N
note that, for any matrix A, 1T A1 is the sum of all elements of A. Thus, the
larger the number of assets under consideration, the smaller is the variance
of the equally-weighted portfolio; that is, the larger the number of assets,
the greater is the benefit of diversification, at least in this simple setting.
Furthermore, when the asset returns are uncorrelated, as we have assumed,
the portfolio variance approaches zero as N increases.
The previous example shows that when there are a large number of assets
with uncorrelated returns, then it is possible to construct a portfolio with
a small standard deviation. The following example shows that having the
returns be uncorrelated is important for this result to hold.
Example 5.4 Consider the same scenario as in Example 5.3, except that now
assume that the correlation of any two asset returns is ρ, where 0 < ρ < 1.
Then,
⎛ ⎞
1 ρ ... ... ρ
⎜ ρ 1 ρ ... ρ ⎟
⎜ ⎟
⎜ ⎟
Σ = σ2 ⎜ .. . . . . . . . . . ... ⎟ .
. (5.4)
⎜ ⎟
⎝ ρ ... ρ 1 ρ ⎠
ρ ... ... ρ 1
Therefore, for any 0 < ρ < 1, σp < σ. However, unlike the case of uncorrelated
assets, if ρ > 0, the standard deviation of the portfolio return cannot be made
arbitrarily small by including more assets.
If ρ is negative, it is possible for σp to be close to zero. However, a negative
value of ρ is generally unrealistic—try to imagine a large number of assets such
that, for any pair, a greater than average return on one corresponds to less
than average returns on the others.
It may be shown that, for Σ of the form (5.4), the minimum possible
variance of a portfolio return is
σ2 1
+ 1− ρσ2
N N
Although the specific results given here depend on the form of Σ, the basic
conclusion holds more generally. That is, when asset returns are positively
correlated, as is usually the case, diversification generally reduces the risk of
a portfolio, but there are limits to its benefits.
σp
FIGURE 5.1
Opportunity set for an N > 2 case.
m
μp
σp
FIGURE 5.2
Minimum risk corresponding to an expected return of m.
are infinitely many portfolios with a given mean return, we are only interested
in the one with the smallest return standard deviation.
Graphically, the portfolio with a given mean return m that has the small-
est standard deviation corresponds to the leftmost value on the line segment
of points (σp , m) in the opportunity set; see Figure 5.2, where the value of
(σp , μp ) for a minimum risk portfolio with mean m is indicated with a dot.
The solid horizontal line represents the possible values of σp corresponding to
μp = m.
The weight vector for this portfolio may be obtained by minimizing
Var(Rp ) subject to the restriction that E(Rp ) = m. That is, we solve a
constrained minimization problem:
minimize wT Σw
w∈N
subject to wT μ = m (5.6)
wT 1 = 1.
The weight vector for the minimum-risk portfolio for any given mean return
m is the solution to a minimization problem of this type. In terms of the
opportunity set of possible pairs (σp , μp ), as shown in Figure 5.1, solving (5.6)
for each value of m finds the left boundary of the set; see Figure 5.3. This
boundary is known as the minimum-risk frontier.
For convenience, we will use the term “minimum-risk frontier” to refer
to risk-expected return pairs of the form (σp , μp ) as well as to refer to the
corresponding portfolios and their weight functions. For instance, the state-
ment that a weight vector is on the minimum-risk frontier means that the
μp
σp
FIGURE 5.3
An example of a minimum-risk frontier.
with equality if and only if either one of x and y is the zero vector or x = cy
for some scalar c.
Zero-Investment Portfolios
Consider a vector v ∈ N such that v T 1 = 0. Then the portfolio based on the
vector v, that is, with return v T R, has no net investment—the coordinates
of v that are greater than zero are offset by one or more coordinates that take
negative values. A portfolio based on weights given by such a vector is said to
be a zero-investment portfolio; the vector v will be called a zero-investment
weight vector, to distinguish such vectors from standard weight vectors that
sum to 1.
Let w1 , w2 be two portfolio weight vectors. Then v = w1 − w2 satisfies
v T 1 = 0 so that w1 − w2 is a zero-investment weight vector. Conversely, if
w is a portfolio weight vector and v is a zero-investment weight vector, then
w + v is also a portfolio weight vector.
Define a set of N -dimensional vectors by
V0 = {v ∈ N : v T 1 = 0, v T μ = 0}. (5.7)
If δ ∈ V0 , then δT R is the return on a zero-investment portfolio that has zero
expected return; in this chapter, δ generally will be used to denote an element
of V0 .
Elements of V0 play a central role in describing the minimum-risk frontier
because if w satisfies the constraints wT 1 = 1 and wT μ = m and v ∈ V0 ,
then w + v also satisfies these constraints:
(w + v)T 1 = wT 1 + v T 1 = 1 + 0 = 1
and
(w + v)T μ = wT μ + v T μ = m + 0 = m.
Conversely, if w1 and w2 are weight vectors satisfying the constraint that
wjT μ = m for j = 1, 2, then w1 − w2 ∈ V0 . Note that V0 is a linear subspace
of N , with dimension N − 2.
and
(ŵ + zδ)T μ = ŵT μ + zδT μ = m̂ + z(0) = m̂.
That is, weight vectors of the form ŵ + zδ satisfy the constraints in the
minimization problem (5.6).
Define
f (z) = (ŵ + zδ)T Σ(ŵ + zδ), z ∈ .
so that
f (z) = 2ŵT Σδ + 2zδT Σδ
and f (0) = 2ŵT Σδ. It follows that ŵT Σδ = 0. Because the value of δ in V0
is arbitrary, we have shown that if the portfolio with weight vector ŵ is on
the minimum-risk frontier, then
Let m̂ = ŵT μ. We need to show that, for any weight vector w satisfying
w μ = m̂,
T
is a scalar,
ŵT Σ(w − ŵ) = (w − ŵ)T Σŵ.
By (5.8), the cross-product term in (5.9) is zero. Therefore,
It follows that wT Σw ≥ ŵT Σŵ. Because this holds for any w satisfying
wT 1 = 1 and wT μ = m̂, it follows that ŵ solves the constrained minimization
problem (5.6) with m = m̂; that is, ŵ is on the minimum-risk frontier.
Corollary 5.1. For j = 1, 2, let R̂pj denote the return on the portfolio with
weight vector ŵj . Suppose that both portfolios are on the minimum-risk
frontier and that E(R̂p1 ) = E(R̂p2 ). Then ŵ1 = ŵ2 .
Cov(Rp , R0 ) = 0.
Cov(R̂p , R1 ) = Cov(R̂p , R2 ).
are on the minimum-risk frontier. Note that a weight vector of this form
may be viewed as the weight vector of a portfolio constructed from the two
portfolios having weight vectors ŵ1 and ŵ2 , respectively.
To establish this result, first note that
using the fact that ŵ1T 1 and ŵ2T 1 are both 1. Furthermore, if ŵjT Σδ = 0 for
all δ ∈ V0 , for j = 1, 2 then for all δ ∈ V0
It now follows from Proposition 5.1 that the portfolio with weight vector
z ŵ1 + (1 − z)ŵ2 is on the minimum-risk frontier.
A formal statement of this result is given in the following lemma. Although
the result in Corollary 5.3 applies to two portfolios, clearly it can be extended
to a portfolio formed from a finite number of portfolios on the minimum-risk
frontier.
Corollary 5.3. Suppose that ŵ1 and ŵ2 are the weight vectors of two port-
folios on the minimum-risk frontier. Then, for any z ∈ , z ŵ1 + (1 − z)ŵ2 is
the weight vector of a portfolio on the minimum-risk frontier.
Let ŵ1 and ŵ2 be the two weight vectors in Corollary 5.3 and let mj =
ŵjT μ, j = 1, 2; that is, the portfolio with weight vector ŵj has mean return
mj , j = 1, 2. Note that the portfolio with weight vector z ŵ1 + (1 − z)ŵ2 has
mean return zm1 + (1 − z)m2; if m1 = m2 , then any real number can be writ-
ten as zm1 + (1 − z)m2 for some z. Therefore, according to Corollary 5.3, the
weight vector of any portfolio on the minimum-risk frontier can be written in
terms of the weight vectors ŵ1 and ŵ2 ; the details are given in the following
result.
Lemma 5.2. Let m1 and m2 denote distinct real numbers; for j = 1, 2, let
ŵj denote the weight vector of the minimum-risk portfolio with mean return
mj . Then, for any given m ∈ ,
wm ŵ1 + (1 − wm )ŵ2
is the weight vector of the minimum-risk portfolio with mean return m, where
m − m2
wm = .
m1 − m2
Lemma 5.2 shows that the entire minimum-risk frontier may be generated
from two portfolios; thus, it is sometimes called the two-fund theorem. This
result shows that, in some respects, portfolio theory for an arbitrary number
of assets is essentially the same as portfolio theory for the case of two assets,
as discussed in the previous chapter.
For instance, the set of possible variances of portfolios on the minimum-
risk frontier is a parabola. It follows that there is a minimum possible variance
of portfolios on the minimum-risk frontier and this minimum is achieved by a
single portfolio, called the minimum-variance portfolio. The properties of the
minimum-variance portfolio will be discussed in the following section. Further-
more, if one portfolio on the minimum-variance frontier has a given variance,
then, unless that variance is the minimum variance, there is a second portfolio
with the same variance, as was illustrated in Figure 4.1.
Example 5.5 Consider the set of four assets described in Example 5.1, with
mean return vector (0.10, 0.20, 0.05, 0.10)T and return covariance matrix
⎛ ⎞
0.05 0.01 0.02 0
⎜0.01 0.10 0.05 0.02⎟
Σ=⎜
⎝0.02
⎟.
0.05 0.20 0.10⎠
0 0.02 0.10 0.20
> mu
[1] 0.10 0.20 0.05 0.10
> Sigma
[,1] [,2] [,3] [,4]
[1,] 0.05 0.01 0.02 0.00
[2,] 0.01 0.10 0.05 0.02
[3,] 0.02 0.05 0.20 0.10
[4,] 0.00 0.02 0.10 0.20
wT Σw
and the constraints are wT 1 = 1 and wT μ = 0.2, which may also be written
as 1T w = 1 and μT w = 0.2.
Therefore, in the notation of solve.QP, d is the zero vector of length 4,
D is 2Σ, A is a 4 × 2 matrix with the first column given by a vector of all
ones and the second column given by the mean vector μ, and b is the vector
(1, 0.1). Thus, the constraint AT w = b specifies that the portfolio weights
sum to 1 and that the mean return on the portfolio is 0.2.
Therefore, the R commands needed to solve this constrained optimization
problem are as follows:
> library(quadprog)
> A<-cbind(c(1,1,1,1), mu)
> t(A)
[,1] [,2] [,3] [,4]
1.0 1.0 1.00 1.0
0.1 0.2 0.05 0.1
> mrf1<-solve.QP(Dmat=2*Sigma, dvec=mu, Amat=A, bvec=c(1, 0.2),
+ meq=2)
Note that cbind combines two vectors or matrices by the columns; when the
vector c(1,1,1,1) is used in this context, it is interpreted as a column vector.
The weight vector that maximizes the objective function is the component
$solution of the result of the function solve.QP; therefore, the solution to
the constrained minimization problem is
> mrf1$solution
[1] 0.362 0.813 -0.374 0.199
Thus, the mean and standard deviation of the return on the portfolio
corresponding to the weight vector mrf1$solution are given by
> sum(mrf1$solution*mu)
[1] 0.2
> (mrf1$solution%*%Sigma%*%mrf1$solution)^.5
[,1]
[1,] 0.265
The calculations are easily repeated for other values of m. For example,
for m = 0.25,
It follows that
f (0) = 2Cov(Rmv , Rp − Rmv )
so that
Cov(Rmv , Rp − Rmv ) = 0;
using properties of covariance,
so that
Cov(Rmv , Rp ) = Var(Rmv ),
as stated in the proposition.
Now suppose that Cov(R̂, Rp ) = Var(R̂) holds for any portfolio return Rp .
Because
Var(Rp ) = Var R̂ + (Rp − R̂)
= Var(R̂) + 2Cov(R̂, Rp − R̂) + Var(Rp − R̂)
= Var(R̂) + 2 Cov(R̂, Rp ) − Var(R̂) + Var(Rp − R̂)
= Var(R̂) + Var(Rp − R̂),
it follows that
Var(Rp ) ≥ Var(R̂).
Because this holds for any portfolio return Rp , R̂ must be the return on
the minimum-variance portfolio.
One consequence of the Proposition 5.2 is that the return on the minimum-
variance portfolio is uncorrelated with the return on any zero-investment
portfolio. Note that, if there were a zero-investment portfolio with weight
vector v such that Rmv and R0 = v T R are correlated, then we could find a
constant c such that the portfolio with return Rmv + cR0 has a smaller return
variance than does Rmv .
The following corollary gives a formal statement of this result; the proof
is left as an exercise.
Corollary 5.4. Let Rmv denote the return on the minimum-variance portfo-
lio and let R0 denote the return on a zero-investment portfolio. Then
Cov(Rmv , R0 ) = 0.
and because wT 1 = 1,
1
wT Σw ≥
1T Σ−1 1
with equality if w = cΣ−1 1; that is, the weight vector of the minimum-
variance portfolio must be of the form cΣ−1 1 for some constant c. Since
the weights must sum to 1, we must have
1
c= ,
1T Σ−1 1
proving the result.
Example 5.6 Consider a set of three assets, with mean returns 0.25, 0.125,
and 0.3, respectively, and suppose that the returns have covariance matrix
⎛ ⎞
0.25 0.1 0.24
Σ = ⎝ 0.1 0.16 0.096⎠ . (5.14)
0.24 0.096 0.36
Hence, the asset returns have standard deviations 0.5, 0.4, and 0.6, respec-
tively, and their correlation matrix is
⎛ ⎞
1 0.5 0.8
⎝0.5 1 0.4⎠ (5.15)
0.8 0.4 1
The mean vector and covariance matrix may be entered into R using the
commands
Example 5.7 Consider a set of N assets with covariance matrix of the form
⎛ ⎞
1 ρ
... ... ρ
⎜ ρ 1ρ ... ρ ⎟
⎜ ⎟
⎜ .. ..
.. .. .. ⎟
Σ=σ ⎜ 2
. .. . . ⎟ (5.16)
⎜ ⎟
⎝ ρ ... ρ 1 ρ ⎠
ρ ... ... ρ 1
1 = Σ−1 Σ1 = (1 + (N − 1)ρ)Σ−1 1
Σ−1 1 1
= 1
1T Σ−1 1 N
so that the equally weighted portfolio is the minimum-variance portfolio.
To find the variance of the minimum-variance portfolio, we use the fact
that
T
1 1 σ2 σ2 1
1 Σ 1 = (N + N (N − 1)ρ) = + 1 − ρσ2 .
N N N2 N N
This is the minimum possible variance for a portfolio based on a return vector
with a covariance matrix of the form (5.16).
σp
FIGURE 5.4
An example of an efficient frontier.
of the minimum-variance portfolio. The following result shows that this is, in
fact, the case.
Lemma 5.3. Suppose there are two distinct portfolios on the minimum-risk
frontier, with returns Rp1 and Rp2 , respectively. Let μj = E(Rpj ), j = 1, 2
and suppose that Var(Rp1 ) = Var(Rp2 ). Then
1
μmv = (μ1 + μ2 )
2
wz = zw1 + (1 − z)w2
d T
wz Σwz = (4z − 2) w1T Σw1 − w2T Σw1 = 0.
dz
The result in (5.17) shows that w1T Σw1 − w2T Σw1 = 0; therefore, 4z − 2 = 0
or z = 1/2.
Thus, the portfolio on the minimum-risk frontier with the smallest variance
has weight vector
1 1
w1 + w2 ;
2 2
because the minimum variance portfolio is on the minimum-risk frontier, we
must have
1 1
w1 + w2 = wmv .
2 2
Furthermore, we must have
1 1
μ1 + μ2 = μmv .
2 2
Clearly, this cannot hold if either μ1 and μ2 are both greater than μmv or if
both μ1 and μ2 are less than μmv . The result follows.
v̄ = Σ−1 (μ − μmv 1)
where
v̄ = Σ−1 (μ − μmv 1) (5.22)
as stated in the proposition.
v̄ T 1 = (μ − μmv 1) Σ−1 1
T
> w_mv
[1] 0.243 0.713 0.044
> m<-sum(w_mv*mu)
> m
[1] 0.163
> vbar<-solve(Sig, mu - m*c(1,1,1))
> vbar
[1] 0.194 -0.607 0.413
and for λ = 4,
Asset 1
Asset 2
Asset 3
0.6
Weight
0.4
0.2
0
1 2 3 4 5
λ
FIGURE 5.5
Weights of the risk-averse portfolio in Example 5.8 as λ varies.
Figure 5.5 contains a plot of the weights for the three assets as λ varies.
Note that the weight for asset 1 is relatively constant, while the weight for
asset 2 is small, or even negative, for λ < 1 and increases rapidly as λ increases
over the range (1, 3). For λ > 5, the weights are stable, being approximately
equal to the weights of the minimum variance portfolio.
For λ = 1, the mean return of the portfolio with weight vector wλ is
> sum((w_mv + vbar)*mu)
[1] 0.260
and the standard deviation of the return is
> (t(w_mv + vbar)%*%Sig%*%(w_mv + vbar))^.5
[,1]
[1,] 0.489
For λ = 4, the mean and standard deviation of the return are 0.187 and 0.386,
respectively. Thus, for small λ, there is less of a penalty on the variance
of return; it follows that the optimal portfolio has a larger return standard
deviation and, hence, a larger mean return. Figure 5.6 contains a plot of the
mean and standard deviation of the return on the portfolio with weight vector
wmv as λ varies.
1.0 Mean
st dev
0.8
Mean or st dev
0.6
0.4
0.2
0
1 2 3 4 5
λ
FIGURE 5.6
Mean and standard deviation of the risk-averse portfolio in Example 5.8 as λ
varies.
Cov(Rmv , v̄ T R) = Cov(wmv
T
R, v̄ T R)
T
= wmv Σv̄
1
= T −1 1T Σ−1 ΣΣ−1 (μ − μmv 1)
1 Σ 1
1
= T −1 (1T Σ−1 μ − μmv 1T Σ−1 1)
1 Σ 1
= 0.
This fact is useful in computing the mean and variance of the return
corresponding to wλ .
Corollary 5.7. Let R denote the return vector for a set of assets, let μ denote
the mean vector of R, and let Σ denote the covariance matrix of R. For a given
value of λ > 0, the portfolio with weight vector wλ , as given in Proposition
5.5, has mean return
1
μλ = μmv + (μ − μmv 1)T Σ−1 (μ − μmv 1)
λ
μλ − μmv
λ= .
σ2λ − σ2mv
Therefore, the value of λ may be chosen to set a desired value for the mean
return of the portfolio above that of the minimum variance portfolio as a
proportion of the variance of the portfolio above that of the minimum variance
portfolio.
It is not surprising that the risk-averse portfolios are on the efficient fron-
tier. However, the converse is also true—every portfolio on the efficient frontier
is a risk-averse portfolio with parameter λ, for some λ > 0. This result is given
in the following proposition.
Proposition 5.6. The portfolio with weight vector wp is on the efficient fron-
tier if and only if either wp = wλ for some λ > 0 or wp = wmv . Here wλ
denotes the weight vector of the risk-averse portfolio with parameter λ, as
defined in Proposition 5.5.
Proof. First suppose that wp = wλ for some λ > 0. Suppose the portfolio with
weight vector wp is not on the minimum risk frontier; then there is a portfolio
with the same mean return but a smaller return variance. However, such a
portfolio would have a smaller value of the risk-aversion criterion (for any
value of λ), which contradicts the fact that the portfolio with weight vector
wp is the risk-averse portfolio with parameter λ. It follows that the portfolio
with weight vector wp is on the minimum risk frontier. By Corollary 5.7,
together with the fact that Σ is positive definite, it follows that
μλ − μmv > 0;
it follows that the portfolio with weight vector wp is on the efficient frontier.
Note that, because the minimum-variance portfolio is on the efficient frontier,
this result also holds if wp = wmv . Therefore, if wp = wλ for some λ > 0 or
wp = wmv , then the portfolio with weight vector wp is on the efficient frontier.
Now suppose the portfolio with weight vector wp is on the efficient fron-
tier, but it is not the minimum-variance portfolio; let μp = E(Rp ) and note
that μp > μmv . According to Corollary 5.7, the risk-averse portfolio with
parameter λ has expected return
1
μλ = μmv + (μ − μmv 1)T Σ−1 (μ − μmv 1).
λ
Because
(μ − μmv 1)T Σ−1 (μ − μmv 1) > 0,
there exists a λp > 0 such that the expected return on the portfolio with
weight vector wλp is μp . Futhermore, the portfolio with weight vector wλp
is on the efficient frontier. By the uniqueness of portfolios on the efficient
frontier, it follows that wp = wλ . Using the fact that the minimum-variance
portfolio is on the efficient frontier, it follows that if the portfolio with weight
vector wp is on the efficient frontier, then either wp = wλ for some λ > 0 or
wp = wmv , proving the result.
minimum-risk frontier with a given mean return may also be used to compute
the weight vector maximizing the risk-aversion criterion function.
Recall that in solve.QP the objective function is of the form
1 T
x Dx − dT x,
2
which is minimized with respect to x. This is equivalent to maximizing the
objective function
1
dT x − xT Dx.
2
The constraint on the weight vector w, wT 1 = 1, is easily included using the
argument Amat to solve.QP. Equality constraints on x are given by AT x = b
for a given matrix A and a given vector b.
The following example illustrates how solve.QP can be used to find the
weight vector wλ .
Example 5.9 Consider the assets described in Example 5.6 and analyzed in
Example 5.8. The assets have mean return vector (0.25, 0.125, 0.3) and return
covariance matrix given by (5.14); these are stored in variables mu and Sig,
respectively.
The same basic approach used in Example 5.5 can be used here, except
that now the objective function is of the form
λ T
wT μ − w Σw
2
and in the present context the only constraint is that the portfolio weights
sum to 1.
Thus, the arguments of solve.QP that define the objective function are
Dmat=lambda*Sig and dvec=mu, where lambda denotes the value of the
risk-aversion parameter λ. To specify the constraint that the weights sum to
1, we take Amat to be matrix(rep(1,3), 3, 1); in this command, rep(1,3)
is a vector consisting of 1 repeated three times and matrix forms a matrix
from that vector. The remaining arguments are bvec=1, which specifies that
the weights sum to 1, and meq=1, which indicates that the constraint is an
equality constraint.
Consider calculation of the weights of the risk-averse portfolio correspond-
ing to λ = 1. These may be obtained using the R command
> library(quadprog)
> ra<-solve.QP(Dmat=Sig, dvec=mu, Amat=matrix(rep(1,3),3,1),
+ bvec=1, meq=1)
> ra$solution
[1] 0.437 0.106 0.457
wT = cΣ−1 (μ − μf 1)
1
c= .
1T Σ−1 (μ − μf 1)
Note that
1T Σ−1 (μ − μf 1) = (μmv − μf )1T Σ−1 1 > 0
using the fact that Σ is positive definite, along with the assumption that
μf < μmv .
The role of the tangency portfolio here is the same as in the N = 2 case:
When constructing a portfolio consisting of risky assets plus the risk-free asset,
all investors should use the tangency portfolio as their portfolio of risky assets.
Example 5.10 Consider the assets described in Example 5.6, with mean
return vector stored in the variable mu and covariance matrix of the returns
stored in Sig:
> mu
[1] 0.250 0.125 0.300
> Sig
[,1] [,2] [,3]
[1,] 0.25 0.100 0.240
[2,] 0.10 0.160 0.096
[3,] 0.24 0.096 0.360
Suppose that the risk-free asset has return μf = 0.01. Then the weight
vector of the tangency portfolio is given by
> sum(w_T*mu)/(w_T%*%Sig%*%w_T)^.5
[,1]
[1,] 0.532
wT (μ − μf 1)
1
(wT Σw) 2
(cw)T (μ − μf 1) wT (μ − μf 1)
1 = 1 . (5.25)
((cw)T Σ(cw)) 2 (wT Σw) 2
wjT (μ − μf 1) > 0, j = 1, 2.
Using (5.25),
w1T (μ − μf 1) w2T (μ − μf 1)
1 ≥ 1
(w1T Σw1 ) 2 (w2T Σw2 ) 2
if and only if
(cw1 )T (μ − μf 1) (dw2 )T (μ − μf 1)
1 ≥ 1 (5.26)
((cw1 )T Σ(cw1 )) 2
((dw2 )T Σ(dw2 )) 2
for any c > 0 and d > 0. That is, in maximizing the Sharpe ratio, it is not
necessary for the weights to sum to 1; recall that this fact was used in the
proof of Proposition 5.7, when we found the vector in N that maximizes the
Sharpe ratio and then rescaled it to sum to 1.
Let
1
c̄ = T
w1 (μ − μf 1)
and
1
d¯ = .
w2T (μ − μf 1)
Then taking c = c̄ and d = d¯ in (5.26), it follows that
w1T (μ − μf 1) w2T (μ − μf 1)
1 ≥ 1
(w1T Σw1 ) 2 (w2T Σw2 ) 2
if and only if
(c̄w1 )T (μ − μf 1) ¯ 2 )T (μ − μf 1)
(dw
1 ≥ 1 . (5.27)
((c̄w1 )T Σ(c̄w1 )) 2
(dw ¯ 2) 2
¯ 2 )T Σ(dw
(c̄w1 )T (μ − μf 1) = 1
and
¯ 2 )T (μ − μf 1) = 1.
(dw
If u1 and u2 satisfy
uTj (μ − μf 1) = 1, j = 1, 2
then
uT1 (μ − μf 1) uT2 (μ − μf 1)
1 ≥ 1
(uT1 Σu1 ) 2 (uT2 Σu2 ) 2
if and only if
uT1 Σu1 ≤ uT2 Σu2 .
That is, we can describe the weight vector of the tangency portfolio as
being proportional to the vector u ∈ N that minimizes
uT Σu (5.28)
1
dT x − xT Dx (5.30)
2
with respect to the vector x, which is subject to equality and inequality
constraints on x. The constraints are of the form
AT x(=, ≥)b
for a matrix A and vector b, where (=, ≥) denotes either equality or inequality
of the form ≥ on a component-wise basis.
Inequality constraints may be included in solve.QP by specifying appro-
priately the arguments Amat, corresponding to the matrix A described earlier,
and bvec, corresponding to the vector b. The argument meq of solve.QP indi-
cates the number of equality constraints; these must correspond to the first
columns of Amat.
The following example illustrates in detail the maximization of the
risk-aversion criterion function subject to the constraint that the weight on
asset j, wj , is nonnegative for each j. That is, the portfolio cannot contain a
short position on any asset. We will write this constraint as w ≥ 0.
Example 5.12 Consider the assets described in Example 5.6, with mean
return vector (0.25, 0.125, 0.3) and covariance matrix given by
⎛ ⎞
0.25 0.1 0.24
Σ = ⎝ 0.1 0.16 0.096⎠ .
0.24 0.096 0.36
Consider maximization of the risk-aversion criterion function (5.29) as dis-
cussed in Example 5.8.
For λ = 0.5, the weight vector wλ is given by
> w_mv + 2*vbar
[1] 0.632 -0.501 0.869,
which includes a substantial short position on asset 2. Hence, we might con-
sider maximizing (5.29) when λ = 0.5, subject to the constraint that all asset
weights are nonnegative.
The arguments to solve.QP are Dmat and dvec, which specify the objective
function as described in (5.30), and Amat, bvec, and meq, which specify the
equality and inequality constraints.
Thus, to maximize the risk-aversion criterion function, dvec is taken to
be μ, the vector of asset means, and Dmat is taken to be λΣ, where Σ is the
covariance matrix of the asset returns. The constraints in our problem are
1T w = 1, w1 ≥ 0, w2 ≥ 0, w3 ≥ 0;
thus, ⎛ ⎞
1 1 1
⎜1 0 0⎟
A =⎜
T
⎝0
⎟
1 0⎠
0 0 1
and
b = (1, 0, 0, 0)T .
The argument meq indicates the number of equality constraints; thus, in
this example, meq = 1, indicating that the constraints are given by
⎛ ⎞ ⎛ ⎞
1 1 1 = 1
⎜1 0 0⎟ ≥ ⎜0 ⎟
⎜ ⎟ ⎜ ⎟
⎝0 1 0⎠ x ≥ ⎝0⎠ .
0 0 1 ≥ 0
> library(quadprog)
> A<-cbind(c(1,1,1), diag(3))
> t(A)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 0 1
> b<-c(1, 0, 0, 0)
> qpsol<-solve.QP(Dmat=(.5)*Sig, dvec=mu, Amat=A, bvec=b, meq=1)
Note that diag(3) returns a 3 × 3 identity matrix and cbind combines two
vectors or matrices by the columns; when the vector c(1,1,1) is used in
this context, it is interpreted as a column vector. The weight vector that
maximizes the objective function is the component $solution of the result of
the function solve.QP; therefore, for this problem, the weight vector is
> qpsol$solution
[1] 0.154 0.000 0.846
> sum(qpsol$solution*mu)
[1] 0.292
> ((qpsol$solution)%*%Sig%*%qpsol$solution)^.5
[,1]
[1,] 0.571
These may be compared to the mean and standard deviation of the return
of the optimizing portfolio that is not subject to the constraint
Holding Constraints
Another example of commonly used constraints are holding constraints of
the form Lj ≤ wj ≤ Uj , j = 1, . . . , N , where Lj and Uj are lower and upper
bounds, respectively, on the proportion of the investment in asset j. These
may also be handled using the function solve.QP.
Example 5.13 Consider the assets analyzed in Example 5.12. The weight
vector that maximizes the risk-aversion criterion function (5.29) for λ = 1 is
given by
> w_mv + vbar
[1] 0.437 0.106 0.457
Suppose we would like our portfolio to allocate between 25% and 75% of
our investment to each asset; that is, we would like to enforce the constraints
The corresponding portfolio has a mean return 0.241 and return standard
deviation of 0.455. These can be compared to the mean return and return
standard deviation of the unconstrained optimal portfolio, given by 0.260 and
0.489, respectively.
w1 + w2 + w3 + w4 + w5 ≥ 0.50.
w1 − w2 ≥ 0.
By properly choosing the argument Amat, solve.QP can solve many con-
strained optimization problems of this type.
w1T (μ − μf 1) w2T (μ − μf 1)
1 ≥ 1
(w1T Σw1 ) 2 (w2T Σw2 ) 2
if and only if
(cw1 )T (μ − μf 1) (dw2 )T (μ − μf 1)
1 ≥ 1 (5.31)
((cw1 )T Σ(cw2 )) 2
((dw2 )T Σ(dw2 )) 2
for any c > 0 and d > 0.
Therefore, the same approach can be used to find the portfolio that max-
imizes the Sharpe ratio under constraints, provided that a weight vector w
satisfies the constraints if and only if cw satisfies the constraints for any c > 0.
Note that this condition is satisfied for nonnegativity constraints of the form
w ≥ 0; however, it is not satisfied for other types of constraints, such as the
holding constraints considered in Example 5.13.
The details are described in the following example.
Example 5.14 Consider the set of four assets used in Example 5.1 and let mu
and Sigma denote the R variables containing the mean vector and covariance
matrix, respectively, of the returns,
> mu
[1] 0.10 0.20 0.05 0.10
> Sigma
[,1] [,2] [,3] [,4]
[1,] 0.05 0.01 0.02 0.00
[2,] 0.01 0.10 0.05 0.02
[3,] 0.02 0.05 0.20 0.10
[4,] 0.00 0.02 0.10 0.20
and take the risk-free rate to be 0.01.
Then the weight vector of the tangency portfolio is given by
> solve(Sigma, mu-0.01)/sum(solve(Sigma, mu-0.01))
[1] 0.482 0.559 -0.223 0.182
Alternatively, we can calculate this weight vector using solve.QP, as described
in Example 5.11.
> wT<-solve.QP(Dmat=2*Sigma, dvec=rep(0,4), Amat=cbind(mu-0.01),
+ bvec=1, meq=1)$solution
> wT/sum(wT)
[1] 0.482 0.559 -0.223 0.182
We can include nonnegativity constraints in the calculation by taking the
argument Amat to be the matrix
> A.shrp<-cbind(mu-0.01, diag(4))
> t(A.shrp)
[,1] [,2] [,3] [,4]
[1,] 0.09 0.19 0.04 0.09
[2,] 1.00 0.00 0.00 0.00
[3,] 0.00 1.00 0.00 0.00
[4,] 0.00 0.00 1.00 0.00
[5,] 0.00 0.00 0.00 1.00
taking bvec to be the vector
> b.shrp<-c(1, rep(0,4))
> b.shrp
[1] 1 0 0 0 0
and taking meq=1. Thus, the portfolio with nonnegative weights that
maximizes the Sharpe ratio is given by
> wT.nn<-solve.QP(Dmat=2*Sigma, dvec=rep(0,4), Amat=A.shrp,
+ bvec=b.shrp, meq=1)$solution
> wT.nn/sum(wT.nn)
[1] 0.425 0.494 0.000 0.081
5.10 Exercises
1. Consider a three-dimensional return vector R with mean vector
given by (0.04, 0.03, 0.05) and covariance matrix given by
⎛ ⎞
0.05 0.05 0.025
⎝ 0.05 0.10 0.08 ⎠ .
0.025 0.08 0.075
Let Rp1 denote the return on the portfolio with weight vector
(1/3, 1/3, 1/3) and let Rp2 denote the return on the portfolio with
weight vector (0.4, 0.4, 0.2).
a. Find the mean and standard deviation of Rp1 ; see Example 5.1.
b. Find the mean and standard deviation of Rp2 .
Cov(Rmv , R0 ) = 0.
Does the converse hold? That is, suppose that a portfolio return
Rp satisfies
Cov(Rp , R0 ) = 0
for the return R0 on any zero-investment portfolio. Does it follow
that Rp is the return on the minimum-variance portfolio? Why or
why not?
7. Let Rmv denote the return on the minimum-variance portfolio and
let Rp denote the return on another portfolio. Find an expression
for
Var(Rp )
Var(Rmv )
in terms of the correlation of Rp and Rmv .
8. Consider a market consisting of N assets and let R denote the
vector of asset returns; let μ denote the vector of mean returns and
let Σ denote the covariance matrix of R.
Let ŵ denote the weight vector of a portfolio on the efficient
frontier and let w̃ denote the weight vector of another portfolio that
is not on the efficient frontier. Suppose that the two portfolios have
the same mean return and let γ = Var(ŵT R)/Var(w̃T R) denote the
ratio of the variances of the portfolio returns; note that, since the
portfolio with weight vector ŵ is on the efficient frontier, 0 < γ < 1.
Find the correlation of returns on the two portfolios as a function
of γ.
9. Let λ1 , λ2 be nonnegative real numbers and let Rj denote the return
on the risk-averse portfolio with the risk-aversion parameter λj ,
j = 1, 2.
Consider the portfolio with a return of the form
Rp = wR1 + (1 − w)R2
11. Consider a set of three assets with the mean return vector and
return covariance matrix as given in Exercise 1.
Using the approach described in Example 5.8, find wmv , the
weight vector of the minimum-variance portfolio, and v̄, the weight
vector of the zero-investment portfolio given in the statement of
Proposition 5.5. Use those results to give the weight vectors of the
risk-averse portfolio with parameters λ = 1 and λ = 5.
12. Consider a set of three assets with the mean return vector and
return covariance matrix as given in Exercise 1.
Find the mean and variance of the return on the risk-averse
portfolio based on a risk-aversion parameter λ.
13. Consider the set of five assets with the mean return vector and
return covariance matrix as given in Exercise 5.
Using the R function, solve.QP, as in Example 5.9, find the
weight vector of the risk-averse portfolio based on the risk-aversion
parameter λ = 1. Find the mean return and return standard
deviation of the portfolio.
14. Consider a vector of asset returns with mean vector μ and covari-
ance matrix Σ. Find an expression for the Sharpe ratio of the
tangency portfolio in terms of μ, Σ, and μf .
15. Consider a market consisting of N assets and let R denote the
vector of asset returns. Let μ denote the vector of mean returns
and let Σ denote the covariance matrix of R. Let μf denote the
return on the risk-free asset.
Find conditions on μ under which the minimum-variance port-
folio is the same as the tangency portfolio.
16. Use the general expression for the weight vector of the tangency
portfolio given in Section 5.7 to derive the expression for the
weight vector of the tangency portfolio for the N = 2 case given
in Section 4.6.
17. Let wT denote the weight vector of the tangency portfolio and let
wλ denote the weight vector of the risk-averse portfolio based on
the risk-aversion parameter λ.
Show that there exists λT > 0 such that wT = wλT and give an
expression for λT .
18. Consider a set of five assets with the mean return vector and return
covariance matrix as given in Exercise 5. Assume that the risk-free
rate of return is μf = 0.01.
Find wT , the weight vector of the tangency portfolio; see
Example 5.10.
19. Consider a set of six assets with return vector R. Suppose that the
mean vector of R − Rf 1 is given by (0.04, 0.08, 0.02, 0.10, 0.03, 0.06)
and that the covariance matrix of R is given by
⎛ ⎞
0.20 0.02 0.03 0.04 0.05 0.06
⎜0.02 0.50 0.06 0.08 0.10 0.12⎟
⎜ ⎟
⎜0.03 0.06 0.20 0.12 0.15 0.18⎟
⎜ ⎟
⎜0.04 0.08 0.12 0.80 0.20 0.24⎟ .
⎜ ⎟
⎝0.05 0.10 0.15 0.20 1.20 0.30⎠
0.06 0.12 0.18 0.24 0.30 0.80
Find wT , the weight vector of the tangency portfolio; see Exam-
ple 5.10.
20. Consider the set of five assets with the mean return vector and
return covariance matrix as given in Exercise 5. Assume that the
risk-free rate of return is μf = 0.01.
Find the Sharpe ratio of the tangency portfolio and compare
it to the Sharpe ratios of the equally-weighted portfolio and the
minimum-variance portfolio.
21. Let wλ denote the weight vector of the risk-averse portfolio with
parameter λ, as given by Proposition 5.5. Write wλ in terms of
wmv , the weight vector of the minimum-variance portfolio, and
wT , the weight vector of the tangency portfolio.
22. For the three assets with the mean return vector and return covari-
ance matrix given in Exercise 1, determine the weight vector
that maximizes the risk-aversion criterion function with parameter
λ = 5, subject to the constraint that all weights are nonnegative;
see Example 5.12.
Find the mean and variance of the return on the resulting port-
folio and compare these to the mean and variance of the return on
the risk-averse portfolio based on λ = 5.
23. Consider the set of five assets with the mean return vector and
return covariance matrix as given in Exercise 5.
Suppose we want to find the portfolio weights that maximize
the risk-aversion criterion function with parameter λ = 1 subject
to the constraints that the portfolio weights are all nonnegative
and that the sum of the weights given to assets 1 and 2 is equal to
the sum of the weights given to assets 4 and 5. That is, in terms
of the weight vector w = (w1 , w2 , w3 , w4 , w5 )T , we want to enforce
the constraints that wj ≥ 0 for j = 1, 2, . . . , 5 and that
w1 + w2 = w4 + w5 .
Find the optimal weight vector and calculate the mean return
and return standard deviation of the corresponding portfolio.
6.1 Introduction
The portfolio theory developed in the previous chapters is based on prop-
erties of the distribution of asset returns, specifically their means, standard
deviations, and correlations. Of course, in practice, these parameters are all
unknown and must be estimated.
The simplest approach to estimating such parameters is to use the cor-
responding sample versions based on historical data; for instance, we can
estimate a mean return by the sample mean of a series of observed returns.
Such methods often work fairly well, particularly when analyzing just a few
assets. However, in many cases, better estimators are available.
In this chapter, several methods of estimating the parameters needed
for portfolio analysis are presented. Other methods, which build on those
discussed here, are covered in the following chapters.
145
T
R̄j = Rj,t .
T t=1
T
(Rj,t − Rf,t ) = R̄j − R̄f
T t=1
where
1
T
R̄f = Rf,t .
T t=1
Now consider estimation of the return standard deviations or the return
variances. Note that here we have a choice—we have defined σ2j to be the vari-
ance of Rj,t ; however, because the return on the risk-free asset has zero
variance, σ2j is also the variance of Rj,t − Rf,t . Therefore, to estimate σ2j ,
we may use either the sample variance of the returns
Rj,1 , Rj,2 , . . . , Rj,T
T
Sj2 = {Rj,t − Rf,t − (R̄j − R̄f )}2 ;
T − 1 t=1
http://www.federalreserve.gov/releases/h15/data.htm
T
Sjk = {Rj,t − Rf,t − (R̄j − R̄f )}{Rk,t − Rf,t − (R̄k − R̄f )}.
T − 1 t=1
The correlation of the returns on asset j and k can be estimated by the sample
correlation, sometimes called the sample correlation coefficient,
Sjk
ρ̂jk ≡ .
Sj Sk
Note that the sample correlation has many of the same properties as the
correlation between two random variables; for example, it takes values in the
interval [−1, 1] and it is not affected by linear transformations of the data.
Example 6.2 Let sbux.ex denote five years of excess monthly returns on
Starbucks stock (symbol SBUX), calculated using the same procedure used
for wmt.ex in Example 6.1. The sample covariance between the excess returns
on Wal-Mart and Starbucks stock may be calculated using the cov function:
Example 6.3 Suppose that the returns on an asset have mean 0.01 and
standard deviation 0.05. Then the√sample mean return R̄j has expected value
0.01 and standard deviation 0.05/ T , where T is the number of observations.
For example, for T = 60, R̄j is approximately normally distributed with mean
0.01 and standard deviation
0.05
√ = 0.00645.
60
2
0.01 ± (0.00645) = 0.00570 and 0.0143.
3
That is, based on a sample of size 60, there is about a 50% chance that the
sample mean return will be within 0.0043 of the true mean return.
T
R̄ = Rt .
T t=1
The vector of mean excess returns, μ − μf 1, may be estimated by the
sample mean vector of the excess returns
⎛ ⎞
R̄1 − R̄f
⎜ R̄2 − R̄f ⎟
⎜ ⎟
R̄E = R̄ − R̄f 1 = ⎜ .. ⎟.
⎝ . ⎠
R̄N − R̄f
To estimate Σ, we can use the sample covariance matrix S, calculated
from either the standard returns or the excess returns; here we use the excess
returns. The sample covariance matrix is an N × N matrix with the (j, k)th
element given by Sj2 if k = j and ρ̂jk Sj Sk if k = j. The same information,
in a form that is easier to interpret, is provided by the asset excess-return
sample standard deviations, S1 , S2 , . . . , SN together with the corresponding
sample correlation matrix, Ĉ, the N × N matrix with ones on the diagonal,
and the (j, k)th element given by ρ̂jk for j = k.
Data Matrix
When describing the sample mean vector and the sample covariance matrix, it
is often convenient to express them in terms of a data matrix. The data matrix
for the excess returns, which we denote here by X, is the T × N matrix with
row t given by the vector of excess returns at time t, (Rt − Rf,t 1)T :
⎛ ⎞
R1,1 − Rf,1 R2,1 − Rf,1 · · · RN,1 − Rf,1
⎜ R1,2 − Rf,2 R2,2 − Rf,2 · · · RN,2 − Rf,2 ⎟
⎜ ⎟
X=⎜ .. .. .. ⎟.
⎝ . . ··· . ⎠
R1,T − Rf,T R2,T − Rf,T ··· RN,T − Rf,T
Thus, the jth column of X is the time series of excess returns on asset j and
the row t of X is the vector of N asset excess returns at time t.
The sample mean vector and the sample covariance matrix have simple
expressions in terms of X. The (column) vector of sample mean excess returns
may be written
1
R̄E = R̄ − R̄f 1 = X T 1T . (6.1)
T
The sample covariance matrix has a particularly simple expression in terms
of X:
1
S= (X − 1T R̄TE )T (X − 1T R̄TE ). (6.2)
T −1
Note that
⎛ ⎞
1
⎜1 ⎟
⎜ ⎟
1T R̄TE = ⎜ . ⎟ R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f
⎝ .. ⎠
1
⎛ ⎞
R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f
⎜R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f ⎟
⎜ ⎟
=⎜ .. .. .. ⎟
⎝ . . ··· . ⎠
R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f
Example 6.6 Consider the returns on the stocks of eight large companies,
Apple (symbol AAPL), Baxter International (BAX), Coca-Cola (KO), CVS
Health Corporation (CVS), Exxon Mobil (XOM), IBM (IBM), Johnson &
Johnson (JNJ), and Walt Disney (DIS). These companies were chosen to
represent large companies from a variety of industries.
For each stock, five years of monthly excess returns were calculated for the
period ending December 31, 2014. In R, each vector of 60 excess returns was
stored as a variable with the name of the stock symbol (e.g., aapl for Apple).
To calculate the parameters of the distribution of the return vector Rt ,
it is convenient to have all of the data stored in a single matrix, with each
column corresponding to a particular stock; this can be done using the cbind
command:
Therefore, the excess returns on Apple stock, for example, have sample mean
0.0254 and sample standard deviation 0.0739.
Of course, one or both of the results of the aforementioned apply function
may be assigned to a variable. For example,
To calculate the sample correlation matrix of the data in the matrix big8,
we use the cor function with the data matrix as the argument.
> cor(big8)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.193 0.260 0.329 0.303 0.319 0.145 0.346
BAX 0.193 1.000 0.310 0.330 0.327 0.381 0.473 0.196
KO 0.260 0.310 1.000 0.310 0.338 0.197 0.493 0.348
CVS 0.329 0.330 0.310 1.000 0.442 0.244 0.421 0.537
XOM 0.303 0.327 0.338 0.442 1.000 0.520 0.408 0.650
IBM 0.319 0.381 0.197 0.244 0.520 1.000 0.206 0.348
JNJ 0.145 0.473 0.493 0.421 0.408 0.206 1.000 0.323
DIS 0.346 0.196 0.348 0.537 0.650 0.348 0.323 1.000
Thus, the excess returns on Apple and Disney stocks have correlation 0.346, for
example. Note that the sample correlation matrix, like all correlation matrices,
is symmetric.
The sample covariance matrix may be calculated using the cov function:
> Smat<-cov(big8)
> Smat
AAPL BAX KO CVS XOM IBM
AAPL 0.005460 0.000794 0.000790 0.001405 0.001028 0.001080
BAX 0.000794 0.003088 0.000709 0.001061 0.000835 0.000969
KO 0.000790 0.000709 0.001694 0.000737 0.000638 0.000371
CVS 0.001405 0.001061 0.000737 0.003339 0.001174 0.000646
XOM 0.001028 0.000835 0.000638 0.001174 0.002109 0.001094
IBM 0.001080 0.000969 0.000371 0.000646 0.001094 0.002099
JNJ 0.000413 0.001016 0.000783 0.000939 0.000723 0.000365
DIS 0.001479 0.000631 0.000830 0.001797 0.001729 0.000923
JNJ DIS
AAPL 0.000413 0.001479
BAX 0.001016 0.000631
KO 0.000783 0.000830
CVS 0.000939 0.001797
XOM 0.000723 0.001729
IBM 0.000365 0.000923
JNJ 0.001491 0.000722
DIS 0.000722 0.003352
Thus, a second way to compute the standard deviations of these eight stocks
is to use the square root of the diagonal elements of Smat. The diagonal
of a square matrix may be extracted using the diag command. Hence, the
estimated standard deviations are given by
> diag(Smat)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579
matching the results obtained previously.
where tr(A) denotes the trace of a matrix A. Recall that the trace of a
matrix is the sum of its diagonal elements; it is also equal to the sum of its
eigenvalues.
for a constant γ, 0 < γ < 1, known as the decay parameter. Thus, the estimator
is a weighted average of the returns Rj,1 , Rj,2 , . . . , Rj,T , with the weights
proportional to γT −t ; the value of γ used controls how quickly the weights
decrease as t decreases.
Note that, under the assumption that Rj,1 , Rj,2 , . . . , Rj,T each has
mean μj ,
T T
γT −t E(Rjt ) t=1 γ
T −t
μj
E(μ̂wj ) = t=1 T = T
= μj
T −t T −t
t=1 γ t=1 γ
so that μ̂wj is an unbiased estimator of μj .
If E(Rj,t ) changes with t, then the weighted estimator will often have
smaller bias than does the sample mean return. For instance, suppose that
E(Rj,t ) = a + b(T − t), t = 1, 2, . . . , T , so that Rj,1 has an expected value
a + b(T − 1) and Rj,T has an expected value a. Then, it is straightforward
to show that R̄j has expected value
T −1
a+b
2
and, using properties of geometric series, that μ̂wj has an expected value
γ γT
a+b − T .
1 − γ 1 − γT
T −1
. 1
γT −t = γt = ; (6.3)
t=1 t=0
1−γ
it follows that
T T −t 2
T 1
t=1 (γ ) (γ2 )T −t . 1−γ2 1−γ
T = t=1 = = ;
( t=1 γT −t )2
T
( t=1 γT −t )2 1
2 1+γ
1−γ
and, hence,
. 1−γ 2
Var(μ̂wj ) = σ .
1+γ j
The value of the weighted estimator changes with the value of the decay
parameter γ and, as γ approaches one, the weighted estimator approaches the
sample mean.
The EWMA approach can also be applied to variances. For a given value
of γ, the weighted estimator of the return variance is given by
T
γT −t (Rj,t − μ̂wj )2
σ̂2wj = t=1
T (6.4)
T −t
t=1 γ
> wgt<-(0.9^(59:0))/sum(0.9^(59:0))
This weight vector can be used to calculate the weighted estimate of the
mean return on Wal-Mart stock using the command
and the weighted estimate of the return variance for Wal-Mart stock is given by
Example 6.9 Consider Example 6.6, which considered the returns on the
stocks of eight large companies. Recall that the data for this example are
stored in the variable big8.
The calculations needed to estimate the mean vector and covariance matrix
of the excess returns may all be performed using the cov.wt function, which
calculates weighted estimates of the mean vector and covariance matrix based
on a data matrix.
Suppose we take the decay parameter to be γ = 0.97; the corresponding
weight vector may be calculated using the command
> wgt<-0.97^(59:0)
Note that, for use in the function cov.wt, the weight vector does not need to
be standardized to sum to 1.
To calculate the weighted estimates of the mean return vector and
covariance matrix, we use the command
> big8.wt<-cov.wt(big8, wt=wgt, method="ML")
The default value for the method argument is unbiased, which applies a
multiplicative adjustment to the result to yield an unbiased estimator of the
true covariance matrix, analogous to dividing by T − 1 instead of by T when
calculating the unweighted sample covariance matrix. Specifying method="ML"
returns an estimate that corresponds to (6.4) and does not include such an
adjustment.
The result of cov.wt is a list containing several components. For example,
the estimated mean returns are given in the component $center of big8.wt:
> big8.wt$center
AAPL BAX KO CVS XOM IBM JNJ
0.024478 0.008672 0.009056 0.025728 0.006367 0.000341 0.013607
DIS
0.023101
These may be compared to the unweighted sample means, as calculated earlier.
> Rbar
AAPL BAX KO CVS XOM IBM JNJ DIS
0.02540 0.00740 0.00978 0.02119 0.00825 0.00598 0.01153 0.02075
In most cases, the weighted and unweighted estimates are similar; however,
for IBM, the weighted estimate is much smaller than the weighted estimate,
suggesting that the returns on IBM stock may be decreasing over time.
The estimated covariance matrix is available in the component $cov.
Hence, the estimated weighted sample standard deviations may be obtained
by using the diag function, which returns the diagonal elements of a matrix.
> diag(big8.wt$cov)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0720 0.0462 0.0427 0.0511 0.0434 0.0473 0.0374 0.0505
0.10
0.05
0
Return
−0.05
−0.10
−0.15
−0.20
0 10 20 30 40 50 60
Time
FIGURE 6.1
Plot of excess returns on Baxter stock.
> cov2cor(Sighat)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.193 0.260 0.329 0.303 0.319 0.145 0.346
BAX 0.193 1.000 0.310 0.330 0.327 0.381 0.473 0.196
KO 0.260 0.310 1.000 0.310 0.338 0.197 0.493 0.348
CVS 0.329 0.330 0.310 1.000 0.442 0.244 0.421 0.537
XOM 0.303 0.327 0.338 0.442 1.000 0.520 0.408 0.650
IBM 0.319 0.381 0.197 0.244 0.520 1.000 0.206 0.348
JNJ 0.145 0.473 0.493 0.421 0.408 0.206 1.000 0.323
DIS 0.346 0.196 0.348 0.537 0.650 0.348 0.323 1.000
Now consider the estimator of μ based on the assumption that all asset
mean returns are equal: μ1 = μ2 = · · · = μN ≡ μ so that the mean vector μ
is of the form ⎛ ⎞
μ
⎜μ⎟
⎜ ⎟
μ = ⎜.⎟.
⎝ .. ⎠
μ
Note that, since the same risk-free rate applies to each asset, the model in which
the μj are equal is equivalent to one in which the excess mean returns are equal.
Under this model, the common mean μ can be estimated by the sample
mean of R̄1 , R̄2 , . . . , R̄N ;
1
N
μ̂ = R̄j .
N j=1
T
μ̂ = R̄·t
T t=1
where R̄·t is the sample mean of all the asset returns in time period t:
N
R̄·t = Rj,t .
N j=1
N
E(μ̂) = μi
N i=1
N
E(μ̂) − μj = μi − μj .
N i=1
ψμ̂ + (1 − ψ)R̄
N
(R̄j − μ̂)2 ;
j=1
N
τ2 = (R̄j − μ̂)2 .
N j=1
Sj2 is the sample return variance for asset j. Hence, the average estimated
variance of a sample mean return is given by S 2 /T where
1
2
N
S2 = S .
N j=1 j
These results may then be used to calculate the weight ψ and the vector
of estimated asset mean returns.
> psi<-(mean(S2)/60)/(tau2 + (mean(S2)/60))
> psi
[1] 0.491
> muhat<-psi*mean(Rbar) + (1-psi)*Rbar
> muhat
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0197 0.0105 0.0117 0.0176 0.0110 0.0098 0.0126 0.0173
Thus, the estimated mean returns for the eight stocks are weighted averages
of the individual sample mean returns and the average mean return for all
stocks.
Of course, we do not know which estimator, R̄ or the shrinkage estimator μ̂S
is the more accurate estimator; this issue will be considered in Section 6.7.
where S is the sample covariance matrix. The same basic reasoning used in
estimating a mean return applies here as well: By taking a weighted aver-
age of Σ̂ and S, we hope to form an estimator with the best properties of
each.
The remaining issue is selection of ψ. A value of ψ close to one leads to
an estimator that is close to the model-based estimator Σ̂, while for a value
close to zero the resulting estimator is close to the sample covariance matrix.
Although the details are fairly complicated, the basic idea in choosing the
value of ψ used in the shrinkage estimate of the covariance matrix is the same
as that used in estimating the asset return means. Let a2 denote an estimate
of the squared distance between Σ and σ2 I and let b2 denote a measure of
the variability in S as an estimator of Σ. Then the optimal value of ψ is of
the form b2 /(a2 + b2 ).
To calculate a shrinkage estimate of a covariance matrix in R, using
this model, we may use the function shrinkcovmat.equal, available in the
package ShrinkCovMat (Touloumis 2015).
> library(ShrinkCovMat)
> cov.shrnk<-shrinkcovmat.equal(t(big8))
> cov.shrnk$Sigmahat
AAPL BAX KO CVS XOM
AAPL 0.00503 0.00067 0.00066 0.00118 0.00086
BAX 0.00067 0.00305 0.00059 0.00089 0.00070
KO 0.00066 0.00059 0.00188 0.00062 0.00053
CVS 0.00118 0.00089 0.00062 0.00326 0.00098
XOM 0.00086 0.00070 0.00053 0.00098 0.00223
IBM 0.00091 0.00081 0.00031 0.00054 0.00092
JNJ 0.00035 0.00085 0.00066 0.00079 0.00061
DIS 0.00124 0.00053 0.00070 0.00151 0.00145
> cov.shrnk$Target
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00283 0.00000 0.00000 0.00000 0.00000
[2,] 0.00000 0.00283 0.00000 0.00000 0.00000
[3,] 0.00000 0.00000 0.00283 0.00000 0.00000
[4,] 0.00000 0.00000 0.00000 0.00283 0.00000
[5,] 0.00000 0.00000 0.00000 0.00000 0.00283
[6,] 0.00000 0.00000 0.00000 0.00000 0.00000
[7,] 0.00000 0.00000 0.00000 0.00000 0.00000
[8,] 0.00000 0.00000 0.00000 0.00000 0.00000
[,6] [,7] [,8]
[1,] 0.00000 0.00000 0.00000
[2,] 0.00000 0.00000 0.00000
[3,] 0.00000 0.00000 0.00000
[4,] 0.00000 0.00000 0.00000
[5,] 0.00000 0.00000 0.00000
[6,] 0.00283 0.00000 0.00000
[7,] 0.00000 0.00283 0.00000
[8,] 0.00000 0.00000 0.00283
> cov.shrnk$lambdahat
[1] 0.162
These results may be compared to the sample variances and the sample
correlation matrix.
Note that the shrinkage estimates of standard deviation are all closer to
the average sample variance 0.0532 and the shrinkage correlation estimates
are all closer to zero than are the estimates based on the sample covariance
matrix.
S −1 (R̄ − R̄f 1)
.
1 S −1 (R̄ − R̄f 1)
T
Thus, w_mv may be viewed as an estimate of the weight vector of the minimum-
variance portfolio constructed from the stocks represented in the data matrix
big8.
Alternatively, we could use a shrinkage estimator of the covariance matrix
to estimate the weights of the minimum-variance portfolio. Recall that the
shrinkage estimate based on a target matrix of the form σ2 I is stored
in the matrix cov.shrnk$Sigmahat; see Example 6.12. The corresponding
estimate of the minimum-variance portfolio weight vector is given by
The two estimates are generally similar but there are some differences; for
instance, there are no negative weights in the shrinkage estimate.
For the weights of the tangency portfolio, the estimates based on the
sample mean returns and the sample covariance matrix are given by
and the corresponding estimates based on the shrinkage estimates are given by
Here the differences in the weights are greater than we saw for the
weights of the minimum-variance portfolio. In general, the weights are less
extreme, with the largest differences occurring for the largest positive and
largest negative weights. This is not surprising given the nature of shrinkage
estimates.
λ T
wT μ − w Σw,
2
we can maximize the estimator of the criterion function given by
λ T
wT R̄ − w Sw.
2
Example 6.14 Consider estimating the weight vector of a risk-averse port-
folio of the stocks represented in the data matrix big8. Recall that, given
the mean vector and covariance matrix of the returns on the assets under
consideration, the weight vector of the risk-averse portfolio may be obtained
as the solution to a quadratic programming problem. In R, we can calculate
such a solution using the function solve.QP in the package quadprog; see
Example 5.9.
The following commands can be used to estimate the portfolio weights of
the risk-averse portfolio with parameter λ = 5 for the stocks represented in
the data matrix big8.
> library(quadprog)
> mu8<-apply(big8, 2, mean) + mean(rfree)
> A1<-cbind(rep(1, 8))
> ra8.5<-solve.QP(Dmat=5*Smat, dvec=mu8, Amat=A1, bvec=1,
+ meq=1)$solution
> ra8.5
[1] 0.606 -0.213 -0.243 0.532 -1.007 -0.281 0.702 0.905
Here mu8 is the vector of sample mean returns; the matrix big8 contains
excess returns so that the sample mean of the risk-free rate must be added
back in.
Note that the estimated weight vector contains large short positions on
four stocks. Hence, we might consider enforcing the requirement that all asset
weights be nonnegative. The function solve.QP can also be used to calculate
the weight vector that maximizes the risk-aversion criterion subject to such a
restriction; see Example 5.12.
> A2<-cbind(rep(1, 8), diag(8))
> b2<-c(1, rep(0, 8))
> ra8.5.nn<-solve.QP(Dmat=5*Smat, dvec=mu8, Amat=A2, bvec=b2,
+ meq=1)$solution
> round(ra8.5.nn, 5)
[1] 0.392 0.000 0.000 0.343 0.000 0.000 0.000 0.265
The function round is used here so that values very close to 0 are written as 0.
Thus, with the nonnegativity constraint, only three stocks are represented
in the risk-averse portfolio with λ = 5: AAPL, CVS, and DIS. It is inter-
esting to note that the weight of JNJ, which is 0.702 in the unconstrained
risk-averse portfolio, is zero in the constrained portfolio.
according to a distribution with mean 0.01 and standard deviation 0.05. In this
example, we perform a similar analysis, using Monte Carlo simulation in place
of the theoretical properties discussed in Section 6.2.
To simulate a sequence of 60 such returns, we may use the command rnorm.
Specifically,
> ret<-rnorm(60, mean=0.01, sd=0.05)
draws 60 random variables independently from a normal distribution with
mean 0.01 and standard deviation 0.05 and places the result in the variable
ret, a vector of length 60. The sample mean of this vector,
> mean(ret)
[1] 0.0178
represents
a sample of size one from the distribution of the random variable
R̄j = Tt=1 Rj,t /60, where Rj,1 , Rj,2 , . . . , Rj,60 have the distribution described
previously.
Of course, a sample of size one does not provide much information. Hence,
we are generally interested in a large sample of such simulated sample means.
To simulate many sample means at one time, we may use the following proce-
dure. We begin by simulating a matrix of returns, where each row corresponds
to a vector of 60 returns.
> ret_mat<-matrix(rnorm(60*10000, mean=0.01, sd=0.05), 10000, 60)
Here the function rnorm simulates 600,000 independent returns, each normally
distributed with mean 0.01 and standard deviation 0.05, and the function
matrix arranges these in a 10, 000 × 60 matrix.
Using apply, we may now calculate the sample mean of each row:
> ret_mean<-apply(ret_mat, 1, mean)
The variable ret_mean now contains a sample of size 10, 000 from the dis-
tribution of R̄j . We may now analyze ret_mean as we would any sample of
observations. For instance,
> mean(ret_mean)
[1] 0.01002
> sd(ret_mean)
[1] 0.00642
are estimates of the mean and standard deviation, respectively, of the sampling
distribution of R̄j .
Recall that, in this scenario, the mean and standard deviation of this sam-
pling distribution may be calculated exactly and, in Example 6.3, they were
shown to be 0.01 and 0.00645, respectively. Thus, the Monte Carlo estimates
closely match the true values; generally speaking, even closer agreement could
be obtained by increasing the number of Monte Carlo replications. That is, we
can achieve closer agreement by increasing the number of rows in the matrix
ret_mean.
Other properties of the sampling distribution of R̄j may be found in the
same way. For instance,
> summary(ret_mean)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.01260 0.00571 0.01003 0.01002 0.01440 0.03110
Thus, estimates of the upper and lower quartiles of the sampling distribution of
R̄j are given by 0.00571 and 0.0144, respectively. Recall that, in Example 6.3,
we approximated these by 0.00570 and 0.0143, respectively. The advantage of
the Monte Carlo method is that these values were obtained using only numer-
ical methods, without using any analytical approximations. One drawback of
the Monte Carlo method is that, if the analysis is repeated, different results
will be obtained, although if a large Monte Carlo sample size is used, such
as the 10, 000 used here, the differences are generally slight. For instance, if
the calculations described previously are repeated, the sample mean and stan-
dard deviation of the simulated return sample means are 0.01009 and 0.00643,
respectively.
A histogram of the simulated mean returns gives some information about
the shape of the sampling distribution; such a plot can be produced by the
command hist(ret_mean). The result is given in Figure 6.2, and it supports
the conclusion of the CLT that the sampling distribution of R̄j is approxi-
mately normal. A normal probability plot, as discussed in Section 2.5, could
also be considered.
Histogram of ret_mean
2500
2000
Frequency
1500
1000
500
0
−0.01 0 0.01 0.02 0.03
ret_mean
FIGURE 6.2
Histogram of simulated sample mean returns.
Given the form of the estimator—a sample mean—along with the distri-
bution of the returns, which are assumed to be independent and normally
distributed, it is not surprising that the distribution of R̄j is approximately
normal; of course, in this case, it is well-known that the distribution of R̄j
is exactly normal. A more extensive study of this type would likely change
the assumed distribution of the returns in order to study the effect of distri-
butional assumptions on the properties of the estimator. With Monte Carlo
simulation, such changes are generally easy to implement.
Example 6.16 Consider the Monte Carlo simulation described in Example
6.15, which analyzed the properties of the sample mean return based on the
observation of 60 returns, independently distributed according to a distri-
bution with mean 0.01 and standard deviation 0.05, but now suppose that
the observations are not normally distributed. Empirical studies have shown
that the t-distribution is often useful for modeling return data; thus, here we
assume that the returns follow a t-distribution with six degrees of freedom,
with location and scale parameters chosen to achieve the desired mean and
standard deviation.
Changing the distribution used in the Monte Carlo simulation from the
normal distribution to the t-distribution focuses on changing the function
rnorm used in Example 6.15 to the function rt, which simulates observations
from a t-distribution. However, the properties of the t-distribution are such
that some care is needed in order to achieve the required results.
The degrees of freedom in rt is specified by the argument df; for example,
rt(1, df=6) returns one observation from a t-distribution with six degrees
of freedom. The complications arise when specifying the mean and standard
deviation of the distribution. A random variable with a t-distribution with ν
degrees of freedom has mean 0 and variance ν/(ν − 2), provided that ν > 2;
thus, a random variable with a √ t-distribution with six degrees of freedom has
mean 0 and standard deviation 1.5. It follows that to draw a random sample
of size n from a t-distribution with mean 0.01, standard deviation 0.05, and
six degrees of freedom, we use the command
rt(n, df=6)*(0.05/(1.5^.5))+0.01
Therefore, to repeat the Monte Carlo study described in Example 6.15,
but with a t-distribution with six degrees of freedom replacing the standard
normal distribution, we use the commands
> ret_mat_t<-matrix(rt(60*10000, df=6)*(0.05/(1.5^.5))+0.01,
+ 10000, 60)
> ret_mean_t<-apply(ret_mat_t, 1, mean)
> mean(ret_mean_t)
[1] 0.00999
> sd(ret_mean_t)
[1] 0.00641
> summary(ret_mean_t)
Note that the results are generally similar to those obtained in Example
6.15, suggesting that the sampling distribution of the sample mean is close to
normal even if the returns follow a t-distribution. This is not surprising given
that each sample mean is based on 60 observations; the approximation given
by the CLT tends to be accurate for that sample size.
In the examples considered thus far in this section, Monte Carlo simulation
was applied to an estimator whose properties may be determined using ana-
lytic methods. However, the Monte Carlo method is most useful when such
analytic results are not readily available. Many such cases occur when the
estimator under consideration is a function of the returns on several assets,
so that the correlation structure of the returns plays a role in the sampling
distribution.
Thus, each of the three component random variables in the random vector
has variance 0.2 and the correlation between any two such random variables
is 0.5.
The following command generates one sample from this distribution.
> library(MASS)
> mu0<-c(0.1,0.2,0.3)
> Sig0<-matrix(c(0.2,0.1,0.1,0.1,0.2,0.1,0.1,0.1,0.2), 3, 3)
> Sig0
[,1] [,2] [,3]
[1,] 0.2 0.1 0.1
[2,] 0.1 0.2 0.1
[3,] 0.1 0.1 0.2
> mvrnorm(n=1, mu=mu0, Sig=Sig0)
[1] -0.3335 0.0529 0.0982
For n = 1, the result of the function is a vector; for n > 1, the result is a
matrix, with each row corresponding to a simulated random vector. For exam-
ple, to draw four random vectors from the multivariate normal distribution
with mean vector mu0 and covariance matrix Sig0, we use the command
Therefore, if the returns on a set of three assets are modeled as random vectors
with mean mu0 and covariance matrix Sig0, and the return vectors in different
time periods are independent, then the matrix ret.sim represents the returns
on the three assets over four time periods, with each row representing a time
period. Hence, ret.sim is a simulated data matrix of the type described in
Section 6.3 and is similar to the observed data matrix big8.
Simulated values of functions of these returns may be calculated in the
usual way. For instance, simulated return sample means for the three assets
are given by
Now consider replications of this procedure. For the case of a single asset, in
which the returns are given in a vector, independent replications of the returns
may be stored in a matrix, with each row corresponding to a series of asset
returns. When simulating a vector of returns, one replication of the simulation
is already a matrix. Although it is possible to handle the replications using a
three-dimensional array, a generalization of a matrix (an interesting exercise,
for those so inclined), the simplest approach is to use a loop.
For instance, to obtain 10,000 simulated values of the return sample means
for the three assets in the example, we can use the following commands:
ret_means<-matrix(0, 10000, 3)
for (j in 1:10000){
+ ret_sim<-mvrnorm(4, mu=mu0, Sig=Sig0)
+ ret_means[j, ]<-apply(ret_sim, 2, mean)
+ }
The command ret_means<-matrix(0, 10000, 3) creates a matrix that will
store the results. Each iteration of the loop simulates a vector of sample means
of the asset returns and stores them in a row of ret_means.
The results in ret_means may now be used to estimate properties of the
sampling distribution. For instance, the mean of the sampling distribution of
Rt is estimated by
> apply(ret_means, 2, mean)
[1] 0.0967 0.2007 0.2991
which is close to the known mean vector (0.1, 0.2, 0.3)T .
Many statistical methods tend to work particularly well for normally dis-
tributed data. Thus, in conducting a Monte Carlo study, it is often of interest
to consider other distributions in addition to the multivariate normal.
The multivariate t-distribution is a multivariate generalization of the t-
distribution; its relationship to the univariate t-distribution is similar to the
relationship the multivariate normal distribution has to the univariate normal
distribution.
To simulate random variates with a multivariate t-distribution in R, we
use the function rmvt, which is available in the package mvtnorm (Genz et al.
2016). In the function rmvt, we may specify the number of samples (the
argument n), the degrees of freedom (the argument df), and the “scale” matrix
(the argument sigma). Let A denote the scale matrix; then the covariance
matrix of the distribution is given by Σ = A(ν/(ν − 2)), where ν denotes the
degrees of freedom of the distribution. The following example illustrates the
use of rmvt.
Example 6.18 Consider a three-dimensional random vector with the mean
vector and covariance matrix given in Example 6.17 and stored in the R
variables mu0 and Sig0, respectively. Furthermore, assume that the random
vector has a multivariate t-distribution.
> library(mvtnorm)
> rmvt(n=1, df=6, sigma=Sig0)/(1.5^.5) + mu0
[,1] [,2] [,3]
[1,] 0.172 0.677 0.0995
Note that rmvt draws a random vector from a multivariate distribution with
mean vector 0; adding the vector mu0 to the result modifies the mean vector
to mu0.
To simulate several random vectors, we can increase the value of n. How-
ever, in order to add the mean vector to the simulated random vectors, we
must construct a matrix of mean vectors. For instance, suppose that we wish
to simulate four random vectors. Then
returns a data matrix with the correct covariance matrix but each value is
simulated from a distribution with mean zero.
To add the mean vector to the result of rmvt, note that
matrix(mu0, 4, 3, byrow=T) returns a matrix with each row equal to mu0:
and verify that the sample mean vector and sample covariance matrix are
close to mu0 and Sig0, respectively.
Σ−1 (μ − μf 1)
,
1 Σ−1 (μ − μf 1)
T
Example 6.19 Consider two assets such that the excess return on the first
asset has mean 0.025 and variance 0.0055 and the excess return on the second
asset has mean 0.010 and variance 0.0017; take the covariance of the returns
to be 0.0008. These are the observed values (slightly rounded) for Apple stock
and Coca-Cola stock based on the big8 data. Define variables mu1 and Sig1
to represent the mean vector and covariance matrix of the excess returns:
> mu1<-c(0.025, 0.010)
> Sig1<-matrix(c(0.0055, 0.0008, 0.0008, 0.0017), 2, 2)
Thus, the tangency portfolio has weights
> solve(Sig1, mu1)/sum(solve(Sig1, mu1))
[1] 0.496 0.504
so that this portfolio places roughly half its investment on each asset.
Simulating the data according to a multivariate normal distribution, as
described previously in this section, we may draw a sample from the sampling
distribution of the weight on Apple stock (note that, since the weights sum
to 1, we only need to consider one of the weights).
> wgts<-rep(0, 10000)
> sharpe<-rep(0, 10000)
> for (j in 1:10000){
+ ret_sim<-mvrnorm(60, mu=mu1, Sig=Sig1)
+ mean_sim<-apply(ret_sim, 2, mean)
+ sig_sim<-cov(ret_sim)
+ wgt_sim<-solve(sig_sim, mean_sim)/sum(solve(sig_sim,
+ mean_sim))
+ wgts[j]<-wgt_sim[1]
+ sharpe[j]<-sum(mean_sim*wgt_sim)/((wgt_sim%*%sig_sim%*%
+ wgt_sim)^.5)
+ }
The vector wgts contains a sample of size 10, 000 from the sampling distribu-
tion of wT , the weight on Apple stock in the tangency portfolio of Apple and
Coca-Cola stocks, each based on 60 observations; the vector sharpe contains
the estimated Sharpe ratio corresponding to the estimated tangency portfolio.
The sampling distribution of wT can be summarized using the usual functions;
for example,
> mean(wgts)
[1] 0.64
> median(wgts)
[1] 0.487
Hence, although the median weight of 0.487 is close to the true weight of
0.496, the mean weight is considerably larger than the true weight, suggesting
a skewed distribution.
The analysis in Example 6.19 is based on the assumption that the returns
in a given period have a multivariate normal distribution; we may repeat
the analysis using the assumption that the returns have a multivariate t-
distribution with six degrees of freedom.
Example 6.20 Consider the two assets described in Example 6.19, with mean
excess return vector given in mu1 and return covariance matrix given in Sig1.
Simulating the data according to a multivariate t-distribution with six
degrees of freedom, as described previously in this section, we may draw a
sample from the sampling distribution of the weight on Apple stock using the
following commands.
> wgts.t<-rep(0, 10000)
> sharpe.t<-rep(0, 10000)
> for (j in 1:10000){
+ ret_sim<-rmvt(60, df=6, sigma=Sig1)/(1.5^.5) + matrix(mu1,
+ 60, 2, byrow=T)
+ mean_sim<-apply(ret_sim, 2, mean)
+ sig_sim<-cov(ret_sim)
+ wgt_sim<-solve(sig_sim, mean_sim)/sum(solve(sig_sim,
+ mean_sim))
+ wgts.t[j]<-wgt_sim[1]
+ sharpe.t[j]<-sum(mean_sim*wgt_sim)/((wgt_sim%*%sig_sim%*%
+ wgt_sim)^.5)
+ }
The vector wgts.t contains a sample of size 10,000 from the sampling dis-
tribution of the weight on Apple stock in the tangency portfolio of Apple and
Coca-Cola stocks under the assumption that the returns have a t-distribution.
Thus, the sampling distribution of this weight may be summarized by its
quantiles:
> quantile(wgts.t, prob=probvec)
1% 5% 10% 25% 50% 75% 90% 95% 99%
-0.638 0.085 0.180 0.320 0.486 0.710 1.054 1.418 3.487
Note that the results are very similar to those in Example 6.19.
The quantiles of the sampling distribution of the maximum Sharpe ratio
are given by
> quantile(sharpe.t, prob=probvec)
1% 5% 10% 25% 50% 75% 90% 95% 99%
-0.060 0.187 0.236 0.315 0.404 0.500 0.588 0.641 0.746
Again, these results are similar to those in Example 6.19.
The results of the previous two examples suggest that tangency weights
based on sample data are not well-determined. There are two general reasons for
this. One is the variability in the estimates of the mean vector and the covari-
ance matrix, as we have discussed in this section. The other is that, although the
tangency portfolio maximizes the Sharpe ratio, many portfolios have a Sharpe
ratio close to the maximum value and, hence, the tangency portfolio itself is not
very well-defined. This fact may be illustrated in the following example.
Example 6.21 Consider the framework in Example 6.19. Recall that the tan-
gency portfolio, which maximizes the Sharpe ratio, places weight 0.496 on
Apple stock. Figure 6.3 contains a plot of the Sharpe ratio of a portfolio of
0.40
0.35
Sharpe ratio
0.30
0.25
0.20
0 0.2 0.4 0.6 0.8 1.0
w
FIGURE 6.3
Plot of the Sharpe ratio versus w in Example 6.21.
+ sample_means[j, ]<-Rb
+ }
The matrix shrnk_means contains a sample of 10, 000 from the sampling dis-
tribution of the vector of shrinkage estimates of the mean asset returns, with
one set of estimates in each row of the matrix. The matrix sample_means
contains the sample means from each iteration in a similar format.
The results in shrnk_means and sample_means may be summarized using
the usual functions. For instance, the mean vector of the sampling distribution
of the shrinkage estimator may be obtained by
Therefore, for each asset, the estimated bias is much greater than its stan-
dard error; for instance, the ratios of the estimated biases to their standard
errors, that is, the t-statistics for testing the hypothesis of no estimator bias,
are given by
Hence, based on these results, we conclude that the shrinkage estimators are
biased; this is to be expected because the shrinkage estimate of an asset mean
return is a weighted average of the sample mean return for that asset and the
sample mean return for all assets.
A similar analysis can be done on the sample mean returns. The estimated
biases are given by
and the ratios of the estimated biases to their standard errors are
where bias(θ̂) = E(θ̂) − θ is the bias of the estimator. Hence, the MSE com-
bines the bias of an estimator with the variability of its sampling distribution,
as measured by the variance. Often the square root of the MSE is reported (the
“root mean squared error” or RMSE), for the same reason that the standard
deviation is often preferred to the variance as a measure of variability.
For the shrinkage estimators, the MSE may be estimated using the results
in shrnk_means.
> shrnk.mse^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.00904 0.00613 0.00463 0.00699 0.00546 0.00575 0.00442 0.00689
The ratios of the RMSE values for the shrinkage estimators to the RMSE
values for the sample mean returns are given by
> sort((shrnk.mse/mean.mse)^.5)
BAX JNJ KO XOM DIS CVS AAPL IBM
0.858 0.876 0.887 0.917 0.921 0.929 0.948 0.980
Note that the sort function puts the vector in increasing order.
Thus, for all of the stocks, the shrinkage estimator has a smaller esti-
mated RMSE than does the sample mean estimator. That is, the biases of the
shrinkage estimators are more than offset by their smaller standard deviations,
leading to more accurate estimates, on average.
It is important to keep in mind that the analysis in this example, like all
analyses based on Monte Carlo simulation, are simply estimates of the true
properties of the estimators under consideration. In particular, if the Monte
Carlo analysis is repeated, the results will change; however, with a large Monte
Carlo sample size, such as the 10, 000 used here, the changes tend to be small.
For instance, if the analysis in this example is repeated, the new estimates of
the ratios of the RMSE values are given by
BAX JNJ KO DIS XOM CVS AAPL IBM
0.865 0.880 0.888 0.915 0.919 0.920 0.938 0.979
Note that, although the ratios have all changed from the original estimates,
the changes are minor and the general conclusions regarding the estimators
do not change.
6.9 Exercises
1. Consider the returns on two stocks, Papa John’s International, Inc.
(symbol PZZA), and Bed Bath & Beyond, Inc. (BBBY). For each
stock, calculate five years of monthly returns for the period ending
December 31, 2015.
a. Calculate the sample mean of the returns on each stock.
b. Calculate the sample standard deviation of the returns on each
stock.
c. Calculate the sample correlation of the returns on the two stocks.
2. Repeat Exercise 1 using five years of daily returns for the period
ending December 31, 2015.
Compare the sample means and sample standard deviations
for the daily returns to the corresponding values for the monthly
returns. Do the relationships between monthly and daily values dis-
cussed in Section 2.5 appear to hold at least approximately? For
the comparisons, round the results to three significant figures.
Compare the sample correlation of the daily returns to the
sample correlation of the monthly returns.
3. Using data on three-month Treasury Bills obtained from the Fed-
eral Reserve website, calculate five years of monthly risk-free rates
for the period ending December 31, 2015. Calculate five years of
monthly excess returns for this period for Papa John’s and Bed
Bath & Beyond stock.
a. Calculate the sample mean of the excess returns on each stock.
b. Calculate the sample standard deviation of the excess returns on
each stock.
c. Calculate the sample correlation of the returns on the two stocks
using the excess returns.
d. Compare the results obtained in Parts (b) and (c) to those
obtained in Exercise 1. For the comparison, round the results
to three significant figures.
4. Calculate approximate 95% confidence intervals for the mean
monthly excess return on Papa John’s stock and the mean monthly
excess return on Bed Bath & Beyond stock. Using the procedure
discussed in Example 6.5, calculate an approximate 95% confidence
interval for the difference in mean monthly excess returns on these
two stocks.
5. Using a decay parameter of 0.93, calculate weighted estimates of
the mean and standard deviation of the monthly excess return on
Papa John’s stock; see Examples 6.7 and 6.8. Compare the results
to the (unweighted) sample mean and sample standard deviation.
6. Construct a data matrix consisting of five years of monthly excess
returns on five stocks, Papa John’s International, Inc. (symbol
PZZA), Bed Bath & Beyond, Inc. (BBBY), Netflix, Inc. (NFLX),
Time Warner, Inc. (TWX), and Verizon Communications, Inc.
(VZ); use returns for the time period ending December 31, 2015.
Add column names corresponding to the stock symbols to the data
matrix.
a. Calculate the sample mean of the excess returns of each stock.
b. Calculate the sample mean of the risk-free rate R̄f and use that
to calculate the sample mean of the standard returns of each
stock.
c. Calculate the sample standard deviation of the excess returns of
each stock.
d. Calculate the sample covariance matrix of the excess returns of
the five stocks.
e. Calculate the sample correlation matrix of the excess returns of
the five stocks.
7. Consider the data matrix constructed in Exercise 6, consisting of
five years of monthly excess returns on five stocks. Let μ1 , μ2 , . . . , μ5
denote the respective mean returns on the stocks. Using the pro-
cedure described in Example 6.11, construct shrinkage estimates of
σ∗
σ̄ = σ.
S
period 60; the return standard deviation is 0.02 in period 1 and 0.04
in period 60.
A matrix of return values corresponding to this scenario may be
simulated using the command
Note that the rnorm function recycles the values specified in the
argument mean. Because of the way in which they are recycled, we
must populate the matrix of returns by row, instead of by column,
which is the default; this is achieved by the argument byrow=T.
a. Construct a simulated return matrix using the command given
previously.
b. Consider a decay parameter of γ = 0.93. By constructing a vec-
tor of the form γ60−t for t = 1, 2, . . . , 60, using a loop, calculate
the 10, 000 EWMA estimates of the return standard deviation.
c. Using the return matrix from Part (a) together with the apply
function, calculate a vector of 10, 000 sample mean returns.
d. Calculate the sample mean and standard deviation of the
EWMA estimates calculated in Part (b) and the sample means
calculated in Part (c).
e. Consider the estimates obtained by the two methods as estimates
of the return standard deviation in period 60, 0.04. Using the
results from the Monte Carlo simulation, estimate the bias and
RMSE of each estimator. Based on these results, which estimator
is preferable?
f. Estimate the RMSE for EWMA estimators based on different
values of the decay parameter, γ = 0.90 and γ = 0.95. Which
EWMA estimator has the smallest estimated RMSE?
18. The goal of this exercise is to use Monte Carlo simulation to study
the behavior of estimates of the weights for tangency and minimum-
variance portfolios. Consider a three-dimensional vector of asset
returns with excess mean vector of the form c(1, 1, 1)T for some c
and covariance matrix of the form
⎛ ⎞
1 ρ ρ
σ2 ⎝ρ 1 ρ⎠
ρ ρ ρ
7.1 Introduction
In Chapters 4 and 5, we considered portfolio theory in which information
about the means, variances, and correlations of asset returns is used to con-
struct portfolios that are optimal according to certain criteria. In this chapter,
we turn this around—we analyze an optimal portfolio and see what this opti-
mality implies about the distribution of the asset returns. This analysis leads
to important properties of the relationship between the returns on a given
asset and the returns on the optimal portfolio.
According to the theory described in Sections 4.6 and 5.7, an investor
choosing a portfolio of risky assets to combine with the risk-free asset should
always choose the tangency portfolio. This is true for any level of risk preferred
by the investor as follows: to achieve low levels of risk, more of the investment
is placed in the risk-free asset, while investors able to tolerate higher levels of
risk place more of their investment in the tangency portfolio, even borrowing
to do so, if desired. Because, according to this theory, all investors use the
same combination of risky assets, that is, the tangency portfolio, the market
as a whole gives us useful information about the tangency portfolio.
The market portfolio is a portfolio of assets in which the weight placed
on asset j is equal to the investment in asset j, as a proportion of the total
investment in the market. The market portfolio may be viewed as a type of
“consensus portfolio” for all investors.
According to portfolio theory, all investors should use the tangency port-
folio so that this consensus portfolio should be identical to the tangency
portfolio. Therefore, we do not need to calculate the weights of the tangency
portfolio, we can observe them by calculating the investment in each asset in
the market. Furthermore, the equivalence of the market and tangency port-
folios has important implications for the relationship between the returns on
an asset and the returns on the market portfolio, which is summarized in
the capital asset pricing model (CAPM ). This model is a starting point for a
number of models describing the behavior of asset returns.
Of course, such an analysis must be based on a number of assumptions.
Specifically, we assume the following:
• Asset prices are in equilibrium, with supply equaling demand for
each asset.
197
The proof also follows from noting that any such portfolio has (σp , μp )
falling on the line connecting (0, μf ) and (σT , μT ).
Note that the condition that the portfolio in Lemma 7.1 places positive
weight on the tangency portfolio is the condition that the investor does not
take a short position in the tangency portfolio in order to buy the risk-free
asset.
According to the argument given in the introduction to this chapter, the
return on the tangency portfolio may be viewed as the return on the market
portfolio. Therefore, we may write the result (7.1) as
μp − μf μm − μf
= (7.2)
σp σm
where μm , σm are the mean and standard deviation, respectively, of the return
on the market portfolio. This equation may also be written as
μm − μf
μp = μf + σp ; (7.3)
σm
this form emphasizes the relationship between the expected return on the
portfolio and the portfolio risk. The relationship given in (7.3) is known as
the capital market line.
In this section, we show that a similar relationship holds for the return on
any asset, such as a single stock or a portfolio. Based on the assumptions dis-
cussed in Section 7.1, the market portfolio is equivalent to the tangency port-
folio, and hence, it maximizes the Sharpe ratio among all portfolios. Therefore,
modifying the weight given to asset i in the market portfolio must decrease
the Sharpe ratio. This fact may be used to derive a relationship between the
expected return on asset i and the expected return on the market portfolio.
Proposition 7.1. Let Ri denote the return on a given asset, let μi = E(Ri )
and let σ2i = Var(Ri ). Let Rm denote the return on the market portfolio, let
μm = E(Rm ), let σ2m = Var(Rm ), and let ρi denote the correlation of Ri and
Rm . Then
σi
μi − μf = ρi (μm − μf ) (7.4)
σm
where μf is the return on the risk-free asset.
Proof. Consider a new portfolio, formed by combining asset i with the market
portfolio. Let wi denote the weight given to asset i so that the new portfolio
has return Rp = wi Ri + (1 − wi )Rm . It follows that
and
Here
dμp (wi )
= μi − μm
dwi
and
dσ2p (wi )
= 2wi σ2i − 2(1 − wi )σ2m + 2(1 − 2wi)ρi σi σm
dwi
so that
dσ2p (wi )
= 2ρi σi σm − 2σ2m .
dwi
wi =0
Note that
dσ2p (wi ) dσp (wi )
= 2σp (wi ) ;
dwi dwi
therefore,
dσ2p (wi )
dσp (wi ) dwi w =0 ρi σi σm − σ2m
i
= = .
dwi wi =0 2σp (0) σp (0)
Because μp (0) = μm and σp (0) = σm ,
dσp (wi )
= ρi σi σm
dwi wi =0
The result given in Proposition 7.1 is known as the capital asset pricing
model, often abbreviated as CAPM. It may also be written as
μi − μf μm − μf
= ρi (7.7)
σi σm
so that the Sharpe ratio of a given asset is equal to the Sharpe ratio of the
market portfolio times the correlation of the asset’s return with the return on
the market portfolio.
That is, according to Proposition 7.1, the only way for an asset to
have a large Sharpe ratio is for its returns to be highly correlated with the
market returns. On the other hand, an asset with returns that are approxi-
mately uncorrelated with the market returns necessarily has a small Sharpe
ratio.
Note that, because ρi ≤ 1, the relationship in (7.7) is consistent with the
assumption that the market portfolio has the largest possible Sharpe ratio.
Example 7.1 Suppose that the monthly return on the market portfolio has
an expected value μm = 0.025 and a standard deviation σm = 0.04, and sup-
pose that the risk-free rate of return is μf = 0.005. Consider an asset with a
return with expected value and standard deviation of μi and σi , respectively,
and let ρi denote the correlation of the asset’s return with the market return.
Then
μi − 0.005 0.025 − 0.005
= ρi = 0.5ρi
σi 0.04
so that the Sharpe ratio of the asset is ρi /2.
Define
σi Cov(Ri , Rm )
βi = ρi = . (7.8)
σm Var(Rm )
Then the relationship given in Proposition 7.1 may also be written
μi − μf = βi (μm − μf );
this equation is known as the security market line (SML). It shows that the
expected excess return on an asset is proportional to its value of βi . Thus, the
parameter βi describes an important property of an asset and analysts often
refer to the “beta” of an asset; the interpretation of beta will be discussed in
detail in the following section.
Recall that, for any random variable X, E(X 2 ) = E(X)2 + Var(X) and,
for any constant c, Var(c + X) = Var(X). It follows that
!
2
E[(Ri − Rf − a − b(Rm − Rf )2 ] = E (Ri − Rf − a − b(Rm − Rf ))
+ Var(Ri − bRm ) (7.10)
Let â, b̂ denote the values of a, b, respectively, that minimize (7.9), or equiv-
alently, the expression in (7.10). Then, given b̂, â minimizes E[Ri − Rf − a −
b̂(Rm − Rf )]2 with respect to a. It follows that
so that " 2 #
E Ri − Rf − â − b̂(Rm − Rf ) = 0.
for any a and b. That is, the linear function of Rm − Rf that best approximates
Ri − μf in the sense of MSE is βi (Rm − Rf ).
portfolio times the correlation of the asset’s return with the market return.
However, there are a number of different implications of this result and, in
this section, we consider several of these.
The CAPM, as given in Proposition 7.1, describes a relationship between
the expected return on a portfolio and the expected return on a market port-
folio in terms of the standard deviations of the returns and their correlation.
However, that result also implies a relationship for the returns themselves.
Corollary 7.1. Let Ri denote the return on an asset, let Rm denote the
return on the market portfolio, let Rf denote the return on the risk-free asset,
and let βi = Cov(Ri , Rm )/Var(Rm ). Then we may write
Ri − Rf = βi (Rm − Rf ) + Zi
for a random variable Zi that has mean 0 and that is uncorrelated with Rm .
Proof. Note that Zi may be written
Zi = Ri − Rf − βi (Rm − Rf ), (7.12)
where βi is as given in the statement of the corollary.
Then, according to Proposition 7.1, Zi has expected value 0:
E(Zi ) = μi − μf − βi (μm − μf ) = 0.
Furthermore, using properties of covariance,
Cov(Zi , Rm ) = Cov(Ri − μf − βi (Rm − μf ), Rm )
= Cov(Ri , Rm ) − βi Var(Rm ) = 0
so that Zi is uncorrelated with the market return.
hence, it follows that βj > βi . Note that, because μi , μj , and μm are all
greater than μf , βi , and βj must be positive. Therefore,
so that μj > μi . That is, an investor who assumes additional market risk by
investing in asset j is rewarded with a higher expected return.
On the other hand, suppose that the additional risk of asset j is
attributable entirely to the difference in the assets’ nonmarket components
of variance. If the market components of the variances of assets i and j are
equal, then β2i σ2m = β2j σ2m so that βi = βj . It follows that μi = μj ; that is,
there is no “reward” for the additional nonmarket risk.
Now suppose that the difference between Var(Rj ) and Var(Ri ) is because of
differences in both the market and the nonmarket components of the variances.
Then the same argument holds, except that μj − μi depends only on the
difference between the market components of variance.
Specifically,
μj − μi = (βj − βi )(μm − μf )
μm − μf
= (βj σm − βi σm ) .
σm
Note that βi σm and βj σm are the square roots of the market components of
variance for assets i and j, respectively. We will refer to βi σm as the market
component of risk for the asset; this market component may also be written
as ρi σi .
Thus, the difference (μj − μi ) is proportional to the difference in the mar-
ket components of risk for the two assets. This consequence of the CAPM is
often summarized by saying that there is a reward for assuming risk but only
for the market component of risk; there is no benefit in investing in an asset
that has a large nonmarket component of risk.
Example 7.4 Suppose that the return on the market portfolio has μm =
0.025 and σm = 0.04; let μf = 0.005. Consider an asset with a return that has
mean μi , standard deviation σi , and correlation with the market return of ρi .
Then
σi
βi = ρi = 25ρi σi
σm
and, hence,
1
μi = μf + βi (μm − μf ) = 0.005 + 25ρiσi (0.025 − 0.005) = 0.005 + ρi σi .
2
Assume that ρi > 0. Let γ2i denote the component of the variance of the
return on asset i that is due to the market, so that γ2i = ρ2i σ2i . Then
1
μi = 0.005 + γi .
2
That is, the expected return on the asset is a linear function of its market
component of risk, γi .
N
Rp = wi Ri
i=1
denote its return. Then βp , the value of beta for the portfolio, may be written
N
Cov(Rp , Rm ) Cov( i=1 wi Ri , Rm )
βp = =
Var(Rm ) Var(Rm )
N N
i=1 Cov(wi Ri , Rm ) wi Cov(Ri , Rm )
= = i=1
Var(Rm ) Var(Rm )
N
wi βi Var(Rm )
N
= i=1 = wi βi .
Var(Rm ) i=1
Because
N
μp = E(Rp ) = wi μi ,
i=1
N
μp − μf = wi (μi − μf ) = wi βi (μm − μf )
i=1 i=1
= βp (μm − μf ).
and
μi − μm μm − μf dσp (wi )/dwi |wi =0
f (0) = − .
σm σm σp (0)
We have seen that
ρi σi σm − σ2m
dσp (wi )/dwi |wi =0 =
σm
so that, using the fact that ρi = βi σm /σi , we may write
μi − μm μm − μf
f (0) = − (βi − 1)
σm σm
μi − μm − βi (μm − μf )
=
σm
αi
= .
σm
Therefore, if αi > 0, then f (0) > 0 so that adding a small investment in asset
i to the market portfolio increases the Sharpe ratio. Stated another way, the
market portfolio does not contain enough of asset i to maximize the Sharpe
ratio.
Let Qi denote the number of shares of asset i in the market and let Pi
denote the price of one share of asset i. Then the weight given to asset i in
the market portfolio is
Qi Pi
C
where C denotes the total investment in the market, known as the
market capitalization.
When αi > 0, the weight given to asset i in the market portfolio is too
small; that is, the ratio Qi Pi /C is too small. Therefore, Pi , the price of asset
i, should be higher on average. It follows that, according to the CAPM, an
asset with αi > 0 is mispriced and its price is too low. Conversely, the price
of an asset with αi < 0 is too high; according to the CAPM, its price should
be lower on average.
Example 7.6 Suppose that Rm , the return on the market portfolio, has mean
0.025 and standard deviation 0.04 and that the risk-free rate is μf = 0.005.
Consider an asset with return Ri that has mean 0.02 and standard deviation
0.08, and suppose that the correlation of Ri and Rm is ρi = 0.30.
Then
βi = ρi σi /σm = (0.30)(0.08)/(0.04) = 0.60
and, hence, according to the CAPM,
μi = μf + βi (μm − μf ) = 0.005 + 0.60(0.025 − 0.005) = 0.017.
However, μi = 0.02, so that
αi = μi − μf − βi (μm − μf ) = 0.02 − 0.017 = 0.003.
Therefore, the price of asset i is too low.
The CAPM given in Proposition 7.1 follows from the assumptions pre-
sented in Section 7.1 as follows: Asset prices are in equilibrium, investments
decisions are based on the means and standard deviations of the returns, and
all investors hold a combination of the tangency portfolio and the risk-free
asset. Therefore, if the conclusion of Proposition 7.1 does not hold, then one
or more of the assumptions must be incorrect.
For instance, it may be that market prices are not in equilibrium. This
suggests that if αi > 0, then the price of asset i needs to increase in order
to reach equilibrium, at which point αi will be 0. This leads to an expected
return for asset i that is larger than the expected return given by the CAPM.
The case of αi < 0 is similar except that we expect the return on asset i to be
lower than what is implied by the CAPM.
Alternatively, it may be that prices are in equilibrium but that investors
hold inefficient portfolios so that the market portfolio is inefficient in the sense
that it does not maximize the Sharpe ratio. Thus, if αi > 0, the demand for
asset i is lower than it would be if the market portfolio were efficient, leading
to a price for asset i that is too low.
and let μp and σp denote the mean and standard deviation, respectively, of
Rp . Then
μp − μf μm − μf
≤ (7.17)
σp σm
with equality if and only if Rp and Rm have correlation one. That is, the
market portfolio has the maximum possible Sharpe ratio.
Proof. Using the form of the SML given by (7.7), together with the properties
of portfolios discussed in Section 7.4, it follows that
μp − μf μm − μf
= ρp (7.18)
σp σm
The CAPM shows that if a given portfolio is efficient in the sense that it
maximizes the Sharpe ratio, then the SML holds for all assets with respect
to that efficient portfolio. Proposition 7.2 shows that if the SML holds for all
assets in the market with respect to a given market portfolio, then that market
portfolio must maximize the Sharpe ratio. Therefore, there is a sense in which
the CAPM, as stated in Proposition 7.1, is actually a statement about the
efficiency of the market portfolio.
The result in Proposition 7.2 may be stated in an alternative form, which
is given in the following corollary; the proof of Proposition 7.2 may be easily
adapted to prove this result.
Corollary 7.3. Let Rp∗ denote a given portfolio and for any arbitrary asset
with return R define
where μ∗p = E(Rp∗ ) and β(R) = Cov(R, Rp∗ )/Var(Rp∗ ). Consider the set of
assets for which α(R) = 0. Then the asset with return Rp∗ is in this set and
it has the maximum Sharpe ratio among all portfolios formed from assets in
this set.
and, hence, we may no longer assume that the market portfolio is equivalent
to the tangency portfolio.
Instead, we assume that each investor holds a portfolio of risky assets
that lies on the efficient frontier, but these portfolios may vary by investor.
According to Propositions 5.2, portfolios constructed from assets lying on
the efficient frontier are also on the efficient frontier provided that the mean
return of the portfolio is greater than the mean return on the minimum vari-
ance portfolio. Hence, we may still assume that the market portfolio lies on
the efficient frontier. However, it is not necessarily equal to the tangency
portfolio.
Let Rm denote the return on the market portfolio. We assume that
if there is another portfolio, with return Rp , such that E(Rp ) = E(Rm ),
then Var(Rp ) ≥ Var(Rm ); alternatively, if Var(Rp ) = Var(Rm ), then E(Rp ) ≤
E(Rm ). Note that these assumptions state simply that the market portfolio
lies on the efficient frontier.
The proof of the CAPM given in Proposition 7.1 is based on the fact that
the market portfolio has the maximum Sharpe ratio among all assets and,
hence, modifying the weight given to any asset cannot increase the Sharpe
ratio. For the version of the CAPM considered in this section, we use a similar
argument based on the efficiency of the market portfolio.
Let Ri denote the return on asset i. Suppose that we can construct a
portfolio consisting of asset i together with the market portfolio that has
the same expected return as the market portfolio; then the variance of that
portfolio must be at least as large as that of the market portfolio. We may try
to use this fact to establish a relationship similar to that in the SML.
However, it is clear that such an approach cannot work—unless asset i
has the same expected return as the market portfolio, a portfolio including
both asset i and the market portfolio cannot have the same expected return as
the market portfolio. Hence, we need to include a third asset in the portfolio.
Because we would like the eventual result to focus on the relationship between
the return on asset i and the return on the market portfolio, we might consider
an asset with a return that is uncorrelated with the market return.
Let Rz denote the return on an asset satisfying Cov(Rz , Rm ) = 0 and
E(Rz ) = E(Rm ). At the end of this section it will be shown that such a
portfolio exists. Note that Cov(Rz , Rm ) = 0 implies that the value of beta
corresponding to Rz is zero; therefore, the asset with return Rz is known as
the zero-beta portfolio.
Proposition 7.3. Let Rm denote the return on the market portfolio and let
Rz denote the return on the corresponding zero-beta portfolio; let μm = E(Rm )
and μz = E(Rz ). Consider an asset with return Ri ; let μi = E(Ri ) and let
βi = Cov(Ri , Rm )/Var(Rm ). Then
μi − μz = βi (μm − μz ). (7.19)
Proof. For a real number θ, consider the zero-investment portfolio with return
Therefore,
βi = 1 − θ0
and, using the expression for θ0 ,
μm − μi μi − μm
βi = 1 − = ,
μm − μz μm − μz
proving the result.
That is, a form of the SML holds with μz replacing μf . The pricing model
given by (7.19) is known as the zero-beta CAPM.
Note that Proposition 7.3 requires only that the market portfolio is on
the efficient frontier, which is weaker than the condition that the market
portfolio is the tangency portfolio required in Proposition 7.1. Hence, one
might consider the possibility of weakening the conditions of Proposition 7.1
to require only that the market portfolio is on the efficient frontier, using the
method of proof used in Proposition 7.3 with the risk-free asset playing the role
of the zero-beta portfolio. However, in Proposition 7.3, it is important to keep
in mind that the efficiency of the market portfolio is with respect to all assets
under consideration; if a risk-free asset is available, then the market portfolio
must be efficient with respect to portfolios that include the risk-free asset.
Thus, such efficiency requires that the market portfolio is equivalent to the
tangency portfolio; that is, it is not possible to use the approach of Proposition
7.3 to weaken the conditions used to establish the SML in Proposition 7.1.
μf
βk
FIGURE 7.1
Hypothetical Plot of μk versus βk .
7.9 Exercises
1. Consider an asset with expected return 0.04 and suppose that the
return on the market portfolio is 0.06. Assuming that the SML holds
for the asset and that the risk-free return is 0.004, find the value of
beta for the asset.
2. Use the relationship given by the CAPM, as stated in (7.7), along
with the assumption that the market portfolio is equivalent to the
tangency portfolio, to establish the result in Lemma 7.1.
3. Consider an asset with return R and let Rm denote the return on
the market portfolio. Let μ = E(R), μm = E(Rm ), σ2 = Var(R),
and σ2m = Var(Rm ), and let μf denote the return on the risk-free
asset. Suppose that σ = σm /2 and that
μ − μf 1 μm − μf
= .
σ 2 σm
Assuming that the SML holds for this asset, find its value of β.
4. Consider an asset with return R and let Rm denote the return on
the market portfolio. Suppose that R and Rm are related by
R = 0.002 + 0.9Rm +
Cov(Rm , ) = 0.
Assuming that the SML holds for this asset, find the value of
μf , the return on the risk-free asset.
5. Consider two assets, with returns R1 and R2 , respectively, and let
Rm denote the return on the market portfolio. For j = 1, 2, let
μj = E(Rj ), let σ2j = Var(Rj ), and let ρj denote the correlation of
Rj and Rm . Let μf denote the expected return on the risk-free asset
and assume that μj > μf for j = 1, 2. Assume that the SML holds
for both assets.
For each of the sets of parameter values given as follows, state
that asset 1 has the greater mean return, that asset 2 has the greater
mean return, or that it is not possible to determine the greater mean
return based on the information given.
a. Suppose that ρ1 = 0.6, σ1 = 0.2, ρ2 = 0.5, and σ2 = 0.3.
b. For j = 1, 2, let SRj denote the Sharpe ratio of asset j. Suppose
that
SR1 = 0.9 and SR2 = 1.2.
c. For j = 1, 2, let γ2j = ρ2j σ2j ; note that γ2j is the market component
of variance for the return on asset j. Suppose that γ21 = 0.5 and
γ22 = 0.8.
6. Consider two assets with returns R1 and R2 , respectively, and let
Rm denote the return on the market portfolio. Suppose that the
SML holds for both assets, with β = 0.80 for asset 1 and β = 0.90
for asset 2. Does it follow that the correlation of R2 and Rm is
greater than the correlation of R1 and Rm ? Why or why not?
7. Let Rmv denote the return on the minimum-variance portfolio and
let Rm denote the return on the market portfolio. Suppose that the
correlation of Rmv and Rm is 0.4. Find the value of beta in the
SML for the minimum-variance portfolio.
8. Consider an asset with return Ri . Suppose that the variance of Ri is
0.04 and that the market component of the variance of Ri is 0.03. Let
μf denote the risk-free return and assume that μi ≡ E(Ri ) > μf .
a. Find the correlation of Ri and Rm , the return on the market
portfolio.
b. If the Sharpe ratio of the market portfolio is 0.12, find μi − μf .
9. Consider a set of N assets and let Rλ∗ denote the return on the risk-
averse portfolio based on risk-aversion parameter λ. Let Ri denote
the return on a given asset in this set. Find E(Ri ) − E(Rλ∗ ), in terms
of Cov(Ri , Rλ∗ ), Var(Rλ∗ ), and λ.
R0 − Rf = β0 (Rm − Rf ) + Z
Rz = φz Rm + (1 − φz )Rmv
σ2mv
φz = .
σ2mv − σ2m
E(Rz1 ) = E(Rz2 ).
8.1 Introduction
The capital asset pricing model (CAPM) describes a relationship between the
expected return on an asset and the expected return on a “market portfolio,”
which is assumed to be equivalent to the tangency portfolio. The value of
the parameter β for an asset gives important information regarding both the
expected return and the risk of an asset.
However, the CAPM describes a theoretical relationship that is based on
a number of assumptions that are difficult, or impossible, to verify. In this
chapter, we consider the market model, a statistical model for the relationship
between observed asset returns and the observed returns on a type of market
portfolio. This model is a form of linear regression model that can be estimated
using standard techniques. It is consistent with the CAPM in many respects
and, hence, the estimates from the market model give useful information
regarding the statistical properties of asset returns.
Cap-Weighted Indices
Suppose that a market index is to be based on N assets, with respective prices
P1,t , P2,t , . . . , PN,t at time t; here Pj,t is the price of one share of asset j at
221
time t. Let Qj denote the number of shares of asset j available in the market,
j = 1, 2, . . . , N . Then the amount invested in asset j at time t is Qj Pj,t ; this is
known as the market capitalization of asset j at time t. The total capitalization
of the entire market, or the market cap, at time t is given by
N
Qj Pj,t .
j=1
Note that
N N N
j=1 Qj Pj,t+1 j=1 Qj (Pj,t+1 − Pj,t ) j=1 Qj Pj,t Rj,t+1
N −1 = N = N
j=1 Qj Pj,t j=1 Qj Pj,t j=1 Qj Pj,t
where
Pj,t+1
Rj,t+1 = −1
Pj,t
denotes the return on asset j at time t + 1.
Let
c Qj Pj,t
wj,t = N
j=1 Qj Pj,t
denote the proportion of the market cap corresponding to asset j at time t.
Note that
N
c
wj,t = 1.
j=1
c
The weights wj,t , j = 1, 2, . . . , N are known as capitalization weights or simply
cap weights, at time t.
The return on the index at time t + 1 may be written
It+1
N
−1 = c
wj,t Rj,t+1 ,
It j=1
and the matrix indices contains all four of the indices considered; the first
few rows of this matrix are given as follows:
> head(indices)
SP500 R1000 R3000 W5000
[1,] -0.0370 -0.0370 -0.0370 -0.0344
[2,] 0.0284 0.0305 0.0316 0.0323
[3,] 0.0587 0.0597 0.0613 0.0615
[4,] 0.0146 0.0174 0.0205 0.0207
[5,] -0.0821 -0.0815 -0.0811 -0.0812
[6,] -0.0540 -0.0573 -0.0591 -0.0562
It is clear from these few values that the returns on these four indices
are generally, but not always, similar. Therefore, it is not surprising that the
means and standard deviations of the returns on the four indices are generally
close.
> cor(indices)
SP500 R1000 R3000 W5000
SP500 1.000 0.999 0.997 0.996
R1000 0.999 1.000 0.999 0.999
R3000 0.997 0.999 1.000 1.000
W5000 0.996 0.999 1.000 1.000
Thus, the smallest correlation among the four indices is 0.996, between the
S&P 500 index and the Wilshire 5000 index; this is not surprising given that,
of the four indices, the Wilshire 5000 represents the most stocks while the
S&P 500 represents the fewest.
It is worth noting that, even though the return means and standard devia-
tions of the different indices are in close agreement and the indices are highly
correlated, often there is considerable variation in the returns on the different
indices in a given time period. For instance, in period 4, the returns on the four
indices are 0.0146, 0.0174, 0.0205, and 0.0207, respectively. The high correla-
tions tell us that this variation among indices is small relative to the variation
within each index, a consequence of the fact that even the returns on a broad
stock market index such as the Wilshire 5000 are quite variable.
Price-Weighted Indices
Another approach to computing a market index is to simply sum the prices
P1,t , P2,t , . . . , PN,t of the stocks represented in the index. Let
N
j=1 Pj,t
Jt =
Dt
Jt+1
N
Pj,t
−1 = N Rj,t+1
Jt j=1 j=1 Pj,t
so that the return on Jt is equal to the return on a portfolio with asset weights
p Pj,t
wj,t = N , j = 1, 2, . . . , N.
j=1 Pj,t
Example 8.2 Suppose that the monthly excess returns on the Dow Jones
Industrial Average for the period January 2010 to December 2014 have been
calculated and are stored in the R variable djia.
> mean(djia)
[1] 0.00950
> sd(djia)
[1] 0.0347
Thus, the mean and standard deviation of the returns on the Dow Jones
Industrial Average are close to, but slightly different than, those based on the
returns on the cap-weighted indices considered earlier. Similarly, its returns
are highly correlated with those of the cap-weighted indices, but the correla-
tions are not as large as those among the four cap-weighted indices; of course,
the Dow Jones is based on only 30 stocks, so we should not expect the same
level of agreement seen earlier.
> cor(djia, indices)
SP500 R1000 R3000 W5000
[1,] 0.977 0.971 0.968 0.967
and that
Cov(i,t , Rm,s ) = 0 for all t, s = 1, 2, . . . , T.
Therefore, the market model is a regression model with response variable
Yt = Ri,t − Rf,t , t = 1, 2, . . . , T,
Xt = Rm,t − Rf,t , t = 1, 2, . . . , T,
Yt = α + βXt + t , t = 1, 2, . . . , T
μi − μf = αi + βi (μm − μf ),
Interpretation of βi
As in the CAPM, the parameter βi in the market model is a measure of the
relationship between the excess returns on an asset and the excess returns
on the market index and, in many respects, the interpretation of βi follows
from the interpretation of beta in the CAPM, as discussed in Chapter 7.
For instance, it may be used to decompose the variance of an asset’s returns
into market and nonmarket components; this will be discussed in detail in
Section 8.5.
An alternative interpretation of βi is as a measure of the sensitivity of an
asset’s excess returns to the excess return on the market index. However, such
an interpretation does not follow directly from the assumptions of the market
model as given in this chapter. In particular, the interpretation of βi as a
measure of sensitivity is valid only if the relationship between Ri,t − Rf,t and
Rm,t − Rf,t is a linear one.
That is, suppose that the condition that i,t and Rm,t are uncorrelated is
strengthened to E(i,t |Rm,t − Rf,t ) = 0; then
It follows that
d
βi = E(Ri,t − Rf,t |Rm,t − Rf,t = r)
dr
and βi may be interpreted as the measure of sensitivity described previously.
However, if E(i,t |Rm,t − Rf,t ) is a nonzero function of Rm,t − Rf,t , then
d
E(Ri,t − Rf,t |Rm,t − Rf,t = r)
dr
might not be equal to βi . Fortunately, it is generally reasonable to assume
that E(i,t |Rm,t − Rf,t ) = 0 does hold and, hence, the interpretation of βi as
a measure of sensitivity is typically appropriate.
Estimation
We now consider estimation of the parameters of the market model. As
discussed previously, the market model may be viewed as a simple linear
regression model with response variable Yt = Ri,t − Rf,t , where Ri,t is the
return on a specific asset in period t, and Rf,t is the risk-free rate in period t,
and predictor variable Xt = Rm,t − Rf,t , where Rm,t is the return of a market
index in period t.
Therefore, the parameters αi and βi may be estimated using ordinary least
squares. The formulas for the estimators are
T
t=1 (Yt − Ȳ )(Xt − X̄)
β̂i = T
t=1 (Xt − X̄)
2
and
α̂i = Ȳ − β̂i X̄
where Ȳ and X̄ are the sample means of the Yt and Xt , respectively; these
expressions are sometimes useful for studying the properties of the estimators,
but they are not needed for numerical work.
Thus, the remaining issue is selection of the data to be used in the analysis:
the market index, the risk-free asset, the return interval, and the observation
period.
As discussed in Section 8.2, the “market portfolio” is a hypothetical con-
cept; hence, in estimating the parameters of the market model, we use a
market index chosen to measure the general behavior of the equity market.
The most commonly used index in this context is the S&P 500 index. Although
it includes only 500 stocks, the return on the S&P 500 is generally believed
to reflect the return on the entire market. There are a number of broader
indices that can be used such as the Russell 3000 index and the Wilshire 5000
index. As shown in Example 8.1, the S&P 500 index, the Russell 1000 index,
the Russell 3000 index, and the Wilshire 5000 index are generally highly cor-
related with each other; hence, the choice from among these indices has a
relatively small impact on the estimates. Here we will use the return on the
S&P 500 index as the return on the market portfolio.
For the risk-free rate to use in the analysis, we will use the return on
a 3-month Treasury Bill, as discussed in Example 6.1. These are generally
reported as annual percentage rates, which must be converted to proportional
monthly rates. Let Rf a be an annual percentage rate; recall that this may be
converted to a monthly rate by
Rf = (1 + Rf a /100)1/12 − 1.
For the return interval, we could use daily, weekly, monthly, quarterly, or
yearly returns. The return interval should reflect the investment horizon of
interest. For instance, if investment decisions are made on a monthly basis,
it generally makes sense to use monthly returns. Here we will use monthly
returns.
The observation period refers to the number of return intervals to use in
the analysis; for example, for monthly data, we need to choose how many
Example 8.3 Consider the monthly excess returns on IBM stock, which we
assume have been calculated and stored in the variable ibm; as discussed in
Example 8.1, the excess returns on the S&P 500 index are stored in the vari-
able sp500. As with any statistical analysis, before estimating the parameters
of the linear regression model relating ibm to sp500, it is a good idea to plot
the data, such a plot is given in Figure 8.1. The plot indicates an approximate
linear relationship among the variables that would be accurately described by
the market model.
To estimate the parameters of the market model in R, we use the function
lm. The syntax of the command to estimate the market model relating returns
on IBM stock to returns on the S&P 500 index, as contained in the variables
ibm and sp500, respectively, is
> lm(ibm~sp500)
0.10
0.05
Return on IBM stock
−0.05
−0.10
FIGURE 8.1
Plot of IBM monthly returns versus the returns on the S&P 500 index.
Example 8.4 Recall that, for IBM stock, the results of the lm function
include the following:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000707 0.005358 -0.13 0.9
sp500 0.618789 0.138073 4.48 3.5e-05 ***
The p-value for testing α = 0 is 0.9; therefore, we do not reject the hypothesis
that the IBM stock is priced correctly.
To calculate the p-values for testing that α = 0 for each of the stocks with
returns included in the big8 variable, we may use the apply function. Note
that the [1, 4] element of the $coefficients component of the output from
the lm function is the p-value for testing α = 0
> summary(lm(ibm~sp500))$coefficients[1, 4]
[1] 0.8955
> f.alphapval<-function(y)
+ {summary(lm(y~sp500))$coefficients[1, 4]}
> f.alphapval(ibm)
[1] 0.8955
The p-values for the eight stocks may then be calculated using apply
Therefore, for these eight stocks, the hypothesis that the stock is priced cor-
rectly is never rejected at the 0.05 level. Two stocks have a p-value less than
0.10—Apple, which has α̂ = 0.0154 and a p-value of 0.088, and CVS, which
has α̂ = 0.00956 and a p-value of 0.095.
m
P(A1 ∪ A2 ∪ · · · ∪ Am ) ≤ P(Aj ),
j=1
m
P(q1 ≤ c∗ ∪ q2 ≤ c∗ ∪ · · · ∪ qm ≤ c∗ ) ≤ P(qj ≤ c∗ ). (8.5)
j=1
Using the fact that a p-value has a uniform distribution under the null
hypothesis, P(qj ≤ c∗ ) = c∗ . It follows that
P(q1 ≤ c∗ ∪ q2 ≤ c∗ ∪ · · · ∪ qm ≤ c∗ ) ≤ mc∗ .
Therefore, to guarantee that our test has a level less than or equal to 0.05,
we can choose c∗ = 0.05/m. Then the probability of concluding that any of
the assets is mispriced when all are priced correctly is less than or equal to
0.05. Clearly, the same approach may be used for any desired level.
Hence, to address the multiple-testing problem, we modify the criterion for
a significant p-value from 0.05 to 0.05/m, where m is the number of hypotheses
being tested; this is known as the Bonferroni method. An equivalent approach
is to calculate “adjusted p-values,” given by mqj , j = 1, 2, . . . , m; if mqj > 1,
we set the adjusted p-value to 1. The adjusted p-values can then be evaluated
using the usual criteria; for instance, we can compare the adjusted p-values
to 0.05 for a test with level 0.05.
Example 8.5 Consider stocks for firms represented in the S&P 100 index;
stocks in the S&P 100 index are a subset of those in the S&P 500 index,
representing a cross section of large U.S. companies. For each stock, five years
of monthly returns were analyzed for the period ending December 31, 2014;
only 96 of the 100 stocks had five years of monthly returns available.
For each of these 96 stocks, the p-value of the test of αj = 0 described
earlier was calculated; the results are stored in the variable sp96.pv
> head(sp96.pv)
[1] 0.0883 0.2450 0.5338 0.9436 0.1488 0.0397
Thirteen of the p-values are less than 0.05, with the smallest at 0.0043.
> sort(sp96.pv)[1:15]
[1] 0.00426 0.00548 0.00930 0.01299 0.01801 0.01960 0.02715
0.02891 0.03139
[10] 0.03458 0.03966 0.04254 0.04811 0.05394 0.05474
For a test with level 0.05, the Bonferroni-corrected criterion is 0.05/96 =
0.00052; all of the p-values exceed this threshold. Thus, although the p-values
suggest that some of the stocks might be mispriced, after adjusting for multiple
testing, we do not reject the hypothesis that all stocks are priced correctly.
Alternatively, if we compute the adjusted p-values, by multiplying the
p-values by 96, we see that the smallest adjusted p-value is 0.41 (96 times
0.0043), leading to the same conclusion.
return data. In the present context, this property means that there is a ten-
dency for the procedure to conclude that all stocks are priced correctly even
when one or more is mispriced.
An alternative approach to designing tests of many hypotheses is to control
the false discovery rate (FDR) rather than to control the probability of a Type
I error. Suppose we conduct a series of tests of the hypotheses that a stock is
mispriced, that is, of the hypotheses of the form αj = 0, and that, based on
the procedure used, we conclude that m0 of the stocks are mispriced; that is,
m0 of the hypotheses that αj = 0 are rejected. Let m1 denote the number of
those rejected hypotheses for which αj is actually 0.
We refer to a rejected hypothesis as a “discovery” and an incorrectly
rejected hypothesis as a “false discovery.” In the present context, a false dis-
covery occurs if we conclude that a stock is mispriced when it is not. The false
discovery proportion is defined as m1 /m0 provided that m0 > 0; if m0 = 0, it
is taken to be 0.
Note that the false discovery proportion is a random variable; the FDR is
the expected value of this random variable. Therefore, the FDR is the expected
proportion of rejected null hypotheses that were rejected incorrectly.
It is important to note that although the level of a test and its FDR
are related, they are fundamentally different measures. The level of a test of
α1 = α2 = · · · = αm = 0 is the probability of rejecting this hypothesis, that is,
of concluding that at least one αj is nonzero when all are actually 0. The FDR
measures the expected proportion of those cases in which αj = 0 is rejected for
which αj is actually 0. Hence, procedures that control the FDR do not control
the level of the test. However, the FDR is an intuitively appealing concept
in many applications, such as stock screening; furthermore, the procedures
that control the FDR have higher power than those based on the Bonferroni
correction, so that we are more likely to discover mispriced stocks.
Let qj denote the p-value of the usual test of αj = 0, j = 1, 2, . . . , m.
To control the FDR at F , instead of comparing each qj to a given threshold
value, as in the Bonferroni method, we use the following procedure. First, order
the p-values and let q(1) , q(2) , . . . , q(m) denote the ordered values, so that q(1) is
the smallest p-value, q(2) is the second smallest, and so on. Then, starting with
j = 1 and moving through the list of p-values, we compare q(j) to (j/m)F .
If q(j) > (j/m)F for all j = 1, 2, . . . , m, then we do not reject any of the
hypotheses. Otherwise, find the largest j for which q(j) ≤ (j/m)F ; denote this
value by j ∗ . Then we reject the hypotheses corresponding to q(1) , q(2) , . . . , q(j ∗ ) .
Although this procedure is a bit complicated, fortunately, there is an R func-
tion that computes the corresponding adjusted p-values that can be compared
to a given threshold in the usual way.
Although the conventional choice for the level of a test is 0.05, that is
not necessarily the best choice for the FDR. For instance, an FDR of 0.10
or larger may be reasonable. In particular, if the tests of αj = 0 are used to
screen stocks for further investigation, a threshold as large as 0.20 may be
appropriate.
Example 8.6 Consider stocks for firms represented in the S&P 100 index
analyzed in Example 8.5; consider testing αj = 0 for these stocks, controlling
the FDR at 0.10.
The p-values for testing αj = 0 for each of the 96 stocks are stored in the
variable sp96.pv. To compute the p-values adjusted for controlling the FDR,
we use the following command:
> sp96.pv.fdr<-p.adjust(sp96.pv, method="fdr")
> head(sp96.pv)
[1] 0.0883 0.2450 0.5338 0.9436 0.1488 0.0397
> head(sp96.pv.fdr)
[1] 0.403 0.523 0.733 0.971 0.468 0.340
The function p.adjust can perform a number of different adjustments; using
the argument method="fdr" specifies the adjustment to control the FDR, as
described earlier.
The minimum adjusted p-value is given by
> min(sp96.pv.fdr)
[1] 0.26
Because this value exceeds 0.10, we conclude that all 96 stocks are priced
correctly. If the minimum adjusted p-value had not exceeded 0.10, we would
reject the hypothesis that αj = 0 for those assets with an adjusted p-value less
than or equal to 0.10.
denote the sample variance of Rm,t − Rf,t , t = 1, 2, . . . , T , and let β̂i and σ̂2,i
T −2 2
Si2 = β̂2i Sm
2
+ σ̂ ;
T − 1 ,i
hence, except when T is very small, the relationship described in (8.6) holds
to a high degree of approximation.
The proportion of the variance of Ri,t explained by the return on the
market index can be estimated by
β̂2i Sm2
2 ,
Si
Example 8.7 Consider the market model for the returns on IBM stock; recall
that the output from the linear regression analysis corresponding to the market
model for IBM stock is stored in the variable ibm.mm. The R-squared value
for the regression is available using the function summary.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000707 0.005358 -0.13 0.9
sp500 0.618789 0.138073 4.48 3.5e-05 ***
> summary(ibm.mm)$r.squared
[1] 0.257
The apply function can be used to calculate the R-squared for all eight of the
stocks represented in big8. Define a function f.rsq by
>f.rsq<-function(y){summary(lm(y~sp500))$r.squared}
Then the R-squared values may be calculated by
> apply(big8, 2, f.rsq)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.219 0.234 0.196 0.485 0.516 0.257 0.276 0.599
Thus, for the eight stocks with return data in the big8 variable, the R-squared
values range from 0.196 for Coca-Cola to 0.599 for Disney. Therefore, nearly
60% of the variation in the returns on Disney stock, as measured by the
return variance, can be explained by variation in the market return; on the
other hand, for Coca-Cola stock, less than 20% of the variation in the returns
can be explained by the market.
N
β̄ = β̂i .
N i=1
N
SE2 = SE(β̂i )2 .
N i=1
ψβ̄ + (1 − ψ)β̂i
where
SE2
ψ=
SE2 + τ2β
and
1
N
τ2β = (β̂i − β̄)2 .
N ij=1
Example 8.8 One use for shrinkage estimation is in estimating the values of β
for a number of assets for which it is reasonable to expect similar relationships
with the market index. For example, here we consider the four airline stocks,
American Airlines Group, Inc. (symbol AAL), Delta Air Lines, Inc. (DAL),
Southwestern Airline Company (LUV), and United Continental Holdings, Inc.
(UAL).
Five years of monthly returns for the period ending December 31, 2014,
were computed for these stocks. The results are stored in variables with
the name matching the stock symbol; for example, the returns on Ameri-
can Airlines stock are stored in the variable aal. The returns for all the four
stocks are stored as a matrix in the variable air, which is similar to the vari-
able big8. Estimates of beta for each of the four stocks may be computed as
follows:
> air.mm<-lm(air~sp500)
> air.beta<-air.mm$coefficients[2,]
> air.beta
AAL DAL LUV UAL
0.610 0.825 1.016 0.679
> beta.bar<-mean(air.beta)
> beta.bar
[1] 0.782
> tausq.beta<-mean((air.beta-beta.bar)^2)
> tausq.beta
[1] 0.0242
To compute SE(β̂i ) for each stock, we may use the apply function. Note
that the [2, 2] element of the component $coefficients of summary applied
to the output from the lm function yields the standard error of β̂.
Define a function f.betase by
SE(β̂i )2
ψi = .
SE(β̂i )2 + τ2β
ψi β̄ + (1 − ψi )β̂i .
Example 8.9 Consider the four airline stocks analyzed in Example 8.8.
Recall that the excess return data for the four stocks are stored in the vari-
ables aal, dal, luv, and ual; the data matrix for these four assets is stored in
the variable air. The variable sp500 contains the excess returns on the S&P
500 index.
Consider estimation of β for American Airlines stock. Using the results of
the lm function applied to the market model for American Airlines,
> summary(lm(aal~sp500))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0459 0.0218 2.10 0.04 *
sp500 0.6095 0.5628 1.08 0.28
To calculate the shrinkage estimates of β for all four stocks, recall that the
variable air.betase contains the standard errors of β̂i for the four stocks
> air.betase
AAL DAL LUV UAL
0.5628 0.3405 0.2400 0.3879
The vector of weights used in this procedure along with the vector of
shrinkage estimates of β for all four stocks may now be computed as follows:
> psi.air<-(air.betase^2)/(tausq.beta + air.betase^2)
> psi.air
AAL DAL LUV UAL
0.929 0.827 0.704 0.861
> psi.air*beta.bar + (1-psi.air)*air.beta
AAL DAL LUV UAL
0.770 0.790 0.851 0.768
Recall that the shrinkage estimates based on a global value for ψ are given by
AAL DAL LUV UAL
0.760 0.788 0.813 0.769
The two sets of estimates are very similar. The greatest difference occurs
for LUV; note that LUV has the largest value of β̂i and the smallest value of
the standard error of β̂i .
Adjusted Beta
When estimating β for a large number of assets, a simpler type of shrinkage
estimator is sometimes used. Since the value of beta for the entire market is,
by definition, equal to 1, it is often reasonable to use 1 in place of β̄. Using
global values for the weights given to one and β̂i leads to an estimator of the
form
(1 − k) + k β̂i (8.7)
for some constant k. Often k = 2/3 is used, yielding the estimator of βi
given by
1 2
β̂i,adj = + β̂i , (8.8)
3 3
which is known as adjusted beta. It is sometimes attributed to analysts at
the brokerage firm Merrill Lynch, Pierce, Fenner & Smith, Inc. (Vasicek
1973); this type of adjusted beta is used most notably by the financial data
firm Bloomberg L. P. (www.bloomberg.com) so it is sometimes referred to as
“Bloomberg adjusted beta.”
This estimator has the advantage of requiring only β̂i in order to estimate
βi . However, it has the drawbacks of always shrinking the estimates toward
1 and of always using the weights 1/3 and 2/3; these choices may not be
appropriate in all cases.
Example 8.10 Stocks for firms represented in the S&P 100 index were con-
sidered; these data were also analyzed in Example 8.5. For each of the 96
stocks, five years of monthly returns were analyzed for the period ending
December 31, 2014.
For each stock, the least-squares estimate β̂i was calculated along with the
adjusted beta for stock i, β̂i,adj . To measure the accuracy in these estimates as
predictions of future beta values, they were compared with the least-squares
estimates of β based on the 12 monthly returns in 2015, which we denote by
β̂∗i , i = 1, 2, . . . , 96.
The average error in the least-squares estimates is given by
96
|β̂i − β̂∗i | = 0.369;
96 i=1
96
|β̂i,adj − β̂∗i | = 0.321.
96 i=1
Thus, use of adjusted beta reduces the error in predicting the estimates of
β for 2015 by about 13%.
N
Rp,t = wi Ri,t , t = 1, 2, . . . , T.
i=1
Suppose that the market model holds for each asset so that, for each
i = 1, 2, . . . , N ,
Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T (8.9)
where i,t has mean 0 and is uncorrelated with the market return Rm,t . Then
N
Rp,t − Rf,t = wi (Ri,t − Rf,t )
i=1
N
= wi (αi + βi (Rm,t − Rf,t ) + i,t )
i=1
N
N
= wi αi + wi βi (Rm,t − Rf,t ) + wi i,t , t = 1, 2, . . . , T.
i=1 i=1 i=1
Note that
N
E wi i,t = wi E(i,t ) = 0
i=1 i=1
and, because Cov(i,t , Rm,t ) = 0 for all i = 1, 2, . . . , N ,
N
N
Cov wi i,t , Rm,t = wi Cov(i,t , Rm,t ) = 0, t = 1, 2, . . . , T.
i=1 i=1
It follows that the market model holds for the portfolio with parameters
N
αp = wi αi and βp = wi βi .
i=1 i=1
N
α̂p = wi α̂i and β̂p = wi β̂i .
i=1 i=1
To see why this holds, consider the N = 2 case. For j = 1, 2, let Yj,t =
Rj,t − Rf,t , and let Xt = Rm,t − Rf,t . Then, as discussed in Section 8.3, for
j = 1, 2,
T
(Yj,t − Ȳj )(Xt − X̄)
β̂j = t=1T
t=1 (Xt − X̄)
2
where Yp,t = Rp,t − Rf,t and Ȳp is the sample mean of the Yp,t . Because
it follows that
T
+ (1 − w)Y2,t − wȲ1 − (1 − w)Ȳ2 )(Xt − X̄)
t=1 (wY1,t
β̂p = T
t=1 (Xt − X̄)
2
T T
t=1 (Y1,t − Ȳ1 )(Xt − X̄) (Y2,t − Ȳ2 )(Xt − X̄)
=w T + (1 − w) t=1T
t=1 (Xt − X̄) t=1 (Xt − X̄)
2 2
= wβ̂1 + (1 − w)β̂2 .
Example 8.11 Consider the eight stocks with returns stored in the vari-
able big8. Recall that the results from the market model regression on all
eight stocks are stored in the variable big8.mm. The estimated regression
coefficients from the eight regression analyses are available in the component
coefficients of big8.mm:
> big8.mm$coefficients
AAPL BAX KO CVS XOM IBM JNJ
(Intercept) 0.015 -0.00035 0.0045 0.0096 -0.0013 -0.00071 0.0056
sp500 0.920 0.71644 0.4857 1.0719 0.8787 0.61879 0.5405
DIS
(Intercept) 0.0078
sp500 1.1932
These estimates form a 2 × 8 matrix; therefore, the second row of this
vector contains the eight estimates of β:
> big8.mm$coefficients[2,]
AAPL BAX KO CVS XOM IBM JNJ DIS
0.9203 0.7164 0.4857 1.0719 0.8787 0.6188 0.5405 1.1932
Consider the equally weighted portfolio of the eight stocks; the returns on
this portfolio may be calculated as
> big8.port<-apply(big8, 1, mean)
yielding a vector consisting of the average return in each time period.
Alternatively, we can perform the calculation using matrix multiplication
> big8.port<-big8%*%rep(1/8, 8)
Here, rep(1/8, 8) is a vector of length 8 of the form (1/8, 1/8, . . . , 1/8).
The estimates of αp and βp may be calculated directly from the returns on
the portfolio.
> lm(big8.port~sp500)
Coefficients:
(Intercept) sp500
0.00506 0.80320
These estimates can also be obtained as the averages of the coefficient
estimates from the analyses on the eight individual stocks.
> apply(big8.mm$coefficients, 1, mean)
(Intercept) sp500
0.00506 0.80320
N
wi,t = 1 for t = 1, 2, . . . , T.
i=1
If the market model holds for each asset, that is, if (8.9) holds, then, for each
t = 1, 2, . . . , T ,
N
Rp,t − Rf,t = wi,t (Ri,t − Rf,t )
i=1
N
= wi,t αi + wi,t βi (Rm,t − Rf,t ) + wi,t i,t
i=1 i=1 i=1
= αp,t + βp,t (Rm,t − Rf,t ) + p,t
where
N
αp,t = wi,t αi and βp,t = wi,t βi .
i=1 i=1
Note that the conditions that E(p,t ) = 0 and Cov(p,t , Rm,t ) = 0 for each
t = 1, 2, . . . , T continue to hold so that the market model holds for the portfolio
in each specific time period; however, the values of α and β for the portfolio
now depend on t.
If the weights are approximately constant over time, for example, if the
portfolio corresponds to a specific investment strategy with periodic minor
adjustments, then it may be reasonable to assume that αp,t and βp,t are
approximately constant over time and, hence, the market model is appropriate
for the portfolio. On the other hand, if major changes are made regularly to
the portfolio so that, in effect, there is a different portfolio in each time period,
then the market model assumption of constant αp and βp is inappropriate.
and
Rj,t − Rf,t = αj + βj (Rm,t − Rf,t ) + j,t t = 1, 2, . . . , T.
Now consider a portfolio of assets i and j, with return of the form
Because w(1 − w) > 0 for 0 < w < 1 and ρij < 1, it follows that σ2p < σ2j for
any 0 < w < 1; see Section 4.2 for further details. That is, for any 0 < w < 1,
the risk of the portfolio is less than that of either of the two assets used to
form it. However, diversification has very different effects on the market and
nonmarket components of risk.
As discussed in the previous section, the market model holds for the
portfolio:
where
αp = wαi + (1 − w)αj and βp = wβi + (1 − w)βj .
Then the market component of the variance of the portfolio return is given
by β2p σ2m .
Suppose that, without loss of generality, βi ≤ βj . Then
βi ≤ βp ≤ βj .
so that the market component of variance for the portfolio return lies between
the market components of return variance for the two assets. Thus, the market
component of return variance for the portfolio cannot be reduced below the
smaller of the market components of return variance for the two assets.
In particular, if βi = βj then βp = βi and, hence, the market component of
the variance of Rp,t is β2i σ2m , the same as the market components of variance
for returns on each of assets i and j. That is, in this case, diversification
does not reduce the market component of risk. Therefore, the reduction in
the variance of the portfolio return, as compared to the return variances of
and for each asset, the nonmarket component of the return variance is
The equally weighted portfolio has βp = 0.8; hence, its market component
of return variance is also 0.1024. It follows that the nonmarket component of
the return variance for the portfolio is
Thus, each asset has a return variance of 0.25, with a market component of
0.1024. The return variance of the portfolio is 0.15, but the market component
of that variance is the same as the market component of the return variance
for the two assets, 0.1024. The nonmarket component of return variance for
each of the two assets is 0.1476, while the nonmarket component of return
variance for the portfolio is only 0.0476.
Let
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
α1 β1 1,t
⎜ α2 ⎟ ⎜ β2 ⎟ ⎜ 2,t ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
α = ⎜ . ⎟, β = ⎜ . ⎟, and t = ⎜ . ⎟
⎝ .. ⎠ ⎝ .. ⎠ ⎝ .. ⎠
αN βN N,t
Example 8.13 Consider the equally weighted portfolio of eight stocks ana-
lyzed in Example 8.11, with returns in big8. For the eight individual stocks,
the standard deviations are given by
> summary(lm(ibm~sp500))$sigma
[1] 0.0398
> f.sighat<-function(y){summary(lm(y~sp500))$sigma}
Note that
> f.sighat(ibm)
[1] 0.0398
> sd(big8.port)
[1] 0.0340
> f.sighat(big8.port)
[1] 0.0158
Note that the standard deviation of the portfolio return is about two-thirds as
large as the average return standard deviation for the eight stocks, given by
This value is similar to the market components of risk for the eight individual
stocks:
> big8.mm$coefficients[2,]*sd(sp500)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0346 0.0269 0.0182 0.0403 0.0330 0.0232 0.0203 0.0448
Therefore, the total risk of the portfolio, 0.0340, is smaller than the
total risks of the individuals stocks, which range from 0.0386 to 0.0739; this
difference is attributable primarily to a decrease in the nonmarket risk.
Because the total variance of the portfolio return consists of the market
component, which is similar to the market components of return variance for
the individual assets in the portfolio, and the nonmarket component, which
tends to be much less than the individual nonmarket components of return
variance, the proportion of return variance explained by the market return
tends to be higher for the portfolio than for the individual assets. That is,
R-squared for the portfolio tends to be larger than R-squared for the individual
stocks.
Example 8.14 Consider the returns for the eight stocks stored in the variable
big8 and analyzed in the previous example. Recall that the R-squared values
for these stocks are given by
> apply(big8, 2, f.rsq)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.219 0.234 0.196 0.485 0.516 0.257 0.276 0.599
The R-squared value for the equally weighted portfolio of the eight stocks
may be calculated using f.rsq as well:
> f.rsq(big8.port)
[1] 0.787
Therefore, R-squared for the portfolio is considerably larger than R-squared
for the individual stocks.
Note that a relatively large value of R-squared for a portfolio indicates
that it is well diversified, in the sense that most of its risk is because of its
relationship with the market portfolio which, by definition, is diversified.
R1,t , R2,t , . . . , RN,t . The single-index model and its implications for portfolio
theory are covered in detail in Chapter 9.
Consider the equally weighted portfolio with weight vector w = (1/N )1.
Then ⎛ ⎞
1 1 1
N
1
σ2,p = 2 1T Σ 1 = ⎝ σ2,j ⎠ = σ̄2
N N N j=1 N
where
1
2
N
σ̄2 = σ
N j=1 ,j
is the average nonmarket component of the return variance of the assets.
Therefore, if N is large and the residual returns for different assets are uncor-
related, then the nonmarket component of return variance for an equally
weighted portfolio tends to be small.
It is important to note that such a conclusion depends on the assumption
that Σ is a diagonal matrix, that is, on the assumption that i,t and j,t
are uncorrelated for i = j. For a general covariance matrix Σ the minimum
possible nonmarket return variance can be found by choosing w to minimize
wT Σ w subject to wT 1 = 1; note that this is the same as finding the weight
vector for the minimum variance portfolio, but with Σ replacing Σ, the
covariance matrix of the returns.
Therefore, according to Proposition 5.3, wT Σ w is minimized by
Σ−1
1
w̃ =
1T Σ−1
1
and the minimum nonmarket component of return variance is given by
1
w̃T Σ w̃ = T −1 .
1 Σ 1
For instance, suppose that Σ is of the form σ2 Mρ where
⎛ ⎞
1 ρ ··· ρ
⎜ .⎟
⎜ρ . . . . . . .. ⎟
⎜
Mρ = ⎜ . ⎟ (8.11)
⎟
⎝ .. . . . . . . ρ⎠
ρ ... ρ 1
and σ2 > 0; see Example 5.7. Then all residual returns have standard deviation
σ and any pair of residual returns for different assets in the same time period
has correlation ρ. In Example 5.7, it was shown that the equally weighted
portfolio is the minimum variance portfolio in this case and it has variance
σ2 T N + N (N − 1)ρ 1−ρ
1 M ρ 1σ2
= = ρ + σ2 .
N2 N2 N
Therefore, the nonmarket component of return variance when Σ = σ2 Mρ
is never less than ρσ2 . Thus, even for a large number of assets and a diversified
Treynor Ratio
The Sharpe ratio considers the mean excess return of a portfolio relative to the
portfolio risk, as measured by standard deviation of the returns. An important
aspect of this measure is that it uses total risk, including both the market and
the nonmarket components.
According to the CAPM, the expected excess return on an asset depends
on the market component of its risk, as measured by βp σm , where βp denotes
the value of beta for the portfolio, assumed to be positive, and σm is the
standard deviation of the return on the market portfolio. Therefore, portfolios
with large values of βp are expected to have larger returns than portfolios with
values of βp close to 0.
A version of the Sharpe ratio with total risk replaced by the market com-
ponent of risk is given by (μp − μf )/(βp σm ). Note that the market risk, as
measured by σm , is the same for each portfolio considered.
The Treynor ratio measures the excess return of a portfolio relative to its
value of beta:
μp − μf
TR = .
βp
Here we can interpret βp as the portfolio’s value of beta in the market model.
Therefore, the Treynor ratio rewards portfolios that have a large Sharpe ratio
while having returns that have low correlation with the returns on the market
index.
To estimate the Treynor ratio, we may use the least-squares estimator β̂p
of the parameter βp in the market model applied to the portfolio returns. Then
an estimator of the Treynor ratio is given by
% R̄p − R̄f
TR = .
β̂p
Example 8.15 Although the methods described here may be applied to any
portfolio of assets, investors often choose to invest in mutual funds, which
are essentially professionally managed, regulated portfolios. Therefore, the
performance measures described in this chapter will be illustrated based on
mutual funds. A mutual fund combines the capital of a number of investors
for the purpose of investing it on their behalf. The investments made by a
fund form a type of portfolio and may include investments in stocks, bonds,
and cash. A given mutual fund has a specific investment strategy, as described
in its prospectus, but the exact investments generally vary over time; in that
respect, a mutual fund differs from the portfolios we have been considering.
In spite of this variation over time, we will analyze the returns on a mutual
fund as if they are the returns on a given portfolio.
We will consider the returns of four mutual funds: Vanguard U.S.
Growth Portfolio Fund (symbol VWUSX), T. Rowe Price New Horizons
Fund (PRNHX), Fidelity Select Utilities Portfolio (FSUTX), and BlackRock
Natural Resources Trust (MDGRX). These funds differ in their investment
objectives. The Vanguard U.S. Growth Portfolio Fund focuses on stocks
exhibiting long-term capital appreciation; T. Rowe Price New Horizons Fund
also looks for long-term growth but focuses on the stocks of small companies,
before they are widely recognized. The other two funds are more specialized.
Fidelity Select Utilities Portfolio invests in companies in the utilities industry
and BlackRock Natural Resources Trust invests in companies with substantial
assets in natural resources.
The value of a share in a mutual fund changes over time, like the price
of a stock, and these share values may be downloaded from Yahoo Finance,
using the same procedure we used to download stock price information. Fur-
thermore, the returns on the mutual funds are calculated in the same way.
The data matrix funds contains five years of monthly excess returns on these
four funds for the period ending December 31, 2014; thus, funds plays the
same role that big8 played in the analysis of the stock returns of eight large
companies. Note that, although these are excess returns, we will generally
refer to them simply as “returns.”
> head(funds)
VWUSX PRNHX FSUTX MDGRX
[1,] -0.06323 -0.0325 -0.0439 -0.0574
[2,] 0.03687 0.0452 -0.0120 0.0323
[3,] 0.06054 0.0830 0.0375 0.0238
[4,] 0.00576 0.0398 0.0342 0.0299
[5,] -0.08571 -0.0646 -0.0559 -0.0929
[6,] -0.05843 -0.0643 -0.0135 -0.0513
The mean and standard deviation of the fund returns may be calculated
using the apply function.
Estimates of αp and βp for the four funds can be estimated using the same
approach used to estimate the parameters of the market model for the “big8”
stocks in Example 8.3.
> funds.mm<-lm(funds~sp500)
> funds.alpha<-funds.mm$coefficients[1,]
> funds.beta<-funds.mm$coefficients[2,]
> funds.alpha
VWUSX PRNHX FSUTX MDGRX
0.000377 0.005150 0.006691 -0.010968
> funds.beta
VWUSX PRNHX FSUTX MDGRX
1.123 1.120 0.473 1.319
> funds.treynor<-funds.mean/funds.beta
> funds.treynor
VWUSX PRNHX FSUTX MDGRX
0.01124 0.01550 0.02505 0.00259
Therefore, the Utilities Portfolio has the largest estimated Treynor ratio.
Note that the estimate of beta for this fund is much lower than those of the
other funds and its estimated correlation with the return on the S&P 500
index is much lower than those of the other funds.
For comparison, we can estimate the Sharpe ratio for each fund.
> funds.sharpe<-funds.mean/funds.sd
> funds.sharpe
VWUSX PRNHX FSUTX MDGRX
0.2870 0.3665 0.3578 0.0545
Hence, the New Horizons Fund has the largest estimated Sharpe ratio and the
Natural Resources Trust has the smallest.
In interpreting these results, it is important to keep in mind that the
Treynor and Sharpe ratios calculated here are estimates of the underlying
“true” ratios corresponding to the funds considered. In particular, in evalu-
ating and comparing funds, it is important to take into account the sampling
variability of the estimates, as measured by their standard errors. Such issues
will be considered in detail in the following section.
Jensen’s Alpha
According to the CAPM, the expected return on a portfolio depends on its
value of beta:
μp = μf + βp (μm − μf ).
Thus, a portfolio with μp > μf + βp (μm − μf ) has a larger expected return
than predicted by the CAPM for its value of beta. A similar interpretation
holds if μp < μf + βp (μm − μf ).
Therefore, one way to evaluate the performance of a portfolio is to compare
its mean return to what is predicted by the CAPM. Let
αp = μp − μf − βp (μm − μf );
When αp > 0, the mean portfolio return is greater than expected for the
portfolio’s value of β; when αp < 0, the mean return is less than expected.
Jensen’s alpha has the advantage of being easy to interpret because it is mea-
sured on the same scale as returns. Recall that αp may also be interpreted in
terms of a portfolio that is over- or underpriced, as discussed in Section 8.4.
Jensen’s alpha may be estimated by α̂p , the least-squares estimator of the
intercept in the market model regression for the portfolio.
Example 8.16 For the four funds represented in the data matrix funds, the
output from estimation of the market model is stored in the variable funds.mm
and the estimates of alpha are stored in the variable funds.alpha.
> funds.alpha
VWUSX PRNHX FSUTX MDGRX
0.000377 0.005150 0.006691 -0.010968
Hence, the U.S. Growth Portfolio, the New Horizons Fund, and the Utili-
ties Portfolio all have an estimated mean return greater than that predicted by
the CAPM, with the largest difference for the Utilities Portfolio; the estimated
mean return of the Natural Resources Trust is less than what is predicted by
the CAPM.
As discussed in the previous example, it is important to keep in mind that
the values reported here are only estimates of the true parameter values.
Appraisal Ratio
One drawback of Jensen’s alpha is that it does not explicitly incorporate
portfolio risk. A portfolio with a large value of αp , but with large risk, may
be less desirable than a less-risky portfolio that has a smaller value of αp .
The market component of return variance for a portfolio, β2p σ2m , is directly
tied to its predicted mean return according to the CAPM through the param-
eter βp . Thus, in evaluating the value of αp for a portfolio, we compare it to
the portfolio’s nonmarket component of risk.
The appraisal ratio is a risk-adjusted form of Jensen’s alpha given by
αp
,
σ,p
where σ,p is the error standard deviation in the market model for the portfolio.
Thus, the appraisal ratio is the difference between the expected return on
the portfolio and its predicted expected return according to the CAPM based
on the value of βp , relative to the nonmarket component of risk for the portfo-
lio. Recall that, according to the CAPM, nonmarket risk is not compensated
by a larger expected return. Thus, a portfolio with a large appraisal ratio is
apparently realizing some reward for its nonmarket risk.
There is an alternative interpretation of the appraisal ratio as the increase
in the Sharpe ratio that may be achieved by combining the portfolio under
consideration with the market portfolio; this result will be discussed in detail
in Section 9.6.
Example 8.17 Consider the four mutual funds with return data stored in
the variable funds. The estimates σ̂,p for these assets may be calculated using
the apply function with the function f.sighat defined in Example 8.13:
Example 8.18 Consider the returns on the Vanguard U.S. Growth Portfolio,
which are stored in the variable vwusx. Output from estimating the market
model using the lm function includes the table
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000377 0.001687 0.22 0.82
sp500 1.122536 0.043467 25.83 <2e-16 ***
Therefore, the estimate of Jensen’s alpha for this fund is 0.000377 and
the standard error is 0.001687, leading to an approximate 95% confidence
interval of
0.000377 ± 1.96(0.001687) = (−0.00293, 0.00368).
Using the sampled integers as the indices of the vector prnhx5 yields a random
sample with replacement from the set of returns values in prnhx5:
> prnhx5[samp]
[1] 0.0452 -0.0325 -0.0325 -0.0646 0.0830
The sample mean of prnhx5[samp] yields a simulated value of the sample
mean return R̄j for this asset:
> mean(prnhx5[samp])
[1] -0.0003
This procedure may be repeated multiple times; for example,
> mean(prnhx5[sample(5, replace=T)])
[1] 0.0286
> mean(prnhx5[sample(5, replace=T)])
[1] 0.0142
and so on.
Note that each time mean(prnhx5[sample(5, replace=T)]) is calcu-
lated, a new set of random numbers is drawn. Suppose we perform this
procedure 1000 times, storing the sample means in the variable prnhx5.boot,
these values represent a type of random sample drawn from the distribution
of the sample mean of five returns on the New Horizons Fund. Here are the
first eight values.
> prnhx5.boot[1:8]
[1] -0.0003 0.02860 0.01420 0.05270 -0.15400 0.03400
[7] 0.01523 0.00447
The sample standard deviation of prnhx5.boot yields an estimate of the
standard error of R̄j for this fund.
> sd(prnhx5.boot)
[1] 0.0247
Note that the bootstrap standard error is close to, but not exactly
the same as, the value given by the usual formula for the standard error
of the sample mean, 0.0272. There are two reasons for the difference. One is
that the sample standard deviation uses a divisor of T − 1; it may be shown
that the estimate of the standard deviation implicity used by the bootstrap
method is equivalent to the one with a divisor of T . The effect of these differ-
ent divisors is highlighted in the example because in that case T = 5. In more
realistic
settings, such as the analysis of five years of monthly data, T = 60
and 60/59 = 1.0084 so that the difference is unlikely to be important.
The other reason for the difference between the bootstrap standard error
and the usual value is that the bootstrap method is based on a random sample.
If the method is repeated, a different standard error will be obtained. For
instance, the bootstrap method was repeated three times, with results
> sd(prnhx5.boot1)
[1] 0.0237
> sd(prnhx5.boot2)
[1] 0.0245
> sd(prnhx5.boot3)
[1] 0.0246
If a very large bootstrap sample size is used, we expect that the result will
be closer to that obtained by the usual method. For example, prnhx5.boot10k
contains a random sample of size 10,000 from the distribution of R̄j for the
New Horizons Fund.
> sd(prnhx.boot10k)
[1] 0.0241
.
Hence, note that 5/4 = 1.118 so that, after accounting for the difference in
divisors, the result is nearly identical to the 0.0272 obtained from the usual
formula.
This function takes the values in x corresponding to the indices in the vector
ind and uses those values to compute the Sharpe ratio. For example, consider
the excess return data in the vector prnhx. To use all the data, we set ind to
1:60, the vector of integers from 1 to 60.
This yields the same result as computing the Sharpe ratio directly for the data
in prnhx:
> mean(prnhx)/sd(prnhx)
[1] 0.367
If 1:60 is replaced by 1:5, the result is the Sharpe ratio based on the first
five values; recall that these are stored in the variable prnhx5.
> Sharpe(prnhx, 1:5)
[1] 0.233
> mean(prnhx5)/sd(prnhx5)
[1] 0.233
The other arguments to boot are data, the data used in calculating the
statistic of interest, and R, the number of bootstrap replications to be used.
Example 8.21 Consider estimation of the Sharpe ratio for the New Horizons
Fund. To calculate the standard error of the estimated Sharpe ratio for the
data in prnhx based on a bootstrap sample size of 1000, we use the command
>library(boot)
> boot(prnhx, Sharpe, 1000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.367 0.00582 0.138
The output gives the value of the estimate, under the heading “origi-
nal”; hence, the estimated Sharpe ratio for these data is 0.367, as calculated
previously. The standard error, given under the “std. error” heading, is 0.138.
The output of the boot function includes an estimate of the bias of the
estimator; recall that the bias is the expected value of the estimator minus
the true value of the parameter.
A bias-corrected estimate may be formed by subtracting the bias from
the estimate; for example, a bias-corrected estimate of the Sharpe ratio for
the New Horizons Fund is 0.367 − 0.006 = 0.361. Whenever the bias is small
relative to the standard error, the impact of the bias correction is small and,
hence, it may be ignored. A simple rule of thumb is that the estimated bias
may be ignored when it is less than one-fourth of the standard error; of course,
such a guideline will not be appropriate in all cases.
The usefulness of the bootstrap method arises from the fact that it can be
applied to a wide range of statistics, by modifying the function used as the
argument to boot. For instance, it may be applied when the statistic under
consideration depends on the returns of more than one asset; this is illustrated
in the following example.
In this function, the return data are input in the matrix rmat, which is
assumed to have two columns, the first with the return data for the asset
and the second with the return data for the market index. The variable ind
contains the indices of the returns to be used in the estimation of the Treynor
ratio. The first two lines of the function extract the relevant return data using
ind and places them in two variables ret and mkt, which contain the return
data for the asset and for the market index, respectively, corresponding to
ind. The third line obtains the estimate of beta for the data in ret and mkt,
and the final line returns the estimate of the Treynor ratio.
For example, using the function with the first argument taken to be
cbind(prnhx, sp500), the matrix formed by combining prnhx and sp500
as column vectors, and taking the second argument to be the sequence of
integers 1:60, yields the estimated Treynor ratio for the New Horizons Fund
Therefore, the standard error for the estimated Treynor ratio for the New
Horizons Fund is 0.00534; the estimated bias is very small relative to the
standard error and, hence, it may be ignored.
Comparison of Portfolios
A common goal in calculating measures of portfolio performance is to compare
portfolios. Hence, we may be interested in estimating the difference between
measures of performance for two portfolios. Estimation of such a difference is
straightforward. We may estimate the difference in performance measures by
the difference of corresponding estimates. To calculate the standard error of
such a difference, we may again use the bootstrap procedure, as implemented
in the boot function, by defining the inputs to boot appropriately. This is
illustrated in the following example.
Example 8.23 Suppose that we are interested in comparing the Sharpe
ratios of the U.S. Growth Portfolio and the New Horizons Fund, the esti-
mated Sharpe ratios are 0.287 for the U.S. Growth Portfolio and 0.367 for
the New Horizons Fund, suggesting that the Sharpe ratio for the New Hori-
zons Fund is larger. However, these are only estimates and it is of interest to
take into account the sampling variability in evaluating the difference in the
estimates.
First, consider calculation of the standard error for each of these individ-
ual estimates. The return data for the U.S. Growth Portfolio are stored in
the variable vwusx and the returns for the New Horizons Fund are stored
in the variable prnhx. Recall that Sharpe is a function that calculates the
Sharpe ratio of an asset, which can be used in the function boot.
The standard errors for the individual Sharpe ratios are given by
> boot(vwusx, Sharpe, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.287 0.00786 0.141
Bootstrap Statistics :
original bias std. error
t1* 0.367 0.00751 0.137
Thus, the standard error of the estimated Sharpe ratio for the U.S. Growth
Portfolio is 0.141 and the standard error of the estimated Sharpe ratio for the
New Horizons Fund is 0.137.
Now consider calculation of the standard error for the difference in two
estimated Sharpe ratios. Note that we cannot use standard error based on the
individual standard errors because the two estimates are likely to be corre-
lated. Hence, we use an approach similar to that used when calculating the
standard error for the difference of means for matched-pair data.
Define a function Sharpe_diff by
> Sharpe_diff<-function(rets, ind){
+ Sharpe(rets[,1], ind)-Sharpe(rets[,2], ind)
+ }
This function takes the return data in the matrix rets, with the returns for
the first asset in column 1 and the returns for the second asset in column 2,
and computes the difference in the Sharpe ratios corresponding to the indices
in ind.
To form a matrix with columns given by the variables vwusx and prhnx, we
use the cbind function. Therefore, standard error of the difference in Sharpe
ratios is given by
> boot(cbind(vwusx, prnhx), Sharpe_diff, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* -0.0795 0.00208 0.0589
Note that the estimated difference, −0.0795, agrees with the difference in
the estimated Sharpe ratios calculated in Example 8.15, 0.2870 − 0.3665. The
standard error of the difference is 0.0589 and, hence, the difference is not
statistically significant at the 5% level; that is, a 95% confidence interval
for the difference includes zero. The estimated bias is small relative to the
standard error and may be ignored.
It is worth noting that the standard error of the difference of the esti-
mates based on the standard errors of the individual estimates along with the
assumption that the estimates are uncorrelated,
1
(0.141)2 + (0.137)2 2 = 0.197
is much larger than the estimate given previously. This is because the method
used to obtain the value 0.197 ignores the fact that the returns on the two
funds are correlated.
It is important to keep in mind that the results based on the bootstrap
method are based on the random numbers generated by the boot function.
Hence, if the procedure is repeated, the results will vary. It is generally a good
idea to repeat the standard error calculation in order to assess this variation; if
it is large enough to affect the conclusions of the analysis, then the bootstrap
sample size should be increased.
Example 8.24 Consider calculation of the standard error of the difference of
the Sharpe ratios for the U.S. Growth Portfolio and the New Horizons Fund.
Recall that in Example 8.23, the standard error was found to be 0.0589.
http://www.spindices.com/documents/methodologies/
methodology-index-math.pdf
Many authors define the market model in terms of regular returns instead
of excess returns; that is the case, for example, with Campbell et al. (1997)
and Fama (1976). The interpretation of β is the same in both cases, although
the value of the estimate generally changes slightly. The interpretation of the
intercept, on the other hand, depends on the type of return used; however, it
is a simple matter to translate the parameter for the model based on regular
returns to the parameter based on excess returns and vice versa.
As discussed in Section 8.3, the market model may be viewed as a sim-
ple linear regression model and, hence, standard least-squares-based methods
may be used for inference for the parameters of the model; Newbold et al.
(2013, Chapter 11) offer a good introduction to these methods and include
a discussion of their application to the market model in their Section 11.8.
Like least-squares estimators in general, the least-squares estimator of beta
is sensitive to outliers; see Martin and Simin (2003) for discussion of an
outlier-resistant estimator.
Further details on shrinkage estimation of β are given by Elton et al. (2007,
Chapter 7), Francis and Kim (2013, Section 17.4), and Vasicek (1973). Mul-
tiple testing and the Bonferroni correction are discussed by Tamhane and
Dunlop (2000, Section 6.3.9), who also present other useful information on
hypothesis tests and their properties. The method of controlling the expected
FDR is known as the Benjamini-Hochberg method. Foulkes (2009, Chapter 4)
presents a good introduction to this and other methods of multiple testing;
although the subject of this book is statistical genetics, a field in which multi-
ple testing is routinely used, it is not difficult to see how the methods described
there may also be applied to problems in financial statistics.
Modigliani and Pogue (1974) offer a detailed discussion of the implications
of the market model for understanding risk. An extension of the market model
allows beta to change over time; see Ruppert (2004, Section 7.10) for an
introduction to such models.
Good general discussions of measures of portfolio performance are available
from Elton et al. (2007, Chapter 25) and Francis and Kim (2013, Chapter 18).
The bootstrap is an important method in statistics that can be used to calcu-
late standard errors and confidence intervals for a wide range of statistics; see,
for example, Efron and Tibshirani (1993) and Davison and Hinkley (1997).
In particular, the boot package used here is attributed to Davison and Hinkley
(1997).
8.12 Exercises
1. Calculate five years of monthly excess returns on Bed Bath &
Beyond Inc., stock (symbol BBBY) and five years of monthly excess
returns on the S&P 500 index (symbol ^GSPC) for the period end-
ing December 31, 2015. For the risk-free rate, use the return on the
three-month Treasury Bill, available on the Federal Reserve website.
Using these data, estimate the parameters α and β of the market
model for Bed Bath & Beyond stock, using the S&P 500 as the
market index; see Example 8.3. Give the standard error of each
estimate. Based on these results, does it appear that Bed Bath &
Beyond stock is priced correctly?
2. Repeat Exercise 1 using the Wilshire 5000 index (symbol ^W5000)
as the market index. Compare the results to those obtained in
Exercise 1.
3. Calculate five years of monthly excess returns for the period end-
ing December 31, 2015, for five stocks, Papa John’s International,
Inc. (symbol PZZA), Bed Bath & Beyond, Inc. (BBBY), Netflix,
Inc. (NFLX), Time Warner, Inc. (TWX), and Verizon Commu-
nications, Inc. (VZ); for the risk-free rate, use the return on the
three-month Treasury Bill, available on the Federal Reserve web-
site. Using the S&P 500 index (symbol ^GSPC) as the market index,
calculate the estimate of beta for the market model for each stock;
see Example 8.3.
Which stock is most sensitive to the market? Which stock is
least sensitive?
4. Consider the return data for the five stocks given in Exercise 3 and
take the market index to be the S&P 500 index.
a. Using the Bonferroni method, test the hypothesis that all five
stocks are priced correctly; see Example 8.5. Using a significance
level of 0.05, what do you conclude?
b. Repeat the analysis controlling the expected FDR at the level
0.10; see Example 8.6. Do your conclusions change?
5. For each of the five stocks listed in Exercise 3, estimate σ,i , the
standard deviation corresponding to the nonmarket component of
the return variance, and give the value of R2 for the market model
regression; see Example 8.7. Use the S&P 500 as the market index.
For which stock is the proportion of the return variance
explained by the market the greatest? For which stock is it the
smallest?
6. For the return data on the five stocks given in Exercise 3, use the
shrinkage method to estimate the values of beta in the market model
with the S&P 500 index as the market index. Compute the esti-
mates using a global value of the weight ψ, as in Example 8.8, and
then repeat the analysis using an asset-specific value of ψ, as in
Example 8.9.
Which shrinkage method appears to be more appropriate here?
Why?
7. For the return data on the five stocks given in Exercise 3, calculate
the value of adjusted beta for each stock. Use the S&P 500 index
as the market index.
9.1 Introduction
In the previous chapter, the market model, which relates the return on an
asset to the return on a market index, was presented. According to the market
model, the excess return on an asset may be written in terms of two random
variables, the excess return on a market index and the residual return, which
is uncorrelated with the return on the market index.
The decomposition given by the market model is useful for understanding
the properties of the expected return and risk of an asset. However, the market
model applies only to a single asset and, hence, it is not useful in explaining
the relationships among the returns on several assets, as required in portfolio
theory.
In this chapter, we consider the single-index model, which is a type of
extension of the market model to a set of N assets. In particular, the single-
index model leads to a simple model for the covariance matrix of an asset
return vector.
where Rm,t denotes the return on a market index at time t, Rf,t denotes
the risk-free rate at time t, i,t is an error term that has mean zero and
is uncorrelated with Rm,t , as discussed in Section 8.3, and αi and βi are
parameters.
As we have seen, the market model is useful for understanding the rela-
tionship between the return on an asset and the return on the market, as
reflected in a market index; in particular, it gives a decomposition of the
variance of an asset’s returns into market and nonmarket components. The
single-index model uses the same approach to describe the covariance struc-
ture of a set of assets; thus, the single-index model is a model for the returns
on a set of assets.
273
At this point, all we have done is rewrite the market models for the N assets
using matrix notation. The single-index model goes further, by assuming that
the covariance matrix of t is a diagonal matrix,
⎛ 2 ⎞
σ,1 0 ... 0
⎜ .. ⎟
⎜ 0 σ2,2 . . . . ⎟
Σ = ⎜⎜ .
⎟.
⎟
⎝ .. .. ..
. . 0 ⎠
0 . . . 0 σ2,N
That is, the single-index model is an extension of the market model to a set of
asset returns in which the market model holds for each asset and the residual
returns for different assets are uncorrelated.
We will say that the single-index model holds for Rt whenever Rt follows
the model given in (9.2) and the preceding assumptions are satisfied.
Note that aT βRm,t = (aT β)Rm,t where (aT β) is a scalar; it follows that
Cov(aT βRm,t , bT βRm,t ) = (aT β)(bT β)σ2m = (aT β)(βT b)σ2m = aT ββT bσ2m .
Cov(aT t , bT t ) = aT Σ b
and
βj σm
ρj = .
β2j σ2m + σ2,j
Now consider ρij , the correlation of Ri,t and Rj,t . Under the single-index
model, the covariance of Ri,t and Rj,t is the (i, j)th element of
σ2m ββT + Σ ;
hence,
Cov(Ri,t , Rj,t ) = βi βj σ2m .
It follows that
βi βj σ2m
ρij = = ρi ρj .
β2i σ2m + σ2,i β2j σ2m + σ2,j
Thus, under the single-index model, the correlation of the returns on any
two assets is equal to the product of the correlations of each assets’ returns
with the returns on the market index.
Example 9.1 Suppose that for a set of three assets, the single-index
model holds with β = (0.8, 0.5, 1.1)T , σ,1 = 0.20, σ,2 = 0.25, σ,3 = 0.10, and
σm = 0.05. Then the covariance matrix of the assets is given by
> cov2cor(Sig1)
[,1] [,2] [,3]
[1,] 1.0000 0.0195 0.0945
[2,] 0.0195 1.0000 0.0480
[3,] 0.0945 0.0480 1.0000
Recall that, for a given asset, βi = ρi (σi /σm ) where σ2i = Var(Ri,t ).
Therefore, the vector of correlations (ρ1 , ρ2 , ρ3 )T of each asset’s returns with
the returns on the market index is given by
> cor_vec<-beta*(0.05/(diag(Sig1)^.5))
> cor_vec
[1] 0.1961 0.0995 0.4819
Products of the form ρi ρj for i = j may be easily obtained from the off-
diagonal elements of cor_vec%*%t(cor_vec):
> cor_vec%*%t(cor_vec)
[,1] [,2] [,3]
[1,] 0.0385 0.0195 0.0945
[2,] 0.0195 0.0099 0.0480
[3,] 0.0945 0.0480 0.2322
Partial Correlation
This property of the correlation between the returns of any two assets in a
given time period may be described in terms of their partial correlation. If, as
in the single-index model, the correlation between Ri,t and Rj,t is attributable
entirely to the fact that both assets’ returns are linearly related to the market
return, then ρij = ρi ρj . It follows that
ρij − ρi ρj
represents the correlation of Ri,t and Rj,t relative to the value of the correla-
tion that would be obtained if the correlation between Ri,t and Rj,t is because
of the relationships of the assets’ returns to the market return.
The partial correlation coefficient of Ri,t and Rj,t given Rm,t is a scaled
version of this difference:
ρij − ρi ρj
ρij·m = . (9.5)
(1 − ρ2i )(1 − ρ2j )
This partial correlation coefficient describes the extent to which Ri,t and
Rj,t are linearly related, after removing the effect of the market return on this
relationship; this is often expressed by saying that ρij·m gives the correlation
between Ri,t and Rj,t “controlling for” the market return. Like the usual
correlation coefficient, it is based on the assumption that the relationships
among Ri,t , Rj,t , and Rm,t are all linear ones. If the single-index model holds,
then ρij·m = 0 for all i, j = 1, 2, . . . , N , i = j.
Example 9.2 In this chapter, we apply the single-index model to the returns
on the stocks of five companies, Cablevision Systems Corp. (symbol CVC),
Edison International (EIX), Expedia, Inc. (EXPE), Humana, Inc. (HUM),
and Wal-Mart Stores, Inc. (WMT). Five years of monthly returns for the
period ending December 31, 2014, were calculated for each stock and stored
in variables with the same name as the stock symbol; for example, cvc contains
the excess returns for Cablevision.
The matrix of excess returns for all five stocks is stored in the variable
stks, which is analogous to the matrix stored in the variable big8 used in
Example 6.6, as well as other examples in Chapters 6 through 8. The vari-
able sp500 contains similar excess returns on the Standard & Poors (S&P)
500 index.
An estimate of the partial correlation coefficient is given by replacing the
correlation coefficients in (9.5) by the sample correlation coefficients. Although
such an estimate is easily calculated using results from the cor function, it
is convenient to use the pcor.test function in package ppcor, which also
includes a test of ρij·m = 0.
For instance, consider the estimated partial correlation of the returns on
Edison stock and Wal-Mart stock given the returns on the S&P 500 index.
> library(ppcor)
> cor(eix, wmt)
V1
V1 0.2769
> pcor.test(eix, wmt, sp500)
estimate p.value statistic n gp Method
1 0.1574 0.229 1.203 60 1 pearson
Therefore, the sample correlation coefficient for the returns on Edison and
Wal-Mart stock is 0.277 and the estimated partial correlation coefficient is
0.157. The test of the null hypothesis that the partial correlation coefficient
for Edison and Wal-Mart returns, controlling for the returns on the S&P 500
index, is zero has p-value 0.229. Hence, there is no evidence to reject the null
hypothesis, and it appears that the correlation between Edison and Wal-Mart
stock returns is attributable entirely to their relationships with the market
return, as measured by the return on the S&P 500 index. That is, there is no
evidence to reject the hypothesis that the single-index model holds for Edison
and Wal-Mart stock.
To calculate the partial correlation coefficients for all pairs of assets, we can
use nested loops that calculate the partial correlation and the corresponding
p-value for the returns on each pair of stocks.
> pcor<-pvalue<-matrix(0, 5, 5)
> for (i in 1:4){
+ for (j in (i+1):5){
+ res<-pcor.test(stks[,i], stks[, j], sp500)
+ pcor[i, j]<-res[1, 1]
+ pvalue[i, j]<-res[1, 2]
+ }
+ }
The matrix pcor will contain the estimated partial correlation coefficients and
the matrix pvalue will contain the associated p-values. Note, although there
are five columns in the return matrix stks, i runs from 1 to 4 and j runs
from i + 1 to 5 to avoid computing each estimate and p-value twice and to
avoid computing the partial correlation coefficient of a vector of returns with
itself.
It is easier to read the results if we add column names and row names to
the result matrices:
> rownames(pcor)<-colnames(stks)
> colnames(pcor)<-colnames(stks)
> rownames(pvalue)<-colnames(stks)
> colnames(pvalue)<-colnames(stks)
> pcor
CVC EIX EXPE HUM WMT
CVC 0 -0.0856 -0.0354 -0.0318 -0.1018
EIX 0 0.0000 -0.0396 -0.0285 0.1574
EXPE 0 0.0000 0.0000 -0.2486 0.1526
HUM 0 0.0000 0.0000 0.0000 -0.0667
WMT 0 0.0000 0.0000 0.0000 0.0000
> pvalue
CVC EIX EXPE HUM WMT
CVC 0 0.517 0.789 0.8101 0.440
EIX 0 0.000 0.765 0.8297 0.229
EXPE 0 0.000 0.000 0.0527 0.244
HUM 0 0.000 0.000 0.0000 0.614
WMT 0 0.000 0.000 0.0000 0.000
Because all p-values are relatively large, there is no evidence that the single-
index model is inappropriate.
When there are many zeros in a table, it is often convenient to replace
them by a different symbol; this is particularly true when, as in the present
case, the zeros do not provide any information—they are simply placeholders.
This can be achieved by converting the matrix to a “table” and then using
the function print, which includes an argument for the symbol used for zeros.
For example,
> print(as.table(pvalue), zero.print=".")
CVC EIX EXPE HUM WMT
CVC . 0.5167 0.7892 0.8101 0.4398
EIX . . 0.7649 0.8297 0.2290
EXPE . . . 0.0527 0.2436
HUM . . . . 0.6138
WMT . . . . .
Using nested loops in this way generally works well when analyzing a small
or moderate number of assets. However, when analyzing a large number of
assets, it may be preferable to use one of the vector-based functions available
in R, such as outer, which is often more efficient in such cases.
Note that when testing a large number of hypotheses, as in the previous
example, it is important to be aware of the multiple testing issue, as discussed
in Section 8.4. That is, when interpreting the p-values for tests of ρij·m = 0 for
a large number of stocks, we expect a few small p-values even if the single-index
model holds for all stocks. As discussed in Section 8.4, the Bonferroni method
may be used in such cases to calculate an adjusted p-value that is valid for test-
ing the hypothesis that the single-index model holds for all stocks considered.
9.4 Estimation
As in the previous section, consider a set of N assets, with returns
R1,t , R2,t , . . . , RN,t at time t, t = 1, 2, . . . , T , and suppose the single-index
model (9.2) holds. In this section, we consider estimation of the parameters of
the model: the vector β, β = (β1 , β2 , . . . , βN )T, the vector α = (α1 , α2 , . . . , αN )T,
and the standard deviations of the residual returns, σ,1 , σ,2 , . . . , σ,N .
Note that, under the assumptions of the single-index model, the parameter
estimates for each asset may be obtained by estimating the parameters of the
market model for that asset. Hence, parameter estimation for the single-index
model uses the methods discussed in Section 8.3.
Example 9.3 Consider the stock returns for the five companies listed in
Example 9.2. Using the same procedure used in Chapter 8 for the data in big8,
the results for the market model applied to the returns in stks are stored in
the variable stks.mm. The estimated regression coefficients are therefore in the
$coefficients component of stks.mm.
> stks.mm<-lm(stks~sp500)
> stks.mm$coefficients
CVC EIX EXPE HUM WMT
(Intercept) -9.75e-05 0.00914 0.0108 0.0162 0.0059
sp500 1.07e+00 0.47407 0.8902 0.6516 0.4568
> stks.alpha<-stks.mm$coefficients[1,]
> stks.alpha
CVC EIX EXPE HUM WMT
-9.75e-05 9.14e-03 1.08e-02 1.62e-02 5.90e-03
> stks.beta<-stks.mm$coefficients[2,]
> stks.beta
CVC EIX EXPE HUM WMT
1.073 0.474 0.890 0.652 0.457
where ⎛ ⎞
σ2,1 0 ... 0
⎜ .. .. ⎟
⎜ 0 σ2,2 . . ⎟
Σ = ⎜
⎜ ..
⎟.
⎟
⎝ .. ..
. . . 0 ⎠
0 ... 0 σ2,N
Let ⎛ ⎞
σ̂2,1 0 ... 0
⎜ .. .. ⎟
⎜ 0 σ̂2,2 . . ⎟
Σ = ⎜ ⎟.
⎜ .. .. .. ⎟
⎝ . . . 0 ⎠
0 ... 0 σ̂2,N
Then an estimate of Σ based on the single-index model is given by
= Sm
Σ 2
β̂β̂T + Σ (9.7)
2
where Sm is the sample variance of Rm,1 − Rf,1 , Rm,2 − Rf,2 , . . . , Rm,T −
Rf,T .
Example 9.4 Consider the data analyzed in Example 9.3. The matrix Σ
is the diagonal matrix with diagonal elements σ̂,1 , . . . , σ̂,5 . Hence, for this
2 2
example, it may be formed from the estimates in stks.s using the diag
function, which forms a diagonal matrix from a vector of diagonal elements.
> stks.Sigeps<-diag(stks.s^2)
> rownames(stks.Sigeps)<-colnames(stks.Sigeps)<-c(labels(stks.s))
> print(as.table(stks.Sigeps), zero.print=".")
The command
> rownames(stks.Sigeps)<-colnames(stks.Sigeps)<-c(labels(stks.s))
assigns the labels from the vector stks.s to the row and column names of the
matrix stks.Sigeps. The print command is used with as.table in order to
print the zeros in the matrix as dots.
The matrix β̂β̂T may be obtained using matrix multiplication with the
vector stks.beta. Recall that the matrix multiplication operator is %*% and
t is the transpose function.
> stks.beta%*%t(stks.beta)
CVC EIX EXPE HUM WMT
[1,] 1.152 0.509 0.955 0.699 0.490
[2,] 0.509 0.225 0.422 0.309 0.217
[3,] 0.955 0.422 0.792 0.580 0.407
[4,] 0.699 0.309 0.580 0.425 0.298
[5,] 0.490 0.217 0.407 0.298 0.209
The estimate Σ defined in (9.7) may now be obtained by combining
diag(stks.s^2) with stks.beta%*%t(stks.beta) and the sample variance
of the returns on the S&P 500 index.
> stks.Sig<-var(c(sp500))*(stks.beta%*%t(stks.beta)) +
+ diag(stks.s^2)
> stks.Sig
CVC EIX EXPE HUM WMT
[1,] 0.008388 0.000718 0.001348 0.000987 0.000692
[2,] 0.000718 0.002410 0.000595 0.000436 0.000305
[3,] 0.001348 0.000595 0.012255 0.000818 0.000574
[4,] 0.000987 0.000436 0.000818 0.005428 0.000420
[5,] 0.000692 0.000305 0.000574 0.000420 0.001979
Note that the variance of the returns on the S&P 500 index is calculated by
var(c(sp500)) rather than by var(sp500) since sp500 is a 60 × 1 matrix
rather than a vector and, hence, var(sp500) returns a 1 × 1 matrix, rather
than a scalar; c(sp500) converts sp500 to a vector.
The estimate stks.Sig may be compared to the sample covariance matrix
of the returns in stks.
> cov(stks)
CVC EIX EXPE HUM WMT
CVC 0.008273 0.000401 0.001046 0.000808 0.000354
EIX 0.000401 0.002374 0.000407 0.000347 0.000596
EXPE 0.001046 0.000407 0.012067 -0.000974 0.001223
HUM 0.000808 0.000347 -0.000974 0.005346 0.000233
WMT 0.000354 0.000596 0.001223 0.000233 0.001950
the two estimates of standard deviation should be very close, with the
difference due to the slightly different divisors used in the estimates.
> diag(stks.Sig)^.5
[1] 0.0916 0.0491 0.1107 0.0737 0.0445
> diag(cov(stks))^.5
CVC EIX EXPE HUM WMT
0.0910 0.0487 0.1098 0.0731 0.0442
That is, the estimates of the asset return standard deviations based on the
single-index model are essentially the same as the return sample standard
deviations.
The correlation matrix corresponding to stks.Sig is given by
> cov2cor(stks.Sig)
CVC EIX EXPE HUM WMT
[1,] 1.000 0.160 0.133 0.146 0.170
[2,] 0.160 1.000 0.110 0.120 0.140
[3,] 0.133 0.110 1.000 0.100 0.116
[4,] 0.146 0.120 0.100 1.000 0.128
[5,] 0.170 0.140 0.116 0.128 1.000
this may be compared to the sample correlation matrix of the excess returns,
> cor(stks)
CVC EIX EXPE HUM WMT
CVC 1.0000 0.0905 0.1047 0.1215 0.0881
EIX 0.0905 1.0000 0.0761 0.0973 0.2769
EXPE 0.1047 0.0761 1.0000 -0.1212 0.2522
HUM 0.1215 0.0973 -0.1212 1.0000 0.0721
WMT 0.0881 0.2769 0.2522 0.0721 1.0000
> cov2cor(stks.Sig)-cor(stks)
CVC EIX EXPE HUM WMT
[1,] 0.0000 0.0691 0.0283 0.0247 0.0817
[2,] 0.0691 0.0000 0.0334 0.0232 -0.1370
[3,] 0.0283 0.0334 0.0000 0.2216 -0.1357
[4,] 0.0247 0.0232 0.2216 0.0000 0.0560
[5,] 0.0817 -0.1370 -0.1357 0.0560 0.0000
> stks.shrink<-shrinkcovmat.equal(t(stks))$Sigmahat
> diag(stks.shrink)^.5
CVC EIX EXPE HUM WMT
0.0878 0.0571 0.1029 0.0742 0.0542
These estimates are similar to those based on the single-index model, with
some differences, particularly for EIX and WMT. The shrinkage estimate of
the correlation matrix is given by
> cov2cor(stks.shrink)
CVC EIX EXPE HUM WMT
CVC 1.0000 0.0604 0.0874 0.0936 0.0561
EIX 0.0604 1.0000 0.0524 0.0618 0.1452
EXPE 0.0874 0.0524 1.0000 -0.0963 0.1655
HUM 0.0936 0.0618 -0.0963 1.0000 0.0437
WMT 0.0561 0.1452 0.1655 0.0437 1.0000
Although the three estimates of the return correlation matrix are similar,
there are some differences. The most important is the correlation between
Humana and Expedia stocks. Note that, according to the sample covariance
matrix and the shrinkage estimate of the covariance matrix, the returns on
these stocks are negatively correlated. However, both stocks have positive esti-
mates of beta, 0.890 for Expedia and 0.652 for Humana. Therefore, according
to the single-index model, both stocks are positively correlated with the mar-
ket index and, hence, they are positively correlated with each other. Specifi-
cally, the estimate of the correlation based on the single-index model is 0.100,
while the sample covariance is −0.121 and the shrinkage estimate is −0.096.
Because the shrinkage estimate is a weighted average of the sample covari-
ance matrix and a scaled identity matrix, it is not surprising that the shrinkage
correlation estimates are closer to zero than are the sample correlations. The
shrinkage estimates also tend to be closer to zero than are the correlation
estimates based on the single-index model.
μ − μf 1 = α + β(μm − μf )
and
Σ = σ2m ββT + Σ (9.10)
where σ2m = Var(Rm ) and Σ is a diagonal matrix.
To estimate the weight vector of a given portfolio under the single-index
model, we can simply use the single-index-model estimates of Σ and μ − μf 1
in the expression for the portfolio weights. The following example illustrates
this for the case of the tangency portfolio.
Example 9.5 Consider the example of the returns on the stocks of the five
companies discussed in the examples in this chapter. Recall that stks.Sig is
an estimate of the return covariance matrix under the assumption that the
single-index model holds and stks.alpha and stks.beta are estimates of α
and β, respectively. Therefore, an estimate of wT is given by
> wgt<-solve(stks.Sig, stks.alpha + stks.beta*mean(sp500))
> w_T_si<-wgt/sum(wgt)
> w_T_si
CVC EIX EXPE HUM WMT
0.009 0.353 0.080 0.269 0.289
Note that sp500 contains the excess returns on the S&P 500 index.
The weights calculated here may be compared to the estimated weights
calculated without assuming that the single-index model holds.
> w_1T<-solve(cov(stks), apply(stks, 2, mean))
> w_T<-w_1T/sum(w_1T)
> w_T
CVC EIX EXPE HUM WMT
0.035 0.331 0.119 0.314 0.201
The two sets of weights are similar but there are differences; this is not sur-
prising because the single-index model is an important assumption regarding
the relationship among the returns of the different stocks.
Note that the differences in the estimates are a consequence of the dif-
ferences in the estimates of Σ. Because of the properties of least-squares
estimates, the estimate of μ − μf 1 under the single-index model agrees with
the estimate based on the sample means of the excess returns:
> apply(stks, 2, mean)
CVC EIX EXPE HUM WMT
0.0116 0.0143 0.0205 0.0233 0.0109
> stks.alpha + stks.beta*mean(sp500)
CVC EIX EXPE HUM WMT
0.0116 0.0143 0.0205 0.0233 0.0109
The estimates of the tangency portfolio weights calculated earlier can also
be compared to the estimates based on a shrinkage estimate of the covari-
ance matrix. Using the shrinkage estimate based on a target matrix of the
form σ2 I, which is stored in the variable stks.shrink (see Example 9.4), the
estimate of tangency weight vector is given by
> w.sh1<-solve(stks.shrink, apply(stks, 2, mean))
> w.sh1/sum(w.sh1)
CVC EIX EXPE HUM WMT
0.061 0.278 0.149 0.331 0.180
The three sets of estimates are generally similar, with some differences; as
noted previously, such differences are not unexpected.
Note that many of the portfolio weight vectors we have considered depend
on the return covariance matrix Σ through its inverse Σ−1 . Under the form
of Σ using the single-index model, it is possible to derive a relatively simple
expression for this inverse. Although such an expression is not needed for
numerical work, it is useful for understanding how such portfolio weights are
related to the parameters of the single-index model.
Hence, we now consider such an expression for Σ−1 ; we begin with some
useful results on matrix inverses.
and, hence,
c c2
c ddT − dd T
− ddT ddT
1 + c dT d 1 + c dT d
1 c dT d
= c 1− − ddT
1 + c dT d 1 + c dT d
1 + c dT d
= c 1− ddT = 0.
1 + c dT d
It follows that the right-hand side of (9.13) is equal to I. A similar argument
can be used to show (9.12).
The result in Lemma 9.1 may now be used to determine the inverse of a
matrix of the form σ2m ββT + Σ .
Lemma 9.2. Let Σ be an N × N positive-definite matrix and let β be an
element of N . Then, for all σ2m ≥ 0,
2 T −1 σ2m
σm ββ + Σ = Σ−1
− Σ−1 T −1
ββ Σ . (9.14)
1 + σ2m βT Σ−1
β
Hence,
−1
− 12 −1 −1 −1
(Σ + σ2m ββT )−1 = Σ IN + σ2m (Σ 2 β)(Σ 2 β)T Σ 2 . (9.15)
−1
Applying Lemma 9.1 with c = σ2m and d = Σ 2 β, it follows that
−1 −1 c −1 −1
(IN + σ2m (Σ 2 β)(Σ 2 β)T )−1 = IN − − 12 − 1 (Σ 2 β)(Σ 2 β)T .
1 + c(Σ β)T (Σ 2 β)
Using this result in (9.15) yields
σ2m
(σ2m ββT + Σ )−1 = (Σ + σ2m ββT )−1 = Σ−1
− Σ−1 T −1
ββ Σ
1 + σm βT Σ−1
2
β
(9.16)
as given in the statement of the lemma.
Note that Lemma 9.1 does not require Σ to be a diagonal matrix; however,
when it is a diagonal matrix, a simple expression for Σ−1 is available. Lemma
9.1 also shows that the covariance matrix σ2m ββT + Σ is invertible provided
that Σ is invertible. The same argument shows that the estimated covariance
matrix Sm 2 is invertible under the weak condition that σ̂2 > 0 for
β̂β̂T + Σ ,j
j = 1, 2, . . . , N .
The explicit expression for the inverse of Σ in this setting can be used
to derive explicit expressions for the weight vectors of some of the optimal
portfolios we have considered in terms of β and Σ .
μ = μf 1 + α + (μm − μf )β
Σ = σ2m ββT + Σ
and
σ2m
γ= βT Σ−1
(μ − μf 1)
1 + σ2m βT Σ−1
β
σ2m
= βT Σ−1
(α − (μm − μf )β) .
1 + σ2m βT Σ−1
β
Proof. For a given value of the mean vector μ, consider calculation of the
weight vector of the tangency portfolio when Σ is given by σ2m ββT + Σ . Using
(9.14), this weight vector is proportional to
σ2m
Σ−1
− Σ −1
ββ T −1
Σ (μ − μf 1), (9.17)
1 + σ2m βT Σ−1
β
Because βT Σ−1
(μ − μf 1) is a scalar,
σ2m
Σ−1 T −1
β(β Σ (μ − μf 1)) = γΣ−1
β
1 + σm βT Σ−1
2
β
where
σ2m T −1 σ2m
γ= β Σ (μ− μ f 1) βT Σ−1
(α − (μm − μf )β) .
1 + σ2m βT Σ−1
β 1 + σ2m βT Σ−1
β
This result is useful for gaining some insight into the relationship
between weights of the tangency portfolio and the parameters (αj , βj , σ2,j ),
j = 1, 2, . . . , N . Note that vj , the jth element of v, may be written as
vj = αj /σ2,j + δβj /σ2,j , j = 1, 2, . . . , N
where δ = μm − μf − γ. That is, the tangency weight for asset j is propor-
tional to a linear function of αj /σ2,j and βj /σ2,j .
If α = 0, as would be the case if the capital asset pricing model (CAPM)
holds, then
v = (μm − μf − γ0 )Σ−1 β
where
σ2m
γ0 = βT Σ−1
β(μm − μf ).
1 + σ2m βT Σ−1
β
That is, v is proportional to Σ−1 β so that the weight given to asset j in
the tangency portfolio is proportional to βj /σ2,j . Such weights may be viewed
as a simple approximation to the tangency weights that holds when the |αj |
are small. According to this approximation, asset j receives a large weight in
the tangency portfolio when βj is large relative to the residual variance for
asset j.
Example 9.6 For the stocks analyzed in Example 9.5, consider the approx-
imation to the tangency portfolio weights based on βj /σ2,j , j = 1, . . . , 5.
Recall that the estimates of βj for these stocks are stored in the variable
stks.beta and the estimates of σ,j are stored in the variable stks.s. Then
the approximations to the tangency portfolio weights are given by
> stks.beta/(stks.s^2)/sum(stks.beta/(stks.s^2))
CVC EIX EXPE HUM WMT
0.182 0.260 0.092 0.155 0.311
These weights may be compared to the estimated tangency portfolio
weights based on the single-index model
> w_T_si
CVC EIX EXPE HUM WMT
0.009 0.353 0.080 0.269 0.289
E(Rt ) = α + (μm − μf )β
Σ = σ2m ββT + Σ ,
where αp = wpT α and βp = wpT β are the values of alpha and beta, respectively,
for the portfolio. The variance of the portfolio return is given by
Var(Rp,t ) = wpT σ2m ββT + Σ wp
= σ2m (wpT β)(βT wp ) + wpT Σ wp
= σ2 β2p + wpT Σ wp .
The goal in active portfolio management is to choose wp so that the resulting
portfolio outperforms the market portfolio.
The approach used here to construct such a portfolio is based on the
following idea. We may view the market portfolio as a single asset in which we
may place an investment; hence, we have, effectively, N + 1 assets from which
to form a portfolio. That is, we consider the market portfolio to be a tradeable
asset. This is, in fact, essentially the case as there are a number of mutual
funds constructed to track various market indices. We then choose the optimal
weights for these N + 1 assets. The result is a combination of the market
portfolio and the N assets under consideration. This is known as the Treynor–
Black method.
where μm = E(Rm,t ),
σ21 = Var(Ri,t ) = β2i σ2m + σ2,i
where σ2m = Var(Rm,t ) and σ2,i = Var(i,t ),
ρ12 = Cov(Ri,t , Rm,t )Cov(βi Rm,t , Rm,t ) = βi σ2m ,
μ2 = μm and σ2 = σm .
Using these values in (9.18), and simplifying, leads to the expression
αi /σ2,i
wi∗ = (9.19)
(μm − μf )/σ2m + (1 − βi )αi /σ2,i
for the weight given to asset 1, which in this case is asset i. The weight given
to the market portfolio is therefore
(μm − μf )/σ2m − βi αi /σ2,i
1 − wi∗ = .
(μm − μf )/σ2m + (1 − βi )αi /σ2,i
We will refer to the portfolio placing weight wi∗ on asset i and weight 1 − wi∗ on
the market portfolio as the Treynor–Black portfolio of asset i and the market
portfolio.
If asset i is priced correctly relative to the market portfolio, in the sense
that αi = 0, then the optimal weight to give asset i is 0; that is, we cannot
improve the market portfolio by including more or less of asset i. However,
if αi = 0, then the market portfolio may be improved by combining it with
asset i.
The Treynor–Black portfolio of asset i and the market portfolio has mean
excess return
wi∗ (μi − μf ) + (1− wi∗ )(μm − μf ) = wi∗ (αi + βi (μm − μf )) + (1− wi∗ )(μm − μf )
= wi∗ αi + (wi∗ βi + 1 − wi∗ )(μm − μf )
1 2 2
= αi /σ,i + (μm − μf )2 /σ2m
c
where
c = (μm − μf )/σ2m + (1 − βi )αi /σ2,i
is the denominator in the expressions for wi∗ and 1 − wi∗ .
The variance of the return on this portfolio is given by
(wi∗ )2 σ2i + (1 − wi∗ )2 σ2m + 2wi∗ (1 − wi∗ )Cov(Ri , Rm )
= (wi∗ )2 (β2i σ2m + σ2,i ) + (1− wi∗ )2 σ2m + 2wi (1− wi )βi σ2m
= (wi∗ )2 σ2,i + (wi βi + 1 − wi )) σ2m
2
1 μm − μf
= 2 αi /σ,i +
2 2
σ2m
c σ2m
1 (μm − μf )2
= 2 α2i /σ2,i + .
c σ2m
Note that (μm − μf )2 /σ2m is the squared Sharpe ratio of the market portfolio.
Therefore, if αi = 0, the Sharpe ratio of the Treynor–Black portfolio is
equal to the Sharpe ratio of the market portfolio. However, if αi = 0, the
Sharpe ratio of the Treynor–Black portfolio is greater than that of the market
portfolio. The quantity αi /σ,i is known as the appraisal ratio of the asset;
recall that we considered the appraisal ratio as a measure of portfolio perfor-
mance in Section 8.9. In the present context, the magnitude of the appraisal
ratio indicates the possible improvement in the Sharpe ratio of the market
portfolio that may be achieved by including more or less of the asset.
Example 9.7 Suppose that Rm , the return on the market portfolio, has mean
0.025 and standard deviation 0.04 and that the risk-free rate is μf = 0.005.
Consider an asset with return Ri with mean 0.02, standard deviation 0.08,
and suppose that the correlation of Ri and Rm is ρi = 0.30 so that βi = 0.60
and
σ2,i = (0.08)2 − (0.60)2 (0.04)2 = 0.00582.
In Example 7.6, it was shown that αi = 0.008 so that the price of asset i is
mispriced, with a price that is too low.
Therefore, we can improve on the market portfolio by constructing a port-
folio based on the market portfolio and asset i. The weight given to asset i in
such a portfolio is
αi /σ2,i
(μm − μf )/σ2m + (1 − βi )αi /σ2,i
(0.008)/(0.00582)
=
(0.025 − 0.005)/(0.04)2 + (1 − 0.60)(0.008)/(0.00582)
= 0.105;
Example 9.8 Consider the stock in Apple Inc. (symbol AAPL); here we
analyze five years of monthly returns for the period ending December 31,
2014. The variables aapl.alpha and aapl.s contain the estimates of αi and
σ,i , respectively, for this asset. Then the estimated appraisal ratio for Apple is
> aapl.alpha/aapl.s
(Intercept)
0.233
The estimated Sharpe ratio of the S&P 500 index is 0.290. Thus, according
to (9.20), the estimated Sharpe ratio of the Treynor–Black portfolio based on
Apple stock together with the S&P 500 index is
1
(0.233)2 + (0.290)2 2 = 0.372.
Then the standard error of the estimated appraisal ratio for Apple may be
calculated using
> library(boot)
> boot(cbind(aapl, sp500), appraisal, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.233 0.00571 0.141
Thus, an approximate 95% confidence interval for the true appraisal ratio
of Apple stock is
Hence, although there is some evidence to suggest that increasing the invest-
ment in Apple stock leads to a portfolio with a larger Sharpe ratio, a formal
test of the hypothesis that the true appraisal ratio is zero does not reject the
null hypothesis at the 5% level.
A similar approach may be used to calculate a standard error for the
estimated weight given to Apple stock in the Treynor–Black portfolio. The
standard error based on a bootstrap sample size of 10,000 was calculated
to be 6.84. An extremely large value such as this should be interpreted as
an indication that there is large variability in bootstrap replications of the
Treynor–Black weight.
In order to investigate this variability, the vector of bootstrap estimates of
the Treynor–Black weight may be saved to a variable using
> aapl.boot<-boot(cbind(aapl, sp500), tb.wgt, 10000)
Here tb.wgt is a user-defined function, similar to appraisal defined earlier,
that calculates the Treynor–Black weight. The bootstrap replications of the
statistic specified in the boot function are stored in the component $t of the
result, in this case aapl.boot$t.
The sample quantiles of the 10,000 values in aapl.boot$t may be
calculated using
> quantile(aapl.boot$t, prob=c(0.01, 0.05, 0.10, 0.50, 0.90,
+ 0.95, 0.99))
1% 5% 10% 50% 90% 95% 99%
-0.54860 0.00923 0.09006 0.42768 1.23355 1.77756 5.23613
These results show that there is considerable variability in these values, sug-
gesting that the Treynor–Black weight is not accurately estimated, at least
with the amount of data considered here.
It is worth noting that this same high degree of variability is observed
if other assets are analyzed in place of Apple stock. Also, recall that in
Σ−1
+ (μ+ − μf 1N +1 ). (9.22)
Σ−1 −1
+ Σ+ = Σ+ Σ+ = IN +1 .
A = Σ−1 2 T 2 −1
(Σ + σm ββ ) − σm Σ ββ = IN ,
T
B = σ2m Σ−1 −1
β − σm Σ β = 0,
2
and
C = −σ2m βT Σ−1 2 T 2 T −1 −1
β + (1 − σm β (Σ + σm ββ ) β) = 1,
verifying that Σ−1
+ Σ+ = IN +1 .
We can now use the result in Lemma 9.3, along with the expression for
the weights of the tangency portfolio given in (9.22), to derive the optimal
modification to the market portfolio.
and ⎛ ⎞
1 ⎝ μm − μf
N
wm = − αj βj /σ2,j ⎠
c σ2m j=1
N
w0,j + wm = 1.
j=1
Proof. The expression for the weights of the tangency portfolio of the N + 1
assets is given in (9.22). Write
w0
Σ−1
+ (μ+ − μ 1
f N +1 ) = c
wm
where c is a constant, chosen so that the weights sum to 1.
Using the result in Lemma 9.3, along with the fact that
μ − μf 1
μ+ − μf 1N +1 = ,
μm − μf
it follows that
cw0 = Σ−1 −1
(μ − μf 1) − Σ β(μm − μf )
= Σ−1
(μ − μf 1 − β(μm − μf )).
Note that ⎛ ⎞
α1
⎜ α2 ⎟
⎜ ⎟
α = ⎜ . ⎟ = μ − μf 1 − β(μm − μf )
⎝ .. ⎠
αN
so that ⎛ ⎞
α1 /σ2,1
1⎜
2 ⎟
1 ⎜ α2 /σ,2 ⎟
w0 = Σ−1 α = ⎜ . ⎟.
c c⎝ .. ⎠
αn /σ2,N
The weight for the market index is given by
1 + σ2m βT Σ−1
β
cwm = −βT Σ−1
(μ − μf 1) + (μm − μf )
σ2m
μm − μf
= − βT Σ−1
(μ − μf 1 − β(μm − μf ))
σ2m
μm − μf
N
μm − μf
= − βT Σ−1
α = − αj βj /σ2,j
σm
2 σ2m j=1
αj /σ2,j
w̄0,j = N , j = 1, 2, . . . , N,
j=1 αj /σ,j
2
N N
assuming that j=1 αj /σ2,j = 0. Note that j=1 w̄0,j = 1.
To form the Treynor–Black portfolio, the market index is given weight
N
∗
(μm − μf )/σ2m − αj βj /σ2,j
j=1
wm = N N (9.23)
j=1 αj /σ,j + (μm − μf )/σm − j=1 αj βj /σ,j
2 2 2
∗
and the portfolio with weight vector w̄0 is given weight 1 − wm .
Example 9.10 Consider the stocks of the five companies listed in Example
9.2 and analyzed in several examples in this chapter. Recall that the estimates
of αj for these stocks are stored in the variable stks.alpha and estimates of
σ,j are stored in the variable stks.s. The weights w̄0,j , j = 1, 2, 3, 4, 5 are
calculated as follows.
> wbar0<-(stks.alpha/stks.s^2)/sum(stks.alpha/stks.s^2)
> wbar0
CVC EIX EXPE HUM WMT
-0.0012 0.3587 0.0796 0.2752 0.2877
These results suggest that the best combination of the five stocks to com-
bine with the market index consists primarily of Expedia, Humana, and
Wal-Mart stocks, in roughly equal weights. The other stocks have weights
that are relatively small in magnitude.
The weight given to the market index using this approach is given by
with the remainder, 1 − 0.0781 = 0.922, invested in the portfolio of the five
stocks, with the weights given in the variable wbar0 calculated earlier.
⎛ ⎞
1 (μm − μf )
N
α 2 2
= 2⎝
j ⎠
σ2TB + ,
c σ2m σ 2
j=1 ,j
and
1 μm − μf
βTB = .
c σ2m
Here c is as defined in the statement of Proposition 9.3:
N
μm − μf
N
c= αj /σ2,j + − αj βj /σ2,j .
j=1
σ2m j=1
1
αj 1
αj βj
N N
1 μm − μf
μTB − μf = (μj − μf ) + (μm − μf ) + (μm − μf )
c j=1 σ,j
2 c σm 2 c j=1 σ2,j
1 (μm − μf )2 1
αj
N
= + (μj − μf − βj (μm − μf ))
c σ2m c j=1 σ2,j
⎛ ⎞
1 ⎝ (μm − μf )2
α2j ⎠
N
= + ,
c σ2m σ2
j=1 ,j
and the market component of σ2TB is β2TB σ2m , where βTB is the value of beta
for the Treynor–Black portfolio.
Using the expression for βTB derived previously, the market component of
σ2TB is given by
1 (μm − μf )2
. (9.25)
c2 σ2m
Adding (9.24) and (9.25) shows that
⎛ ⎞
1 ⎝ (μm − μf )2
α2j ⎠
N
σ2TB = 2 + .
c σ2m σ2
j=1 ,j
Let SRTB = (μTB − μf )/σTB denote the Sharpe ratio of the Treynor–Black
portfolio. By construction, it is at least as large as the Sharpe ratio of the
market index; an expression for the difference in the squared Sharpe ratios is
given in the following corollary to Proposition 9.4.
Corollary 9.2.
(μm − μf )2
N
α2j
(SRTB )2 − = . (9.26)
σ2m σ2
j=1 ,j
Example 9.11 The Sharpe ratio for the market index corresponding to the
S&P 500 index is given by
> mean(sp500)/sd(sp500)
[1] 0.290
For the stocks represented in the data matrix stks, the estimated difference
between the squared Sharpe ratio of the Treynor–Black portfolio described
in Example 9.10 and the squared Sharpe ratio of the market portfolio is
given by
> sum((stks.alpha^2)/stks.s^2)
[1] 0.125
Thus, the estimated Sharpe ratio of the Treynor–Black portfolio is
1
(0.290)2 + 0.125 2 = 0.457.
N
Bias in the Estimator of j=1 α2j /σ2,j
N
The quantity j=1 α2j /σ2,j measures the difference between the squared Sharpe
ratio of the Treynor–Black portfolio and the squared Sharpe ratio of the mar-
ket portfolio; hence, it gives a measure of the possible improvement in the
market portfolio by combining it with a portfolio of the assets under consider-
ation. Of course, in practice, N j=1 α 2
j /σ2
,j must be estimated using parameter
estimators from the market models for the assets under consideration.
NHowever, such an estimator tends to overestimate the true value of
α
j=1 j
2
/σ 2
,j giving an overly optimistic assessment of the benefit from
,
modifying the market portfolio. The reason for this is that the estimator
N
j=1 α̂j /σ̂,j is a sum of squared random variables α̂j /σ̂,j , j = 1, 2, . . . , N .
2 2
N
E⎝ α̂2j /σ̂2,j ⎠ = E α̂2j /σ̂2,j
j=1 j=1
N
2
= E (α̂j /σ̂,j ) + Var (α̂j /σ̂,j )
j=1
N
= α2j /σ2,j + Var (α̂j /σ̂,j )
j=1 j=1
N
> α2j /σ2,j .
j=1
One way to correct for this bias is to use the bootstrap method, as we
did in Section 8.9 when estimating measures of portfolio performance. This is
illustrated in the following example.
N
Example 9.12 Consider estimating the bias in j=1 α̂2j /σ̂2,j as an estimator
N
of j=1 α2j /σ2,j using the function boot in the package boot. Recall that a
function to be used in boot must take two arguments: the data, in the form
of a vector or matrix, and the indices of the data values to be used in the
computation. Define the function shrp.sq.diff
This function assumes that the argument x is a matrix of returns, with the
last column corresponding to the market portfolio and the remaining columns
corresponding to the assets to be used in forming the Treynor–Black portfo-
lio. Calculation of s.hat, the vector of estimates of σ,j , is carried out using
the component $residuals of the output from the function lm; this compo-
nent consists of a matrix of residuals corresponding to the different response
variables used in the lm function, in this case, the returns on the different
assets. Adding the squares of these residuals for each asset and dividing by
the degrees of freedom, yields estimates of σ2,j .
Therefore, shrp.sq.diff takes the values in x corresponding to the indices
in the vector ind and uses those values to compute the difference in the esti-
mated Sharpe ratios of the Treynor–Black and market portfolios. For example,
taking x = cbind(stks, sp500) and ind = 1:60 returns the difference of
the estimated squared Sharpe ratios calculated in Example 9.11
Bootstrap Statistics :
original bias std. error
t1* 0.125 0.111 0.129
Therefore, the estimated bias is 0.111 and the bias-corrected estimate is only
0.125 − 0.111 = 0.004, considerably smaller than the original estimate.
The output from the boot also gives the standard error of the estimate.
Although this value is useful for getting a rough idea of the variability in
the estimates, in this case, it is not useful for constructing
N an approximate
confidence interval for the true difference value of j=1 α2j /σ2,j .
N
Note that the parameter j=1 α2j /σ2,j is nonnegative and the correspond-
ing estimator is a nonnegative random variable. Hence, if the true value of
N
j=1 αj /σ,j is close to zero, then the distribution of the estimator is not well
2 2
and Finlay (2009, Section 11.7). A number of useful results on matrix inverses,
including the one used in Lemma 9.1, are available from Henderson and Searle
(1981); the inverses of partitioned matrices are discussed by Lu and Shiou
(2002).
Active portfolio management uses a variety of methods in an attempt
to outperform the benchmark portfolio; see Grinold and Kahn (2000) and
Chincarini and Kim (2006) for book-length treatments of this area. The
Treynor–Black method is attributed to Treynor and Black (1973); see also
Francis and Kim (2013, Section 17.2) and Kane et al. (2012). Optimal active
portfolios based on the properties of their residual returns are considered by
Grinold and Kahn (2000, Chapter 5); see Qian et al. (2007, Section 2.2.4) for
an alternative approach.
Given that active portfolio management relies on the inefficiency of the
market portfolio, it is not surprising that some analysts are skeptical of its
benefits; see, for example, Samuelson (1974) and Sharpe (1991).
9.8 Exercises
1. Let β denote a vector in N , let Σ denote an N × N matrix, and
suppose σ2m > 0. Show that
10.1 Introduction
Although there are many assets available to an investor, the returns on these
assets are often correlated—in some cases, highly correlated. One reason for
this correlation is that the returns on a set of assets may all be affected by cer-
tain changes in the economy; alternatively, the assets may correspond to firms
with similar properties. A factor model describes the returns on a set of assets
in terms of a few underlying “factors” potentially affecting all of the assets.
We have already seen one example of a factor model, the single-index
model discussed in Chapter 9, which describes the returns on a set of assets
in terms of the returns on a market index. An important implication of this
model is that the covariance between the returns on two assets arises from the
fact that both assets’ returns are related to the return on the market index.
Although it may be reasonable to assume that the behavior of the market as a
whole is the most important factor affecting asset returns, empirical research
has shown that there are other factors, in addition to the return on the market
index, that have important effects on asset returns. These factors are useful
for describing the correlation structure of a set of asset returns as well as for
describing the behavior of the mean returns of the assets, extending the type
of relationship described by the single-index model in Chapter 9.
The goal of this chapter is to present the statistical methodology under-
lying these factor models along with the implications of these models for
understanding the behavior of asset returns and for constructing and analyzing
portfolios.
311
Example 10.1 Consider the returns on two stocks, JetBlue Airways Corp.
(symbol JBLU) and EV Energy Partners, L.P. (EVEP), an oil and natu-
ral gas company. The variables jblu and evep contain 5 years of monthly
excess returns on JBLU and EVEP stock, respectively, for the period end-
ing December 31, 2014, and suppose that sp500 contains the corresponding
excess returns on the Standard & Poors (S&P) 500 index. Then the estimated
correlation of the returns on these stocks is given by
The estimated correlations of each return with the return on the S&P 500
index are given by
> cor(jblu, sp500)
[1] 0.311
> cor(evep, sp500)
[1] 0.268
Therefore, each stock’s returns are positively correlated with the return on
the market index, but the returns are negatively correlated with each other.
Note that relationships of this type are not possible under the single-index
model. The estimates of beta for the two stocks are given by
> lm(jblu~sp500)$coef
(Intercept) sp500
0.0137 0.8770
> lm(evep~sp500)$coef
(Intercept) sp500
-0.00437 0.74013
The estimated return variance for the S&P 500 index is
> var(sp500)
[1] 0.00141
According to the single-index model, the estimated covariance of the returns
on JBLU and EVEP stock is
(0.877)(0.740)(0.00141) = 0.000915,
> 0.000915/(sd(jblu)*sd(evep))
[1] 0.0832
Hence, although the sample correlation of returns on JBLU and EVEP stock is
negative (−0.150), the estimated correlation based on the single-index model
is positive (0.0832).
One reason for this behavior may be the presence of other economic vari-
ables that are affecting the returns on JBLU and EVEP stock. For example,
JBLU, as an airline stock, is likely to be negatively affected by increasing
oil prices; EVEP, on the other hand, as a gas and oil stock, is likely to be
positively affected by increasing oil prices. Hence, oil prices might have an
important effect on the relationship between the returns on these two stocks.
One commonly used benchmark for crude oil prices is the price of West
Texas Intermediate (WTI) oil, which is generally refined in the United States.
Historical prices for WTI oil are available on the Federal Reserve Eco-
nomic Data (FRED) website at https://research.stlouisfed.org/fred2/series/
DCOILWTICO/downloaddata; like stock prices, these data are available for
different sampling frequencies, such as daily or monthly prices. Let the vari-
able oil denote the proportional change in monthly prices of WTI oil for the
5-year period ending December 31, 2014; thus, oil is calculated the same
way that asset returns are calculated, except that oil prices, rather than stock
prices, are used.
Note that, as expected, the returns on JBLU stock are negatively cor-
related with the change in oil prices, while the returns on EVEP stock are
positively correlated with the change in oil prices:
> cor(jblu, oil)
[1] -0.265
> cor(evep, oil)
[1] 0.528
Therefore, in modeling the relationship between JBLU and EVEP stock,
it may be important to take into account changes in oil prices. This is likely
to be true when analyzing the returns on other stocks thought to be related
to oil prices.
A second use of the single-index model is in understanding the role of an
asset’s relationship with the market index in the expected return on the asset.
According to the single-index model or, equivalently, the market model, the
expected excess return on asset i, μi − μf , is related to the expected excess
return on the market index, μm − μf , by
μi − μf = αi + βi (μm − μf ),
where αi and βi are the parameters in the market model for asset i.
If the asset is priced correctly, in the sense described in Section 8.4, then
αi = 0 and the expected excess return on an asset is proportional to its value
of beta; see Section 7.7 for further details. This fact suggests that assets with
greater values of beta will tend to have higher expected excess returns. The
following example shows that this interpretation of beta is not always useful
in practice.
Example 10.2 Consider stocks for firms represented in the S&P 500 index.
Five years of monthly returns for the period ending December 31, 2014, were
analyzed; 474 of the stocks had returns for that entire period.
For each stock, the parameters of the market model were estimated,
along with the sample mean excess return. These results suggest that all
474 stocks are priced correctly; for instance, the minimum p-value for testing
αi = 0 is 0.00236 so that, using the Bonferroni method, we fail to reject the
hypothesis that all αi are equal to zero at any level.
The estimates of beta for the 474 stocks are stored in the variable
sp474.mmbeta. Note that there is considerable variation in the estimates of
beta:
> summary(sp474.mmbeta)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.004 0.732 1.060 1.080 1.390 2.870
Hence, according to the CAPM, we expect that stocks with large estimates of
beta will tend to have higher sample mean excess returns.
Figure 10.1 contains a plot of the sample mean excess returns versus the
estimated value of beta for the 474 stocks. Note that there is, at most, a
very weak relationship between a stock’s sample mean excess return and its
estimate of beta. Furthermore, this plot does not support the idea that stocks
with larger values of beta tend to have large mean excess returns. For instance,
the sample correlation of the estimates of beta and the sample mean excess
returns based on these data is only 0.0359.
√ The standard error of this estimate
when the true correlation is zero is 1/ N , where N is the number of observa-
tions; here N = 474. Hence, the standard error of the estimate is 0.0459 and
the correlation is not significantly different than zero. The squared sample
0.05
Sample mean excess returns
0.04
0.03
0.02
0.01
−0.01
FIGURE 10.1
Plot of sample mean excess returns versus estimates of beta for stocks in the
S&P 500 index.
correlation is about 0.0013, so only about 0.13% of the variation in the sam-
ple mean excess returns on the stocks may be explained by their estimates of
beta. Thus, the theoretical relationship expressed in Figure 7.1 does not hold
for these data.
The results in the previous example suggest that, at least in some cases, the
relationship between the returns on an asset and the returns on a market index
is not sufficient to effectively describe the mean excess returns on an asset.
That is, these results are consistent with the idea that it may be important to
include factors other than the return on a market index in a model for asset
returns.
Here Rf,t is the return on the risk-free asset at time t. The idea behind this
model is that all assets are related to “the market” and volatility in the market
induces volatility in the returns of individual assets. Furthermore, under this
model, the correlation between the returns of two assets is a result of the fact
that both assets are related to the market.
A general factor model extends the single-index model by including other
risk factors, in addition to the market return, in the model. These factors
may represent economic conditions that, like the return on a market index,
affect all assets. Or the factors might reflect properties of the assets under
consideration, such as the size of the company, in the case of a stock. This
flexibility in the factors, which may be chosen to represent the analyst’s beliefs
and goals, is one of the strengths of factor models. In this section, we consider
the form and properties of a factor model, along with parameter estimation;
selection of the factors is considered in the following section.
Let F1,t , F2,t , . . . , FK,t denote the values of K factors at time t, t =
1, 2, . . . , T . For i = 1, 2, . . . , N , let Ri,t denote the return on asset i at time t.
Then a factor model that describes Ri,t in terms of F1,t , F2,t , . . . , FK,t has the
form
Ri,t − Rf,t = αi + βi,1 F1,t + βi,2 F2,t + · · · + βi,K FK,t + i,t , t = 1, 2, . . . , T,
where i,1 , i,2 , . . . , i,T are unobserved mean-zero random variables that are
uncorrelated with the factors. These terms represent the component of the
asset’s excess return not explained by the factors.
Note that the values of the factors F1,t , F2,t , . . . , FK,t are the same for
each asset and, hence, do not depend on i; in this sense, they may be viewed
as common factors. The parameters βi,1 , βi,2 , . . . , βi,K , known as the factor
sensitivities for asset i, measure how the factors affect a particular asset’s
returns. Hence, these parameters depend on i; however, they are assumed to
be constant over the observation period, so that they do not depend on t.
In the factor model, the factor sensitivities, like β in the single-index model,
are unknown parameters that must be estimated.
It is assumed that the same factor model applies to all assets under
consideration so that, for t = 1, 2, . . . , T ,
Cov(t , Ft ) = 0, t = 1, 2, . . . , T,
Σ = βΣF βT + Σ , (10.2)
Portfolios
Consider a portfolio of the N assets under consideration based on a weight
vector w = (w1 , w2 , . . . , wN )T . Then the return on the portfolio at time t,
Rp,t , may be written as
Rp,t = wT Rt , t = 1, 2, . . . , T.
Var(Rp,t ) = βTp ΣF βp + wT Σ w.
Estimation
Given a set of factors thought to be relevant to the asset returns under con-
sideration, the factor sensitivities are estimated from the available data. Let
F1,t , F2,t , . . . , FK,t denote the values of K factors at time t, t = 1, 2, . . . , T , and
let Ri,t , t = 1, 2, . . . , T denote the returns on a given asset. Then, according
to the factor model,
where i,t is uncorrelated with F1,t , F2,t , . . . , FK,t ; here, Rf,t is the return on
the risk-free asset at time t.
Hence, as in the case of the single-index model, the parameter estimates
for asset i may be obtained using least-squares regression based on the returns
on asset i; that is, all parameter estimates may be obtained from N regression
analyses, one for each asset. Specifically, the parameters αi , βi,1 , βi,2 , . . . , βi,K
may be estimated using least-squares regression with response variable
Ri,t − Rf,t at time t and predictor variables F1,t , F2,t , . . . , FK,t at time t. The
error variance for asset i, σ2,i = Var(it ), may be estimated using the usual
estimator from the regression analysis.
The excess returns on JetBlue stock are stored in the variable jblu.
To estimate the parameters of the factor model for JetBlue stock, we use
the lm function to fit a regression model with predictor variables sp500
and oil:
> summary(lm(jblu~sp500+oil))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.00965 0.01282 0.75 0.4547
sp500 1.15544 0.34078 3.39 0.0013 **
oil -0.60598 0.19626 -3.09 0.0031 **
> summary(lm(evep~sp500+oil))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000828 0.011979 0.07 0.95
sp500 0.380479 0.318424 1.19 0.24
oil 0.782726 0.183384 4.27 7.5e-05 ***
> jblu.fm<-lm(jblu~sp500+oil)
> evep.fm<-lm(evep~sp500+oil)
> betamat<-rbind(jblu.fm$coef[2:3], evep.fm$coef[2:3])
> Sig.eps<-diag(c(summary(jblu.fm)$sigma,
+ summary(evep.fm)$sigma)^2)
> cov.fm<-betamat%*%cov(cbind(sp500, oil))%*%t(betamat) +
+ Sig.eps
Here jblu.fm and evep.fm contain the results from the factor model regres-
sions. The matrix betamat is the matrix of coefficient estimates; note that
$coef extracts the coefficient estimates from the result of lm. Thus, here
betamat contains the estimated factor sensitivities,
> betamat
sp500 oil
[1,] 1.16 -0.606
[2,] 0.38 0.783
> Sig.eps
[,1] [,2]
[1,] 0.00899 0.00000
[2,] 0.00000 0.00785
> cov.fm
[,1] [,2]
[1,] 0.01153 -0.00096
[2,] -0.00096 0.01104
> cov2cor(cov.fm)
[,1] [,2]
[1,] 1.0000 -0.0851
[2,] -0.0851 1.0000
Thus, the factor model with two factors, the return on the S&P 500 index and
the change in the price of WTI oil, captures the negative correlation between
the returns on JetBlue and EV Energy Partners stock.
10.4 Factors
As noted in the previous section, there is considerable freedom in the selection
of the factors to include in a factor model. Factors are often divided into two
categories. Economic factors are variables measuring the general state of the
market or of the economy; for instance, the return on a market index and
the unemployment rate are two examples of economic factors. Fundamental
factors are based on the characteristics of a particular firm, such as the size
of the firm, as measured by its market capitalization; however, because the
factors in our model must apply to all assets, such characteristics must first
be converted to a common factor.
An economic factor may be based on any macroeconomic variable thought
to have an important influence on the asset returns under consideration.
Response RS :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0782 0.0416 -1.88 0.065 .
sp500 1.7668 0.1700 10.39 1.1e-14 ***
indpro 0.0304 0.0149 2.03 0.047 *
unemp 0.7603 0.5137 1.48 0.144
Response XOM :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.03568 0.02590 -1.38 0.1738
sp500 0.84442 0.10591 7.97 8.6e-11 ***
indpro -0.02685 0.00931 -2.88 0.0056 **
unemp 0.52112 0.31996 1.63 0.1090
Note that the statistical significance of the different factors varies considerably
by stock. This is not surprising; although, in general, the factors are related to
the stocks’ returns, the nature of the relationship depends on the particular
features of the stock under consideration. It is interesting to note that, for
each stock considered, one of indpro and unemp is statistically significant, or
close to significant, at the 0.05 level.
The estimate of the matrix of factor coefficients, β, may be extracted from
the result of the lm function and is stored in the R matrix betamat.
The index [-1,] is used to drop the estimate of the intercept when
constructing betamat.
The estimates of ΣF , the covariance matrix of the factors, and Σ , the error
covariance matrix, are stored in the matrices Sig_F and Sig_eps, respectively.
> Sig.F<-cov(cbind(sp500, indpro, unem))
> Sig.F
sp500 indpro unem
sp500 0.00141 -0.00235 -0.00283
indpro -0.00235 0.18510 0.06878
unem -0.00283 0.06878 1.53847
> f.sig<-function(y){summary(lm(y~sp500+indpro+unem))$sigma}
> Sig.eps<-diag(apply(stks4, 2, f.sig)^2)
> Sig.eps
CAT CTAS RS XOM
CAT 0.00302 0.00000 0.00000 0.000000
CTAS 0.00000 0.00115 0.00000 0.000000
RS 0.00000 0.00000 0.00235 0.000000
XOM 0.00000 0.00000 0.00000 0.000912
The estimate of the covariance matrix of the assets based on the factor
model, that is, using (10.2), is given by
> Sig<-betamat%*%Sig.F%*%t(betamat) + Sig.eps
and the corresponding correlation matrix is given by
> cov2cor(Sig)
CAT CTAS RS XOM
CAT 1.000 0.494 0.618 0.506
CTAS 0.494 1.000 0.535 0.420
RS 0.618 0.535 1.000 0.530
XOM 0.506 0.420 0.530 1.000
This can be compared to the estimate based on the single-index model
CAT CTAS RS XOM
CAT 1.000 0.499 0.578 0.528
CTAS 0.499 1.000 0.531 0.485
RS 0.578 0.531 1.000 0.561
XOM 0.528 0.485 0.561 1.000
and to the sample correlation matrix
> cor(stks4)
CAT CTAS RS XOM
CAT 1.000 0.432 0.729 0.484
CTAS 0.432 1.000 0.569 0.451
RS 0.729 0.569 1.000 0.579
XOM 0.484 0.451 0.579 1.000
The estimates based on the factor model and the single-index model are gen-
erally similar, but there are some differences. For instance, the correlation of
the returns on XOM and the returns on the other stocks is smaller for the
factor model than for the single-index model. This is likely because of the fact
that the estimate of the coefficient of indpro for XOM is negative, while the
estimates of this coefficient for the other stocks are positive, reducing the fac-
tor model estimate of correlation as compared to the estimate based on the
single-index model.
low-X portfolio and those of the high-X portfolio. This procedure yields a
common factor that applies to all assets; it may also be used in a factor model
for a portfolio, for which it may not be possible to measure X, such as in the
firm size example.
> stks.mm$coef
CVC EIX EXPE HUM WMT
(Intercept) -9.75e-05 0.00914 0.0108 0.0162 0.0059
sp500 1.07e+00 0.47407 0.8902 0.6516 0.4568
In this example, the returns on these stocks are analyzed using a factor
model based on three factors. The first factor is the return on the S&P 500
index and the second factor is the SMB factor described earlier.
For the third factor, we use a factor based on the book-to-market ratio
of the stock. The “book value” of a stock is the value of the stock based on
accounting information for the firm, while the “market value” of the stock
is based on the price of the stock. Thus, a large book-to-market ratio suggests
that the stock is undervalued by the market; such a stock is often referred to as
a value stock. Fama and French (1993) construct a factor based on the book-to-
market ratio of the stocks, using the same general procedure used to construct
the factor SMB. The factor based on the book-to-market ratio is known as
HML, for “high minus low”; thus, this factor is based on a zero-investment
portfolio contrasting stocks with high book-to-market ratios with stocks with
low book-to-market ratios. The model with factors SMB and HML, together
with the return on a market index, is known as the Fama–French three-factor
model.
Data on the factors SMB and HML are available from the Kenneth R.
French Data Library, on the website http://mba.tuck.dartmouth.edu/pages/
faculty/ken.french/data library.html. This site contains extensive data useful
in analyzing financial data. The values for the factors SMB and HML are given
in the file “Fama/French 3 Factors” found in the section on U.S. Research
Returns Data.
Five years of monthly data on SMB and HML for the period ending
December 31, 2014, are stored in the variables smb and hml, respectively.
Note that the data in the French Data Library are generally in the form of
percentage returns, while here we use proportional returns; hence, the vari-
ables smb and hml contain the values given in the French Data Library, divided
by 100.
To fit the factor model with these three factors to the stocks represented
in the data matrix stks, we may use the function lm:
> stks.fact<-lm(stks~sp500+smb+hml)
The coefficient estimates for these factors may be extracted by
> stks.fact$coef
CVC EIX EXPE HUM WMT
(Intercept) 0.0026 0.0094 0.0072 0.016 0.0044
sp500 0.8379 0.4461 1.1570 0.535 0.6333
smb 0.5268 0.0637 -0.4224 0.625 -0.5659
hml 0.9994 0.1158 -1.5607 -0.389 -0.3341
Wal-Mart (WMT) has the largest market capitalization of the five stocks con-
sidered; hence, it is not surprising that its sensitivity to SMB is negative. CVC
has the smallest market capitalization and its sensitivity to SMB is fairly large
and positive. However, the sensitivities are not a direct measure of the firm’s
size; they measure a property of the relationship between the stock’s returns
and those of the zero-investment portfolio on which SMB is based. Hence, it
is possible for the stock returns of a large firm to have a positive sensitivity
to SMB or the returns of a small company to have a negative sensitivity to
SMB; that is, in fact, the case for EXPE, which has a fairly small market
capitalization but a negative sensitivity to SMB.
The summary function may be used to obtain the standard errors of the esti-
mates along with other useful information such as the R-squared and adjusted
R-squared values for the regressions; see Example 10.4.
Estimates of the residual standard deviations may be obtained by defining
a function f.sighat by
> f.sighat<-function(y){summary(lm(y~sp500+smb+hml))$sigma}
and then using the apply function
> apply(stks, 2, f.sighat)
CVC EIX EXPE HUM WMT
0.0811 0.0465 0.1033 0.0692 0.0398
Note that the factors in a factor model are, in general, correlated. For
instance, for the three factors analyzed in the previous example, the estimated
correlation matrix of the factors is
> cor(cbind(sp500, smb, hml))
sp500 smb hml
sp500 1.000 0.4380 0.2122
smb 0.438 1.0000 0.0614
hml 0.212 0.0614 1.0000
> lm(wmt~sp500)$coef
(Intercept) sp500
0.0059 0.4568
Ri − Rf = αi + βi (Rm − Rf ) + i ,
αi = E(Ri − Rf ) − βi E(Rm − Rf ).
E(Ri ) − μf = βi E(Rm − Rf ).
That is, under the efficiency of the portfolio with return Rm , the expected
excess return on an asset is proportional to its value of beta, βi . Thus, the
See Section 7.3 for further discussion of the implications of the CAPM.
Now consider a factor model of the form
where F1 , F2 , . . . , FK are the values of the factors. Then, under the usual
assumptions on i , the expected excess return on asset i may be written as
E(Ri ) − E(Rj ) = (βi,1 − βj,1 )Γ1 + (βi,2 − βj,2 )Γ2 + · · · + (βi,K − βj,K )ΓK ,
for some constants Γ1 , Γ2 , . . . , ΓK . Note that such a result holds under the
condition that all αi are equal,
α1 = α2 = · · · = αN ≡ α,
for some constant α but, as we will see, it also holds under weaker conditions
on α1 , α2 , . . . , αN .
The key assumption of the CAPM is the efficiency of the market portfolio.
Given the generality of a factor model—there is considerable flexibility in the
specific factors used in the model—a more general approach is needed for
factor models. Hence, we rely on a concept more fundamental than efficiency,
arbitrage.
Roughly speaking, an arbitrage opportunity is one in which an investor
makes no net investment, has no chance of losing money, and has at least
some chance of making money.
Let R denote an N × 1 vector of asset returns. Consider a zero-investment
portfolio, that is, one based on a weight vector v satisfying v T 1 = 0, with
return Rp = v T R. The portfolio corresponding to v is said to be an arbitrage
portfolio if there is zero probability of a negative return, P(Rp < 0) = 0, and
there is positive probability of a positive return P(Rp > 0) > 0. Thus, an
arbitrage portfolio requires no investment, has zero probability of a negative
R − Rf 1 = α + βF + .
Taking expectations,
E(R) − μf 1 = α + βE(F ).
Our goal is to show that, under the no-arbitrage assumption, the vector of
expected excess returns E(R) − μf 1 is of the form
α1 + βΓ
that u and v are orthogonal, then u must be orthogonal to all vectors that
are orthogonal to M; that is,
u ∈ (M⊥ )⊥ ,
so that
u ∈ M,
proving the result.
Lemma 10.1 may now be used to prove the following simple form of APT.
Proposition 10.1. Suppose that the following factor model holds:
R − Rf 1 = α + βF (10.3)
E(R) − μf 1 = α1 + βΓ (10.4)
1T v = 0 and βT v = 0K . (10.5)
v T R = v T α + v T βF = v T α.
It follows that the portfolio return has zero variance and an expected value
v T α. Under the no-arbitrage assumption, the expected return must be 0 so
that v T α = 0.
That is, v is orthogonal to M implies that v is orthogonal to α. It now
follows from Lemma 10.1 that α lies in M, the space spanned by 1 and the
columns of β; hence, α must be of the form
α = α1 + βΓ̃
the result now follows by writing this equation in the form (10.4) by defining
Γ = Γ̃ + E(F ).
E(Ri ) − E(Rj ) = (βi,1 − βj,1 )Γ1 + (βi,2 − βj,2 )Γ2 + · · · + (βi,K − βj,K )ΓK .
That is, the difference in the expected returns of two assets can be described
in terms of the differences in the factor sensitivities for the two assets.
Asymptotic Arbitrage
Of course, the assumption that the asset returns are completely determined
by the factors is an unrealistic one. More general versions of APT are based
on the idea that if a zero-investment portfolio has a “small” return variance,
then it must have a “small” expected return. Note that this is a type of
continuity assumption. Recall that a function f (·) is continuous at a point x0 if
|f (x) − f (x0 )| is “small” whenever |x − x0 | is “small.” Definitions of continuity
are generally based on the concept of convergence; for instance, in the case of
a function, f (·) is continuous at x0 if, for a sequence x1 , x2 , . . . , the condition
that xn → x0 implies that f (xn ) → f (x0 ).
Therefore, a more general treatment of APT is based on assumptions
regarding sequences of portfolios. For N = 1, 2, . . . , consider a set of N assets
and let R(N ) denote the corresponding N × 1 vector of asset returns. Let
vN , N = 1, 2, . . . denote a sequence of vectors such that vN is N × 1 and
T
vN 1N = 0, N = 1, 2, . . . . Thus, each vN defines a zero-investment portfolio
based on the set of N assets.
(N ) T (N )
Let Rp = vN R(N ) , N = 1, 2, . . . so that, for each N , Rp is the return
on a zero-investment portfolio of N assets. We say that the no-asymptotic-
arbitrage assumption holds if
Var(Rp(N ) ) → 0 as N → ∞
implies that
E(Rp(N ) ) → 0 as N → ∞.
Thus, under the no-asymptotic-arbitrage assumption, a zero-investment
portfolio with a small return variance must also have a small mean return.
Suppose that for each N = 1, 2, . . . the factor model
α1N + β(N ) Γ
where Ft is a vector of factor values and the residual return vector t satisfying
E(t ) = 0, in addition to the other conditions described in Section 10.3.
According to APT, we may write
T
R̄i − R̄f = (Ri,t − Rf,t )
T t=1
denote the sample mean excess return on the asset. It follows from (10.8) that
where βi,1 , βi,2 , . . . , βi,K are the factor sensitivities of asset i and
α, Γ1 , Γ2 , . . . , ΓK are unknown parameters; Γk is known as the risk premium
or, simply, the premium of factor k. The premium of a factor represents
the reward, in terms of a greater expected return, for assuming the risk
associated with a factor. Thus, our goal is to estimate the factor premiums
Γ1 , Γ2 , . . . , ΓK or, equivalently, the vector Γ = (Γ1 , Γ2 , . . . , ΓK )T along with
the scalar parameter α.
This equation gives an expression for the excess mean return vector
E(Rt − Rf 1) in terms of the factor sensitivities for the assets, as given by the
matrix β, the vector α, which is a property of the assets under consideration,
and the vector of factor means E(Ft ).
APT tells us that the vector α is approximately of the form
α1 + βΓ̃ (10.11)
for a scalar α and a vector Γ̃. Using this fact in (10.10), and defining Γ to
be Γ̃ + E(Ft ), leads to (10.8). Thus, it is important to keep in mind that, in
the expression (10.8), the parameter Γ is different than the vector of factor
means, E(Ft ). In particular, we cannot estimate Γ by the sample means of
the factors.
A second important fact to keep in mind is that the result of APT is valid
only if the factor model correctly describes the asset returns under consider-
ation. There are at least two aspects to this. First, we assume that the error
term t in (10.7) is uncorrelated with the factor vector Ft . In particular, if we
inadvertently omit an important factor from the factor model, and that factor
is correlated with the factors in the model, then the least-squares estimators
of the factor sensitivities will be biased, a case of omitted-variable bias.
A second aspect of model misspecification is its effect on the correlation
structure of t . A key part of the proof of (10.6) is that we can form a portfolio
in which the variance of residual returns of the portfolio is negligible. If the
covariance matrix of t is diagonal, then this is generally possible when N
is large. However, if, after accounting for the factors, the asset returns are
still correlated, it might not be possible. Thus, it is important that the factor
model includes those factors that play an important role in describing the
correlation between the returns of different assets. That is, there is a second
potential problem related to an omitted factor, in addition to possible omitted
variable bias—the omitted variable may lead to correlation in the residual
returns for different assets, so that the expression in (10.11) is not an accurate
approximation to α.
for i = 1, 2, . . . , N but with the estimates β̂i,k replacing the parameters βi,k .
Specifically, we use least-squares regression with the sample mean excess
returns, R̄i − R̄f , i = 1, 2, . . . , N , as the observed response variable and the
first-stage estimates of the factor sensitivities, β̂i,1 , β̂i,2 , . . . , β̂i,K , as the pre-
dictor variables corresponding to R̄i − R̄f . The methodology is illustrated in
the following example.
Example 10.6 Recall that in Example 10.2, stock return data for firms rep-
resented in the S&P 500 index were analyzed; 474 of the stocks had returns
for the period under consideration. The data matrix for these 474 stocks is
stored in the variable sp474.data.
The first step is to estimate the parameters of a factor model for each
stock. Here, we use four factors: the return on the S&P 500 index (sp500),
the factors SMB and HML described in Example 10.5 (variables smb and hml,
respectively), and a “momentum” factor, denoted by MOM. Like SMB and
HML, the momentum factor is based on constructing portfolios from two sets
of stocks, those that have performed well in recent months and those that have
performed poorly; the factor MOM is the return on a zero-investment portfolio
based on the difference in the returns on these two portfolios. Thus, MOM is
different than SMB and HML in the sense that it is based on previous returns
of the assets rather than on properties of the firms issuing the stock. The
momentum factor is originally attributed to Carhart (1997); values of MOM
are available in the Kenneth R. French Data Library, in the section “Sorts
involving Prior Returns.” The variable mom contains the values of MOM from
the French Data Library, divided by 100.
The factor model estimates for the 474 stocks may be calculated using
> sp474.fm<-lm(sp474.data~sp500+smb+hml+mom)
The estimates of β1 , the coefficient of the S&P 500 index; β2 , the coefficient
of SMB; β3 , the coefficient of HML; and β4 , the coefficient of MOM, for each
of the 474 stocks are stored in the matrix sp474.beta, which has 474 rows
and 4 columns and that may be obtained by the command
Note that the index [, -1] is used in defining sp474.beta in order to drop
the column of intercept estimates; the transpose function t is used so that
sp474.beta has the same form as the parameter β.
To estimate the factor premiums, we fit a regression model with the sample
mean excess returns for each of the 474 stocks as the response variable and
beta estimates as the predictor variables. Note that, in the function lm, we
may specify the model in terms of a matrix of predictor variables; then the
columns of the matrix are taken as the predictors.
Therefore, the estimated mean excess return on a stock with beta estimates
given by β̂1 , β̂2 , β̂3 , and β̂4 is
According to the R-squared value for the regression, about 25.4% of the vari-
ation in the sample mean excess returns on the 474 stocks can be explained
by their estimates of the factor sensitivities; a better estimate of this quantity
is the adjusted R-squared value, which, in this case, is essentially the same
at 24.8%.
For instance, for the stock of the 3M Company (symbol MMM), β̂1 = 1.008,
β̂2 = 0.00366, β̂3 = −0.000988, and β̂4 = −0.000107. Therefore, the estimate
of the mean excess return on 3M stock is
for comparison, the observed sample mean excess returns for MMM is 0.0148.
For the 474 stocks, the estimated mean excess returns are given by fitted
values from the regression
> sp474.fit<-lm(sp474.mean~sp474.beta)$fitted.values
and the average error in the fitted values as estimates of sample mean excess
return is
> mean(abs(sp474.fit-sp474.mean))
[1] 0.00535
The correlation of the fitted values and the sample mean excess returns is
> cor(sp474.fit, sp474.mean)
[1] 0.504
which is simply the square root of the R-squared value from the regression.
as the dependent variable in the analysis for period t. Thus, if our data con-
sist of 5 years of monthly returns, we would obtain 60 estimates of each factor
premium—one estimate for each time period. These estimates are obtained
using least-squares regression with the estimated factor sensitivities as the
predictor variables; note that the same predictor variables are used in each
regression. The final estimates of the factor premiums are given by the sam-
ple means of these 60 estimates, and the standard errors of the premiums
estimates are given by the usual expression for the standard error of a sample
mean.
Example 10.7 Consider estimation of the factor premiums for the factor
model analyzed in Examples 10.2 and 10.6. Recall that the data matrix for
the 474 stocks under consideration is stored in the variable sp474.data and
the corresponding estimates of the factor sensitivities for the assets are stored
in the matrix sp474.beta.
To estimate the factor premiums for each time period, we may use the
command
> spcoef<-lm(t(sp474.data)~sp474.beta)$coef
Note that the data matrix must be transposed in order to use it as the response
variable in lm. The result of this command, spcoef, is a matrix with five rows,
one for the intercept of the model and one for each factor coefficient in the
model, and 60 columns, one for each time period. For instance, the first
column of the matrix
> spcoef[,1]
(Intercept) sp500 smb hml mom
0.0334 -0.0681 0.0118 0.0117 -0.0593
gives the estimated factor premiums based on the data in period 1.
To obtain the overall estimates of the factor premiums, we average the
estimates in each row of spcoef
> apply(spcoef, 1, mean)
(Intercept) sp500 smb hml mom
0.014464 0.000372 0.006519 -0.004380 0.006333
Rolling Regressions
In the models used in this section to estimate factor premiums, the excess
returns on the stocks under consideration are related to those stocks’ estimated
Example 10.8 Consider estimation of the factor premiums for the factor
model analyzed in Examples 10.2 and 10.6. In Example 10.6, estimates of
the factor premiums were obtained using least-squares regression with the
response variable taken to be the sample mean excess returns for the 474
assets and the predictor variables given by the estimated factor sensitivities.
The response and predictor variables are all based on data in the matrix
sp474.data, which contains the monthly excess returns on the 474 stocks for
the 5-year period ending December 31, 2014.
Now suppose we are interested in using the estimated factor sensitivities
to describe future asset returns. The variable sp474.data.115 contains the
excess returns for the 474 stocks for January 2015. Therefore, we can estimate
the factor premiums for the four factors in the model using least-squares
regression with response variable sp474.data.115 and the predictor variables
given in the matrix sp474.beta.
> summary(lm(sp474.data.115~sp474.beta))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04205 0.00821 5.12 4.5e-07 ***
sp474.betasp500 -0.06412 0.00752 -8.52 < 2e-16 ***
sp474.betasmb 0.00330 0.00607 0.54 0.59
sp474.betahml -0.02486 0.00561 -4.43 1.2e-05 ***
sp474.betamom 0.00248 0.00808 0.31 0.76
The estimates obtained in the previous example have the drawback that
the response variable in the regression is based on the stock returns from
only a single month. One way to include additional months of returns in
the analysis is to use rolling regressions. For instance, suppose that we have
m months of asset returns, along with the corresponding factor values, and
suppose that our goal is to estimate the factor premiums using estimates of
the factor sensitivities based on 60 months of data.
To do this, we first estimate the factor sensitivities using data from months
1 through 60 and then use those results as the predictor variables in a regres-
sion model, with the response variable taken to be the excess returns for
all stocks from month 61. This procedure may then be repeated, estimating
the factor sensitivities using data from months 2 through 61 and estimating
the factor premiums using a regression model with response variable based
on data from month 62. We may continue in this way until the final regres-
sion with the returns from month m as the response variable and the factor
sensitivities estimated using data from months m − 60 through m − 1 as the
predictors. The result is m − 60 estimates of the factor premiums, which can
be averaged using an approach similar to the one used in Example 10.7 to
obtain the final estimates.
The matrix pmat stores the 12 sets of factor premiums, one in each column.
For a given value of the index j, dat contains the return data and x.fact
contains the factor data for months j to j + 59 that will be used to obtain the
estimates of the factor sensitivities; these estimates are calculated using the
command lm(dat~x.fact)$coef and are stored in the variable bet. Note that
the transpose function t is used to put the estimates in the correct format
to use in a subsequent function, and the index [, -1] is used to drop the
estimate of the intercept term from the results.
The command lm(sp474.data6[(j+60), ]~bet) runs the regression with
the returns in month j + 60 as the response variable and the estimated factor
sensitivities in bet as the predictor variables. The estimated coefficients from
this regression are stored in the jth column of the matrix pmat. After the loop
is completed, pmat contains 12 estimates of each factor premium.
Using the approach from Example 10.7, the factor premium estimates
may then be obtained by averaging the 12 columns of the matrix pmat for
each column
> apply(pmat, 1, mean)
[1] 0.00762 -0.00851 0.00235 -0.00878 0.00895
and the standard errors may be estimated using
> apply(pmat, 1, sd)/(12^.5)
[1] 0.00846 0.00889 0.00290 0.00391 0.00875
Of course, the first value in these vectors refers to the intercept, which is not
a factor.
These results may be compared to those obtained in Example 10.7, in
which the estimated factor premiums are given by
> apply(spcoef, 1, mean)
(Intercept) sp500 smb hml mom
0.014464 0.000372 0.006519 -0.004380 0.006333
with standard errors given by
> apply(spcoef, 1, sd)/(60^.5)
(Intercept) sp500 smb hml mom
0.00276 0.00553 0.00324 0.00272 0.00366
The two sets of premium estimates have several differences that are likely
to be important in practice. For instance, the estimate of the premium for the
return on the S&P 500 index was found to be 0.00372 in Example 10.7; using
future returns as the predictor variable, the estimate is −0.00851. However,
the standard errors of both estimates are relatively large so that the difference
is unlikely to be statistically significant.
The average R-squared value for the 12 regressions used to calculate
the estimates in pmat is roughly 0.11, which is considerably less than the
R-squared value in Example 10.6. This is not surprising; we expect that
estimates of the factor sensitivities will have a weaker relationship with
future returns than they have with returns for the time period used for their
estimation.
Note that the standard errors of the estimates based on the rolling
regressions are generally larger than those of the estimates calculated in
Example 10.7, which is not surprising since they are based on the averages
of only 12 rolling estimates as compared to the averages of 60 estimates used
in Example 10.7. Of course, by going back further in time, we could obtain
more premium estimates, thus reducing the standard errors; as always, such
an approach is only useful if the relationships among the variables do not
change in important ways over the time period considered. Thus, one draw-
back of the rolling-regression method is that the estimates must be based
on a relatively long series of data in order to achieve the standard errors
Example 10.10 Consider stocks for firms represented in the S&P 100 index;
see Example 8.5. The data matrix sp96.data contains 5 years of monthly
returns for the period ending December 31, 2014, for each stock; only 96 of
the 100 stocks had 5 years of monthly returns available, so sp96.data has 60
rows and 96 columns.
Consider the Fama–French three-factor model used in Example 10.5; using
this model, the covariance matrix of the returns may be estimated using the
same approach used in Example 10.4.
> sp96.ff<-lm(sp96.data~sp500+smb+hml)
> sp96.ff.beta<-sp96.ff$coef[-1,]
> Sig.FF<-cov(cbind(sp500, smb, hml))
> f.sighat.ff<-function(y){summary(lm(y~sp500+smb+hml))$sigma}
> sp96.Sig.fact<-t(sp96.ff.beta)%*%Sig.FF%*%sp96.ff.beta +
+ diag(apply(sp96.data, 2, f.sighat.ff)^2)
> library(quadprog)
> sp96.mv<-solve.QP(Dmat=2*sp96.Sig.fact, dvec=rep(0, 96),
+ Amat=cbind(rep(1,96), diag(96)), bvec=c(1, rep(0, 96)),
+ meq=1)$solution
Note that many of the estimated weights are close to zero, but not exactly
zero
> head(sp96.mv)
[1] -4.2e-18 -2.5e-17 -5.3e-17 -3.0e-17 -2.8e-17 1.0e-02
The mean excess return on an asset may be estimated using the same gen-
eral approach. Such an estimate is based on estimates of the factor premiums,
as discussed in Section 10.6, along with estimates of the factor sensitivities
for a given asset. The methodology is illustrated in the following example.
Example 10.11 Consider the factor model with factors SMB, HML, MOM,
along with the excess return on the S&P 500 index. Estimates of the factor
premiums were obtained in Example 10.6; they are stored in the variable
fact.prem
> fact.prem
sp500 smb hml mom
0.000372 0.006519 -0.004380 0.006333
The constant in the equation for an asset’s excess mean return in terms of its
factor sensitivities is 0.01446.
For example, for 3M Company stock (symbol MMM), the factor sensitiv-
ities are
> sp474.beta["MMM",]
sp500 smb hml mom
1.0080 0.3661 -0.0988 -0.0107
It follows that the estimated mean excess return for 3M stock is
> 0.01446 + sum(fact.prem*sp474.beta["MMM", ])
[1] 0.0176
This may be compared to the sample mean excess return of 0.0148.
> funds3.fact<-lm(funds3.data~sp500+smb+hml+mom)
> t(funds3.fact$coef)[,-1]
sp500 smb hml mom
tisex 1.040 0.9498 -0.0427 -0.0278
prsvx 0.893 0.9412 0.2335 -0.0494
vivax 0.950 -0.0161 0.2578 -0.0483
The adjusted R-squared values for the four-factor model regressions are given
by
> f.rsq<-function(y)
+ {summary(lm(y~sp500+smb+hml+mom))$adj.r.squared}
> apply(funds3.data, 2, f.rsq)
tisex prsvx vivax
0.985 0.974 0.983
indicating that the returns of each of the funds are closely related to the four
factors.
The estimated coefficients of the return on the market index are simi-
lar for the three funds; however, their sensitivities to the other factors are
often quite different. For instance, TISEX and PRSVX both have a relatively
large sensitivity to SMB of about 0.95, indicating that the returns on both
funds have a positive relationship to the returns on small cap stocks, and
their sensitivities to MOM are similar; however, PRSVX has a relatively large
positive sensitivity to HML while TISEX has a negative sensitivity to this
factor. This suggests that PRSVX tends to invest in more value stocks than
does TISEX. The funds PRSVX and VIVAX have similar coefficients of the
market index, HML, and MOM; however, the coefficient of SMB for VIVAX
is small and negative, while for PRSVX it is large and positive. This sug-
gests that the returns on PRSVX are taking advantage of a “small-cap stock
effect” while the returns on VIVAX are approximately unrelated to the size
factor.
N
βp,j = wi βi,j , j = 1, 2, . . . , K,
i=1
Example 10.13 Consider the stocks represented in the data matrix big8
and consider a factor model based on five economic factors: the return on the
S&P 500 index, stored in the variable sp500; the unemployment rate, stored
in the variable unemp; the change in the Industrial Production Index stored in
the variable indpro; the change in the Consumer Sentiment Index, stored
in the variable consum; and the change in the Consumer Price Index, stored
in the variable cpi.
The Industrial Production Index and the unemployment rate are described
in Example 10.4. The Consumer Sentiment Index is based on the Univer-
sity of Michigan’s Surveys of Consumers; these data are available on the
FRED website at https://research.stlouisfed.org/fred2/series/UMCSENT.
The Consumer Price Index is the Consumer Price Index for All Urban
Consumers prepared by the Bureau of Labor Statistics; it is also avail-
able on the FRED website at https://research.stlouisfed.org/fred2/series/
CPIAUCSL.
Some summary statistics of the factors for the time period considered are
given by
The estimated factor sensitivities for the eight stocks are stored in the
variable big8.fact.beta:
>big8.fact<-lm(big8~sp500+unemp+indpro+consum+cpi)
>big8.fact.beta<-t(big8.fact$coefficients)[, -1]
> big8.fact.beta
sp500 unemp indpro consum cpi
AAPL 0.940 0.282 0.011 0.000 0.988
BAX 0.678 -0.304 -0.013 0.017 2.888
KO 0.507 0.092 0.013 -0.119 -0.275
CVS 1.096 -0.446 0.015 0.036 -1.989
XOM 0.846 0.496 -0.022 0.142 2.298
IBM 0.617 0.950 -0.008 0.030 2.030
JNJ 0.504 -0.138 -0.022 -0.023 -1.107
DIS 1.196 0.083 -0.014 0.079 -5.806
The estimate of the return covariance matrix based on the factor model is
stored in the variable big8.Sig.fact
> f.sighat.fact<-function(y){summary(lm(y~sp500+unem+indpro+
+ consum+cpi))$sigma^2}
> Sig.FF1<-cov(cbind(sp500, unem, indpro, consum, cpi))
> big8.Sig.fact<-t(big8.fact.beta)%*%Sig.FF1%*%big8.fact.beta +
+ diag(apply(big8, 2, f.sighat.fact))
Using this estimate, along with the sample mean excess returns, the
estimated weight vector of the risk-averse portfolio with the risk-aversion
parameter taken to be λ = 10 is given by
If we enforce the restriction that all asset weights must be nonnegative, the
estimated weights are given by
> big8.ra10<-solve.QP(Dmat=10*big8.Sig.fact, dvec=big8.mean,
+ A=cbind(rep(1,8),diag(8)), bvec=c(1, rep(0, 8)),
+ meq=1)$solution
> big8.ra10
[1] 0.244 0.000 0.084 0.270 0.000 0.000 0.195 0.207
The estimated factor sensitivities for the portfolio with weight vector
big8.ra10 are given by
> big8.fact.beta%*%big8.ra10
[,1]
sp500 0.9144
unemp -0.0538
indpro 0.0004
consum 0.0119
cpi -1.7397
Suppose we wish to construct a portfolio that is insensitive to the factors
unemp and cpi. This may be done using the function solve.QP, including the
constraint that the portfolio sensitivities to those factors are both zero. Define
a matrix A by
> A<-cbind(rep(1, 8), t(big8.fact.beta[c(2, 5),]), diag(8))
Then the second and third rows of the transpose of A contain the factor
sensitivities for unemp and cpi, respectively. Thus, the command
> big8.ra10.con<-solve.QP(Dmat=10*big8.Sig.fact, dvec=big8.mean,
+ Amat=A, bvec=c(1, 0, 0, rep(0, 8)), meq=3)$solution
computes the weights of the risk-averse portfolio based on the risk-aversion
parameter λ = 10 subject to the constraints that all portfolio weights are
nonnegative and that the factor sensitivities for unemp and cpi are zero. The
weights are given by
> big8.ra10.con
[1] 0.317 0.080 0.139 0.226 0.028 0.035 0.175 0.000
and the estimated factor sensitivities are
> big8.fact.beta%*%big8.ra10.con
[,1]
sp500 0.8039
unemp 0.0000
indpro 0.0026
consum -0.0059
cpi 0.0000
Thus, we expect that the returns on the portfolio with weight vector
big8.ra10.con will be less sensitive to the economic conditions reflected in
unemp and cpi than is the portfolio with weight vector big8.ra10.
The estimated mean excess returns and return standard deviations for the
two portfolios are given by
> sum(big8.ra10*big8.mean)
[1] 0.0193
> sum(big8.ra10.con*big8.mean)
[1] 0.0173
> (big8.ra10%*%big8.Sig.fact%*%big8.ra10)^.5
[,1]
[1,] 0.0411
> (big8.ra10.con%*%big8.Sig.fact%*%big8.ra10.con)^.5
[,1]
[1,] 0.0392
Thus, the constrained portfolio has a smaller estimated mean excess return
but, because its return does not depend on the factors unemp and cpi, its
estimated return standard deviation is smaller as well. In terms of the risk-
aversion criterion function, the value for the unconstrained portfolio is 0.0109,
while the value for the constrained portfolio is slightly less, at 0.0096.
10.9 Exercises
1. Let β denote a N × K matrix, let ΣF denote a K × K symmetric
matrix, and let Σ denote an N × N symmetric matrix. Show that
βΣF βT + Σ (10.12)
and that the residual standard deviations for these assets are given
by 0.4, 0.5, 0.2, 0.6, respectively. Let
⎛ ⎞
0.15 0.035 0.015
ΣF = ⎝0.035 0.050 0.002⎠ .
0.015 0.002 0.030
Find the covariance matrix of the return vector for these assets
under the assumption that the factor model holds and give the
corresponding correlation matrix.
3. Consider a factor model with three factors applied to a set of four
assets. For the parameter values given in Exercise 2, find the fac-
tor sensitivities and residual standard deviation for the equally
weighted portfolio of the four assets.
4. Calculate 5 years of monthly excess returns for the period ending
December 31, 2015, for five stocks, Papa John’s International, Inc.
(symbol PZZA), Bed Bath & Beyond, Inc. (BBBY), Netflix, Inc.
(NFLX), Time Warner, Inc. (TWX), and Verizon Communications,
Inc. (VZ); for the risk-free rate, use the return on the 3-month
Treasury Bill, available on the Federal Reserve website.
Agresti, A. and Finlay, B. (2009). Statistical Methods for the Social Sciences.
Pearson, Upper Saddle River, NJ, fourth edition.
Bali, T. G., Engle, R. F., and Murray, S. (2016). Empirical Asset Pricing:
The Cross Section of Stock Returns. Wiley, Hoboken, NJ.
Bender, J., Briand, R., Melas, D., and Subramanian, R. A. (2013). Foun-
dations of Factor Investing. Technical report. MSCI Index Research,
New York, NY.
355
DeMiguel, V., Garlappi, L., and Uppal, R. (2009). Optimal versus Naive
Diversification: How Inefficient Is the 1/N Portfolio Strategy? The Review
of Financial Studies, 22:1915–1953.
Newbold, P., Carlson, W. L., and Thorne, B. M. (2013). Statistics for Business
and Economics. Pearson, Upper Saddle River, NJ, eighth edition.
Roll, R. (1977). A Critique of the Asset Pricing Theory’s Tests: Part I: On Past
and Potential Testability of the Theory. Journal of Financial Economics,
4:129–176.
Ross, S. M. (2013). Simulation. Academic Press, San Diego, CA, fifth edition.
A returns, negatively
Active portfolio management, correlated, 72
Treynor–Black method and, risk-free, 81–84
292–307 risky, 84, 85, 90
adding single asset to market volatility of, 20
portfolio, 293–298 Assets, N (portfolios of), 95–102
benchmark portfolio, 292 correlation matrix, 100
estimator bias 304–306 diversification, 101–102
numerical computation of eigenvalues, 100
portfolio weights, 306–307 inner product, 97
portfolio of N assets together matrix notation, 96–101
with market portfolio, nonnegative definite matrix, 99
298–302 random vector, 96
properties of Treynor–Black Autocorrelation function, 16
portfolio, 302–304 Autocovariance function, 16
Adjusted beta, 243–244
Adjusted prices, 9–14 B
Adjusted R-squared, 320 Benchmark portfolio, 292
Appraisal ratio, 258–259, 295 Best linear predictor, 51
Arbitrage pricing theory (APT), Beta, adjusted, 243–244
328–333 Bias
arbitrage portfolio, 329 corrected estimate, 264
asymptotic arbitrage, omitted-variable, 335
332–333 “Bloomberg adjusted beta,” 243
factor premiums and, 334 Bonferroni inequality, 234
no-arbitrage assumption, 330 Bonferroni method, 235
no-asymptotic arbitrage Bootstrap method, 261, 304
assumption, 332 Box–Ljung test, 55
Assets
appraisal ratio of, 295 C
correctly priced, 232–237 Capital asset pricing model
excess return of, 83 (CAPM), 2, 197–220
investable weight factor of, 223 applying the CAPM to a
market capitalization of, 222 portfolio, 206–208
mispriced, 208–211 capital market line, 199
prices, random walk models for, CAPM without risk-free asset,
48–54 211–214
363
drift, 50 stationarity, 15
efficient markets, 45–48 statistical properties of, 14–20
geometric random walk, 52 stochastic process, 14
increments of the process, 49 Sturges’ rule, 32
iterated conditional time series, 14
expectations, 46 variance function, 14
martingale model, 46–48 volatility of the asset, 20
rescaled range test, 59–61 weak stationarity, 15–19
runs test, 58–59 weak white noise, 19–20
sample autocorrelation function, Revenue, 6
test based on, 55 Risk-aversion criterion, 79–81,
stock returns, 61–63 121–128
tests of, 54–61 finding wλ using quadratic
variance-ratio test, 56–58 programming, 127–128
volatility, 50
properties of risk-averse
Rescaled range test, 59–61
portfolios, 124–127
Residual returns, 226
Risk-free assets, 81–84
Returns, 5–40
Risk premium, 334
adjusted prices, 9–14
Rolling regressions, 340
analyzing return data, 20–37
Root mean squared error
application to asset returns, 20
(RMSE), 188
autocorrelation function, 16
autocovariance function, 16 R-squared, adjusted, 320
basic concepts, 5–9 Running means, 27
covariance function, 15 Runs test, 58–59
dividends, 5, 8–9
S
Freedman–Diaconis rule, 32
Sample
gross return, 6
autocorrelation function, test
k -period return, 6
based on, 55
log-returns, 7–8
mean function, 14 covariance matrix, properties of,
monthly returns, 24–26 156–157
net return, 5 covariances and correlations,
normal probability plot, 33 148–149
quantile–quantile plot, 33 statistics, 145–151
return interval, 21 Sampling
revenue, 6 distribution of a statistic,
running means and standard 182–186
deviations, 26–29 frequency of data, 21
sample autocorrelation function, horizon, 145
29–32 size, effective, 159
sampling frequency of data, 21 Second-order properties (returns), 16
second-order properties, 16 Sector funds, 309
shape of return distribution, Security market line (SML), 198–202
32–37 capital market line, 199
W Z
Weak white noise, 19–20 Zero-beta portfolio, 212, 214
Weighted estimators, 157–163 Zero-investment portfolios, 106
decay parameter, 157
effective sampling size, 159
exponentially weighted moving
average estimator, 157
of mean vector and covariance
matrix, 160–163