100% found this document useful (2 votes)

384 views

Introduction To Statistical Methods For Financial Models

This document provides an introduction to statistical methods for financial models. It discusses foundational statistical concepts and techniques that are important for analyzing financial data and building financial models, including probability distributions, statistical inference, linear regression, and time series analysis. The goal is to equip readers with the statistical skills needed to construct and analyze quantitative models in finance.

Uploaded by

Brook

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

384 views

Introduction To Statistical Methods For Financial Models

Uploaded by

Brook

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 387

Introduction to

Statistical Methods for

Financial Models

T&F Cat #K31368 — K31368 C000— page i — 6/14/2017 — 22:05

CHAPMAN & HALL/CRC
Texts in Statistical Science Series
Series Editors
Joseph K. Blitzstein, Harvard University, USA
Julian J. Faraway, University of Bath, UK
Martin Tanner, Northwestern University, USA
Jim Zidek, University of British Columbia, Canada
Statistical Theory: A Concise Introduction Problem Solving: A Statistician’s Guide,
F. Abramovich and Y. Ritov Second Edition
Practical Multivariate Analysis, Fifth Edition C. Chatfield
A. Afifi, S. May, and V.A. Clark Statistics for Technology: A Course in Applied
Practical Statistics for Medical Research Statistics, Third Edition
D.G. Altman C. Chatfield

Interpreting Data: A First Course Analysis of Variance, Design, and Regression :

in Statistics Linear Modeling for Unbalanced Data,
A.J.B. Anderson Second Edition
R. Christensen
Introduction to Probability with R
K. Baclawski Bayesian Ideas and Data Analysis: An
Introduction for Scientists and Statisticians
Linear Algebra and Matrix Analysis for
Statistics R. Christensen, W. Johnson, A. Branscum,
S. Banerjee and A. Roy and T.E. Hanson

Modern Data Science with R Modelling Binary Data, Second Edition

B. S. Baumer, D. T. Kaplan, and N. J. Horton D. Collett

Mathematical Statistics: Basic Ideas and Modelling Survival Data in Medical Research,
Selected Topics, Volume I, Third Edition
Second Edition D. Collett
P. J. Bickel and K. A. Doksum Introduction to Statistical Methods for
Mathematical Statistics: Basic Ideas and Clinical Trials
Selected Topics, Volume II T.D. Cook and D.L. DeMets
P. J. Bickel and K. A. Doksum Applied Statistics: Principles and Examples
Analysis of Categorical Data with R D.R. Cox and E.J. Snell
C. R. Bilder and T. M. Loughin Multivariate Survival Analysis and Competing
Statistical Methods for SPC and TQM Risks
D. Bissell M. Crowder
Introduction to Probability Statistical Analysis of Reliability Data
J. K. Blitzstein and J. Hwang M.J. Crowder, A.C. Kimber,
T.J. Sweeting, and R.L. Smith
Bayesian Methods for Data Analysis,
Third Edition An Introduction to Generalized
B.P. Carlin and T.A. Louis Linear Models, Third Edition
A.J. Dobson and A.G. Barnett
Second Edition
R. Caulcutt Nonlinear Time Series: Theory, Methods, and
Applications with R Examples
The Analysis of Time Series: An Introduction, R. Douc, E. Moulines, and D.S. Stoffer
Sixth Edition
C. Chatfield Introduction to Optimization Methods and
Their Applications in Statistics
Introduction to Multivariate Analysis B.S. Everitt
C. Chatfield and A.J. Collins

T&F Cat #K31368 — K31368 C000— page ii — 6/14/2017 — 22:05

Extending the Linear Model with R: Graphics for Statistics and Data Analysis with R
Generalized Linear, Mixed Effects and K.J. Keen
Nonparametric Regression Models, Second Mathematical Statistics
Edition K. Knight
J.J. Faraway
Introduction to Functional Data Analysis
Linear Models with R, Second Edition P. Kokoszka and M. Reimherr
J.J. Faraway
Introduction to Multivariate Analysis:
A Course in Large Sample Theory Linear and Nonlinear Modeling
T.S. Ferguson S. Konishi
Multivariate Statistics: A Practical Nonparametric Methods in Statistics with SAS
Approach Applications
B. Flury and H. Riedwyl O. Korosteleva
Readings in Decision Analysis Modeling and Analysis of Stochastic Systems,
S. French Third Edition
Discrete Data Analysis with R: Visualization V.G. Kulkarni
and Modeling Techniques for Categorical and Exercises and Solutions in Biostatistical Theory
Count Data L.L. Kupper, B.H. Neelon, and S.M. O’Brien
M. Friendly and D. Meyer
Exercises and Solutions in Statistical Theory
Markov Chain Monte Carlo: L.L. Kupper, B.H. Neelon, and S.M. O’Brien
Stochastic Simulation for Bayesian Inference,
Second Edition Design and Analysis of Experiments with R
D. Gamerman and H.F. Lopes J. Lawson

Bayesian Data Analysis, Third Edition Design and Analysis of Experiments with SAS
A. Gelman, J.B. Carlin, H.S. Stern, D.B. Dunson, J. Lawson
A. Vehtari, and D.B. Rubin A Course in Categorical Data Analysis
Multivariate Analysis of Variance and T. Leonard
Repeated Measures: A Practical Approach for Statistics for Accountants
Behavioural Scientists S. Letchford
D.J. Hand and C.C. Taylor Introduction to the Theory of Statistical
Practical Longitudinal Data Analysis Inference
D.J. Hand and M. Crowder H. Liero and S. Zwanzig
Logistic Regression Models Statistical Theory, Fourth Edition
J.M. Hilbe B.W. Lindgren
Richly Parameterized Linear Models: Stationary Stochastic Processes: Theory and
Additive, Time Series, and Spatial Models Applications
Using Random Effects G. Lindgren
J.S. Hodges Statistics for Finance
Statistics for Epidemiology E. Lindström, H. Madsen, and J. N. Nielsen
N.P. Jewell The BUGS Book: A Practical Introduction to
Stochastic Processes: An Introduction, Bayesian Analysis
Second Edition D. Lunn, C. Jackson, N. Best, A. Thomas, and
P.W. Jones and P. Smith D. Spiegelhalter
The Theory of Linear Models Introduction to General and Generalized
B. Jørgensen Linear Models
Pragmatics of Uncertainty H. Madsen and P. Thyregod
J.B. Kadane Time Series Analysis
Principles of Uncertainty H. Madsen
J.B. Kadane Pólya Urn Models
H. Mahmoud

T&F Cat #K31368 — K31368 C000— page iii — 6/14/2017 — 22:05

Randomization, Bootstrap and Monte Carlo Sampling Methodologies with Applications
Methods in Biology, Third Edition P.S.R.S. Rao
B.F.J. Manly A First Course in Linear Model Theory
Statistical Regression and Classification: From N. Ravishanker and D.K. Dey
Linear Models to Machine Learning Essential Statistics, Fourth Edition
N. Matloff D.A.G. Rees
Introduction to Randomized Controlled Stochastic Modeling and Mathematical
Clinical Trials, Second Edition Statistics: A Text for Statisticians and
J.N.S. Matthews Quantitative Scientists
Statistical Rethinking: A Bayesian Course with F.J. Samaniego
Examples in R and Stan Statistical Methods for Spatial Data Analysis
R. McElreath O. Schabenberger and C.A. Gotway
Statistical Methods in Agriculture and Bayesian Networks: With Examples in R
Experimental Biology, Second Edition M. Scutari and J.-B. Denis
R. Mead, R.N. Curnow, and A.M. Hasted
Large Sample Methods in Statistics
Statistics in Engineering: A Practical Approach P.K. Sen and J. da Motta Singer
A.V. Metcalfe
Introduction to Statistical Methods for
Statistical Inference: An Integrated Approach, Financial Models
Second Edition T. A. Severini
H. S. Migon, D. Gamerman, and
F. Louzada Spatio-Temporal Methods in Environmental
Epidemiology
Beyond ANOVA: Basics of Applied Statistics G. Shaddick and J.V. Zidek
R.G. Miller, Jr.
Decision Analysis: A Bayesian Approach
A Primer on Linear Models J.Q. Smith
J.F. Monahan
Analysis of Failure and Survival Data
Stochastic Processes: From Applications to P. J. Smith
Theory
Applied Statistics: Handbook of GENSTAT
P.D Moral and S. Penev
Analyses
Applied Stochastic Modelling, Second Edition E.J. Snell and H. Simpson
B.J.T. Morgan
Applied Nonparametric Statistical Methods,
Elements of Simulation Fourth Edition
B.J.T. Morgan P. Sprent and N.C. Smeeton
Probability: Methods and Measurement Data Driven Statistical Methods
A. O’Hagan P. Sprent
Introduction to Statistical Limit Theory Generalized Linear Mixed Models:
A.M. Polansky Modern Concepts, Methods and Applications
Applied Bayesian Forecasting and Time Series W. W. Stroup
Analysis Survival Analysis Using S: Analysis of
A. Pole, M. West, and J. Harrison Time-to-Event Data
Statistics in Research and Development, M. Tableman and J.S. Kim
Time Series: Modeling, Computation, and Applied Categorical and Count Data Analysis
Inference W. Tang, H. He, and X.M. Tu
R. Prado and M. West
Elementary Applications of Probability Theory,
Essentials of Probability Theory for Second Edition
Statisticians H.C. Tuckwell
M.A. Proschan and P.A. Shaw
Introduction to Statistical Inference and Its
Introduction to Statistical Process Control Applications with R
P. Qiu M.W. Trosset

T&F Cat #K31368 — K31368 C000— page iv — 6/14/2017 — 22:05

Understanding Advanced Statistical Methods Epidemiology: Study Design and
P.H. Westfall and K.S.S. Henning Data Analysis, Third Edition
Statistical Process Control: Theory and M. Woodward
Practice, Third Edition Practical Data Analysis for Designed
G.B. Wetherill and D.W. Brown Experiments
Generalized Additive Models: B.S. Yandell
An Introduction with R, Second Edition
S. Wood

T&F Cat #K31368 — K31368 C000— page v — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page vi — 6/14/2017 — 22:05
Introduction to
Statistical Methods for
Financial Models

Thomas A. Severini
Northwestern University
Evanston, Illinois, USA

T&F Cat #K31368 — K31368 C000— page vii — 6/14/2017 — 22:05

CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2018 by Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper

International Standard Book Number-13: 978-1-138-19837-1 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of allmaterial reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known
or hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, please access www.
copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization
that provides licenses and registration for a variety of users. For organizations that have been granted
a photocopy license by the CCC, a separate systemof payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Names: Severini, Thomas A. (Thomas Alan), 1959- author.

Title: Introduction to statistical methods for financial models / Thomas A.
Severini.
Description: Boca Raton, FL : CRC Press, [2018] | Includes bibliographical
references and index.
Identifiers: LCCN 2017003073| ISBN 9781138198371 (hardback) | ISBN
9781315270388 (e-book master) | ISBN 9781351981910 (adobe reader) | ISBN
9781351981903 (e-pub) | ISBN 9781351981897 (mobipocket)
Subjects: LCSH: Finance--Statistical methods. | Finance--Mathematical models.
Classification: LCC HG176.5 .S49 2017 | DDC 332.072/7--dc23
LC record available at https://lccn.loc.gov/2017003073

Visit the Taylor & Francis Web site at

http://www.taylorandfrancis.com

and the CRC Press Web site at

http://www.crcpress.com

T&F Cat #K31368 — K31368 C000— page viii — 6/14/2017 — 22:05

To Karla

T&F Cat #K31368 — K31368 C000— page ix — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page vi — 6/14/2017 — 22:05
Contents

Preface xv

1 Introduction 1

2 Returns 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Adjusted Prices . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Statistical Properties of Returns . . . . . . . . . . . . . . . . 14
2.5 Analyzing Return Data . . . . . . . . . . . . . . . . . . . . . 20
2.6 Suggestions for Further Reading . . . . . . . . . . . . . . . . 37
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Random Walk Hypothesis 41

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Conditional Expectation . . . . . . . . . . . . . . . . . . . . 41
3.3 Eﬃcient Markets and the Martingale Model . . . . . . . . . 45
3.4 Random Walk Models for Asset Prices . . . . . . . . . . . . 48
3.5 Tests of the Random Walk Hypothesis . . . . . . . . . . . . 54
3.6 Do Stock Returns Follow the Random Walk Model? . . . . . 61
3.7 Suggestions for Further Reading . . . . . . . . . . . . . . . . 63
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4 Portfolios 69
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3 Negative Portfolio Weights: Short Sales . . . . . . . . . . . . 73
4.4 Optimal Portfolios of Two Assets . . . . . . . . . . . . . . . 74
4.5 Risk-Free Assets . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.6 Portfolios of Two Risky Assets and a Risk-Free Asset . . . . 84
4.7 Suggestions for Further Reading . . . . . . . . . . . . . . . . 91
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5 Eﬃcient Portfolio Theory 95

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 Portfolios of N Assets . . . . . . . . . . . . . . . . . . . . . . 95
5.3 Minimum-Risk Frontier . . . . . . . . . . . . . . . . . . . . . 103

T&F Cat #K31368 — K31368 C000— page xi — 6/14/2017 — 22:05

xii Contents

5.4 The Minimum-Variance Portfolio . . . . . . . . . . . . . . . . 113

5.5 The Eﬃcient Frontier . . . . . . . . . . . . . . . . . . . . . . 118
5.6 Risk-Aversion Criterion . . . . . . . . . . . . . . . . . . . . . 121
5.7 The Tangency Portfolio . . . . . . . . . . . . . . . . . . . . . 129
5.8 Portfolio Constraints . . . . . . . . . . . . . . . . . . . . . . 133
5.9 Suggestions for Further Reading . . . . . . . . . . . . . . . . 139
5.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6 Estimation 145
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2 Basic Sample Statistics . . . . . . . . . . . . . . . . . . . . . 145
6.3 Estimation of the Mean Vector and Covariance Matrix . . . 151
6.4 Weighted Estimators . . . . . . . . . . . . . . . . . . . . . . 157
6.5 Shrinkage Estimators . . . . . . . . . . . . . . . . . . . . . . 163
6.6 Estimation of Portfolio Weights . . . . . . . . . . . . . . . . 171
6.7 Using Monte Carlo Simulation to Study the Properties
of Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.8 Suggestions for Further Reading . . . . . . . . . . . . . . . . 189
6.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

7 Capital Asset Pricing Model 197

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7.2 Security Market Line . . . . . . . . . . . . . . . . . . . . . . 198
7.3 Implications of the CAPM . . . . . . . . . . . . . . . . . . . 202
7.4 Applying the CAPM to a Portfolio . . . . . . . . . . . . . . . 206
7.5 Mispriced Assets . . . . . . . . . . . . . . . . . . . . . . . . . 208
7.6 The CAPM without a Risk-Free Asset . . . . . . . . . . . . . 211
7.7 Using the CAPM to Describe the Expected Returns on
a Set of Assets . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.8 Suggestions for Further Reading . . . . . . . . . . . . . . . . 217
7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

8 The Market Model 221

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
8.2 Market Indices . . . . . . . . . . . . . . . . . . . . . . . . . . 221
8.3 The Model and Its Estimation . . . . . . . . . . . . . . . . . 226
8.4 Testing the Hypothesis that an Asset Is Priced Correctly . . 232
8.5 Decomposition of Risk . . . . . . . . . . . . . . . . . . . . . 237
8.6 Shrinkage Estimation and Adjusted Beta . . . . . . . . . . . 239
8.7 Applying the Market Model to Portfolios . . . . . . . . . . . 244
8.8 Diversiﬁcation and the Market Model . . . . . . . . . . . . . 247
8.9 Measuring Portfolio Performance . . . . . . . . . . . . . . . . 254
8.10 Standard Errors of Estimated Performance Measures . . . . 259
8.11 Suggestions for Further Reading . . . . . . . . . . . . . . . . 268
8.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269

T&F Cat #K31368 — K31368 C000— page xii — 6/14/2017 — 22:05

Contents xiii

9 The Single-Index Model 273

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
9.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
9.3 Covariance Structure of Returns under the
Single-Index Model . . . . . . . . . . . . . . . . . . . . . . . 275
9.4 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
9.5 Applications to Portfolio Analysis . . . . . . . . . . . . . . . 286
9.6 Active Portfolio Management and the
Treynor–Black Method . . . . . . . . . . . . . . . . . . . . . 292
9.7 Suggestions for Further Reading . . . . . . . . . . . . . . . . 307
9.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

10 Factor Models 311

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
10.2 Limitations of the Single-Index Model . . . . . . . . . . . . . 311
10.3 The Model and Its Estimation . . . . . . . . . . . . . . . . . 315
10.4 Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
10.5 Arbitrage Pricing Theory . . . . . . . . . . . . . . . . . . . . 328
10.6 Factor Premiums . . . . . . . . . . . . . . . . . . . . . . . . 333
10.7 Applications of Factor Models . . . . . . . . . . . . . . . . . 343
10.8 Suggestions for Further Reading . . . . . . . . . . . . . . . . 349
10.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

References 355

Index 363

T&F Cat #K31368 — K31368 C000— page xiii — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page xiv — 6/14/2017 — 22:05
Preface

This book provides an introduction to the use of statistical concepts and

methods to model and analyze financial data; it is an expanded version of
notes used for an advanced undergraduate course at Northwestern University,
“Introduction to Financial Statistics.” A central theme of the book is that by
modeling the returns on assets as random variables, and using some basic con-
cepts of probability and statistics, we may build a methodology for analyzing
and interpreting financial data.
The audience for the book is students majoring in statistics and economics
as well as in quantitative fields such as mathematics and engineering; the book
can also be used for a master’s level course on statistical methods for finance.
Readers are assumed to have taken at least two courses in statistical methods
covering basic concepts such as elementary probability theory, expected val-
ues, correlation, and conditional expectation as well as introductory statistical
methodology such as estimation of means and standard deviations and basic
linear regression. They are also assumed to have taken courses in multivari-
ate calculus and linear algebra; however, no prior experience with finance or
financial concepts is required or expected.
The 10 chapters of the book fall naturally into three sections. After a
brief introduction to the book in Chapter 1, Chapters 2 and 3 cover some
basic concepts of finance, focusing on the properties of returns on an asset.
Chapters 4 through 6 cover aspects of portfolio theory, with Chapter 4 contain-
ing the basic ideas and Chapter 5 presenting a more mathematical treatment
of efficient portfolios; the estimation of the parameters needed to implement
portfolio theory is the subject of Chapter 6. The remainder of the book,
Chapters 7 through 10, discusses several models for financial data, along with
the implications of those models for portfolio theory and for understanding the
properties of return data. These models begin with the capital asset pricing
model in Chapter 7; its more empirical version, the market model, is covered
in Chapter 8. Chapter 9 covers the single-index model, which extends the
market model to the returns on several assets; more general factor models are
the topic of Chapter 10.
In addition to building on the basic concepts covered in math and statis-
tics courses, the book introduces some more advanced topics in an applied
setting. Such topics include covariance matrices and their properties, shrink-
age estimation, the use of simulation to study the properties of estimators,

T&F Cat #K31368 — K31368 C000— page xv — 6/14/2017 — 22:05

xvi Preface

multiple testing, estimation of standard errors using resampling, and opti-

mization methods. The discussion of such methods focuses on their use and
the interpretation of the results, rather than on the underlying theory.
Data analysis and computation play a central role in the book. There are
detailed examples illustrating how the methods presented may be implemented
in the statistical software R; the methods described are applied to genuine
financial data, which may be conveniently downloaded directly into R. These
examples include both the use of R packages when available and the writing
of small R programs when necessary. I have tried to provide sufficient details
so that readers with even minimal experience in R can successfully implement
the methodology; however, those with no R experience will likely benefit from
one of the many introductory books or online tutorials available.
Each chapter ends with exercises and suggestions for further reading. The
exercises include both questions requiring analytic solutions and those requir-
ing data analysis or other numerical work; in nearly all cases, any R functions
needed have been discussed in the examples in the text. Finance and finan-
cial statistics are well-studied fields about which much has been written. The
books and papers given as suggestions for further reading were chosen based
on the expected background of the reader, rather than to reference the most
definitive treatments of a topic.
I would like to thank Karla Engel who was instrumental in preparing the
manuscript and who provided many useful comments and corrections; it is
safe to say that this book would not have been completed without her help.
I would like to thank Matt Davison (University of Western Ontario) for a
number of valuable comments and suggestions. Several anonymous reviewers
made helpful comments at various stages of the project and their contributions
are gratefully acknowledged. I would also like to thank Rob Calver and the
staff at CRC Press/Taylor & Francis for suggestions and other help throughout
the publishing process.

T&F Cat #K31368 — K31368 C000— page xvi — 6/14/2017 — 22:05

1
Introduction

The goal of this book is to present an introduction to the statistical method-

ology used in investment analysis and financial econometrics, which are
concerned with analyzing the properties of financial markets and with evalu-
ating potential investments. Here, an “investment” refers to the purchase of
an asset, such as a stock, that is expected to generate income, appreciate in
value, or ideally both. The evaluation of such an investment takes into account
its potential financial benefits, along with the “risk” of the investment based
on the fact that the asset may decrease in value or even become worthless.
A major advance in the science of investment analysis took place begin-
ning in the 1950s when probability theory began to be used to model the
uncertainty inherent in any investment. The “return” on an investment, that
is, the proportional change in its value over a given period of time, is modeled
as a random variable and the investment is evaluated by the properties of
the probability distribution of its return. The methods used in this statistical
approach to investment analysis form an important component of the field
known as quantitative finance or, more recently, financial engineering. The
methodology used in quantitative finance may be contrasted with that based
on fundamental analysis, which attempts to measure the “true worth” of an
asset; for example, in the case of a stock, fundamental analysis uses finan-
cial information regarding the company issuing the stock, along with more
qualitative measures of the firm’s profitability.
For instance, in the statistical models used in analyzing investments, the
expected value of the return on an asset gives a measure of the expected
financial benefit from owning the asset and the standard deviation of the
return is a measure of its variability, representing the risk of the investment.
It follows that, based on this approach, an ideal investment has a return with
a large expected value and a small standard deviation or, equivalently, a large
expected value and a small variance. Thus, the analysis of investments using
these ideas is often referred to as mean-variance analysis.
Concepts from probability and statistics have been used to develop a formal
mathematical framework for investment analysis. In particular, the properties
of the returns on a portfolio, a set of assets owned by a particular investor,
may be derived using properties of sums of the random variables representing
the returns on the individual assets. This approach leads to a methodology for
selecting assets and constructing portfolios known as modern portfolio theory

T&F Cat #K31368 — K31368 C001— page 1 — 6/14/2017 — 22:05

2 Introduction to Statistical Methods for Financial Models

or Markowitz portfolio theory, after Harry Markowitz, one of the pioneers in

this field.
A central concept in this theory is the risk aversion of investors, which
assumes that, when choosing between two investments with the same expected
return, investors will prefer the one with the smaller risk, that is, the one with
the smaller standard deviation; thus, the optimal portfolios are the ones that
maximize the expected return for a given level of risk or, conversely, minimize
the risk for a given expected return. It follows that numerical optimization
methods, which may be used to minimize measures of risk or to maximize an
expected return, play a central role in this theory.
An important feature of these methods is that they do not rely on accurate
predictions of the future asset returns, which are generally difficult to obtain.
The idea that asset returns are difficult to predict accurately is a consequence
of the statistical model for asset prices known as a random walk and the
assumption that asset prices follow a random walk is known as the random
walk hypothesis. The random walk model for prices asserts that changes in the
price of an asset over time are unpredictable, in a certain sense. The random
walk hypothesis is closely related to the efficient market hypothesis, which
states that asset prices reflect all currently available information. Although
there is some evidence that the random walk hypothesis is not literally true,
empirical results support the general conclusion that accurate predictions of
future returns are not easily obtained.
Instead, the methods of modern portfolio theory are based on the prop-
erties of the probability distribution of the returns on the set of assets under
consideration. In particular, the mean return on a portfolio depends on the
mean returns on the individual assets and the standard deviation of a portfolio
return depends on the variability of the individual asset returns, as measured
by their standard deviations, along with the relationship between the returns,
as measured by their correlations. Thus, the extent to which the returns on
different assets are related plays a crucial role in the properties of portfolio
returns and in concepts such as diversification.
Of course, in practice, parameters such as means, standard deviations, and
correlations are unknown and must be estimated from historical data. Thus,
statistical methodology plays a central role in the mean-variance approach to
investment analysis. Although, in principle, the estimation of these parameters
is straightforward, the scale of the problem leads to important challenges. For
instance, if a portfolio is based on 100 assets, we must estimate 100 return
means, 100 return standard deviations, and 4950 return correlations.
The properties of the returns on different assets are often affected by var-
ious economic conditions relevant to the assets under consideration. Hence,
statistical models relating asset returns to available economic variables are
important for understanding the properties of potential investments. For
instance, the theoretical capital asset pricing model (CAPM) and its empirical
version, known as the market model, describe the returns on an asset in terms
of their relationship with the returns on the equity market as a whole, known

T&F Cat #K31368 — K31368 C001— page 2 — 6/14/2017 — 22:05

Introduction 3

as the market portfolio, and measured by a suitable market index, such as the
Standard & Poors (S&P) 500 index. Such models are useful for understand-
ing the nature of the risk associated with an asset, as well as the relationship
between the expected return on an asset and its risk. The single-index model
extends this idea to a model for the correlation structure of the returns on a
set of assets; in this model, the correlation between the returns on two assets
is described in terms of each asset’s correlation with the return on the market
portfolio.
The CAPM, the market model, and the single-index model are all based
on the relationship between asset returns and the return on some form of a
market portfolio. Although the behavior of the market as a whole may be
the most important factor affecting asset returns, in general, asset returns
are related to other economic variables as well. A factor model is a type of
generalization of these models; it describes the returns on a set of assets in
terms of a few underlying “factors” affecting these assets. Such a model is
useful for describing the correlation structure of a set of asset returns as well
as for describing the behavior of the mean returns of the assets. The factors
used are chosen by the analyst; hence, there is considerable flexibility in the
exact form of the model. The parameters of a factor model are estimated
using statistical techniques such as regression analysis and the results provide
useful information for understanding the factors affecting the asset returns;
the results from an analysis based on a factor model are important in analyzing
potential investments and constructing portfolios.

Data Analysis and Computing

Data analysis is an important component of the methodology covered in this
book and all of the methods presented are illustrated on genuine financial data.
Fortunately, financial data are readily available from a number of Internet
sources such as finance.yahoo.com and the Federal Reserve Economic Data
(FRED) website, fred.stlouisfed.org. Experience with such data is invaluable
for gaining a better understanding of the features and challenges of financial
modeling.
The analyses in the book use the statistical software R which can be down-
loaded, free of charge, at www.r-project.org. Analysts often find it convenient
to use a more user-friendly interface to R such as RStudio, which is available
at www.rstudio.com; however, the examples presented here use only the stan-
dard R software. R includes many functions that are useful for statistical data
analysis; in addition, it is a programming language and users may define their
own functions when convenient. Such user-defined functions will be described
in detail and implemented as needed; no previous programming experience is
necessary.
There are two features of R that make it particularly useful for analyzing
financial data. One is that stock price data may be downloaded directly into R.

T&F Cat #K31368 — K31368 C001— page 3 — 6/14/2017 — 22:05

4 Introduction to Statistical Methods for Financial Models

The other is that there are many R packages available that extend its function-
ality; several of these provide functions that are useful for analyzing ﬁnancial
data.

Suggestions for Further Reading

A detailed nontechnical introduction to ﬁnancial analysis based on statistical
concepts is given in Bernstein (2001). Chapter 1 of Fabozzi et al. (2006) gives a
concise account of the history of ﬁnancial modeling. Malkiel (1973) contains
a nontechnical discussion of the random walk hypothesis and its implications,
as well as many of the criticisms of the random walk hypothesis that have
been raised.
For readers with limited experience using R, the document “Introduc-
tion to R,” available on the R Project website at https://cran.r-project.
org/doc/manuals/r-release/R-intro.pdf, is a good starting point. Dalgaard
(2008) provides a book-length treatment of basic statistical methods using R
with many examples. The “Quick-R” website, at http://www.statmethods.
net/index.html, contains much useful information for both the beginner and
experienced user.

T&F Cat #K31368 — K31368 C001— page 4 — 6/14/2017 — 22:05

2
Returns

2.1 Introduction
As discussed in Chapter 1, the goal of this book is to provide an introduction
to the statistical methodology used in modeling and analyzing financial data.
This chapter introduces some basic concepts of finance and the types of finan-
cial data used in this context. The analyses focus on the returns on an asset,
which are the proportional changes in the price of the asset over a given time
interval, typically a day or month. The statistical foundations for the analysis
of such data are presented, along with statistical methods that are useful for
investigating the properties of return data.

2.2 Basic Concepts

Consider an asset, such as one share of a particular stock, and let Pt denote
the price of the asset at time t, t = 0, 1, 2, . . . so that P0 is the initial price,
P1 is the price at time 1, P2 is the price at time 2, and so on. Some assets
pay dividends, a speciﬁed amount at a given time. For example, one share of
IBM stock may pay a dividend of $1.20 each quarter. These dividends make
the asset worth more than simply the price. For now, assume that there are
no dividends.
The net return or, simply, the return, on the asset over the period from
time t − 1 to time t is deﬁned as

Pt − Pt−1 Pt
Rt = = − 1, t = 1, 2, . . . .
Pt−1 Pt−1

That is, the return on the asset is simply the proportional change in its price
over a given time period; the return is positive if the price increased and is
negative if the price decreased.

Example 2.1 Suppose that, for a given asset, P0 = 60, P1 = 62.40, P2 = 63.96,
P3 = 61.40, and P4 = 66; assume that all prices are in dollars but, for

T&F Cat #K31368 — K31368 C002— page 5 — 6/14/2017 — 22:05

6 Introduction to Statistical Methods for Financial Models

simplicity, the dollar sign is omitted. Then the returns are

62.40 − 60 63.96 − 62.40
R1 = = 0.040, R2 = = 0.025,
60 62.40
61.40 − 63.96 66 − 61.40
R3 = = −0.040, R4 = = 0.075.
63.96 61.40
The revenue from holding the asset is given by

revenue = (investment) × (return).

Therefore, in Example 2.1, if the initial investment is $100, the revenue over
the period from t = 0 to t = 1 is

100(0.04) = $4.

Normally, we focus on the return rather than on the revenue, which

depends on the amount invested.
The gross return on the asset over the period from time t − 1 to time t is
Pt
= 1 + Rt
Pt−1
so that, for example, the gross return corresponding to R1 = 0.04 is simply
1.04.
We may be interested in returns over a length of time longer than one
period. The return over the time period from time t − k to time t, known as
the k-period return at time t, is deﬁned as the proportional change in price
over that time period. Let Rt (k) denote the k-period return at time t. Then
Pt − Pt−k Pt
Rt (k) = = − 1, t = k, k + 1, . . . .
Pt−k Pt−k
Multiperiod returns are related to one-period returns by
Pt Pt Pt−1 Pt−k+1
1 + Rt (k) = = ···
Pt−k Pt−1 Pt−2 Pt−k
= (1 + Rt )(1 + Rt−1 ) · · · (1 + Rt−k+1 ).

Note that 1 + Rt (k) is the gross return from t − k to t and 1 + Rt , 1 + Rt−1 , . . .,

are the single-period gross returns.
Example 2.2 Using the sequence of prices given in Example 2.1, the two-
period return at time 4 is
P4 − P2 66 − 63.96
R3 (2) = = = 0.032.
P2 63.96
Recall that R3 = −0.040, and R4 = 0.075. Then

R4 (2) = (1 + 0.075)(1 − 0.040) − 1 = 0.032.

T&F Cat #K31368 — K31368 C002— page 6 — 6/14/2017 — 22:05

Returns 7

Log-Returns
It is sometimes convenient to work with log-returns, defined by rt =
log (1 + Rt ), t = 1, 2, . . .; note that throughout the book, “log” will denote
natural logarithms.
Let pt = log Pt , t = 0, 1, . . . denote the log prices. Then the log-returns are
defined as
Pt
rt = log (1 + Rt ) = log = pt − pt−1 .
Pt−1
That is, log-returns are simply the change in the log-prices.
One advantage of working with log-returns is that it simplifies the analysis
of multi-period returns. Let rt (k) denote the k-period log-return at time t.
Then, by analogy with the single-period case, rt (k) = log(1 + Rt (k)) and
rt (k) = log (1 + Rt (k))
= log ((1 + Rt )(1 + Rt−1 ) · · · (1 + Rt−k+1 ))
= log (1 + Rt ) + log (1 + Rt−1 ) + · · · + log (1 + Rt−k+1 )
= rt + rt−1 + · · · + rt−k+1 ;
that is, the k-period log-return at time t is simply the sum of the k
single-period log-returns, rt−k+1 , rt−k+2 , . . . , rt . Alternatively, because rt =
pt − pt−1 , the k-period log-return is the change in the log-price from period
t − k to period t,
rt (k) = pt − pt−k .
Example 2.3 Using the sequence of prices given in Example 2.1, P0 = 60,
P1 = 62.40, P2 = 63.96, P3 = 61.40, and P4 = 66, the log-prices are given by
p0 = log(60) = 4.0943,
p1 = log(62.40) = 4.1336,
p2 = log(63.96) = 4.1583,
p3 = 4.1174,

and
p4 = 4.1897.
It follows that the log-returns are
r1 = p1 − p0 = 4.1336 − 4.0943 = 0.0393,
r2 = p2 − p1 = 4.1583 − 4.1366 = 0.0217,
r3 = p3 − p2 = 4.1174 − 4.1583 = −0.0409,

and
r4 = p4 − p3 = 4.1897 − 4.1174 = 0.0723.
Alternatively, the log-returns may be calculated from the returns; for example
R1 = 0.04 so that
r1 = log(1 + R1 ) = log(1 + 0.04) = log(1.04) = 0.0392,

T&F Cat #K31368 — K31368 C002— page 7 — 6/14/2017 — 22:05

8 Introduction to Statistical Methods for Financial Models

with the diﬀerence between this and our previous result due to round-oﬀ
error. The three-period log-return at time 4 is

r4 (2) = r3 + r4 = −0.0409 + 0.0723 = 0.0314;

alternatively, using the result from Example 2.2,

r4 (2) = log(1 + R4 (3)) = log(1 + 0.032) = 0.0315.

Dividends
Now suppose that there are dividends. Let Dt represent the dividend paid
immediately prior to time t, that is, after time t − 1 but before time t; for
convenience, we will refer to such a dividend as being paid “at time t.” Then
the gross return from time t − 1 to time t takes into account the payment of
the dividend, along with the change in price; it is deﬁned as
Pt + Dt
1 + Rt = .
Pt−1
The net return is given by

Pt + Dt Pt Dt
Rt = −1 = −1 +
Pt−1 Pt−1 Pt−1
= (proportional change in price)
+ (dividend as a proportion of price at time t − 1).

Thus, it is possible to make money from an investment in an asset even if the

asset’s price declines over time.
The multiperiod return from period t − k to period t is deﬁned by an
analogy with the no-dividend case:

1 + Rt (k) = (1 + Rt )(1 + Rt−1 ) · · · (1 + Rt−k+1 )

Pt + Dt Pt−1 + Dt−1 Pt−k+1 + Dt−k+1
= ··· .
Pt−1 Pt−2 Pt−k
Example 2.4 Suppose that, as in Example 2.1, P0 = 60, P1 = 62.40, P2 =
63.96, and suppose that there are dividends D1 = 2 and D2 = 1. Then
P1 + D1 62.40 + 2
R1 = −1 = − 1 = 0.073
P0 60
and
P2 + D2 63.96 + 1
R2 = −1 = − 1 = 0.041.
P1 62.40
The two-period return at time 2 is

P2 + D2 P1 + D1
R2 (2) = − 1 = (1.073)(1.041) − 1 = 0.117.
P1 P0

T&F Cat #K31368 — K31368 C002— page 8 — 6/14/2017 — 22:05

Returns 9

Note that, when there are dividends, the deﬁnition of multiperiod returns
assumes that the dividends are reinvested. To see this, consider the following
example.
Example 2.5 Consider an asset with prices P0 = 8, P1 = 10, and P1 = 12
and with dividends D1 = 2, D2 = 1. Suppose that our initial investment is
$200. The initial price of the asset is P0 = 8; hence, we buy 200/8 = 25 shares.
The price at time t = 1 is P1 = 10; therefore, in time period 1, those shares
are worth (25)(10) = $250, plus we receive a dividend of $2 per share for a
total dividend of 2(25) = $50. The dividends may be used to buy more of the
asset; the price is P1 = 10, so we buy 50/10 = 5 additional shares for a total
of 25 + 5 = 30 shares at the end of time 1.
At time t = 2, the price of the asset is P2 = 12, so those 30 shares are worth
(30)(12) = $360, plus we receive a dividend of $1 per share or 1(30) = $30 for
the 30 shares, leading to a total worth of $390. Thus, our initial investment of
$200 is worth $390, a net return of 390/200 − 1 = 0.95 over the two periods,
which agrees with [(12 + 1)/10][(10 + 2)/8] − 1 = 0.95.

Therefore, the multiperiod return when there are dividends is based on

several sources:
• The price increase of the original investment
• The dividends
• The price increase of the shares purchased by the dividends
When there are dividends, the deﬁnition of the log-return is analogous to
the deﬁnition in the no-dividend case:

rt = log(1 + Rt ) = log(Pt + Dt ) − log(Pt−1 ).

Note, however, that the log-return is no longer directly related to the change
in the log-price.
Multiperiod returns for log-returns in the presence of dividends are deﬁned
as the sum of the single-period log-returns:

rt (k) = rt + · · · + rt−k+1 , t = k, k + 1, . . . , T.

2.3 Adjusted Prices

An alternative to including dividends explicitly in the calculation of returns
is to work with dividend-adjusted prices, which we will refer to more simply
as adjusted prices. Note that we expect the price of a stock to decrease after
payment of a dividend.

T&F Cat #K31368 — K31368 C002— page 9 — 6/14/2017 — 22:05

10 Introduction to Statistical Methods for Financial Models

To see why this is true, consider one share of a particular stock and suppose
that a dividend Dt is paid at time t. Investors selling the stock at time t − 1
will receive Pt−1 ; investors selling the stock at time t receive Pt + Dt . Under
the assumption that the instrinsic value of the investment is stable from time
t − 1 to time t, we must have
Pt−1 = Pt + Dt ,
that is,
Pt = Pt−1 − Dt .
Thus, when measuring how the value of a share of stock changes from time
t − 1 to time t, we should compare Pt to Pt−1 − Dt rather than to Pt−1 . In a
sense, the “effective price” at time t − 1 is Pt−1 − Dt .
This reasoning is the basis for defining adjusted prices. Let P0 , P1 , . . . , PT
denote a sequence of prices of an asset, let D1 , D2 , . . . , DT denote a sequence
of dividends paid by the asset, and let P̄0 , P̄1 , . . . , P̄T denote the corresponding
sequence of adjusted prices. Define P̄T = PT and

DT
P̄T −1 = PT −1 − DT = 1 − PT −1 .
PT −1
Note that the ratio of the adjusted prices may be written
P̄T PT
= (2.1)
P̄T −1 PT −1 − DT
so that it reflects the ratio of the prices, taking into account the dividend DT .
To define the adjusted price at time T − 2, P̄T −2 , we use the relationship
between P̄T −1 and P̄T −2 implied by (2.1):
P̄T −1 PT −1
= .
P̄T −2 PT −2 − DT −1
Solving for P̄T −2 ,

PT −2 − DT −1 DT −1 P̄T −1
P̄T −2 = P̄T −1 = 1− PT −2 .
PT −1 PT −2 PT −1
Using the fact that
P̄T −1 DT
= 1− ,
PT −1 PT −1
it follows that

DT DT −1
P̄T −2 = 1− 1− PT −2 .
PT −1 PT −2
This relationship may be generalized to

DT DT −1 DT −k+1
P̄T −k = 1 − 1− ··· 1− PT −k , k = 1, 2, . . . , T.
PT −1 PT −2 PT −k
Thus, the adjusted prices describe the changes in a stock’s value, taking into
account dividends.

T&F Cat #K31368 — K31368 C002— page 10 — 6/14/2017 — 22:05

Returns 11

Example 2.6 Consider an asset with prices P0 = 60, P1 = 62.40, and P2 =

63.96 and dividends D1 = 2, D2 = 1. Then P̄2 = 63.96,

1
P̄1 = 1 − 62.40 = 61.40,
62.40

and
1 2
P̄0 = 1 − 1− 60 = 57.07.
62.40 60
There are some unexpected properties of adjusted prices that are impor-
tant to keep in mind. One is that when a dividend occurs in the current period,
the entire series of adjusted prices changes.
To see this, let P̄0 , P̄1 , . . . , P̄T denote the adjusted prices based on observ-
ing prices and dividends for periods 0, 1, . . . , T . Now suppose that we observe
PT +1 and DT +1 ; let P̃0 , P̃1 , . . . , P̃T +1 denote the adjusted prices based on
observing prices and dividends for periods 0, 1, . . . , T, T + 1. Then P̃T +1 =
PT +1 ,
DT +1 DT +1
P̃T = 1 − PT = 1 − P̄T ,
PT PT

DT +1 DT DT +1
P̃T −1 = 1 − 1− PT −1 = 1 − P̄T −1 ,
PT PT −1 PT
and so on. In general,

DT +1
P̃T −k = 1− P̄T −k , k = 1, 2, . . . , T.
PT

Example 2.7 For the asset described in Example 2.6, suppose that we
observe an additional time period, with P3 = 61.40 and D3 = 3. Then the
updated adjusted prices are P̄3 = 61.40,

3
P̄2 = 1 − 63.96 = 60.96,
63.96

3 1
P̄1 = 1 − 1− 62.40 = 58.52,
63.96 62.40

3 1 2
P̄0 = 1 − 1− 1− 60 = 54.39.
63.96 62.40 60

The series of adjusted prices is now 54.39, 58.52, 60.96, and 61.40, corre-
sponding to periods 0, 1, 2, 3, respectively. These values can be compared with
the series of adjusted prices 57.07, 61.40, and 63.96 for periods 0, 1, and 2,
respectively, that were computed before observing period 3.

This property of adjusted prices can be confusing when recording adjusted

price data at diﬀerent points in time.

T&F Cat #K31368 — K31368 C002— page 11 — 6/14/2017 — 22:05

12 Introduction to Statistical Methods for Financial Models

Example 2.8 Consider the price of a share of stock in Exxon Mobil

Corporation (symbol XOM); such data are available on the Yahoo Finance
website, http://biz.yahoo.com/r/, by following the “Historical Quotes” link
under “Research Tools.”
On March 4, 2008, the adjusted price for November 30, 2005, was reported
as $57.38, on March 30, 2015, the adjusted price for November 30, 2005, was
reported as $46.77, and on January 7, 2016, the adjusted price for November
30, 2005, was reported as $45.56.
Note that the unadjusted price for November 30, 2005, was reported as
$58.03 on each of these three dates.

Although when there is a nonzero dividend the sequence of adjusted prices

changes with the addition of a new time period of information, all adjusted
prices change by the same factor; hence, the ratios of the adjusted prices
are unchanged so that the returns calculated from the adjusted prices do not
change.
Example 2.9 For the asset described in Example 2.6, the adjusted prices
based on data from time periods 0, 1, and 2 are 57.07, 61.40, and 63.96,
respectively, while the adjusted prices based on data from time periods 0, 1, 2,
and 3 are 54.39, 58.52, 60.96, and 61.40, respectively. Using either set of
adjusted prices, the return in period 1 is
61.40 − 57.07 58.52 − 54.39
= = 0.0759
57.07 54.39
and the return in period 2 is
63.96 − 61.40 60.96 − 58.52
= = 0.0417.
61.40 58.52
Another important property is that although adjusted prices incorporate
information about dividends, returns calculated from adjusted prices are not
exactly equal to returns calculated using the formula for returns in the pres-
ence of dividends. That is, although the returns based on adjusted prices do
not depend on the sequence of adjusted prices used, they are not the same
as the returns calculated using the formula for returns based on unadjusted
prices in the presence of dividends. This is illustrated in the following example.
Example 2.10 Recall that in Example 2.6 the asset prices in periods 0 and 1
are given by P0 = 60 and P1 = 62.40 and the dividend in period 1 is D1 = 2.
Then the return in period 1 is
62.40 + 2
− 1 = 0.0733.
60
In Example 2.9, it is shown that the return in period 1 based on adjusted
prices is 0.0759, which is close to, but not exactly the same as, the value
obtained here.

T&F Cat #K31368 — K31368 C002— page 12 — 6/14/2017 — 22:05

Returns 13

In general, the return in period 1 based on prices P0 , P1 and dividend D1 is

P1 + D1 P1 D1
−1 = −1+ . (2.2)
P0 P0 P0
The adjusted prices are P̄1 = P1 and

D1
P̄0 = 1 − P0 .
P0
Therefore, the return based on adjusted prices is
P̄1 − P̄0 1 P1
= − 1. (2.3)
P¯0 1 − D1 /P0 P0
The diﬀerence between the return based on the unadjusted prices, given
in (2.2), and the return based on adjusted prices, given in (2.3), is

P1 D1 1 P1 1 P1 D1
−1+ − −1 = 1− +
P0 P0 1 − D1 /P0 P0 1 − D1 /P0 P0 P0
D1 /P0 P1 D1
=− +
1 − D1 /P0 P0 P0

P1 D1
= 1− .
P0 − D1 P0
Therefore, if either the dividend is a relatively small proportion of the price
or the ratio
P1
P0 − D1
is close to 1, we can expect the diﬀerence between the two calculated returns
. .
to be minor. Fortunately, in many cases, both D1 /P0 = 0 and P1 = P0 − D1
hold.
Example 2.11 Consider the price of a share of Target Corporation stock
(symbol TGT). Let P0 denote the price on May 15, 2015, and let P1 denote
the price on May 18, 2015, with corresponding dividend D1 . Note that May
15, 2015, was a Friday, so that May 15 and May 18 are consecutive trading
days. Then P0 = $78.53, P1 = $78.36, and D1 = $0.52. The return for period
1 is
78.36 0.52
R1 = −1+ = 0.004457.
78.53 78.53
The adjusted prices are P̄1 = P1 = $78.36 and

0.52
P̄0 = 1 − (78.53) = $78.01;
78.53
therefore, the return based on the adjusted prices is
78.36 − 78.01
= 0.004487.
78.01

T&F Cat #K31368 — K31368 C002— page 13 — 6/14/2017 — 22:05

14 Introduction to Statistical Methods for Financial Models

This is close to, but slightly different than, the actual return calculated pre-
viously, with a difference of 0.00003. Note that here D1 /P0 = 0.0066 and
P1 /(P0 − D1 ) = 1.0045.
Now consider monthly returns. Let P0 denote the price of one share of
Target stock at the end of April 2015, and let P1 denote the price at the end
of May 2015. Then P0 = $78.83, P1 = $79.32, and D1 = $0.52. The adjusted
monthly prices are P̄0 = $78.31 and P̄1 = $79.32. Then the monthly return
on Target stock is 0.01281; using the adjusted prices, the monthly return is
0.01290, again, a slight difference.

Adjusted stock prices are generally adjusted for stock splits as well as
for dividends. A stock split occurs when a company decides to proportion-
ally increase the number of shares owned by investors. For instance, in a
two-for-one stock split, the owner of each share of stock is given a second
share, in a sense, splitting each share into two. Of course, the price of the
shares is adjusted accordingly.
Adjusted prices better reﬂect changes in the asset’s value over time, and
they are a useful alternative to the raw prices. In the remainder of this book,
the term “prices” will always refer to adjusted prices and the term “returns”
will always refer to returns calculated from adjusted prices. The notations
used for prices, returns, log-prices, and so on will refer to quantities based
on the adjusted prices; for example, Pt will be used to denote the adjusted
price of an asset at time t and Rt will be used to denote the return based on
adjusted prices.

2.4 Statistical Properties of Returns

Consider the returns R1 , R2 , . . . on an asset. An important feature of such
returns is that they are ordered in time. Hence, we consider the properties of
a sequence of random variables Y1 , Y2 , . . . that are ordered in time.
The set of random variables {Yt : t = 1, 2, . . .} is called a stochastic pro-
cess and the sequence of observations corresponding to Y1 , Y2 , . . . is called a
time series. When analyzing the properties of a stochastic process {Yt : t =
1, 2, . . .}, we consider properties of the random variable Yt as a function of t.
Although any property of a random variable can be viewed as a function
of t by computing it for each Yt , t = 1, 2, . . ., in practice, we are primarily
interested in simple properties such as means and variances. For instance, let

μt = E(Yt ), t = 1, 2, . . .

denote the mean function of the process so that μ3 = E(Y3 ), for example.
Similarly, the variance function of the process is given by

σ2t = Var(Yt ), t = 1, 2, . . . .

T&F Cat #K31368 — K31368 C002— page 14 — 6/14/2017 — 22:05

Returns 15

The covariance function gives the covariance of two elements of {Yt : t =

1, 2, . . .} as a function of their indices; it is deﬁned as

γ0 (t, s) = Cov(Yt , Ys ), t, s = 1, 2, . . . .

Hence, γ0 (t, t) = σ2t for any t.

Note that, without further assumptions on the random variables Y1 , Y2 , . . .,
it is difficult, if not impossible, to obtain any information about the features
of their probability distributions. For instance, if the probability distribution
of Yt is completely different for each t, and we have only one set of observa-
tions corresponding to the process, then we only have one observation from
each distribution. In such a case, accurate estimation of the properties of the
distribution of Yt is not possible. Fortunately, in many cases, it is reason-
able to assume that the properties of random variables Yt and Ys for t = s
are similar.
The strongest condition of this type is the condition that Y1 , Y2 , . . . are
independent and identically distributed, often abbreviated to i.i.d.; i.i.d. ran-
dom variables are independent and each has the same marginal distribution.
Although this condition is appropriate in some areas of application, it is often
too strong for the type of random variables used in modeling financial data.
A similar, but weaker, condition is stationarity. The process {Yt : t =
1, 2, . . .} is said to be stationary if the statistical properties of the random
variables in the process do not change over time. More formally, the process
is stationary if for any integer m and any times t1 , t2 , . . . , tm the joint distri-
bution of the vector (Yt1 , Yt2 , . . . , Ytm ) is the same as the joint distribution of
the vector (Yt1+h , Yt2 +h , . . . , Ytm +h ) for any h = 0, 1, 2, . . . . Thus, stationarity
is a type of time invariance.
For instance, taking m = 1, stationarity requires that Yt has the same dis-
tribution as Yt+h for any integer h; that is, under stationarity, the marginal
distribution of Yt is the same for each t, so that Y1 , Y2 , . . . are identically dis-
tributed. Taking m = 2, the joint distribution of (Yt1 , Yt2 ) is the same as
the joint distribution of (Yt1 +h , Yt2 +h ) for any time points t1 , t2 and any
h = 0, 1, 2, . . . . For example, (Y1 , Y4 ) must have the same distribution as
(Y2 , Y5 ), (Y3 , Y6 ), (Y4 , Y7 ), and so on. This same type of property must hold
for any m-tuple of random variables. This condition holds if, in addition to
being identically distributed, Y1 , Y2 , . . . are independent, but independence is
not required. Although stationarity is weaker than the i.i.d. property, it is still
a strong condition.

Weak Stationarity
In ﬁnancial applications, the assumption of stationarity is generally stronger
than is needed. Furthermore, because it refers to the entire distribution of
each random variable, it is diﬃcult to verify in practice. Hence, a weaker
version of stationarity, based on means, variances, and covariances, is often
used.

T&F Cat #K31368 — K31368 C002— page 15 — 6/14/2017 — 22:05

16 Introduction to Statistical Methods for Financial Models

The process {Yt : t = 1, 2, . . .} is said to be weakly stationary if

1. E(Yt ) = μ for all t = 1, 2, . . ., for some constant μ.
2. Var(Yt ) = σ2 for all t = 1, 2, . . ., for some constant σ2 > 0.
3. Cov(Yt , Ys ) = γ(|t − s|) for all t, s = 1, 2, . . ., for some function γ(·).
That is, the mean and variance of Yt do not depend on t and covariance of
Yt+h , Ys+h does not depend on h: under Condition (3) of weak stationarity,

Cov(Yt+h , Ys+h ) = γ(|t + h − (s + h)|) = γ(|t − s|).

Thus, weak stationarity is essentially the same as stationarity, except that

it applies only to the second-order properties of the process, the means,
variances, and covariances of the random variables.
The function γ(·) is called the autocovariance function of the process. Note
that γ(0) = Cov(Yt , Yt ) = σ2 and γ(h) = Cov(Yt+h , Yt ), h = 0, 1, . . ., for any
t = 1, 2, . . . . The correlation of Yt and Ys is given by
Cov(Yt , Ys ) γ(|t − s|)
ρ(|t − s|) ≡ = .
Var(Yt )Var(Ys ) σ2

The function ρ(·) is called the autocorrelation function of the process.

Example 2.12 Let Z0 , Z1 , Z2 , . . . denote i.i.d. random variables each with
mean μ and standard deviation σ. Deﬁne

Yt = Zt − Zt−1 , t = 1, 2, . . .

so that Y1 = Z1 − Z0 , Y2 = Z2 − Z1 , and so on, and consider the properties of

the process {Yt : t = 1, 2, . . . .}.
Note that

E(Yt ) = E(Zt − Zt−1 ) = E(Zt ) − E(Zt−1 ) = μ − μ = 0

and that

Var(Yt ) = Var(Zt − Zt−1 ) = Var(Zt ) + Var(Zt−1 ) = σ2 + σ2 = 2σ2 .

Hence, conditions (1) and (2) of weak stationarity are satisﬁed.

Consider Cov(Yt , Ys ). If t = s − 1, then

Cov(Yt , Ys ) = Cov(Zs−1 − Zs−2 , Zs − Zs−1 )

= Cov(Zs−1 , Zs ) − Cov(Zs−1 , Zs−1 )
− Cov(Zs−2 , Zs ) + Cov(Zs−2 , Zs−1 )
= −Var(Zs−1 ) = −σ2 ;

note that, because Cov(Yt , Ys ) = Cov(Ys , Yt ), the same result holds if s =

t − 1. Thus, Cov(Yt , Ys ) = −σ2 if |t − s| = 1.

T&F Cat #K31368 — K31368 C002— page 16 — 6/14/2017 — 22:05

Returns 17

If |t − s| > 1, then Yt = Zt − Zt−1 and Ys = Zs − Zs−1 do not have any

terms in common; hence, Cov(Yt , Ys ) = 0. It follows that
⎧
⎪
⎨2σ
2
if |t − s| = 0
Cov(Yt , Ys ) = −σ 2
if |t − s| = 1 .
⎪
⎩
0 if |t − s| = 2, 3, . . .

Clearly, Cov(Yt , Ys ) is a function of |t − s| and, since E(Yt ) and Var(Yt ) do

not depend on t, {Yt : t = 1, 2, . . .} is weakly stationary.
The autocovariance function of the process is given by
⎧
⎪
⎨2σ
2
if h = 0
γ(h) = −σ2 if h = 1
⎪
⎩
0 if h = 2, 3, . . .

and the autocorrelation function is given by

⎧
⎪
⎨1 if h = 0
ρ(h) = − 2 if h = 1
1
⎪
⎩
0 if h = 2, 3, . . .

Example 2.13 Let Z1 , Z2 , . . . denote i.i.d. random variables each with mean
0 and standard deviation σ. Deﬁne
Z1 + Z2 + · · · + Zt
Xt = √ , t = 1, 2, . . .
t
and consider the properties of the process {Xt : t = 1, 2, . . .}.
Note that
1 1
E(Xt ) = √ E(Z1 ) + · · · + √ E(Zt ) = 0
t t
and
1 1 1
Var(Xt ) = Var(Z1 ) + · · · + Var(Zt ) = tσ2 = σ2 .
t t t
Now consider Cov(Xt , Xs ) for t = s. The calculation is simpler if we know
which of t and s is smaller; note that, without loss of generality, we may
assume that t < s. Then

Z1 + Z2 + · · · + Zt Z1 + Z2 + · · · + Zt
Cov(Xt , Xs ) = Cov √ , √
t s

Zt+1 + Z2 + · · · + Zs
+ √
s

Z1 + Z2 + · · · + Zt Z1 + Z2 + · · · + Zt
= Cov √ , √
t s

Z1 + Z2 + · · · + Zt Zt+1 + Zt+2 + · · · + Zs
+ Cov √ , √ .
t s

T&F Cat #K31368 — K31368 C002— page 17 — 6/14/2017 — 22:05

18 Introduction to Statistical Methods for Financial Models

Because, for any random variable Y , Cov(Y, Y ) = Var(Y ),

Z1 + Z2 + · · · + Zt Z1 + Z2 + · · · + Zt
Cov √ , √
t s
√
Z1 + Z2 + · · · + Zt t Z1 + Z2 + · · · + Zt
= Cov √ ,√ √
t s t
√
t Z1 + Z2 + · · · + Zt Z1 + Z2 + · · · + Zt
= √ Cov √ , √
s t t
√ √
t t
= √ Cov (Xt , Xt ) = √ Var (Xt )
s s
√
t
= √ σ2 .
s
The sums Z1 + Z2 + · · · + Zt and Zt+1 + Zt+2 + · · · + Zs have no terms in
common; hence, using the fact that Z1 , Z2 , . . . are independent, these sums
have covariance equal to 0. It follows that
√
t
Cov(Xt , Xs ) = √ σ2 .
s
This result holds when t < s; if s < t, the same basic result holds, switching
the roles of t and s: √
s
Cov(Xt , Xs ) = √ σ2 .
t
These results may be combined by stating that, for any t, s,

min{t, s} 2
Cov(Xt , Xs ) = σ .
max{t, s}
Therefore, although the mean and variance of Xt do not depend on t, the
covariance of Xt and Xs is not a function of |t − s|. For instance,
1 2
Cov(X1 , X5 ) = σ
5
while
11 2
Cov(X11 , X15 ) = σ .
15
Hence, {Xt : t = 1, 2, . . .} is not weakly stationary.
The property of weak stationarity greatly simpliﬁes the statistical analysis
of the process. For instance, because E(Yt ) = μ for all t, we expect that

T
Ȳ = Yt
T t=1

will be a reasonable estimator of μ. To estimate ρ(1), the correlation of two

observations one time period apart, we can use the sample correlation of pairs

T&F Cat #K31368 — K31368 C002— page 18 — 6/14/2017 — 22:05

Returns 19

(Y1 , Y2 ), (Y2 , Y3 ), . . ., and so on. Parameter estimation using these ideas will
be considered in detail in the following section.

Weak White Noise

A particularly simple example of a weakly stationary process is a sequence
of random variables Z1 , Z2 , . . . such that, for each t = 1, 2, . . ., E(Zt ) = μ and
Var(Yt ) = σ2 , for some constants μ and σ2 > 0, such that, for each t, s =
1, 2, . . ., t = s, Cov(Zt , Zs ) = 0. A process with these properties is called weak
white noise.
The autocovariance function of a weak white noise process {Zt : t =
1, 2, . . .} is given by
σ2 if h = 0
γ(h) =
0 if h = 1, 2, . . .
and the autocorrelation function of the process is given by

1 if h = 0
ρ(h) = .
0 if h = 1, 2, . . .

Example 2.14 Let {Zt : t = 1, 2, . . .} be a weak white noise process such

that, for each t = 1, 2, . . ., E(Zt ) = 0 and Var(Zt ) = 1. Let Z be a random
variable with mean 0 and variance 1 such that Cov(Z, Zt ) = 0 for t = 1, 2, . . .
and let
Yt = Z + Zt , t = 1, 2, . . . .
Then, for all t,
E(Yt ) = E(Z) + E(Zt ) = 0,
Var(Yt ) = Var(Z) + Var(Zt ) = 1 + 1 = 2,
and for all t, s, t = s,

Cov(Yt , Ys ) = Cov(Z + Zt , Z + Zs )
= Cov(Z, Z) + Cov(Z, Zs ) + Cov(Zt , Z) + Cov(Zt , Zs )
= Var(Z)
= 1.

Thus, {Yt : t = 1, 2, . . .} is a weakly stationary process with autocovariance

function
2 if h = 0
γ(h) =
1 if h = 1, 2, . . .
and autocorrelation function ρ(h) = 1/2, h = 1, 2, . . . .
Now deﬁne Xt = Z1 + . . . + Zt , t = 1, 2, . . . . Then E(Xt ) = E(Z1 ) + · · · +
E(Zt ) = 0, Var(Xt ) = Var(Z1 ) + · · · + Var(Zt ) = t.

T&F Cat #K31368 — K31368 C002— page 19 — 6/14/2017 — 22:05

20 Introduction to Statistical Methods for Financial Models

To ﬁnd the covariance of Xt and Xs , we may use the same general approach
used in Example 2.13. Note that, without loss of generality, we may assume
that t < s. Then

Cov(Xt , Xs ) = Cov(Z1 + · · · + Zt , Z1 + · · · + Zs )
= Cov(Z1 + · · · + Zt , Z1 + · · · + Zt )
+ Cov(Z1 + · · · + Zt , Zt+1 + · · · + Zs ).

Because Zt and Zs are uncorrelated for t = s and each Zt has variance 1,

Cov(Xt , Xs ) = t;

thus, Cov(Xt , Xs ) = min{t, s}. Because the Var(Xt ) depends on t and the
covariance of Xt , Xs is not a function of |t − s|, the process {Xt : t = 1, 2, . . .}
is not weakly stationary.

Application to Asset Returns

These ideas may be applied to the stochastic process {Rt : t = 1, 2, . . .} cor-
responding to the returns on an asset. Weak stationarity implies that the
second-order properties of the returns do not change over time: E(Rt ) and
Var(Rt ) do not depend on t and Cov(Rt , Rs ) is a function of |t − s|. Hence,
under weak stationarity, we may refer to the mean return on an asset and the
asset’s return standard deviation, also known as the volatility of the asset,
with the understanding that such parameters refer to all time periods under
consideration.
The autocorrelation function of {Rt : t = 1, 2, . . .} describes the correlation
structure of the returns and the relationships between returns in diﬀerent time
periods. In many cases, it is appropriate to model returns as a weak white
noise process, simplifying the analysis; assumptions of this type are discussed
in detail in Chapter 3.

2.5 Analyzing Return Data

In order to develop models for return data, it is important to understand its
properties. Hence, in this section, we consider several statistical methods that
are useful in describing the properties of return data as well as for investigating
the appropriateness of assumptions such as weak stationarity.
Asset price data is widely available on the Internet. Here we use data taken
from the ﬁnance.yahoo.com website, using the R function get.hist.quote,
in the tseries package (Trapletti and Hornik 2016) that directly downloads
the data into R.
The arguments of get.hist.quote are instrument, which refers to the
stock symbol of interest, start and end, which specify the starting and ending

T&F Cat #K31368 — K31368 C002— page 20 — 6/14/2017 — 22:05

Returns 21

dates of the time period under consideration, quote, which specifies the data
to be downloaded, for which we use AdjClose for the adjusted closing price,
and compression, which specifies the sampling frequency of the data, the
time interval over which data are recorded. We may view this choice in terms
of the return interval, the length of the time period over which each return is
calculated. Typical choices are days or months, but sometimes weeks or years
are used.
Daily data have the advantage that more observations are available in a
given time period and that they may reflect more subtle changes in the price
of the asset. On the other hand, investment decisions are often made on a
monthly basis and, in many cases, monthly returns are more stable than daily
returns. Hence, both daily and monthly data are commonly used. For daily,
data, we use compression = "d"; for monthly data, “d” is replaced by “m.”

Example 2.15 Suppose we would like to analyze data on Wal-Mart Stores,

Inc., stock (symbol WMT) for the time period 2010–2014. The relevant R
commands are

> library(tseries)
> x<-get.hist.quote(instrument="WMT", start="2009-12-31",
+ end="2014-12-31", quote="AdjClose", compression="d")
> wmt<-as.vector(x)

This command assigns ﬁve years of Wal-Mart price data to the variable x;
the format of x is known as “zoo,” an R data format for irregularly observed
time series. Here, we analyze the prices as a standard vector; hence, the
command wmt<-as.vector(x) converts x to a standard vector and assigns it
to the variable wmt. To check the contents of wmt, we can use the command
head, which displays the ﬁrst few elements of the vector.

> head(wmt)
[1] 45.64114 46.30719 45.84608 45.74361 45.76923 45.53868
> length(wmt)
[1] 1259

Thus, the first adjusted price of Wal-Mart stock in the sequence is $45.64114
and there are 1259 prices in the variable wmt.
The number of significant figures displayed can be controlled by the digits
argument of the options function. For instance, options(digits=5) limits
the number of significant figures printed to five. Throughout this book, the
number of digits will be adjusted without comment, based on the context of
the example, the desire to fit the output to the page, and so on.

> options(digits=5)
> head(wmt)
[1] 45.641 46.307 45.846 45.744 45.769 45.539

T&F Cat #K31368 — K31368 C002— page 21 — 6/14/2017 — 22:05

22 Introduction to Statistical Methods for Financial Models

Note that, when displaying the contents of a vector, the number of digits
shown is chosen so that all elements of the vector have the required number
of signiﬁcant ﬁgures. For example,

> options(digits=2)
> c(11/100000, 1/3)
[1] 0.00011 0.33333

The corresponding returns and log-returns for Wal-Mart stock may be

calculated using the commands

> wmt.ret<-(wmt[-1] - wmt[-1259])/wmt[-1259]

> head(wmt.ret)
[1] 0.0145931 -0.0099576 -0.0022350 0.0005600 -0.0050372
0.0165010
> wmt.logret<-log(wmt[-1]) - log(wmt[-1259])
> head(wmt.logret)
[1] 0.0144876 -0.0100075 -0.0022375 0.0005598 -0.0050500
0.0163663

In these expressions, wmt[-1] returns a vector that is identical to wmt,

except that the ﬁrst element has been dropped; similarly, wmt[-1259] returns
a vector that is identical to wmt, except that the 1259th or, in this case the
last, element has been dropped. It follows that wmt[-1] - wmt[-1259] is a
vector of price diﬀerences of the form Pt − Pt−1 .
The function summary gives several summary statistics for a variable—the
minimum and maximum values, the sample mean, the sample median, and
the upper and lower sample quartiles; sd gives the sample standard deviation.

> summary(wmt.ret)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.04660 -0.00452 0.00066 0.00052 0.00558 0.04720
> sd(wmt.ret)
[1] 0.0091715

The function quantile can be used to calculate additional sample quantiles

of a variable; for example,

> quantile(wmt.ret, probs=c(0.05, 0.10, 0.25, 0.5, 0.75, 0.90,

+ 0.95))
5% 10% 25% 50% 75% 90%
-0.01352 -0.00940 -0.00452 0.00066 0.00558 0.01093
95%
0.01445

Therefore, roughly 5% of the sample values are less than or equal to −0.01352;
the help ﬁle for the function quantile gives details on the exact method of
calculation of the sample quantiles. Note that the 25% and 75% quantiles

T&F Cat #K31368 — K31368 C002— page 22 — 6/14/2017 — 22:05

Returns 23

correspond to the sample quartiles calculated using the function summary and
the 50% quantile corresponds to the sample median.
The following commands give plots of the Wal-Mart stock prices and
returns, given in Figures 2.1 and 2.2, respectively. Note that, if the plot
command has only one argument, it is understood to be the y-variable, with
the x-variable taken to be the corresponding index (1 to 1259 in the case of

70
Stock price

0 200 400 600 800 1000 1200

Time

FIGURE 2.1
Time series plot of Wal-Mart daily stock prices.

0.04

0.02
Return

−0.02

−0.04

0 200 400 600 800 1000 1200

Time

FIGURE 2.2
Time series plot of Wal-Mart daily stock returns.

T&F Cat #K31368 — K31368 C002— page 23 — 6/14/2017 — 22:05

24 Introduction to Statistical Methods for Financial Models

wmt and 1 to 1258 in the case of wmt.ret); type="l" speciﬁes that the plot
be drawn as lines connecting the points, which are not displayed.
> plot(wmt, type="l", ylab="Price", xlab="Time")
> plot(wmt.ret, type="l", ylab="Return", xlab="Time")

Monthly Returns
So far in this section, we have analyzed daily prices and daily returns.
In practice, models for asset returns are often based on monthly returns,
which, in many cases, correspond better to the investment horizon of interest
and are often more stable than daily returns. To obtain monthly returns, we
use the get.hist.quote function with compression="m."
Example 2.16 The commands
> x<-get.hist.quote(instrument="WMT", start="2009-12-31",
+ end="2014-12-31", quote="AdjClose", compression="m")
> wmt.m<-as.vector(x)
return the prices of Wal-Mart stock for the last trading day of the month,
for each month from December 2009 to December 2014, storing them in the
vector wmt.m.
Thus, there are 61 monthly prices in wmt.m, which lead to 60 monthly
returns.
> length(wmt.m)
[1] 61
> wmt.m.ret<-(wmt.m[-1]-wmt.m[-61])/wmt.m[-61]
> length(wmt.m.ret)
[1] 60
> head(wmt.m.ret)
[1] -0.000374 0.011978 0.034093 -0.035252 -0.051944 -0.049248
There is one possible pitfall when downloading monthly data—the last
price returned corresponds to the last trading day that occurred on or before
the day listed, even if that day is not the last day of the month. For example,
if we use the command
> x<-get.hist.quote(instrument="WMT", start="2009-12-31",
+ end="2014-12-15", quote="AdjClose", compression="m")
the last price in x will be the price for December 15, 2014. Thus, the last
monthly return will correspond to only the ﬁrst half of December 2014. There-
fore, when downloading monthly returns, it is important that the end date be
the last day of the month under consideration.
Figure 2.3 contains a time series plot of the monthly returns on Wal-Mart
stock.

T&F Cat #K31368 — K31368 C002— page 24 — 6/14/2017 — 22:05

Returns 25

0.15

0.10
Return

0.05

−0.05

0 10 20 30 40 50 60
Time

FIGURE 2.3
Time series plot of Wal-Mart stock monthly returns.

When analyzing a plot of monthly returns, sometimes it is easier to notice

certain features of the sequence of returns if the plot includes the points corre-
sponding to the return values along with the lines connecting the values. This
can be achieved using the R plot function by using the argument type="b,"
where "b" indicates both points and lines. An example of this is given in
Figure 2.4. In principle, the same approach can be used when constructing a

0.15

0.10
Return

0.05

−0.05

0 10 20 30 40 50 60
Time

FIGURE 2.4
Alternative time series plot of Wal-Mart stock monthly returns.

T&F Cat #K31368 — K31368 C002— page 25 — 6/14/2017 — 22:05

26 Introduction to Statistical Methods for Financial Models

plot of daily returns; however, when plotting a large number of values, the
points tend to overwhelm the plot.

Monthly returns are, of course, closely related to daily returns.

If r1 , . . . , r21 denote the daily log-returns for a given month, then the monthly
log-return for that month is simply r1 + · · · + r21 . Therefore, we expect
the mean monthly log-return to be about 21 times as large as the mean
daily return; note that there are, roughly 252 trading days in a year and
252/12 = 21. If the daily returns are uncorrelated with each other, we expect
the standard deviation of the monthly log-returns to√be about 4.6 times as
large as the standard deviation of the daily returns ( 21=4.6). ˙ Also, a time
series plot of monthly returns will be “smoother” than the corresponding plot
of daily returns.
These relationships do not hold exactly for standard returns, but they hold
approximately. Recall that the return at time t, Rt , and the corresponding
log-return, rt , are related by rt = log(1 + Rt ) and, using a Taylor’s series
approximation,
.
log(1 + Rt ) = Rt .
Therefore, we expect that the mean monthly return on an asset will be about
21 times the mean daily return and the monthly return standard deviation
will be about 4.6 times the daily return standard deviation.

Example 2.17 Let wmt.m.ret denote ﬁve years of monthly returns on

Wal-Mart stock for the period ending December 2014. Recall that the daily
returns for this stock were analyzed in Example 2.15.
The summary statistics for wmt.m.ret are

> summary(wmt.m.ret)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.07294 -0.02235 0.01192 0.01094 0.03848 0.14780
> sd(wmt.m.ret)
[1] 0.0441572

The mean monthly return is about 21.6 times the mean daily return and
the standard deviation of the monthly returns is about 4.8 times as large as
the standard deviation of the daily returns; both of these ratios are close to
what we would expect.

Running Means and Standard Deviations

Let Rt denote the return on an asset at time t. If {Rt : t = 1, 2, . . .} is a
weakly stationary process, then μt = E(Rt ) and σ2t = Var(Rt ) are constant
as functions of t. A time series plot of the observed returns, like the ones in
Figures 2.2 and 2.3, provides some evidence regarding the assumption of weak
stationarity: If the underlying process is weakly stationary, we expect the level
of the observed returns as well as their variation to be approximately constant

T&F Cat #K31368 — K31368 C002— page 26 — 6/14/2017 — 22:05

Returns 27

over time. However, the variability inherent in these plots makes assessment
of such properties difficult.
For instance, suppose we are interested in the mean function μt . If μt
is not constant as a function of t, then we have, in general, only a single
observation, Rt , with expected value μt ; hence, it is difficult, if not impossible,
to accurately estimate μt . Thus, it is important to attempt to determine if
the observed returns are consistent with a mean function μt that is constant
over time. One approach to assessing the extent to which μt varies with t is
to calculate running means.
Suppose we observe returns R1 , R2 , . . . , RT and consider running means
based on w observations for some positive integer w. Then the first running
mean is the average of the first w observations:

w
R̄1,w ≡ Rt ;
w t=1

the second running mean is the average of the next w observations,

R2 , R3 , . . . , Rw+1 ,
1

w+1
R̄2,w ≡ Rt ,
w t=2

and so on. The result is a sequence R̄1,w , R̄2,w , . . . , R̄T −w+1,w such that each
element in the sequence is an average of w returns.
Note that
1

w+t−1
E(R̄t,w ) = μj
w j=t

so that if μt is constant as a function of t, μt = μ for all t, then E(R̄t,w ) = μ

and we expect R̄t,w to be approximately constant as a function of t; in this
sense, the information provided by the R̄t,w is similar to the information
provided by the Rt . However, because each R̄t,w is an average of w returns, it
has smaller variance than does Rt , particularly if w is fairly large. Therefore,
if μt varies with t, we expect that variation to be reﬂected in the R̄t,w .
Under the assumption that the μt and √ σt are constant as a function of
t, the standard error of R̄t,w is given by S/ w, where S denotes the sample
standard deviation of R1 , R2 , . . . , RT ; this standard error provides some basis
for evaluating the observed variation in R̄1,w , R̄2,w , . . . , R̄T −w+1,w .
In R, running means may be calculated using the function running in the
package gtools (Warnes et al. 2015).
Example 2.18 For the monthly returns on Wal-Mart stock, stored in the
variable wmt.m.ret, the running means based on a width of w = 12 are
calculated by
> library(gtools)
> wmt.rmean<-running(wmt.m.ret, fun=mean, width=12)

T&F Cat #K31368 — K31368 C002— page 27 — 6/14/2017 — 22:05

28 Introduction to Statistical Methods for Financial Models

0.04

0.03

Return 0.02

0.01

−0.01

−0.02
0 10 20 30 40 50
Time

FIGURE 2.5
Time series plot of running means of monthly returns on Wal-Mart stock.

The variable wmt.rmean is a vector of running means, each based on

12 months of monthly returns; Figure 2.5 contains a time series plot of the
running means of√Wal-Mart monthly returns, together with “error bars” of
the form R̄ ± 2S/ w. The plot was calculated using the commands given here.
> plot(wmt.rmean, type="l", ylim=c(-0.02, 0.040), xlab="Time",
+ ylab="Return")
> lines(1:49, rep(mean(wmt.m.ret) + 2*sd(wmt.m.ret)/(12^.5), 49),
+ lty=2)
> lines(1:49, rep(mean(wmt.m.ret) - 2*sd(wmt.m.ret)/(12^.5), 49),
+ lty=2)
The function lines adds √lines to an existing plot. In this case, each line
is horizontal at y = R̄ ± 2S/ w; the command rep constructs a vector of
length 49 (based on the second argument) by repeating the ﬁrst argument.
The argument lty in lines sets the type of line to be used; in this case, lty=2
speciﬁes a dotted line. An alternative to the function lines is the function
abline, which adds a line to an existing plot by specifying the slope, the
argument b, and the y-intercept, the argument a, to the plot. Therefore, to
add the upper error bar to the plot, we can use the command
> abline(a=mean(wmt.m.ret) + 2*sd(wmt.m.ret)/(12^.5), b=0, lty=2)
Note that the argument ylim is included in plot to set the y-limits in the
plot to be large enough so that the error bars appear on the plot.

When analyzing a plot such as Figure 2.5, there are a few things to keep
in mind. For instance, the error bars are based on the standard error of a

T&F Cat #K31368 — K31368 C002— page 28 — 6/14/2017 — 22:05

Returns 29

single sample mean centered at the sample mean based on all observations.
It is important to realize that the plot is based on many sample means (49 in
the case of Figure 2.5); the standard error does not apply to the maximum of
those sample means, for example. Therefore, we would not conclude that μt is
nonconstant simply because the plotted line crosses an error bar at some point.
Also, the random variables corresponding to the plotted running means are
often highly correlated; in the present case, adjacent running means are based
on sets of 12 observations, 10 of which are included in both running means.
Hence, if one running mean is large, it is likely that the neighboring running
means are large as well. Therefore, the error bars are useful for giving a rough
idea of the expected variability in the running means if the underlying process
is weakly stationary; however, they should not be used for any type of formal
inference.
The same approach used here for running means may be used for any other
summary statistic of the returns. Most useful in this regard is the standard
deviation; recall that if {Rt : t = 1, 2, . . .} is a weakly stationary process, then
σ2t = Var(Rt ) does not depend on t. Running standard deviations may also be
calculated using the running command.

Example 2.19 To calculate the running standard deviations for the

Wal-Mart monthly returns in the variable wmt.m.ret, using a width of w = 12,
we use the command

wmt.rsd<-running(wmt.m.ret, fun=sd, width=12)

Since the distribution of the sample standard deviation tends to be a

skewed distribution (e.g., a chi-squared distribution if the returns are nor-
mally distributed), it is generally preferable to plot the log of the standard
deviation rather than the standard deviation itself. It may be shown that the
error bars for the logs of the running sample
standard deviations based on
w observations are of the form log(S) ± 2/(w − 1), where S is the sample
standard deviation of all the values. A plot of this type for monthly returns
of Wal-Mart stock is given in Figure 2.6.

Sample Autocorrelation Function

Recall that the autocorrelation function ρ(·) of a weakly stationary stochas-
tic process {Yt : t = 1, 2, . . .} describes its correlation structure. For instance,
ρ(2) is the correlation of Yt and Yt+2 ; provided that the process is weakly
stationary, this correlation does not depend on the value of t. The auto-
correlation function may be written as ρ(h) = γ(h)/γ(0), where γ(·) is the
autocovariance function of the process; that is, γ(h) is the covariance of
Yt , Yt+h , h = 0, 1, 2, . . . . In practice, the autocorrelation function is unknown
and must be estimated from the data.
For example, suppose that we observe returns R1 , R2 , . . . , RT and consider
estimating ρ(1), the correlation of two observations one time period apart.

T&F Cat #K31368 — K31368 C002— page 29 — 6/14/2017 — 22:05

30 Introduction to Statistical Methods for Financial Models

−2.6

log − Standard deviation

−2.8

−3.0

−3.2

−3.4

−3.6

−3.8
0 10 20 30 40 50
Time

FIGURE 2.6
Time series plot of running standard deviations of monthly returns on
Wal-Mart stock.

From these data we have T − 1 pairs of observations one time period apart:

(R1 , R2 ), (R2 , R3 ), . . . , (RT −1 , RT ).

Then γ(1), the covariance of two observations one time period apart, may be
estimated by
T −1
1

γ̂(1) = (Rt − R̄)(Rt+1 − R̄).

T t=1

In general, we estimate γ(h) by

T −h
1

γ̂(h) = (Rt − R̄)(Rt+h − R̄).

T t=1

The sample autocorrelation function is given by

γ̂(h)
ρ̂(h) = , h = 1, 2, . . . .
γ̂(0)

Note that in γ̂(h), we use R̄ in place of the sample means of

(R1 , R2 , . . . , RT −h ) and (Rh+1 , . . . , RT ). Also, we take the divisor to be T ,
even though there are only T − h terms in the sum. It may be shown that these
choices lead to estimates ρ̂(h) with the usual properties expected of correla-
tions. For instance, if the divisor T − h is used in place of T , then it is possible

T&F Cat #K31368 — K31368 C002— page 30 — 6/14/2017 — 22:05

Returns 31

for ρ̂(h) to take values outside the interval [−1, 1]. Similarly, we take γ̂(0) to
be the sample variance of Y1 , . . . , YT . An estimator of ρ(1) is then given by

γ̂(1)
ρ̂(1) = .
ˆ
γ(0)
To calculate the sample autocorrelation function for a set of observed
returns in R, we can use the function acf.
Example 2.20 Consider the monthly returns on Wal-Mart stock stored in the
variable wmt.m.ret. To calculate the sample autocorrelation of these returns,
we use
> print(acf(wmt.m.ret, lag.max=12))
Autocorrelations of series wmt.m.ret, by lag
0 1 2 3 4 5 6
1.000 -0.06 -0.04 -0.156 -0.043 -0.122 -0.007
7 8 9 10 11 12
0.119 0.043 0.136 -0.075 -0.221 0.000
The argument lag.max sets the maximum value of h for which to cal-
culate ρ̂(h); for monthly data, 12 is a reasonable maximum value. The acf
command also produces a plot of the sample autocorrelation function; the
plot for Wal-Mart monthly returns is given in Figure 2.7. Note that, without
using the function print, only the plot is produced. √
The dashed lines in the plot are at the values ±(2/ T ); if the true auto-
√
correlation is 0, the standard error of the estimated autocorrelation is 1/ T .

1.0

0.8

0.6
ACF

0.4

0.2

−0.2

0 2 4 6 8 10 12
Lag

FIGURE 2.7
Sample autocorrelation function for monthly returns on Wal-Mart stock.

T&F Cat #K31368 — K31368 C002— page 31 — 6/14/2017 — 22:05

32 Introduction to Statistical Methods for Financial Models

Thus, the dashed lines give some indication of the statistical signiﬁcance of the
autocorrelation estimates. However, it is important to keep in mind that this
standard error applies to each individual estimate (not the maximum of a
series of estimates, for example); thus, these error bars on the plot are best
used as a rough guide when evaluating the magnitude of the estimates. Also
note that the plot includes the value of the sample autocorrelation function
at h = 0, which is always 1; thus, this value contains no information.

For the Wal-Mart monthly returns, all sample autocorrelations are rel-
atively small in magnitude and are roughly consistent with a series of
uncorrelated random variables. A formal hypothesis test based on the sample
autocorrelation function will be discussed in Section 3.5.

Shape of the Return Distribution

Although it is not directly related to the second-order properties of {Rt : t =
1, 2, . . .}, the shape of the return distribution is also of interest. For instance,
is the return distribution symmetric or skewed? Does it follow the familiar
“bell-shaped” curve of the normal distribution or is it “long-tailed,” with
more very large and very small values than would be expected from normally
distributed data? In order to investigate these questions, we need to assume
that the returns R1 , R2 , . . . are identically distributed, an assumption we
maintain through the remainder of this section.
The simplest method of assessing the shape of the distribution is to
compute a histogram of the data.
Example 2.21 Consider the daily returns on Wal-Mart stock stored in the
R variable wmt.ret. The function hist can be used to plot a histogram of the
data
> hist(wmt.m.ret)
The plot of the histogram is given in Figure 2.8.
One drawback of histograms is that they are sensitive to the number of
intervals used in their construction. The default value in R uses “Sturges’ rule”
in which the number of intervals is based on log2 (T ) + 1 where log2 denotes the
base-2 logarithm and T is the sample size. An alternative method that often
yields more informative histograms is the “Freedman–Diaconis rule,” which is
1
based on 2(IQR)/T 3 , where IQR denotes the inter-quartile range of the data,
the distance between the lower and the upper quartiles. Thus, Sturges’ rule
uses only the number of data points, while the Freedman–Diaconis rule also
takes into account the variability in the data.
To use the Freedman–Diaconis rule, the argument breaks="FD" is added
to the hist function.
> hist(wmt.ret, breaks="FD")
The plot of this histogram is given in Figure 2.9.

T&F Cat #K31368 — K31368 C002— page 32 — 6/14/2017 — 22:05

Returns 33
Histogram of wmt.ret

500

400
Frequency
300

200

100

0
−0.04 −0.02 0 0.02 0.04
wmt.ret

FIGURE 2.8
Histogram of daily returns on Wal-Mart stock.

It is also possible to choose the number of intervals in the histogram

explicitly, by specifying the argument breaks as a positive integer, or to
choose breakpoints between the intervals by specifying breaks as a vector of
breakpoints.
A histogram is useful for assessing general features of the return distri-
bution; for instance, in the example, the distribution of daily returns on
Wal-Mart stock appears to be roughly symmetric. However, some features of
the distribution are difficult to evaluate using a histogram. For instance, based
on Figures 2.8 and 2.9, it is difficult to determine if the distribution of daily
returns on Wal-Mart stock is long-tailed or not. This is particularly true when
analyzing monthly returns for which the sample size is generally much smaller.
An alternative approach to assessing the shape of a distribution relative to
that of a normal distribution is to use a normal probability plot. In a normal
probability plot, the sample quantiles of a set of data are plotted versus the
quantiles of the standard normal distribution; therefore, such a plot is also
called a quantile–quantile plot or a Q–Q plot.
Because the quantiles of a normal distribution with mean μ and standard
deviation σ are of the form
μ + σzα
where zα is a quantile of the standard normal distribution, an approximately
linear normal probability plot indicates that the shape of the sample distribu-
tion is approximately normal. Different types of nonlinearity indicate different
deviations from normality.
In R, a normal probability plot can be obtained using the function qqnorm.

T&F Cat #K31368 — K31368 C002— page 33 — 6/14/2017 — 22:05

34 Introduction to Statistical Methods for Financial Models
Histogram of wmt.ret
150

Frequency 100

0
−0.04 −0.02 0 0.02 0.04
wmt.ret

FIGURE 2.9
Histogram of daily returns on Wal-Mart stock using the Freedman–Diaconis
rule.

Example 2.22 Before constructing a normal probability plot for observed

return data, it is helpful to look at such plots for samples from known
distributions.
Samples of 100 data values were randomly generated from four distribu-
tions: the standard normal distribution, the t-distribution with three degrees
of freedom, the chi-squared distribution with two degrees of freedom, and a
uniform distribution on the interval [0, 1]. The t-distribution is symmetric but
long-tailed, the chi-squared distribution is skewed with a long right tail, and
the uniform distribution is symmetric but short-tailed.
In assessing the linearity of a normal probability plot, it is helpful to
include a reference line in the plot; one choice for such a line is the one with
the slope given by the sample standard deviation of the data and y-intercept
given by the sample mean of the data. Based on the earlier discussion, if
the data are approximately normally distributed, the normal probability plot
should follow such a line, at least approximately.
The R functions rnorm, rt, rchisq, and runif can be used to generate
data from the four distributions described previously. The speciﬁc commands
for our case are

> x_norm<-rnorm(100)
> x_t<-rt(100, df=3)
> x_chisq<-rchisq(100, df=5)
> x_unif<-runif(100)

T&F Cat #K31368 — K31368 C002— page 34 — 6/14/2017 — 22:05

Returns 35

To construct the normal probability plot for the data in x_norm, we use
the command

> qqnorm(x_norm)

To add the reference line, we can use the function abline, as described
previously:

>abline(a=mean(x_norm), b=sd(x_norm))

Figure 2.10 gives the normal probability plots for the four randomly gen-
erated sets of data. Note that the plotted points for the normal data generally
follow the line, although there are some minor deviations. The points for the t
data are below the line when the normal quantile is large and negative and are
above the line when the normal quantile is large and positive. This indicates
that the quantiles of the sample corresponding to probabilities close to one are
more extreme than those of the normal distribution, and the quantiles of the
sample corresponding to probabilities close to zero are also more extreme than
those of the normal distribution; that is, the sample distribution is long-tailed.
The chi-squared distribution has a long right tail but a short left tail.
Therefore, the quantiles corresponding to probabilities close to one are larger
than those of the normal distribution while the reverse is true for quantiles

Normal data t data

2
Sample quantiles

Sample quantiles

5
1
0
0
−5
−2 −10

−2 −1 0 1 2 −2 −1 0 1 2
Theoretical quantiles Theoretical quantiles

Chi-squared data Uniform data

10
Sample quantiles

Sample quantiles

0.8
8
6
4 0.4
2
0 0
−2 −1 0 1 2 −2 −1 0 1 2
Theoretical quantiles Theoretical quantiles

FIGURE 2.10
Normal probability plots for randomly generated data.

T&F Cat #K31368 — K31368 C002— page 35 — 6/14/2017 — 22:05

36 Introduction to Statistical Methods for Financial Models

corresponding to probabilities close to zero. Therefore, in the plot for the

chi-squared data, the right-most points are above the reference line, indi-
cating that those sample quantiles are too large (compared to the normal
quantiles); the left-most points are also above the reference line, indicating
that those sample quantiles are too small. This behavior is typical of a skewed
distribution.
For the uniform data, the points corresponding to probabilities close to
one are below the reference line and the points corresponding to probabilities
close to zero are above the reference line, indicating that both the right and
the left tails of the distribution are relatively short.

Example 2.23 The normal probability plot for the daily returns on Wal-Mart
stock analyzed in Example 2.21 can be constructed using the commands
> qqnorm(wmt.ret)
> abline(a=mean(wmt.ret), b=sd(wmt.ret))
The plot is given in Figure 2.11. According to this plot, the distribu-
tion of daily returns on Wal-Mart stock is approximately symmetric but
long-tailed.

The results for the returns on Wal-Mart stock presented in the previous
example are typical of stock return data—the distributions tend to be long-
tailed relative to the normal distribution. For instance, a t-distribution with
a small degrees of freedom, for example, six, is sometimes used as a model for
return data.

Normal Q−Q plot

0.04

0.02
Sample quantiles

−0.02

−0.04

−3 −2 −1 0 1 2 3
Theoretical quantiles

FIGURE 2.11
Normal probability plot of daily returns on Wal-Mart stock.

T&F Cat #K31368 — K31368 C002— page 36 — 6/14/2017 — 22:05

Returns 37

For the methods described in this book, an assumption of a speciﬁc distri-

bution for returns is not needed; in particular, it is not assumed that returns
are normally distributed. We do assume that the mean and standard deviation
of the returns are useful summaries of the return distribution; this is generally
true provided that the distribution of the data is not severely skewed or that
it has very long tails so that the standard deviation does not provide useful
information.

2.6 Suggestions for Further Reading

Topics such as prices, returns, and so on are covered in many books on financial
modeling; hence, the following references are a sampling of those available.
Benninga (2008) presents a comprehensive introduction to a wide range of
topics in finance at a level suitable for beginners and with many detailed
examples; see also Reilly and Brown (2009). A more rigorous treatment of
these topics is available from Campbell et al. (1997).
The statistical properties of a series of observations and the analysis of
such data are discussed in books on time series analysis, such as those by
Montgomery et al. (2008) and Cowpertwait and Metcalfe (2009). A good intro-
duction to time series analysis is provided by Newbold et al. (2013, Chap-
ter 16), Ruppert (2004) and Sclove (2013) present more detailed introductions
specifically geared toward financial statistics.
Stationary processes and autocovariance and autocorrelation functions are
discussed by Wei (2006, Chapter 2). Chapters 1 and 2 by Montgomery et al.
(2008) contain a good introduction to the analysis of time series data, including
graphical methods and the numerical summaries discussed in Section 2.5. Run-
ning means are a type of “smoothing” of the data; see Chambers et al. (1983,
Chapter 4) for an introduction to other types of smoothing methods. Nor-
mal probability plots and quantile–quantile plots in general are discussed by
Chambers et al. (1983, Chapter 6).

2.7 Exercises
1. Consider an asset with prices P0 = $10.00, P1 = $10.40, P2 =
$10.20, P3 = $11.00, and P4 = $11.10. Find the corresponding
returns and log-returns.
2. Suppose that an asset has prices of the form Pt = a exp(bt), t =
0, 1, 2, . . . for some constants a > 0 and b. Find expressions for the
return at time t, Rt , and the log-return at time t, rt .
3. Consider an asset with return Rt and log-return rt at time t. Using
a Taylor’s series approximation, ﬁnd a quadratic function of Rt that
approximates rt for small values of |Rt |.

T&F Cat #K31368 — K31368 C002— page 37 — 6/14/2017 — 22:05

38 Introduction to Statistical Methods for Financial Models

4. Suppose that an asset has prices P0 = $4.00, P1 = $4.80, P2 = $5.00,

and P3 = $5.40 and dividends D1 = 0, D2 = 0.40, and D3 = 0. Find
the corresponding returns.
5. For the asset prices and dividends given in Exercise 4, compute the
adjusted prices.
6. Let P0 , P1 , . . . , PT denote a sequence of prices of an asset and
let D1 , D2 , . . . , DT denote the corresponding sequence of divi-
dends. Suppose that Dt = kPt−1 , t = 1, 2, . . . , T for some constant
k, so that the dividends are proportional to the price of the
asset.

a. Find an expression for the return Rt as a function of the prices

Pt , Pt−1 , and k.
b. Find an expression for the corresponding sequence of adjusted
prices.

7. Let {Yt : t = 1, 2, . . .} and {Xt : t = 1, 2, . . .} be weakly stationary

processes. Does it follow that the process {Yt + Xt , t = 1, 2, . . .} is
weakly stationary? Why or why not?
8. Let {Xt : t = 0, 1, 2, . . .} denote a weakly stationary process; let
μ = E(Xt ), σ2 = Var(Xt ), and let γ(·) denote the autocovariance
function of the process. Deﬁne

Yt = Xt − Xt−1 , t = 1, 2, . . . , .

a. Find the mean and variance functions of the process {Yt : t =

1, 2, . . .}.
b. Find the covariance function of {Yt : t = 1, 2, . . .}.
c. Is the process {Yt : t = 1, 2, . . .} weakly stationary? Why or why
not?

9. Let {Zt : t = 1, 2, . . .} denote a weak white noise process and let Z

denote a random variable with mean 0 and variance 1 such that,
for any t = 1, 2, . . ., Z and Zt are independent. Deﬁne

Xt = ZZt , t = 1, 2, . . . .

a. Find the mean and variance functions of the process {Xt : t =

1, 2, . . .}.
b. Find the covariance function of {Xt : t = 1, 2, . . .}.
c. Is the process {Xt : t = 1, 2, . . .} weakly stationary? Is it a white
noise process? Why or why not?

T&F Cat #K31368 — K31368 C002— page 38 — 6/14/2017 — 22:05

Returns 39

10. Let r1 , r2 , . . . denote the log-returns on an asset and suppose that

the stochastic process {rt : t = 1, 2, . . .} is weakly stationary. Let

r̃1 = r1 + r2 + · · · + r21 ,

r̃2 = r22 + r23 + · · · + r42 ,

and so on. For instance, the rt might represent daily log-returns
and the r̃t represent the corresponding monthly log-returns. Is {r̃t :
t = 1, 2, . . .} weakly stationary? Why or why not?
11. Let X1 , X2 , . . . , XT denote independent, identically distributed ran-
dom variables with mean μ = E(X1 ) and variance σ2 = Var(X1 ).
For a given value of w, let

w+k−1
X̄k,w = Xt , k = 1, 2, . . . , T − w + 1
w
t=k

denote the running means with span w. For convenience, write

Yj = X̄k,w , j = 1, 2, . . . , T − w + 1.

Is Y1 , Y2 , . . . , YT −w+1 a weakly stationary process? If so, ﬁnd the

mean, variance, and the correlation function of the process.
12. Let {Xt : t = 1, 2, . . .} and {Yt : t = 1, 2, . . .} denote weak white
noise processes such that Xt and Ys are independent random
variables for any t, s.
a. Is {Xt + Yt : t = 1, 2, . . .} a weak white noise process? Why or
why not?
b. Is {Xt Yt : t = 1, 2, . . .} a weak white noise process? Why or why
not?
13. Consider stock in Papa John’s International, Inc. (symbol PZZA).
a. Using the R function get.hist.quote, download the adjusted
prices needed to calculate three years of daily returns for the
period ending December 31, 2015.
b. Compute three years of daily returns corresponding to the prices
obtained in Part (a).
c. Use the function summary to compute the summary statistics for
the returns computed in Part (b).
d. Construct a time series plot of the returns calculated in Part (b).
14. Consider stock in Papa John’s International, Inc. (symbol PZZA).
a. Using the R function get.hist.quote, download the adjusted
prices needed to calculate ﬁve years of monthly returns for the
period ending December 31, 2015.

T&F Cat #K31368 — K31368 C002— page 39 — 6/14/2017 — 22:05

40 Introduction to Statistical Methods for Financial Models

b. Compute ﬁve years of monthly returns corresponding to the

prices obtained in Part (a).
c. Use the function summary to compute the summary statistics for
the returns computed in Part (b).
d. Construct a time series plot of the returns calculated in Part (b).
15. For the five years of monthly return data for Papa John’s stock
calculated in Exercise 14, calculate the running means using a width
of 12 months. Plot the running means versus time; include error
bars in the plot; see Example 2.18. Based on this plot, what do you
conclude regarding the weak stationarity of the stochastic process
corresponding to monthly returns on Papa John’s stock?
16. For the five years of monthly returns on Papa John’s stock calcu-
lated in Exercise 14, calculate the running standard deviations using
a width of 12 months. Plot the log of the running standard devia-
tions versus time; include error bars in the plot; see Example 2.19.
Based on this plot, what do you conclude regarding the weak sta-
tionarity of the stochastic process corresponding to monthly returns
on Papa John’s stock?
17. For the three years of daily returns on Papa John’s stock calculated
in Exercise 13, calculate the sample autocorrelation function and
plot the results; use a maximum lag of 20. See Example 2.20. Are
these results consistent with the assumption that the returns are
uncorrelated random variables? Why or why not?
Repeat these calculations using the five years of monthly return
data calculated in Exercise 14 and a maximum lag of 12. Are
these results consistent with the assumption that the returns are
uncorrelated random variables? Why or why not?
18. For the three years of daily returns on Papa John’s stock calculated
in Exercise 13, construct a normal probability plot; see Exam-
ple 2.23. Based on this plot, what do you conclude regarding the
distribution of daily returns of Papa John’s stock?
Repeat the analysis using the five years of monthly returns on
Papa John’s stock calculated in Exercise 14. Do you reach the same
conclusions as you did when analyzing the daily returns?

T&F Cat #K31368 — K31368 C002— page 40 — 6/14/2017 — 22:05

3
Random Walk Hypothesis

3.1 Introduction
Let P0 , P1 , P2 , . . . , denote a sequence of prices of an asset, such as one share
of a particular stock, and let R1 , R2 , . . . denote the corresponding sequence
of returns. Clearly, in making investment decisions, it would be useful to
be able to use historical price and return data to predict future prices and
returns. For instance, “technical analysis” is based on the belief that there
are certain patterns in stock market data that tend to appear frequently and,
by recognizing such patterns, it is possible to make useful predictions about
the future prices and returns. A more statistical approach may look for a
model that relates future prices or returns to past values and may use such a
model for prediction.
An alternative viewpoint is that changes in the price of a stock are essen-
tially unpredictable; this theory is known as the random walk hypothesis. Note
that the random walk hypothesis does not mean that all stock prices are totally
random and that one stock is as good as any other. The random walk hypoth-
esis is a statement about the lack of useful statistical relationships between
past and future prices and returns. However, one stock might still have a
higher average return than another stock. Thus, past returns on a stock may
provide useful information about future returns in the sense that they may be
used to estimate the parameters of the return distribution.

3.2 Conditional Expectation

In discussing random walk models for asset prices, it is helpful to use the
concept of conditional expectation; hence, in this section, we review some of
its basic properties.
Let X and Y denote random variables. Recall that E(Y |X = x) is
the expectation of Y computed under the assumption that X is held fixed
at the value x; it is known as the conditional expectation of Y given X = x.
Note that E(Y |X = x) is a function of x; that is, depending on the value at
which X is held fixed, the conditional expectation of Y changes.
Define h(x) = E(Y |X = x); note that h is a function of the range of the
random variable X. Thus, we may consider the random variable h(X), known

T&F Cat #K31368 — K31368 C003— page 41 — 6/14/2017 — 22:05

42 Introduction to Statistical Methods for Financial Models

as the conditional expectation of Y given X and denoted by E(Y |X). It is

important to keep in mind that E(Y |X) is a random variable. Conditional
expectations such as E(Y |X) are useful for describing certain aspects of the
relationship between X and Y .
Just as the (unconditional) expected value of Y may be calculated based
on knowledge of the distribution of Y , E(Y |X) may be calculated based on
knowledge of the joint distribution of Y and X. However, in this chapter, we
are not concerned with calculating conditional expectations for speciﬁc distri-
butions, but rather we are interested in the general properties of conditional
expectation.
Some properties are straightforward extensions of the properties of
unconditional expectation; for example, for random variables Y1 , Y2 , X and
constants a0 , a1 , a2 ,

E(a0 + a1 Y1 + a2 Y2 |X) = a0 + a1 E(Y1 |X) + a2 E(Y2 |X).

However, we are more concerned with those properties that describe how
E(Y |X) reﬂects the relationship between X and Y .
The following lemma gives three such properties; see, for example,
Blitzstein and Hwang (2015, Chapter 9) for proofs of these results, along
with a detailed discussion of conditional expectation.

Lemma 3.1. Let Y and X denote random variables and let g(·) be a
real-valued function on the range of X.

1. E {E(Y |X)} = E(Y ).

2. E {Y g(X)|X} = g(X)E(Y |X).
3. If X and Y are independent random variables, then E(Y |X) = E(Y ).

Part 1 of the lemma states that the expected value of the random vari-
able E(Y |X) is the same as the expected value of Y . This result may give a
convenient way to calculate E(Y ), if E(Y |X) is easily obtained, using

E(Y ) = E {E(Y |X)} .

Part 2 states that, when computing E(Y g(X)|X), we may treat g(X) as
a constant and factor it out of the conditional expectation calculation. Sup-
pose X and Y are independent, then treating X as ﬁxed does not change
the expected value of Y ; it follows that E(Y |X = x) = E(Y ), leading to
Part 3 of the lemma. Note that these three results continue to hold if X is a
vector-valued random variable of the form (X1 , X2 , . . . , Xm )T .
Example 3.1 Let X and Y denote real-valued random variables and let
Ŷ = E(Y |X). Consider the covariance of Ŷ and X, Cov(Ŷ , X); recall that we
may write
Cov(Ŷ , X) = E(Ŷ X) − E(Ŷ )E(X).

T&F Cat #K31368 — K31368 C003— page 42 — 6/14/2017 — 22:05

Random Walk Hypothesis 43

Using Part 2 of Lemma 3.1,

E(Ŷ X) = E (E(Y |X)X) = E (E(Y X|X))

and, using Part 1 of that lemma,

E (E(Y X|X)) = E(Y X).

Since, by Part 1 of Lemma 3.1,

E(Ŷ ) = E (E(Y |X)) = E(Y ),

it follows that

Cov(Ŷ , X) = E(Ŷ X) − E(Ŷ )E(X) = E(Y X) − E(Y )E(X) = Cov(Y, X).

That is, the covariance of E(Y |X) and X is the same as the covariance of
Y and X.

The random variable E(Y |X) may be interpreted as the function of X that
“best approximates” Y , and it is sometimes described as the “best predictor”
of Y among functions of X. This idea is made precise by the following lemma.

Lemma 3.2. Let Y denote a real-valued random variable and let X denote a
random variable, possibly vector-valued. For any real-valued function g on the
range of X,

E (Y − E(Y |X))2 ≤ E (Y − g(X))2

with equality if and only if g(X) = E(Y |X) with probability one.

Before presenting the proof of this result, consider its relationship to pre-
diction. Suppose that our goal is to choose a function of X to approximate Y .
For instance, suppose that Y is the future price of an asset and X is a vector
containing past prices of the asset; in this case, approximating Y by a function
of X corresponds to predicting the future price using past price data. Suppose
that the quality of the approximation given by a function g(X) is deﬁned as
the expected squared error of the approximation,

2
E (Y − g(X)) .

Then, according to Lemma 3.2, the best approximation, that is, the best
predictor of Y among functions of X, is given by the conditional expectation
E(Y |X). For many purposes, it is more useful to interpret E(Y |X) as the best
approximation or best predictor of Y among functions of X rather than in
terms of the expected value of the conditional distribution of Y given X = x.

T&F Cat #K31368 — K31368 C003— page 43 — 6/14/2017 — 22:05

44 Introduction to Statistical Methods for Financial Models

Note that the lemma is a conditional version of the well-known result

that
the unconditional
expectation E(Y ) has the property that it minimizes
E (Y − c)2 over all constants c. That result may be established by writing

E (Y − c)2 = E (Y − E(Y ) + E(Y ) − c)2
2
= E (Y − E(Y ))2 + {E(Y ) − c} + 2E {(Y −E(Y ))(E(Y ) − c)}
and noting that
E {(Y − E(Y ))(E(Y ) − c)} = {E(Y ) − c} E {Y − E(Y )} = 0.
It follows that, for any real number c,

E (Y − c)2 = E (Y − E(Y ))2 + {E(Y ) − c}2
and, hence,
E (Y − c)2 ≤ E (Y − E(Y ))2
with equality if and only if c = E(Y ).
We can use a version of this idea to prove Lemma 3.2.
Proof of Lemma 3.2. The proof follows the proof of the aforementioned
unconditional result. Note that

2
E (Y − g(X)) = E [Y − E(Y |X) + E(Y |X) − g(X)]2

= E [Y − E(Y |X)]2 + {E(Y |X) − g(X)}2
+ 2E {[Y − E(Y |X)][E(Y |X) − g(X)]} ;
because E(Y |X) − g(X) is a function of X, using Part 1 of Lemma 3.1,
E {[Y −E(Y |X)][E(Y |X) − g(X)]}= E {E [(Y −E(Y |X))(E(Y |X) − g(X))|X]}
and, using Part 2 of Lemma 3.1,
E {[Y − E(Y |X)][E(Y |X) − g(X)]|X}
= {E(Y |X) − g(X)} E {Y − E(Y |X)|X} = 0.
It follows that, for any function g,

2 2
E (Y − g(X)) = E (Y − E(Y |X))2 + E [E(Y |X) − g(X)] ;

note that {E(Y |X) − g(X)}2 ≥ 0 so that

2
E [E(Y |X) − g(X)] ≥ 0

and
2
E (Y − g(X)) ≤ E (Y − E(Y |X))2
with equality if and only if

2
E [E(Y |X) − g(X)] = 0,

which holds if and only if g(X) = E(Y |X) with probability 1.

T&F Cat #K31368 — K31368 C003— page 44 — 6/14/2017 — 22:05

Random Walk Hypothesis 45

Example 3.2 Let Y and X denote real-valued random variables such that
Y = α + βX +
for some constants α and β where is a mean-0 random variable such that
and X are independent. Then
E(Y |X) = E(α + βX + |X) = α + βE(X|X) + E(|X).
Because and X are independent, E(|X) = E() = 0 and E(X|X) = X.
Hence,
E(Y |X) = α + βX;
that is, the best predictor of Y among functions of X is α + βX.
Note that this same result holds provided only that E(|X) = 0; indepen-
dence of and X is not required.

3.3 Eﬃcient Markets and the Martingale Model

Why might it be difficult to use past asset returns to predict future returns?
One possible reason is the efficiency of the markets that set asset prices, in the
sense that those prices reflect all currently available information; hence, past
return data does not include any additional useful information about future
returns.
To see how efficient markets affect the properties of asset prices, we use
the following simple model. Consider the price of one share of stock in a
given company. Let Ωt denote all information available at time t; that is, Ωt
consists of the values of all financial variables that have been observed up to
and including time t. Because information accumulates over time,
Ω0 ⊂ Ω1 ⊂ . . . ⊂ Ωt ⊂ Ωt+1 ⊂ . . . ;
that is, the information available at time s is also available at time s + h for
any h = 0, 1, 2, . . . .
Let V denote a random variable representing the intrinsic value of
one share of stock in this company; for instance, V may include properties of
the stock such as future dividends and so on. Let Pt denote the price of this
share at time t, t = 1, 2, . . . . Assume that
Pt = E(V |Ωt );
that is, we assume that the price of the stock at time t is the best predictor
of the value of the stock based on the information available at that time.
Suppose we are interested in predicting the price of stock at time t + 1,
Pt+1 , using the information available at time t. Using the interpretation of
a conditional expectation value as the best predictor of a random variable,
the best predictor of Pt+1 is E(Pt+1 |Ωt ). Because Pt+1 = E(V |Ωt+1 ), the best
predictor of Pt+1 based on the information available at time t may be written
E(Pt+1 |Ωt ) = E(E(V |Ωt+1 )|Ωt ). (3.1)

T&F Cat #K31368 — K31368 C003— page 45 — 6/14/2017 — 22:05

46 Introduction to Statistical Methods for Financial Models

Iterated Conditional Expectations

The term on the right-hand side of 3.1 is known as an iterated conditional
expectation, that is, the conditional expectation of a conditional expecta-
tion. Note that E(V |Ωt+1 ) is the best predictor of V using the information in
Ωt+1 and E(E(V |Ωt+1 )|Ωt ) is the best predictor of that predictor using the
information in Ωt .
Therefore, the end result, E(E(V |Ωt+1 )|Ωt ), is a predictor of V based
on the information in Ωt calculated using a two-step process. Because the
information in Ωt is also in Ωt+1 , this ﬁnal predictor must be at least as good
as E(V |Ωt ). But we know that E(V |Ωt ) is the best predictor of V using the
information in Ωt . Therefore, we expect that

E(E(V |Ωt+1 )|Ωt ) = E(V |Ωt ).

Note that the same argument holds if Ωt+1 is replaced by any set of
information that includes Ωt .
The following proposition gives a formal statement of this result. A proof
may be based on formalizing the argument described previously; the details
are omitted.
Proposition 3.1. For any random variable V and for any information sets
Ωt and Ωt+h such that Ωt ⊂ Ωt+h ,

E(E(V |Ωt+h )|Ωt ) = E(V |Ωt )

with probability 1.

The Martingale Model

We now apply this result to asset prices. Recall that, according to our model,
the price of a stock at time t is Pt = E(V |Ωt ), where V represents the intrinsic
value of the stock and Ωt is the information available at time t. The best
predictor of Pt+1 , the price of the stock at time t + 1, using the information
available at time t is E(Pt+1 |Ωt ); using Proposition 3.1, this may be written

E(Pt+1 |Ωt ) = E(E(V |Ωt+1 )|Ωt ) = E(V |Ωt ) = Pt . (3.2)

That is, the best predictor of tomorrow’s price of the stock is today’s price.
Clearly, the same argument works for any price in the future: for any
h = 1, 2, . . .,

E(Pt+h |Ωt ) = E(E(V |Ωt+h )|Ωt )

= E(V |Ωt ) = Pt .

A sequence of random variables P1 , P2 , . . . with this property is said to

be a martingale with respect to Ω1 , Ω2 , . . . . Therefore, this is known as the
martingale model for asset prices.

T&F Cat #K31368 — K31368 C003— page 46 — 6/14/2017 — 22:05

Random Walk Hypothesis 47

The martingale model for asset prices has important implications for the
properties of the corresponding returns. Let Xt+1 = Pt+1 − Pt denote the
change in the price from time t to time t + 1 and consider E(Xt+1 ). Then
E(Xt+1 ) = E{E(Xt+1 |Ωt )}
= E{E(Pt+1 − Pt |Ωt )}
= E{E(Pt+1 |Ωt ) − E(Pt |Ω)}.
Note that Ωt , the information available at time t, includes Pt ; that is, the
random variable Pt is a function of the information in Ωt . It follows that
E(Pt |Ωt ) = Pt ;
furthermore, the result given in (3.2) shows that E(Pt+1 |Ωt ) = Pt . It follows
that
E(Xt+1 ) = E{E(Pt+1 |Ωt ) − Pt }
= E(Pt − Pt ) = 0.
Thus, price changes have an expected value of 0. Furthermore, the previous
argument shows that
E(Xt+1 |Ωt ) = 0
so that, given the information available at time t, the predicted value of the
change in price from time t to time t + 1 is 0.
Let
Pt+1 Pt+1 − Pt
Rt+1 = −1 = , t = 0, 1, 2, . . .
Pt Pt
denote the return at time t + 1. Using the fact that Pt is a function of Ωt ,

Pt+1 − Pt 1
E(Rt+1 |Ωt ) = E |Ωt = E(Pt+1 − Pt |Ωt ) = 0; (3.3)
Pt Pt
that is, under the martingale model, the best predictor of the return in period
t + 1 using ﬁnancial information in periods up to and including period t is zero.
Note that this result also implies that the (unconditional) expected value of
Rt+1 is also zero:
E(Rt+1 ) = E {E(Rt+1 |Ωt )} = E(0) = 0.
Furthermore, the following result shows that the correlation of any two
returns Rt and Rs , t = s, is zero.
Proposition 3.2. Using the framework of this chapter, deﬁne the price of an
asset by
Pt = E(V |Ωt ), t = 0, 1, . . .
and let R1 , R2 , . . ., denote the corresponding returns.
Then, under the martingale model, for any t, s = 1, 2, . . ., t = s,
Cov(Rt , Rs ) = 0.

T&F Cat #K31368 — K31368 C003— page 47 — 6/14/2017 — 22:05

48 Introduction to Statistical Methods for Financial Models

Proof. Without loss of generality, we may assume that t < s. According to

3.3, E(Rt ) = E(Rs ) = 0 so that

Cov(Rt , Rs ) = E(Rt Rs ).

Note that
E(Rt Rs ) = E {E(Rt Rs |Ωs−1 )}
and that t ≤ s − 1, Rt , and Ps−1 are functions of Ωs−1 . It follows that

Ps − Ps−1
E(Rt Rs |Ωs−1 ) = Rt E(Rs |Ωs−1 ) = Rt E |Ωs−1
Ps−1
1
= Rt E (Ps − Ps−1 |Ωs−1 )
Ps−1
1
= Rt (Ps−1 − Ps−1 ) = 0,
Ps−1
using the fact that E(Ps |Ωs−1 ) = Ps−1 , establishing the result.

3.4 Random Walk Models for Asset Prices

Under the martingale model, asset returns have zero mean and are uncorre-
lated. However, the framework we used to derive this model is a simple one and
its assumptions are unrealistic in many respects. In particular, we assumed
that the price of an asset is based only on the expectation of the asset’s value.
An alternative, and more realistic, approach is based on the assumption
that the current price of a stock is based on current beliefs regarding the
statistical properties of future earnings of the ﬁrm; in particular, the price
may be based on the expected value of the future earnings as well as their
variability. Including variability in the analysis is important because investors
are generally “risk averse.” Under risk aversion, more risky investments are
worth less than less risky investments with the same expected return.
For example, consider two investments. Stock 1 has a guaranteed return of
a while Stock 2 has a return of zero with probability 1/2 and a return of b with
probability 1/2. Stock 1 may be preferable, that is, it may be worth more to
an investor than Stock 2, even if a < b/2. It follows that the argument given
in the previous sections, which uses only expected values, is, at best, only a
rough approximation of the process used to determine asset prices. Under a
model that recognizes risk aversion, the martingale property no longer holds.
Therefore, although the assumption of an eﬃcient market suggests certain
properties of asset prices, it does not necessarily follow that the martingale
model holds in practice.
Hence, in this section, we consider models for asset prices that have many,
but not all, of the general features of the martingale model. For instance, based
on the models considered in this section, returns are statistically unrelated,

T&F Cat #K31368 — K31368 C003— page 48 — 6/14/2017 — 22:05

Random Walk Hypothesis 49

but they do not necessarily have zero mean. These models are based on the
concept of a random walk.
Consider a sequence of random variables Y0 , Y1 , Y2 , . . . . A simple model for
such a process is one in which the changes in process, Yt − Yt−1 , are “random”
in the sense that they have no discernible pattern and no relationships among
them. Such a process is known as a random walk because it can be viewed as
a model for the location on the real line of an “individual” who, at each time
point, moves randomly along the line. The statistical properties of a process of
this type depend on the interpretation of the term “random” used to describe
the movements of the individual. Thus, there are several different technical
definitions of a random walk, corresponding to different interpretations.
For a given stochastic process {Yt : t = 0, 1, 2, . . .}, let Zt = Yt − Yt−1 ,
t = 1, 2, . . . denote the changes in the values of Yt ; the Zt are known as the
increments of the process. Note that when discussing random walks, we will
often include Y0 , the value at time t = 0, in the process; this random variable
is needed to define the first increment Z1 . Thus, Y0 represents the “starting
point” of the process. The increment process {Zt : t = 1, 2, . . .} will generally
start at time t = 1.
Note that, given Y0 ,

Yt = Y0 + Z1 + · · · + Zt , t = 1, 2, . . . (3.4)

so that the random variables Y1 , Y2 , . . . are equivalent to the increments

Z1 , Z2 , . . . together with the starting point Y0 .
In a random walk process, the increments are “noise” in the sense that
they are statistically unrelated. We have seen one example of such a noise
process, weak white noise; this deﬁnition, as well as some others, are used in
deﬁning random walks.

Deﬁnitions of a Random Walk

We now consider three speciﬁc deﬁnitions of a random walk. Consider a
stochastic process {Yt : t = 0, 1, 2, . . .}, and let Zt = Yt − Yt−1 , t = 1, 2, . . .
denote the increments of the process.
Suppose that Z1 , Z2 , . . . are independent and identically distributed (i.i.d.)
random variables each with mean μ and standard deviation σ and suppose that
Y0 is independent of Z1 , Z2 , . . . . This is the strongest version of the random
walk model, known as Random Walk 1 (RW1). Thus, in RW1, the changes in
the position of the process, Yt − Yt−1 , are i.i.d. random variables.
Note that, using (3.4), under RW1,

E(Yt |Y0 ) = E(Y0 + Z1 + · · · + Zt |Y0 )

= E(Y0 |Y0 ) + E(Z1 |Y0 ) + · · · + E(Zt |Y0 )
= Y0 + E(Z1 ) + · · · + E(Zt )
= Y0 + μt, t = 1, 2, . . . .

T&F Cat #K31368 — K31368 C003— page 49 — 6/14/2017 — 22:05

50 Introduction to Statistical Methods for Financial Models

A similar property holds for the conditional variance of Yt given Y0 ,

deﬁned as

Var(Yt |Y0 ) = E [Yt − E(Yt |Y0 )]2 Y0 } = E(Yt2 |Y0 ) − E(Yt |Y0 )2 .

Note that

Yt − E(Yt |Y0 ) = (Y0 + Z1 + · · · + Zt ) − (Y0 + μ + · · · + μ)

= (Z1 − μ) + (Z2 − μ) + · · · + (Zt − μ).

Using this expression, together with the assumption that Y0 and

(Z1 , Z2 , . . .) are independent, it follows that

E [Yt − E(Yt |Y0 )]2 |Y0 = E [(Z1 − μ) + (Z2 − μ) + · · · + (Zt − μ)]2 |Y0

= E [(Z1 − μ) + (Z2 − μ) + · · · + (Zt − μ)]2
= Var(Z1 ) + Var(Z2 ) + · · · + Var(Zt )
= σ2 t.

Here, μ and σ are called the drift and volatility, respectively, of the process.
Because Yt = Y0 + Z1 + · · · + Zt , it follows that Yt and Zt+1 are indepen-
dent; using the fact that Yt+1 = Yt + Zt+1 ,

E(Yt+1 |Yt ) = E(Yt |Yt ) + E(Zt+1 |Yt ) = Yt + E(Zt+1 ) = Yt + μ.

That is, the best predictor of the position of the random walk at time t + 1,
given knowledge of the position at time t, is Yt + μ.
In fact, the same basic argument can be used to show that

E(Yt+1 |Y0 , Y1 , . . . , Yt ) = E(Yt + Zt+1 |Y0 , Y1 , . . . , Yt ) = Yt + μ;

that is, the best predictor of Yt+1 , given previous knowledge of all past values
of the process, is Yt + μ.
Weaker forms of the random walk model are based on weaker assumptions
regarding the distribution of the increments. For instance, Random Walk
2 (RW2) assumes that the increments Z1 , Z2 , . . ., together with the initial
value Y0 , are independent random variables, as in RW1; Z1 , Z2 , . . . are each
assumed to have mean μ and standard deviation σ, but they are not necessarily
identically distributed.
Under this model, the result

E(Yt+1 |Y0 , Y1 , . . . , Yt ) = Yt + μ

still holds. This generalization is useful because the assumption of identical

distributions of the increments is diﬃcult to verify in practice.
The weakest form of random walk that is commonly used is a weaker ver-
sion of RW2 in which the independence assumption for Z1 , Z2 , . . . is replaced

T&F Cat #K31368 — K31368 C003— page 50 — 6/14/2017 — 22:05

Random Walk Hypothesis 51

by the assumption that the increments are uncorrelated,

Cov(Zt , Zs ) = 0 for any s = t,

and they are uncorrelated with Y0 ,

Cov(Zt , Y0 ) = 0, t = 1, 2, . . . ;

this is known as Random Walk 3 (RW3). That is, in RW3, the increment
process {Zt : t = 1, 2, . . .} is a weak white noise process.
In this case, we can no longer say that the best predictor of Yt+1 based
on Y0 , Y1 , . . . , Yt is Yt + μ. However, the best linear predictor of Yt+1 based on
Y0 , Y1 , . . . , Yt is Yt + μ.
The best linear predictor of Yt+1 based on Y0 , Y1 , . . . , Yt is deﬁned as the
function of the form
a + b0 Y0 + b1 Y1 + · · · + bt Yt

that minimizes
E{(Yt+1 − a − b0 Y0 − · · · − bt Yt )2 }. (3.5)

Proposition 3.3. Consider a stochastic process {Yt : t = 0, 1, 2, . . .} and let

Zt = Yt − Yt−1 , t = 1, 2, . . . denote the increments of the process. If {Zt : t =
1, 2, . . .} is weak white noise and Cov(Y0 , Zt ) = 0 for t = 1, 2, . . ., then the best
linear predictor of Yt+1 based on Y0 , Y1 , . . . , Yt is Yt + μ.

Proof. Consider a linear function of Y0 , Y1 , . . . , Yt of the form a + b0 Y0 +

b1 Y1 + · · · + bt Yt for some constants a, b0 , b1 , . . . , bt .
According to (3.4), each of Y1 , Y2 , . . . , Yt is a linear function of
Y0 , Z1 , . . . , Zt ; hence, any function of the form

a + b0 Y0 + b1 Y1 + · · · + bt Yt = a + b0 Y0 + b1 Y1 + · · · + (bt − 1)Yt + Yt

may be written as

Yt + c + d0 Y0 + d1 Z1 + · · · + dt Zt (3.6)

for some constants c, d0 , d1 , . . . , dt .

Therefore, using the fact that Yt+1 − Yt = Zt+1 , the best linear predictor
of Yt+1 is given by the constants c, d0 , d1 , . . . , dt that minimize

E{(Zt+1 − c − d0 Y0 − d1 Z1 − · · · − dt Zt )2 }, (3.7)

which is simply 3.5 written in terms of Y0 , Z1 , . . . , Zt .

T&F Cat #K31368 — K31368 C003— page 51 — 6/14/2017 — 22:05

52 Introduction to Statistical Methods for Financial Models

Recall that, for any random variable X, E(X 2 ) = E(X)2 + Var(X). Note
that Zt+1 − c − d0 Y0 − d1 Z1 − · · · − dt Zt has the expected value
⎛ ⎞

t
⎝1 − dj ⎠ μ − c − d0 E(Y0 ) (3.8)
j=1

and variance ⎛ ⎞

t
⎝1 + d2j ⎠ σ2 + d20 Var(Y0 )
j=1

where here we have used the fact that any pair of Y0 , Z1 , . . . , Zt , Zt+1 is uncor-
related. Clearly, the variance is minimized by d0 = d1 = · · · = dt = 0. Because,
for these choices of d0 , d1 , . . . , dt , the expected value in (3.8) is 0 for c = μ,
it follows that the best linear predictor of Yt+1 is an expression of the form
3.6 with d0 = d1 = · · · = dt = 0 and c = μ, yielding the expression Yt + μ, as
claimed in the proposition.

Note that the random walk models are related: RW1 implies RW2, which
implies RW3. Thus, if RW3 does not hold, then neither RW2 nor RW1 holds,
and if RW2 does not hold, then RW1 does not hold.

Geometric Random Walk

The same ideas can be applied after a log-transformation of the random vari-
ables; such a transformation is useful if we believe that the “random changes”
in the process are multiplicative rather than additive. In this case, the original
untransformed process is said to follow a geometric random walk model.
Let {Ut : t = 0, 1, 2, . . .} denote a stochastic process such that Yt = log Ut ,
t = 0, 1, 2, . . . follows a random walk model. For instance, if {Yt : t =
0, 1, 2, . . .} follows RW1, then log Ut = Y0 + Z1 + · · · + Zt , where Z1 , Z2 , . . . are
i.i.d. random variables. In this case, we say that {Ut :, t = 0, 1, 2, . . .} follows
a geometric RW1 model. Note that, under this model,

Ut = exp(Y0 ) exp(Z1 + · · · + Zt )
≡ U0 exp(Z1 ) · · · exp(Zt )

where U0 = exp(Y0 ). These ideas may also be applied to RW2 and RW3.

Application of Random Walk Models to Asset Prices

The martingale model for asset prices suggests that the stochastic process
corresponding to the prices of an asset, {Pt : t = 0, 1, 2, . . .}, might be usefully
modeled as a random walk. Under this model, the increments of the process
Pt − Pt−1 , corresponding to changes in the price of the asset, are “noise”;

T&F Cat #K31368 — K31368 C003— page 52 — 6/14/2017 — 22:05

Random Walk Hypothesis 53

for instance, if {Pt : t = 0, 1, 2, . . .} follows RW3, then the price changes form
a weak white noise process.
However, empirical analyses suggest that changes in prices are often
roughly proportional to the price, in the sense that stocks with higher prices
tend to exhibit larger price changes than stocks with lower prices, generally
speaking.
This behavior may be modeled by assuming that
Pt − Pt−1 = Wt Pt−1 , t = 1, 2, . . . ,
where Wt is a random variable representing the proportional change in price.
Under this assumption, the conditional expectation of Xt = Pt − Pt−1 given
Pt−1 depends on Pt−1 in general. On the other hand,
Pt
pt − pt−1 = log = log (Wt + 1)
Pt−1
so that, letting Zt = log(Wt + 1), pt − pt−1 = Zt , t = 1, 2, . . . . Thus, if price
changes are proportional to the price, we might expect log-prices to follow a
random walk so that {Pt : t = 0, 1, 2, . . .} follows a geometric random walk.
Note that the increments of the process {pt : t = 0, 1, 2, . . .} are simply
the log-returns. However, a basic argument showing that market eﬃciency
implies that log-returns are uncorrelated, along the lines of the one we used
in Proposition 3.2 for returns, is not available.
To see why such an argument fails, consider the framework of Section 3.3,
in which Pt = E(V |Ωt ). Then
pt = log E(V |Ωt )
so that
E(V |Ωt )
rt = pt − pt−1 = log .
E(V |Ωt−1 )
It follows that

E(V |Ωt+1 ) E(V |Ωt+1 )
E(rt+1 rt ) = E log log .
E(V |Ωt ) E(V |Ωt )
Because of the presence of the log(·) function, this expression cannot be
usefully simpliﬁed.
However, suppose that the returns Rt = (Pt − Pt−1 )/Pt−1 are small. Then

Pt Pt − Pt−1
rt = pt − pt−1 = log = log 1 +
Pt−1 Pt−1
. Pt − Pt−1
= = Rt .
Pt−1
For example, if Rt = 0.02, a fairly large value for a monthly return, then
Pt /Pt−1 = 1.02 and rt = log(1.02) = 0.01980 and approximating rt by Rt
yields an error of 0.0002, or about 1%.

T&F Cat #K31368 — K31368 C003— page 53 — 6/14/2017 — 22:05

54 Introduction to Statistical Methods for Financial Models

Thus, if two returns are uncorrelated, it is reasonable to expect that

the corresponding log-returns will be approximately uncorrelated. Therefore,
although the martingale model for log-prices does not follow directly from the
efficient market assumption, the argument used in Section 3.3 suggests that
it may be reasonable to model {pt : t = 0, 1, 2, . . .} as a random walk.
The random walk hypothesis in finance generally refers to a geometric
random walk for asset prices. Suppose {Pt : t = 0, 1, 2, . . .} follows a geometric
random walk model; then the stochastic process corresponding to the log-
prices, {pt : t = 0, 1, 2 . . .} follows a random walk model.
Specifically, under RW1, r1 , r2 , . . . are i.i.d. random variables, under RW2,
r1 , r2 , . . . are independent, each with mean μ and standard deviation σ, but
they are not necessarily identically distributed; RW3 weakens the indepen-
dence condition of RW2 to the condition that r1 , r2 , . . . are uncorrelated
random variables. The key idea in the random walk hypothesis is that the
past values r1 , r2 , . . . , rt do not provide any useful information about rt+1 .
As noted in the introduction to this chapter, it is important to keep in mind
that the random walk hypothesis does not imply that the information provided
by r1 , r2 , . . . , rt is useless in understanding the properties of rt+1 . For instance,
under any of the random walk models we have considered, r1 , . . . , rt , rt+1 each
has mean μ; hence, the sample mean of r1 , . . . , rt is an unbiased estimator
of μ, which is the expected value of rt+1 . Thus, the past values do provide a
type of indirect information about the future.

3.5 Tests of the Random Walk Hypothesis

According to the random walk hypothesis, asset prices follow a geometric ran-
dom walk. Although market eﬃciency suggests that a geometric random walk
may be appropriate for modeling price data, for the analyst, the important
issue is whether or not observed prices behave like observations from a geo-
metric random walk. Hence, a large number of statistical tests of the random
walk model have been proposed.
It is worth noting that, although the term “random walk” refers to
the asset log-prices, tests of the random walk model are typically based on the
properties of the log-returns, which are the increments corresponding to the
log-prices. Thus, these tests are designed to detect statistical relationships in
the log-returns of an asset; such relationships would contradict the random
walk hypothesis.
In this section, we consider four simple tests that are useful in this context;
each test is designed to detect departures from some form of a random walk for
log-prices; the ﬁrst two tests considered test RW1 and the remaining two test
RW3. Note that, because of the relationships among the three forms of the
random walk model, rejection of the hypothesis that RW3 holds for log-prices
implies rejection of RW2 and RW1 for log-prices.

T&F Cat #K31368 — K31368 C003— page 54 — 6/14/2017 — 22:05

Random Walk Hypothesis 55

A Test Based on the Sample

Autocorrelation Function
Let p0 , p1 , p2 , . . . denote log-prices of a given asset and let r1 , r2 , . . . denote the
corresponding log-returns. Under RW3 for {pt : t = 0, 1, 2, . . .}, r1 , r2 , . . . are
uncorrelated random variables, each with mean μ and standard deviation σ.
Let ρ(·) denote the autocorrelation function of {rt : t = 1, 2, . . .}. Then,
under RW3, ρ(h) = 0 for all h = 1, 2, . . . . Therefore, a test of the RW3 version
of the random walk hypothesis may be based on the sample autocorrelation
function, as described in Section 2.5.
Suppose we observe T periods of return data, r1 , r2 , . . . , rT , and let ρ̂(h),
h = 1, 2, . . ., denote the sample autocorrelation function. Although a test of
the random walk hypothesis can be based on any one sample autocorrelation,
a better approach is to construct a test statistic based on several sample
autocorrelations, such as the ﬁrst m sample autocorrelations, for some given
value of m. A test statistic of this type is given by

m
ρ̂(h)2
B = T (T + 2) .
T −h
h=1

Note that B tends to be large when the sample autocorrelations are far from 0.
Under the null hypothesis that RW3 holds for log-prices, so that ρ(1) = ρ(2) =
· · · = ρ(m) = 0, B has a chi-squared distribution with m degrees of freedom.
This is known as the Box–Ljung test.
In order to carry out the test, m, the number of lags used to compute B,
must be selected. When the data consist of ﬁve years of monthly returns, a
relatively small value of m should be used; for example, m = 12 is a reasonable
choice. For longer series of daily returns, a larger value of m could be used.
Example 3.3 Consider the monthly log-returns for Wal-Mart stock, stored
in the variable wmt.m.logret. To compute the test statistic B and the
corresponding p-value, we may use the Box.test function.
> Box.test(wmt.m.logret, lag=12, type="L")

Box-Ljung test

data: wmt.m.logret
X-squared = 9.9011, df = 12, p-value = 0.6246
The argument lag in the Box.test function specifies m, the number of
lags to be used and the type="L" argument specifies that the Box–Ljung test
be used.
These results indicate that, for m = 12, B = 9.9011 and the p-value is
0.6246. Therefore, based on this test, there is no evidence to reject the
hypothesis that the data do not exhibit autocorrelation (at least up to lag 12),
confirming our informal conclusion in Section 2.5.

T&F Cat #K31368 — K31368 C003— page 55 — 6/14/2017 — 22:05

56 Introduction to Statistical Methods for Financial Models

Variance-Ratio Test
The variance-ratio test is based on the following observation. Suppose that
RW3 holds for the log-prices, so that r1 , r2 , . . . , rT each has mean μ and
standard deviation σ and rt , rs are uncorrelated for all t = s. Then

E(rt + rt−1 ) = E(rt ) + E(rt−1 ) = μ + μ = 2μ

and
Var(rt + rt−1 ) = Var(rt ) + Var(rt−1 ) = σ2 + σ2 = 2σ2 .
More generally, rt + rt−1 + · · · + rt−q+1 has mean qμ and variance qσ2 .
Note that rt + rt−1 + · · · + rt−q+1 is simply the q-period log-return at time t.
Therefore, if RW3 holds for log-prices, then there is a simple relationship
between the variance of multiperiod log-returns and the variance of single-
period log-returns.
This fact can be used to test RW3 by comparing an estimate of the
variance of
rt + rt−1 + · · · + rt−q+1 , t = q, . . . , T
to an estimate of the variance of r1 , r2 , . . . , rT ; if RW3 holds, the ratio of these
estimates should be roughly q.
For a given value of q, let
T
t=q (rt + rt−1 + · · · + rt−q+1 − qr̄)
2
Sq2 = ,
T −q
which is essentially the sample variance of rt + rt−1 + · · · + rt−q+1 , t = q,
q + 1, . . . , T , with the divisor equal to the sample size minus one, except that
instead of subtracting the sample mean of these values we subtract qr̄, where

T
r̄ = rt .
T t=1

Let S 2 denote the usual sample variance of r1 , r2 , . . . , rT . The variance-

ratio statistic is given by

T 1 Sq2
Vq = .
T − q + 1 q S2
If RW3 holds, we expect that

1 Sq2
=
˙ 1;
q S2
the factor T /(T − q + 1) is an adjustment term designed to improve the accu-
racy of the normal approximation to the distribution of Vq in small samples.
Note that, like the Box–Ljung test, the variance-ratio test is a test of the
correlation structure of the log-returns.

T&F Cat #K31368 — K31368 C003— page 56 — 6/14/2017 — 22:05

Random Walk Hypothesis 57
√
Under the null hypothesis that RW3 holds for the log-returns, T (Vq − 1)
is approximately normally distributed with mean 0 and variance given by
2(2q − 1)(q − 1)/(3q). Therefore, the standardized test statistic is

√ 3q
V̄q = T (Vq − 1)
2(2q − 1)(q − 1)

and the null hypothesis is rejected for large values of |V̄q |. The p-value of the
test is given by
P(|Z| > |V̄q,0 |)
where Z has a standard normal distribution and V̄q,0 is the observed value of
V̄q ; hence,
P(|Z| > |V̄q,0 |) = 2 1 − Φ(|V̄q,0 |)
where Φ denotes the standard normal distribution function.
Example 3.4 Consider the statistic V3 applied to the log-returns on
Wal-Mart stock. This statistic may be calculated using the following com-
mands:
> x<-wmt.m.logret - mean(wmt.m.logret)
> x3<-x[3:60] + x[2:59] + x[1:58]
> (60/58)*(1/3)*(sum(x3^2)/57)/var(x)
[1] 0.90135
Here, x is the vector of mean-corrected log-returns for Wal-Mart stock and x3
is the vector of three-month mean-corrected log-returns,

rt − r̄ + rt−1 − r̄ + rt−2 − r̄, t = 3, 4, . . . , T.

The observed value of the statistic V3 is 0.90135.

To compute a p-value, we use the fact that, under the null hypothesis, V3
has mean 1 and variance
2(2q − 1)(q − 1) 1 2(5)(2) 1 1
= = .
3q T 9 60 27
Therefore, the observed value of the standardized test statistic is given by
√
V̄3,0 = 27(0.90135 − 1) = −0.51260;

this corresponds to a two-tailed p-value of 0.6082, calculated by

> 2*(1-pnorm(0.51260))
[1] 0.6082
Therefore, there is no evidence to reject the null hypothesis that Wal-Mart
log-prices follow RW3. Note that here the function pnorm is the standard
normal distribution function.

T&F Cat #K31368 — K31368 C003— page 57 — 6/14/2017 — 22:05

58 Introduction to Statistical Methods for Financial Models

A similar conclusion is obtained using V6 . The observed value of V6 is

0.610, corresponding to a p-value of 0.222.

Runs Test
Not all types of dependence are reﬂected in correlation. Another approach
to detecting relationships in a series of returns is to look at patterns of
above-average and below-average returns.
More formally, let r1 , r2 , . . . , rT denote a sequence of log-returns and
let med(r1 , r2 , . . . , rT ) denote the sample median of r1 , r2 , . . . , rT . For t =
1, 2, . . . , T , deﬁne

1 if rt > med(r1 , . . . , rT )
Gt =
0 if rt ≤ med(r1 , . . . , rT )

Then G1 , G2 , . . . , GT is a sequence of indicator variables showing if the return

in a given period exceeds the median (Gt = 1) or not (Gt = 0).
Suppose that r1 , r2 , . . . rT are i.i.d. random variables; that is, suppose that
RW1 holds. Then G1 , G2 , . . . , GT should exhibit a random pattern of zeros
and ones. On the other hand, if r1 , r2 , . . . , rT have some type of dependence
structure, or if certain features of the distribution of rt depend on t, then there
may be patterns of zeros and ones; for instance, if rt , rt+1 are dependent, then
Gt+1 = 1 may be more likely if Gt = 1 than if Gt = 0.
Therefore, we can test the hypothesis that p0 , p1 , p2 , . . . , pT follow RW1 by
counting the number of “runs” in the sequence G1 , G2 , . . . , GT , where a run is
deﬁned as a sequence of one symbol. For example, if the sequence of indicator
variables is 0 0 0 1 1 1 0 0 1 1, there are four runs, while if the sequence is
0 1 0 0 1 1 0 1 1 0, there are seven runs.
For convenience, assume that there are T /2 zeros in sequence
G1 , G2 , . . . , GT and T /2 ones. This holds if T is even and r1 , r2 , . . . , rT are
unique. In general, the number of zeros and the number of ones are both
approximately equal to T /2 with high probability and the results described
as follows continue to hold.
Let M0 denote the observed number of runs and let M denote the number
of runs in a random sequence of length T with T /2 ones and T /2 zeros; to
compute a p-value for the test, we can compare M0 to the distribution of M .
Although the exact distribution of M is complicated, it may be shown that M
is approximately distributed as a binomial random variable with parameters
T and 1/2. To see why this might hold, consider building a sequence of length
T by adding randomly selected ones and zeros, one step at a time; at each
stage, there is a 50% chance of increasing the number of runs by one. This
fact may be used to calculate a p-value for the test.

Example 3.5 Consider the calculation of M0 and the corresponding p-value

for log-returns on Wal-Mart stock, stored in the variable wmt.m.logret.

T&F Cat #K31368 — K31368 C003— page 58 — 6/14/2017 — 22:05

Random Walk Hypothesis 59

These calculations can be performed using the function runs.test, which

is available in the randtests package (Caeiro and Mateus 2014).
> library(randtests)
Warning message:
package randtests was built under R version 3.2.3
> runs.test(wmt.m.logret)

Runs Test

data: wmt.m.logret
statistic = 1.0417, runs = 35, n1 = 30, n2 = 30, n = 60, p-value =
0.2976
alternative hypothesis: nonrandomness
Therefore, the p-value of the test is 0.2976 so that there is no evidence to
reject the null hypothesis that RW1 holds for the log-returns.

Rescaled Range Test

The Box–Ljung, variance-ratio, and runs tests are useful for detecting associ-
ation among log-returns from nearby time periods; however, another way in
which the random walk hypothesis may fail is if the log-returns are related
over a long period of time. For instance, there may be multiyear periods dur-
ing which the monthly log-returns are generally (but not always) large. The
rescaled range test is designed to detect this type of long-range dependence.
The test statistic is given by
k l
max1≤k≤T t=1 (rt − r̄) − min1≤l≤T t=1 (rt − r̄)
H= √
S T
where S is the sample standard deviation of r1 , r2 , . . . , rT ; that is, H is the
range of the variables

k
(rt − r̄), k = 1, 2, . . . , T.
t=1

Large values of H are evidence against the null hypothesis that RW1 holds
for the log-prices.
A large value of H indicates that there are times t0 , t1 such that

t1
(rt − r̄)
t=1

is a large positive value and

t0
(rt − r̄)
t=1

T&F Cat #K31368 — K31368 C003— page 59 — 6/14/2017 — 22:05

60 Introduction to Statistical Methods for Financial Models

TABLE 3.1
Critical Values for the Rescaled Range Test
Signiﬁcance Level Critical Value
0.10 1.620
0.05 1.747
0.025 1.862
0.005 2.098

is a large negative value; note that the values rt − r̄, t = 1, 2, . . . , T must sum
to 0. That is, there is a time period over which the log-returns diﬀer greatly
from their sample mean.
To determine if the observed value of H is statistically signiﬁcant, we
compare it to the critical values in Table 3.1.

Example 3.6 Consider calculation of H for the Wal-Mart log-returns in

wmt.m.logret. To calculate H in R, we can use the cumsum function, which
returns the cumulative sums of the values in a vector:

> x<-c(1, 3, -2, -4, 5)

> cumsum(x)
[1] 1 4 2 -2 3

Let

k
H1 = max (rj − r̄)
1≤k≤T
j=1

and

l
H2 = min (rj − r̄)
1≤l≤T
j=1
√
so that H = (H1 − H2 )/(S T ).
For the Wal-Mart monthly log-returns, H1 and H2 may be calculated by

> H1<-max(cumsum(wmt.m.logret-mean(wmt.m.logret)))
> H1
[1] 0.085685
> H2<-min(cumsum(wmt.m.logret-mean(wmt.m.logret)))
> H2
[1] -0.19496

and H is given by

> H<-(H1 - H2)/(sd(wmt.m.logret)*(60^.5))

> H
[1] 0.83819

T&F Cat #K31368 — K31368 C003— page 60 — 6/14/2017 — 22:05

Random Walk Hypothesis 61

To compute the p-value for a test of RW2, we compare the observed value of
H to critical values in Table 3.1. It follows that the p-value is greater than 0.10.
Therefore, according to the rescaled range test, there is no evidence to
reject the hypothesis that the RW2 model holds for Wal-Mart stock.

3.6 Do Stock Returns Follow the Random Walk Model?

Although assumptions regarding market efficiency suggest that asset log-prices
are approximately uncorrelated and, hence, some form of a geometric random
walk may be reasonable for asset prices, the issue is essentially an empirical one.
For an analyst, the important issue is whether or not observed prices behave
like observations from a random walk model. In the previous section, several
tests of the random walk model were presented. These tests show that, when
applied to log-returns on Wal-Mart stock, there is no evidence to reject the
hypothesis that the log-returns form a random sequence in the sense that the
log-returns in different time periods are unrelated statistically. In this section,
we extend those results by applying the tests to a wide range of stock returns.
Stocks for firms represented in the Standard & Poor’s (S&P) 100 index
were considered. The S&P 100 is a stock market index based on stocks from
100 large U.S. firms, spread across several different industries; stock market
indices will be discussed in detail in Chapter 8. For present purposes, we
may view the S&P 100 firms as a cross section of large U.S. companies. For
each stock, five years of monthly returns were analyzed for the period ending
December 31, 2014; of the 100 stocks represented in the S&P 100, four stocks
did not have five years of monthly returns available, leading to a set of 96
stocks for analysis.
For each of the 96 stocks, the four tests of the random walk model dis-
cussed in the previous section were performed; the Box–Ljung test uses m = 12
and the variance-ratio test is based on the statistic V3 . Figure 3.1 contains
the results of these tests in the form of histograms of the results for the 96
stocks. For the Box–Ljung test, the variance-ratio test, and the runs test, the
histograms contain the p-values of the tests; for the rescaled range test, the
histograms are based on values of the test statistics.
If the random walk model holds in general for stocks in the S&P 100,
p-values should be approximately distributed according to a uniform distri-
bution on the interval (0, 1); this result is a more general version of the result
that, if the random walk model holds for all stocks, then about 5% of the
stocks should have a p-value less than 0.05. Conversely, if the random walk
model does not hold for some stocks, we expect that there should be more
than 10 (roughly 10% of 96) p-values in the range from 0 to 0.10 (the first
interval on the histograms).
The histograms in Figure 3.1 suggest that neither the Box–Ljung test,
the runs test, nor the variance-ratio test contradicts the hypothesis that the
random walk model holds. In fact, the Box–Ljung test has a large number

T&F Cat #K31368 — K31368 C003— page 61 — 6/14/2017 — 22:05

62 Introduction to Statistical Methods for Financial Models
Box−Ljung test Runs test

15 20

15
10

Frequency
Frequency

5
5

0 0
0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0
p-value p-value

Variance ratio test with q = 3 Rescaled range test

14
20
12

10 15
Frequency

Frequency

6 10

4
5
2

0 0
0 0.2 0.4 0.6 0.8 1.0 0.6 0.8 1.0 1.2 1.4 1.6
p-value Test statistic

FIGURE 3.1
Results of tests of the random walk model for stocks in the S&P 100.

of relatively large p-values, indicating that the autocorrelations of the stock

log-returns tend to be smaller than might be expected from a sequence of
uncorrelated random variables.
For the rescaled range test, we have a table of critical values for the test but
not a simple method for computing p-values; thus, the histogram presented
in Figure 3.1 is of the observed test statistics. In interpreting this histogram,
we may use the fact that the critical value for a test with level 10% is 1.620
and the critical value of a test with level 5% is 1.747; see Table 3.1. Therefore,
only 1 of the 96 stocks yields a signiﬁcant result at the 10% level in the rescaled
range test.
According to these results, it is reasonable to conclude that, based on
the tests used, the properties of the log-returns of stocks for ﬁrms in the
S&P 100 are not inconsistent with the random walk model. However, these

T&F Cat #K31368 — K31368 C003— page 62 — 6/14/2017 — 22:05

Random Walk Hypothesis 63

results should not be considered to be strong evidence in favor of the random

walk hypothesis. To reach such a conclusion, we would need to consider the
power of the tests, the probability of rejecting the null hypothesis when it
does not hold. In the present context, the power is the probability of rejecting
the random walk model when the data, in fact, do not follow that model.
Therefore, evaluating the power of a test requires consideration of models for
a time series of log-returns that may be appropriate if the random walk does
not hold. If the power of a test is low against reasonable alternative models,
it may be the case that the random walk model does not hold, but the tests
used are unable to recognize this fact; this might be the case, for example, if
the sample sizes of the data used in the tests are relatively small.
However, even if we cannot conclude with certainty that the random walk
hypothesis holds, the results of this section clearly show that there is not a
strong relationship between past returns on an asset and future returns; in
particular, the results suggest that it will be difficult to use past returns to
accurately predict future returns.
Hence, the remainder of the book focuses on methods that can be applied
when the returns on a given asset form a series of uncorrelated random vari-
ables. These methods use the fact that the returns on different assets in the
same time period are correlated in order to construct a combination of assets
yielding a large expected return and small risk. That is, rather than trying to
predict the future returns of different assets, we use the statistical properties
of a set of asset returns to guide investment decisions.

3.7 Suggestions for Further Reading

The random walk hypothesis is one of the most widely discussed topics in
ﬁnancial statistics. The bestselling book, A Random Walk Down Wall Street,
by Princeton economist Burton G. Malkiel (Malkiel 2003), contains a detailed,
nontechnical, discussion of the random walk hypothesis and its implications;
in particular, this book discusses many of the criticisms of the random walk
hypothesis that have been raised and gives arguments against them. An
alternative view is provided by A Nonrandom Walk Down Wall Street, by
econometricians Andrew Lo and A. Craig MacKinlay (Lo and MacKinlay
2002), in which it is shown that there is an element of predictability in stock
prices; a key part of this argument is the variance-ratio test statistic, pre-
sented in Section 3.5. Although the details of the analyses presented in this
book are highly technical, the basic ideas are not; an accessible summary of
many of the issues raised by Lo and MacKinlay (2002) is given by Lo (1997).
Conditional expectation and its relationship to optimal prediction is dis-
cussed by Rice (2007, Section 4.4) and Ross (2006, Section 7.6). The discussion
of the relationship between eﬃcient markets and the martingale model for
asset prices given in Section 3.3 is based on the work of Campbell et al. (1997,
Section 1.5); see also Fama (1965, 1970) and Samuelson (1965). A proof of

T&F Cat #K31368 — K31368 C003— page 63 — 6/14/2017 — 22:05

64 Introduction to Statistical Methods for Financial Models

Proposition 3.1 on iterated conditional expectations is given by Severini (2005,

Chapter 2).
The three versions of a random walk process discussed in Section 3.4 are
based on Chapter 2 from the work of Campbell et al. (1997, Section 2.1); see
Fabozzi et al. (2006, Chapter 10) for further discussion. Statistical tests of
the random walk model, including the Box–Ljung test, the variance-ratio
test, and the rescaled-range test, are discussed in detail in Chapter 2 of
Campbell et al. (1997). The Box–Ljung statistic is discussed by Montgomery
et al. (2008, Section 2.6; where it is called the Ljung–Box statistic)). A good
introduction to the runs test is presented by Newbold et al. (2013, Section
14.7); see Granger (1963) for a more detailed discussion, including its appli-
cation to economic data. Properties of the rescaled-range test are given by Lo
(1991); in particular, this paper discusses the derivation of the critical values
listed in Table 3.1.
Hypothesis tests and p-values are a central topic in statistics. See Newbold
et al. (2013, Chapters 9, 10) or Woolridge (2013, Appendix C) for an intro-
duction and Hogg and Tanis (2006, Chapters 8 and 9) for a more advanced
discussion, including the important issue of the power of a test.

3.8 Exercises
1. Let X and Y denote random variables, each with mean 0, such that

E(Y |X) = X + a

for some constant a. Find a.

2. Let X and Y denote random variables and let

Ŷ = E(Y |X).

Show that
Cov(h(X), Ŷ ) = Cov(h(X), Y )
for any function h.
Does it follow that the correlation of h(X) and Ŷ is equal to the
correlation of h(X) and Y ? Why or why not?
3. Let Y and X denote random variables such that

Y = β0 + β1 X +

where β0 and β1 are constants and is a random variable with

zero mean and standard deviation σ ; assume that X and are
independent. Find the function of X that is the best predictor of
Y 2 and compare it to the square of the best predictor of Y .

T&F Cat #K31368 — K31368 C003— page 64 — 6/14/2017 — 22:05

Random Walk Hypothesis 65

4. Let Ω1 ⊂ Ω2 ⊂ · · · denote information sets and let P1 , P2 , . . . denote

a corresponding sequence of asset prices such that Pt ∈ Ωt . Suppose
that
E(Pt+h |Ωt ) = (1 + η)h Pt , h = 0, 1, . . .
for some constant η > −1.
a. Let Rt+1 = Pt+1 /Pt − 1 denote the return in period t + 1. Find
E(Rt+1 ).
b. For t = s, find Cov(Rt , Rs ).
5. Let V denote the instrinsic value of a share of stock in a given firm
and let Ωt denote the set of financial information available at time
t, t = 1, 2, . . . . Let Pt denote the price of one share of stock in the
firm. However, instead of Pt = E(V |Ωt ), we have

Pt = E(V |Ωt ) + Zt

where Zt is a random variable such that E(Zt |Ωt ) = 0. That is, the
price Pt is approximately, but not exactly, equal to E(V |Ωt ).
Does the martingale property

E(Pt+h |Ωt ) = Pt , h = 1, 2, . . .

hold in this case? Why or why not?

6. Suppose that {Yt : t = 0, 1, . . .} follows RW3 and, for some given
integer m > 1, define a sequence of random variables X1 , X2 , . . . by
X0 = Y0 ,
X1 = Ym , X2 = Y2m , X3 = Y3m ,
and so on, so that Xt = Ytm .
Does {Xt : t = 0, 1, 2, . . .} follow RW3? Why or why not?
7. Suppose {Yt : t = 0, 1, . . .} follows RW3 and, for each t = 1, 2, . . .,
define Xt = aYt + b for constants a, b.
Does {Xt : t = 0, 1, . . .} follow RW3? Why or why not?
8. Let {Yt : t = 0, 1, . . .} and {Xt : t = 0, 1, . . .} denote stochastic pro-
cesses that both follow RW1. Assuming that {Yt : t = 1, 2, . . .} and
{Xt : t = 1, 2, . . .} are independent, does {Xt + Yt : t = 0, 1, . . .}
follow RW1? Why or why not?
9. Let {Yt : t = 0, 1, . . .} denote a random walk process. For a given
positive integer h, define

Xt = Yt+h − Yh , t = 0, 1, . . . .

Is {Xt : t = 0, 1, . . .} a random walk process? Why or why not?

Consider each of the three deﬁnitions of a random walk, RW1, RW1,
and RW3.

T&F Cat #K31368 — K31368 C003— page 65 — 6/14/2017 — 22:05

66 Introduction to Statistical Methods for Financial Models

10. Let {Yt : t = 1, 2, . . .} denote a weakly stationary process; let σ2

denote the variance of Yt and let ρ(·) denote the autocorrelation
function of the process.
For a given positive integer q, find an expression for
Var(Y1 + · · · + Yq )
Var(Y1 )
as a function of ρ(·).
Interpret this result in terms of the properties of the variance-
ratio test.
11. Consider stock in Best Buy Company, Inc. (symbol BBY).
a. Using the R function get.hist.quote, download the adjusted
prices needed to calculate five years of monthly returns for the
period ending December 31, 2015, and compute the log-prices.
b. Compute five years of monthly log-returns corresponding to the
log-prices obtained in Part (a).
c. Using the function summary, calculate the summary statistics for
the returns computed in Part (b).
d. Using the function acf, calculate the sample autocorrelation
function for the log-returns computed in Part (b); use a
maximum lag of 12.
12. For the data calculated in Exercise 11, use the Box–Ljung test to
test the hypothesis that

H0 :ρ(1) = ρ(2) = · · · = ρ(12) = 0

where ρ(·) denotes the autocorrelation of the stochastic process

corresponding to the log-returns on Best Buy stock; see Example
3.3.
Is this result consistent with the hypothesis that the log-prices
of Best Buy stock follow RW3? Why or why not?
13. Use the variance ratio test to test the hypothesis that the log-
prices of Best Buy stock, calculated in Exercise 11, follow RW3;
see Example 3.4.
a. Using the log-returns on Best Buy stock, compute the variance
ratio test statistic V6 .
b. Find the mean and variance of V6 under the null hypothesis
that the stochastic process corresponding to the log-prices of
Best Buy stock follows RW3.
c. Determine the p-value of the test.
d. Based on these results, are the observed log-prices of Best Buy
stock consistent with the RW3 model? Why or why not?

T&F Cat #K31368 — K31368 C003— page 66 — 6/14/2017 — 22:05

Random Walk Hypothesis 67

14. For the log-prices of Best Buy stock calculated in Exercise 11,
calculate the p-value of the runs test; see Example 3.5.
Based on your results, what do you conclude regarding the
random walk hypothesis as applied to the log-prices of Best Buy
stock?
15. For the log-returns on Best Buy stock calculated in Exercise 11,
calculate the rescaled-range statistic H; see Example 3.6.
Based on this result, what do you conclude regarding random
walk hypothesis as applied to the log-prices of Best Buy stock?

T&F Cat #K31368 — K31368 C003— page 67 — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page vi — 6/14/2017 — 22:05
4
Portfolios

4.1 Introduction
Suppose we have a given amount of capital to invest and a number of possible
assets in which to invest. How should we place our investment in the various
assets? This is known as the portfolio selection problem and the mathematical
and statistical methods developed to solve it are known as portfolio theory.
If it were possible to accurately forecast the future returns of the assets we
could simply invest in the asset or assets with the largest predicted returns
over the future time period of interest. However, empirical evidence, such
as that presented in the previous chapter, suggests that such forecasting is
difficult at best.
Hence, in portfolio theory, we attempt to choose the combination of assets
that yields a portfolio return with desirable statistical properties; specifically,
we seek a large expected return, while minimizing the “risk” of the portfolio,
defined here as the standard deviation of the return. Thus, in an ideal case, the
return on the portfolio will have a large expected value and a small standard
deviation so that the portfolio realizes a large return with high probability.
However, complicating the situation is the fact that the two goals are typically
in conflict: Riskier assets generally have a higher expected return, as a reward
for assuming the risk.

4.2 Basic Concepts

Suppose there are N assets in the market and let R1 , R2 , . . . , RN denote ran-
dom variables representing their respective returns in a given time period.
Consider a portfolio in which we place a proportion wj of our total invest-
ment in asset j, j = 1, 2, . . . , N . The quantities w1 , w2 , . . . , wN are known as
the portfolio weights. Because wj represents the proportion invested in asset
j, we must have

N
w1 + w2 + · · · + wN = wj = 1.
j=1
The return on the portfolio, Rp , can be expressed in terms of the returns
on the individual assets and the portfolio weights. For each j = 1, 2, . . . , N ,

T&F Cat #K31368 — K31368 C004— page 69 — 6/14/2017 — 22:05

70 Introduction to Statistical Methods for Financial Models

let Pj,0 and Pj,1 denote the prices of asset j in periods 0 and 1, respectively,
so that
Pj,1
Rj = − 1.
Pj,0
Suppose we wish to invest capital C in the N assets, according to the weights
w1 , w2 , . . . , wN . Then, in period 0, we buy
wj C
Pj,0
shares of asset j, j = 1, 2, . . . , N .
In period 1, the shares in asset j are worth
wj C
Pj,1 .
Pj,0
Thus, the total worth of the portfolio in period 1 is

N
wj C
Pj,1
j=1
Pj,0

and the return on the portfolio is

N wj C
j=1 Pj,0 Pj,1
− 1.
C
Writing Pj,1 /Pj,0 as Rj + 1, the return may be written

N
wj (Rj + 1) − 1 = wj Rj .
j=1 j=1

That is, the return on the portfolio is a linear function of the individual asset
returns, with the coeﬃcients given by the portfolio weights:

N
Rp = w1 R1 + w2 R2 + · · · + wN RN = wj Rj .
j=1

The goal in the statistical approach to portfolio theory is to select the

portfolio weights so that E(Rp ) is large and the standard deviation of Rp , or
equivalently Var(Rp ), is small.
In this chapter, a number of basic concepts and results are presented,
focusing on the case in which N = 2; the general case will be considered in
the following chapter.
Thus, consider two assets, with returns R1 and R2 . For j = 1, 2, let
μj = E(Rj ) and σ2j = Var(Rj )
and let ρ12 denote the correlation of R1 and R2 . Let w1 and w2 denote the
portfolio weights; because w1 + w2 = 1, we can take w1 = w and w2 = 1 − w
for some real number w.

T&F Cat #K31368 — K31368 C004— page 70 — 6/14/2017 — 22:05

Portfolios 71

The portfolio placing weight w on asset 1 and weight 1 − w on asset 2 has

return
Rp = wR1 + (1 − w)R2 .

Then, using results on the mean and variance of a sum of random variables,

μp ≡ μp (w) = E(Rp ) = wE(R1 ) + (1 − w)E(R2 )

= wμ1 + (1 − w)μ2

and

σ2p ≡ σ2p (w) = Var(Rp ) = w2 σ21 + (1 − w)2 σ22 + 2w(1 − w)ρ12 σ1 σ2 .

The portfolio problem for the case of two assets is concerned with choosing w
so that μp (w) is large and σp (w) is small.

Example 4.1 Consider two assets. Suppose that μ1 = 0.10, μ2 = 0.20, σ1 =

0.20, σ2 = 0.25, and ρ12 = 0.5. Then the portfolio that places weight w on
asset 1 and weight 1 − w on asset 2 has mean return

μp (w) = w(0.10) + (1 − w)(0.20) = 0.20 − 0.10w

and return variance

σ2p (w) = w2 (0.20)2 + (1 − w)2 (0.25)2 + 2w(1 − w)(0.50)(0.20)(0.25)

= 0.0625 − 0.075w + 0.0525w2.

For instance, the portfolio placing half its weight on asset 1 and half its weight
on asset 2 has expected return

μp (0.5) = 0.20 − 0.10(0.5) = 0.15

and return standard deviation

1 .
σp (0.5) = 0.0625 − 0.075(0.5) + 0.0525(0.5)2 2 = 0.195.

Diversiﬁcation
Note that, although the mean return on the portfolio depends only on the
mean returns on the individual assets, the risk of the portfolio depends on the
relationship between the asset returns, as measured by their correlation. This
is a fundamental idea in portfolio theory; in particular, it plays an important
role in the concept of diversiﬁcation, which refers to reducing the risk of a
portfolio by investing in many assets, a central idea in portfolio theory. We
illustrate this concept by considering two examples.

T&F Cat #K31368 — K31368 C004— page 71 — 6/14/2017 — 22:05

72 Introduction to Statistical Methods for Financial Models

Example 4.2 Consider the case in which we have two assets with returns
R1 , R2 with the same mean and variance: E(R1 ) = E(R2 ) = μ and Var(R1 ) =
Var(R2 ) = σ2 . Furthermore, assume that R1 and R2 are uncorrelated.
Investing entirely in either asset 1 or asset 2 yields the same expected
return and the same risk. Now consider a portfolio consisting of both asset
1 and asset 2. Let w be the proportion of our investment in asset 1 so that
the portfolio weights are w1 = w and w2 = 1 − w. Then the expected return
on the portfolio is

μp (w) = E(Rp ) = wE(R1 ) + (1 − w)E(R2 ) = wμ + (1 − w)μ = μ.

Therefore, the expected return on the portfolio does not depend on the value
of w; in particular, investing in the portfolio yields the same expected return
as investing in either asset 1 or asset 2. However including both assets in the
portfolio reduces risk.
Because R1 and R2 are uncorrelated,

σ2p (w) = Var(Rp ) = w2 Var(R1 ) + (1 − w)2 Var(R2 ) = (w2 + (1 − w)2 )σ2 .

For w = 0 or 1, σp = σ. However, for any other value of√w, 0 < w < 1, σp < σ.
Choosing w = 1/2 minimizes σp yielding a value of σ/ 2.

The scenario in this example is a special one since the assets have the
same mean and variance, and they are uncorrelated; the following example
generalizes this by assuming the asset returns are correlated.

Example 4.3 Consider the framework of Example 4.2 but now assume that
the correlation of R1 and R2 is ρ12 , −1 < ρ12 < 1. Consider the portfolio
placing equal weight on the two assets so that w1 = w2 = 1/2. Then Rp =
(1/2)R1 + (1/2)R2 ; hence, μp = E(Rp ) = μ as in Example 4.2. The variance
of Rp is given by

1 1 11
σ2p = Var(Rp ) = VarR1 + VarR2 + 2 Cov(R1 , R2 )
4 4 22
1 2 1 2 1
= σ + σ + ρ12 σ 2
4 4 2
1
= (1 + ρ12 )σ .
2
2
Therefore, for any value of ρ12 , −1 < ρ12 < 1, σp < σ.
When ρ12 is close to 1, then σp is close to σ, the standard deviation of
R1 and R2 ; therefore, there is little beneﬁt to including both assets in the
portfolio. This is not surprising because, when ρ12 =1, ˙ R1 and R2 have a lin-
ear relationship; hence, there is little diversity in including both assets in the
portfolio. On the other hand, the standard deviation of the portfolio return is
the smallest when the asset returns are negatively correlated so that one asset
return increases when the other decreases and vice versa.

T&F Cat #K31368 — K31368 C004— page 72 — 6/14/2017 — 22:05

Portfolios 73

4.3 Negative Portfolio Weights: Short Sales

Although it may be more natural to consider portfolio weights that are con-
strained to be in the interval [0, 1], often it is desirable to allow weights to be
negative. This is illustrated in the following example.

Example 4.4 Let R1 , R2 denote returns on two assets and suppose that
E(R1 ) = E(R2 ) = μ,√Var(R1 ) = 0.3, Var(R2 ) = 0.1, and suppose that the cor-
relation of R1 , R2 is 3/2. If we construct a portfolio by investing a proportion
w of our investment in asset 1 and 1 − w of our investment in asset 2, the return
on our portfolio is Rp = wR1 + (1 − w)R2 . It is straightforward to show that
Rp has mean μp (w) = μ and variance σ2p (w), given by

√
3
σ2p (w) = (0.3)w + (0.1)(1 − w) +
2 2
(0.3)(0.1)w(1 − w)
2
= (0.1)(w2 + w + 1).

Because all portfolios have the same expected return, we should choose w
to minimize risk. Note that σ2p (w) is a quadratic function of w with a positive
coeﬃcient of w2 ; hence, it can be minimized by taking the derivative, setting
it equal to 0, and solving for w, yielding w = −1/2. That is, the minimum
variance portfolio places weight −1/2 on asset 1 and weight 3/2 on asset 2.

Such a negative weight in a portfolio represents a “short sale” in which the

investor sells, rather than buys, the asset. Because the investor does not own
the asset in question, ﬁrst she must borrow it before selling it; when the asset
is sold, the investor receives the current price for the asset. At the end of the
investment period, to pay back the loan, the investor must buy the asset at
its current price. Therefore, the entire procedure is the opposite of what takes
place when buying an asset.

Example 4.5 Consider the portfolio constructed in Example 4.4. Suppose

the price of asset 1 is $10 per share in period 0 and $12 per share in period
1 so that the realized value of R1 is (12 − 10)/10 = 0.2. For asset 2, suppose
the price is $15 per share in period 0 and $21 per share in period 1 so that
R2 is observed to be 0.4.
Suppose we have $100 to invest. Because w1 = −1/2, we borrow $50
(i.e., 1/2 × 100) worth of asset 1, 5 shares at $10 per share, and immedi-
ately sell them, yielding $50. We now have $150 to invest in asset 2, which
allows us to buy 10 shares at $15 per share.
At the end of period 1, our 10 shares of asset 2 are worth $21 per share, or
$210 total. However, we must still repay our loan of 5 shares of asset 1. Each
share of asset 1 is now worth $12, so we must use $60 (i.e., 5 × 12) of our $210

T&F Cat #K31368 — K31368 C004— page 73 — 6/14/2017 — 22:05

74 Introduction to Statistical Methods for Financial Models

to buy 5 shares of asset 1. Therefore, our net worth at the end of period 1 is
210$ − 60$ = $150. This corresponds to a return of
150 − 100
= 0.5.
100
In terms of w1 , w2 , R1 , and R2 the return on our portfolio is
w1 R1 + w2 R2 = −0.5R1 + 1.5R2 = −0.5(0.2) + 1.5(0.4) = 0.5
as determined previously.

A short sale of an asset may be viewed as a loan, with the interest rate of
the loan equal to the return on that asset. Therefore, if the price of the asset
decreases over the loan period, so that the return is negative, then the interest
rate on the loan is effectively negative and the investor makes money on the
loan as well. Because of this, short sales are often described as an appropriate
investment for an asset expected to have a negative return. Although this may
be true, a short sale can be useful for controlling risk even in cases in which
the asset price is expected to increase. This illustrated in Example 4.4.
Although portfolios with negative weights are convenient from a mathe-
matical point of view, and we will use them here, there are some important
practical considerations. Note that there is a fundamental difference between
a short sale and a more typical “long” position. If we purchase $100 of an
asset, the most we can lose is $100, which occurs if the share price decreases
to 0. However, if we borrow $100 worth of an asset, our potential losses
are unbounded because the price of the asset can, in principle, increase
indefinitely. Therefore, short sales are tightly regulated.
For instance, when shares are borrowed, a certain amount of collateral
must be placed in a “margin account,” effectively increasing the cost of the
loan. If the price of the stock increases, then more collateral may be needed
since the cost of buying back the stock is greater; in this case, the investor
receives a “margin call” demanding that additional funds be added to the
margin account. Also, short positions may be subject to a “forced buy-in,”
in which the lender of the asset requires that the loan be repaid immediately,
even if such a repayment requires the investor who borrowed the shares to
lose money.
Because of such considerations, analysts sometimes only consider portfolios
in which the weights are restricted to lie in the interval [0, 1], although such
constraints complicate the details of the analysis. Constraints of this type will
be considered in Section 5.8.

4.4 Optimal Portfolios of Two Assets

Consider two assets, with returns R1 and R2 , respectively; for j = 1, 2, let
Rj have mean and standard deviation μj and σj , respectively, and let ρ12

T&F Cat #K31368 — K31368 C004— page 74 — 6/14/2017 — 22:05

Portfolios 75

denote the correlation of R1 and R2 . The portfolio based on portfolio weights

w1 = w and w2 = 1 − w has return Rp = wR1 + (1 − w)R2 . We summarize
the properties of the portfolio with return Rp by the mean and standard
deviation of the return, μp (w) and σp (w), respectively, viewed as functions
of w, −∞ < w < ∞. Recall that

μp (w) = wμ1 + (1 − w)μ2

and
σ2p (w) = w2 σ21 + (1 − w)2 σ22 + 2w(1 − w)ρ12 σ1 σ2 .
In this section, we consider the problem of choosing the value of w.
As w varies, μp (w) and σp (w) vary; we may view these values as points
(σp (w), μp (w)) in the “risk-return space.” A plot of (σp (w), μp (w)) as w
varies is a useful way to understand the relationship between the expected
return and risk of a portfolio.
Example 4.6 Suppose that μ1 = 0.2, σ1 = 0.1, μ2 = 0.1, σ2 = 0.05, and
ρ12 = 0.25. Then μp (w) = w(0.2) + (1 − w)(0.1) = 0.1 + 0.1w and

σ2p (w) = w2 (0.1)2 + (1 − w)2 (0.05)2 + 2w(1 − w)(0.25)(0.1)(0.05)

= 0.01w2 − 0.0025w + 0.0025.

Figure 4.1 contains a plot of (σp (w), μp (w)) as w varies.

The curve in Figure 4.1 represents the set of all (σp (w), μp (w)) pairs that
are available to the investor; it is known as the opportunity set. This term

0.25

0.19

0.13
μp

0.07

0.01

−0.05
0.04 0.07 0.10 0.13 0.16
σp

FIGURE 4.1
Expected return and risk for diﬀerent portfolios in Example 4.6.

T&F Cat #K31368 — K31368 C004— page 75 — 6/14/2017 — 22:05

76 Introduction to Statistical Methods for Financial Models

will also be used to describe the corresponding portfolios; for instance, in the
previous example, a portfolio in the opportunity set has a value of (σp , μp ) on
the curve in Figure 4.1.
Note that, unless μ1 = μ2 , for a given value m there is exactly one value of
w such that μp (w) = m. On the other hand, for a given value s > 0, there
may be zero, one, or two values of w such that σp (w) = s, depending on the
number of solutions to the quadratic equation σ2p (w) − s2 = 0.

Example 4.7 Consider the assets described in Example 4.6. Note that, for
a given value of μp , say m, the portfolio with that expected return may be
found by solving
μp (w) = 0.1 + 0.1w = m
for w, yielding w = 10m − 1. On the other hand, for a given value of σp , say
s, there may be zero, one, or two portfolios with that standard deviation,
depending on the solutions to the quadratic equation

σ2p (w) = 0.01w2 − 0.0025w + 0.0025 = s2 .

For instance, there are no portfolios with σp (w) = 0.0375 because there are
no (real) solutions to the equation

0.01w2 − 0.0025w + 0.0025 = (0.0375)2 ;

there are two portfolios with σp (w) = 0.0625, corresponding to the two
solutions to the equation

0.01w2 − 0.0025w + 0.0025 = (0.0625)2 ,

√ . √ .
√ = 0.52 and w = (1 − 10)/8 = −0.27; there is one portfolio
w = (1 + 10)/8
with σp (w) = 15/80, corresponding to the one solution to the equation
√ 2
15 15
0.01w − 0.0025w + 0.0025 =
2
= ,
80 6400

w = 1/8. These three possibilities are illustrated in Figure 4.2.

Eﬃcient Portfolios
For the case in which two portfolios have the same return standard deviation,
only the one with the larger return mean is of interest. For example, in Fig-
ure 4.1, each portfolio with a (σp (w), μp (w)) pair on the lower half of the
curve is dominated by a portfolio with a (σp (w), μp (w)) pair on the upper
half of the curve. Thus, the portfolios corresponding to the upper half of the
curve are known as eﬃcient portfolios.

T&F Cat #K31368 — K31368 C004— page 76 — 6/14/2017 — 22:05

Portfolios 77

0.25

0.19

0.13
μp

0.07

0.01

−0.05
0.04 0.07 0.10 0.13 0.16
σp

FIGURE 4.2
Three possibilities for the solutions to σ2p (w) = s in Example 4.7.

0.25

0.19

0.13
μp

0.07

0.01

−0.05
0.04 0.07 0.10 0.13 0.16
σp

FIGURE 4.3
Eﬃcient frontier in Example 4.7.

That is, for an efficient portfolio, it is not possible to have a larger expected
return without increasing the risk or, conversely, it is not possible to have lower
risk without decreasing the expected return. This upper half of the opportu-
nity set is known as the efficient frontier ; the efficient frontier for the assets
described in Example 4.7 is given in Figure 4.3. The term “efficient frontier”

T&F Cat #K31368 — K31368 C004— page 77 — 6/14/2017 — 22:05

78 Introduction to Statistical Methods for Financial Models

will also be used to be describe the portfolios with a (σp (w), μp (w)) pair on
the efficient frontier.
Each portfolio on the efficient frontier has the largest possible expected
return for a given level of risk and the lowest possible risk for a given expected
return. Therefore, there is no objective way to choose from among the effi-
cient portfolios; such a choice depends on an investor’s view of the relative
importance of a portfolio’s expected return and risk.

The Minimum-Variance Portfolio

Suppose that our goal is simply to minimize risk, without consideration of the
expected return of the portfolio. Then we choose w to minimize the return
standard deviation σp (w) or, equivalently, to minimize the return variance
σ2p (w). The resulting portfolio is known as the minimum-variance portfolio.
To ﬁnd the minimum-variance portfolio, we need to ﬁnd the value of w
that minimizes σ2p (w). To solve this minimization problem, we may use the
approach used in calculus. Note that

dσ2p (w)
= 2wσ21 − 2(1 − w)σ22 + 2(1 − 2w)ρ12σ1 σ2
dw
and
d2 σ2p (w)
= 2σ21 + 2σ22 − 4ρ12 σ1 σ2 .
dw2
Clearly,

d2 σ2p (w)
≥ 2σ21 + 2σ22 − 4σ1 σ2
dw2
= 2(σ1 − σ2 )2 ≥ 0

and, provided that ρ12 < 1,

d2 σ2p (w)
> 0.
dw2
Hence, σ2p (w) can be minimized by solving

wσ21 − (1 − w)σ22 + (1 − 2w)ρ12 σ1 σ2 = 0

for w, yielding the solution

σ22 − ρ12 σ1 σ2
w = wmv ≡
σ21 + σ22 − 2ρ12 σ1 σ2

and
σ21 − ρ12 σ1 σ2
1 − wmv = .
σ21 + σ22 − 2ρ12 σ1 σ2

T&F Cat #K31368 — K31368 C004— page 78 — 6/14/2017 — 22:05

Portfolios 79

Note that σ21 + σ22 − 2ρ12 σ1 σ2 = Var(R1 − R2 ). Using the fact that
Var(R1 − R2 ) = σ21 + σ22 − 2ρ12 σ1 σ2 = (σ1 − σ2 )2 + 2(1 − ρ12)σ1 σ2 ,
it follows that if ρ12 < 1, then Var(R1 − R2 ) > 0. Therefore, provided that
ρ12 < 1, the denominator in the expression for wmv is nonzero.
Example 4.8 Consider Example 4.6 in which σ1 = 0.1, σ2 = 0.05, and ρ12 =
0.25. Then
(0.05)2 − (0.25)(0.1)(0.05) 1
wmv = = .
(0.1) + (0.05) − 2(0.25)(0.1)(0.05)
2 2 8
Recall that in Example 4.6 we saw that the quadratic equation σ2p (w) −
15/6400 = 0 has a single root, at w = 1/8. Such a single root always occurs
at the point of minimum risk; see Figure 4.2.
Therefore, risk is minimized by placing 1/8th of our investment in asset 1.
Here, μ1 = 0.2 and μ2 = 0.1, so that the minimum-variance portfolio has
expected return
(1/8)μ1 + (7/8)μ2 = 0.1125
and, using the result in Example 4.6, the standard deviation of the return is

1 .
(0.01w2 − 0.0025w + 0.0025) 2 = 0.0484.
w=1/8

This gives the portfolio with minimum risk. However, it is only the opti-
mal choice if our goal is to minimize risk. For example, suppose that we are
willing to increase the standard deviation of the portfolio to 0.05; the solu-
tions to 0.01w2 − 0.0025w + 0.0025 = (0.05)2 are w = 0.25 and w = 0. Only
the ﬁrst of these corresponds to a portfolio on the eﬃcient frontier (why?),
and its expected return is 0.1 + 0.1(0.25) = 0.125. Hence, a 3.3% increase in
risk (i.e., 0.0484 to 0.05) yields a 11% increase in expected return (i.e., 0.1125
to 0.125).

The Risk-Aversion Criterion

The minimum-variance portfolio completely ignores the expected return of the
portfolio. An alternative, and usually preferable, approach is to use a criterion
that takes into account both the risk and the expected return of the portfolio.
Because of risk aversion, investors are generally willing to accept a lower
expected return if that lower expected return corresponds to lower risk as
well. Conversely, high portfolio risk may be tolerable if the portfolio has a
high expected return.
Consider the portfolio placing weight w on asset 1; let μp (w) and σp (w)
denote the mean and standard deviation, respectively, of the return on that
portfolio. We might consider evaluating this portfolio by
λ 2
σ (w)
fλ (w) = μp (w) −
2 p
where λ > 0 is a given parameter, known as the risk aversion parameter.

T&F Cat #K31368 — K31368 C004— page 79 — 6/14/2017 — 22:05

80 Introduction to Statistical Methods for Financial Models

The function fλ is a type of penalized mean return, with a penalty based

on the return variance; the magnitude of the penalty is controlled by the
parameter λ. Thus, larger values of fλ (·) correspond to portfolios with a
greater expected return relative to the portfolio risk, with the tradeoff between
expected return and risk controlled by λ. For a given value of λ, let wλ denote
the value of w that maximizes fλ (w).
To find wλ , we may use an approach similar to the one used to find the
weights of the minimum-variance portfolio. Note that

fλ (w) = μp (w) − λσp (w)σp (w)

= (μ1 − μ2 ) − λ(wσ21 − (1 − w)σ22 + (1 − 2w)ρ12 σ1 σ2 )

and

fλ (w) = −λ(2σ21 + 2σ22 − 4ρ12 σ1 σ2 )

= −2λVar(R1 − R2 ) < 0

provided that ρ12 < 1. Hence, we can maximize fλ (w) by solving fλ (w) = 0
for w, yielding the solution

σ22 − ρ12 σ1 σ2 + (μ1 − μ2 )/λ

wλ = ; (4.1)
σ21 + σ22 − 2ρ12 σ1 σ2

it follows that
σ21 − ρ12 σ1 σ2 − (μ1 − μ2 )/λ
1 − wλ = .
σ21 + σ22 − 2ρ12 σ1 σ2

Example 4.9 Consider two assets such that μ1 = 0.04, μ2 = 0.02, σ1 = 0.2,
σ2 = 0.1, and ρ12 = 0.25. Then, using (4.1),

1 1
w = wλ ≡ + , λ > 0.
8 2λ

Figure 4.4 contains plots of μp (wλ ) and σp (wλ ) as λ varies. Note that, for large
values of λ, there is a large penalty on the variance of the return; hence, the
optimal portfolio has small risk. To achieve this, the portfolio must also have
small expected return. Conversely, for a small λ, the variance of the return is
largely irrelevant; hence, the optimal portfolio has large risk. As a reward for
the large risk incurred, the portfolio also has a large expected return.
When λ is small, the weight on asset 1 is large; hence, the weight on asset
2 is negative. That is, in order to achieve a large expected return, we must
borrow asset 2 (which has a low expected return) in order to buy asset 1
(which has a large expected return). For instance, if λ = 0.25, the optimal
portfolio places weight 2.125 on asset 1 and weight −1.125 on asset 2.

T&F Cat #K31368 — K31368 C004— page 80 — 6/14/2017 — 22:05

Portfolios 81

0.07

0.06
Expected return
0.05

0.04

0.03

1 2 3 4 5
λ

0.5

0.4
Risk

0.3

0.2

0.1
1 2 3 4 5
λ

FIGURE 4.4
Properties of the optimal portfolio as a function of λ in Example 4.9.

4.5 Risk-Free Assets

In constructing a portfolio, we might consider the possibility of not investing
some of our capital. That is, if, as in the previous section, there are two
assets and wi denotes the proportion of our investment in asset i, i = 1, 2,
then we might have w1 + w2 < 1. The proportion not invested, 1 − w1 − w2 ,
does not contribute to the expected return on the portfolio, but it reduces the
proportion of the investment contributing to the risk of the portfolio.
A better approach is to invest in a risk-free asset, an asset that realizes a
small return but does so with complete certainty. Let Rf denote the return on a

T&F Cat #K31368 — K31368 C004— page 81 — 6/14/2017 — 22:05

82 Introduction to Statistical Methods for Financial Models

risk-free asset. Then E(Rf ) = μf , the risk-free rate of return, and Var(Rf ) = 0,
that is, Pr(Rf = μf ) = 1.
Investment in a risk-free asset might contribute only a small return to
the portfolio but it reduces the risk of our investment, giving the investor a
convenient way to control risk; note that, since Rf has zero variance, it also
has zero covariance with any other asset return. Although we might consider
a simple “savings account” as a risk-free asset, the usual risk-free asset used
in portfolio analysis is a three-month U.S. Treasury Bill.
For instance, consider a portfolio consisting of a risk-free asset, with
return Rf , and a standard, “risky” asset, with return R. Let μ = E(R) and
σ2 = Var(R); assume that μ > μf . Suppose we invest a proportion w of our
investment in the risky asset, with the remainder invested in the risk-free
asset; assume that w > 0 so that we do not borrow the risky asset to buy the
risk-free asset. Then the return on our portfolio is wR + (1 − w)Rf .
This portfolio has expected return
μ0 (w) = wμ + (1 − w)μf (4.2)
and return standard deviation
σ0 (w) = wσ. (4.3)
Note that, although Rf and μf are equal with probability one, we will use
Rf when referring to returns, for example, the return on the portfolio is
wR + (1 − w)Rf , and use μf when referring to properties of the distribution of
returns, for example, the expected return on the portfolio is wμ + (1 − w)μf .
Hence, the risk of the portfolio can be made as small as we like, by choosing
a value of w close to 0. Of course, reducing the risk of the portfolio also reduces
the expected return. Note that solving (4.3) for w yields
σ0 (w)
w=
σ
and using this result in (4.2) yields the following relationship between the
expected return on such a portfolio and its risk:
μ − μf
μ0 (w) = μf + σ0 (w).
σ
Example 4.10 Consider a risky asset with expected return μ = 0.02 and
return standard deviation σ = 0.1; suppose that the risk-free return is μf =
0.001. Then
μ − μf 0.02 − 0.001
= = 0.19
σ 0.1
so that the mean return μ0 (w) and return standard deviation σ0 (w) of the
portfolio placing weight w on the risky asset and weight 1 − w on the risk-free
asset are related by
μ0 (w) = 0.001 + 0.19σ0(w).
Because of the simple relationship between the expected return and return
standard deviation for portfolios formed from a risky asset and a risk-free asset,

T&F Cat #K31368 — K31368 C004— page 82 — 6/14/2017 — 22:05

Portfolios 83

the set of risk, expected-return pairs available to the investor takes a particu-
larly simple form.
Example 4.11 Consider the asset and risk-free asset described in Example
4.10. Let μ0 (w) and σ0 (w) denote the mean and standard deviation, respec-
tively, of the portfolio placing weight w on the risky asset. Figure 4.5 contains
a plot of the eﬃcient frontier (the solid line) together with the opportunity
set (the solid line together with the dotted line). The minimum risk portfolio
in this context is the one with w = 0 corresponding to the vertex in the plot,
which occurs at (0, μf ); note that the portfolio with w = 0 is the one in which
the investor neither invests in nor borrows the risky asset.
The eﬃcient frontier in this case is that part of the opportunity set cor-
responding to w ≥ 0, that is, the line segment with positive slope. The slope
of this line segment is (μ − μf )/σ, which may be viewed as a measure of the
mean excess return on the asset per unit of risk. Note that if we have two
possible risky assets, the one with the larger slope leads to portfolios with a
higher expected return for a given level of risk. This property plays an impor-
tant role in constructing a portfolio from two risky assets together with the
risk-free asset, and it will be discussed in detail in Section 4.6.

The risk-free return provides a convenient baseline for measuring the return
on an asset; for a given asset with return R, the quantity R − Rf is known as
the excess return of the asset. Therefore, for the portfolio discussed earlier,
the expected excess return on the portfolio is proportional to the portfolio
risk:
μ − μf
μ0 (w) − μf = σ0 (w).
σ

0.03

0.01
μ0

μf

−0.01

0 0.05 0.10 0.15

σ0

FIGURE 4.5
Eﬃcient frontier and opportunity set in Example 4.11.

T&F Cat #K31368 — K31368 C004— page 83 — 6/14/2017 — 22:05

84 Introduction to Statistical Methods for Financial Models

Note that a short sale of a risk-free asset corresponds to a loan at the

risk-free rate. Therefore, we are assuming that investors can borrow and
lend at the same rate, an assumption that is not true in practice for the
vast majority of investors. However, it may be considered to be a reason-
able approximation for an institutional investor, such as a pension fund or an
insurance company.
Thus, in some cases, in order to achieve a high expected return, the investor
must borrow at the risk-free rate and invest the proceeds from that loan in
the risky asset.
Example 4.12 Consider the assets discussed in Example 4.10 in which the
risky asset has mean return 0.02 and the risk-free return is 0.001. To construct
a portfolio with an expected return of 0.0295, we must use portfolio weights
w = 1.5 and 1 − w = −0.5 (note that (1.5)(0.02) + (−0.5)(0.001) = 0.0295);
that is, we must take a loan for an amount equal to half of our available capital
and invest our capital, together with the proceeds from the loan, in asset 1.

4.6 Portfolios of Two Risky Assets and

a Risk-Free Asset
Now suppose that there are two risky assets available for investment, as in
Section 4.4, plus a risk-free asset, as discussed in the previous section. Consider
risky assets with returns R1 and R2 , respectively. For j = 1, 2, let μj = E(Rj )
and σ2j = Var(Rj ) and let ρ12 denote the correlation of R1 , R2 ; assume that
|ρ12 | < 1. Let Rf denote the return on the risk-free asset, and let μf = E(Rf );
recall that, by definition, Var(Rf ) = 0.
A portfolio consisting of the two risky assets together with the risk-free
asset has a return of the form
w1 R1 + w2 R2 + wf Rf (4.4)
where w1 + w2 + wf = 1; assume that wf = 1 so that the portfolio contains
one or both of the risky assets. Note that we can write (4.4) as
(1 − wf )(w̄1 R1 + w̄2 R2 ) + wf Rf
where w̄j = wj /(1 − wf ), j = 1, 2. Hence, w̄1 + w̄2 = 1.
That is, it is convenient to view the portfolio selection problem as having
two stages. In the first stage, we construct a portfolio of the two risky assets,
with weights w̄1 and w̄2 ; in the second stage, that portfolio is combined with
the risk-free asset by choosing the value of wf . However, when choosing w̄1
and w̄2 , it is important to keep in mind that the portfolio of the risky assets
will be combined with a risk-free asset.
Including the possibility of investing in a risk-free asset has a large effect
on the portfolio selection problem. Because we may always decrease risk by
investing in the risk-free asset, there is a sense in which the risk in the portfolio
of risky assets becomes less important.

T&F Cat #K31368 — K31368 C004— page 84 — 6/14/2017 — 22:05

Portfolios 85

Let Rp denote the return on the portfolio of risky assets; once the portfolio
of risky assets has been selected, we are eﬀectively back to the case of one risky
asset together with the risk-free asset.
The return on the entire portfolio may be written

(1 − wf )Rp + wf Rf .

This portfolio has expected return (1 − wf )μp + wf μf , where μp = E(Rp ) and

return variance
(1 − wf )2 σ2p .
For wf ≥ 0, a plot of the expected return of the portfolio versus its risk is
simply a line segment starting at (0, μf ) and passing through (σp , μp ), similar
to the efficient frontier in Example 4.11.
Example 4.13 Suppose two risky assets have returns with means μ1 = 0.05
and μ2 = 0.15, respectively, standard deviations σ1 = σ2 = 0.25, and corre-
lation ρ12 = 0.125 and suppose that there is a risk-free asset with expected
return μf = 0.025. Figure 4.6 gives a plot of the efficient frontier (the solid
line) for portfolios of the two risky assets. Thus, in the first stage of the port-
folio selection problem, we choose a portfolio from among those corresponding
to the points on the efficient frontier.
In choosing such a portfolio, the fact that the portfolio will be combined
with the risk-free asset plays an important role. Figure 4.7 contains a plot
of the risk-expected-return pairs of the portfolio, consisting of the two risky
assets plus the risk-free asset, for a particular choice of a portfolio of risky
assets.

0.20

0.15
Expected return

0.10

0.05

0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk

FIGURE 4.6
Eﬃcient frontier for portfolios of the two risky assets in Example 4.13.

T&F Cat #K31368 — K31368 C004— page 85 — 6/14/2017 — 22:05

86 Introduction to Statistical Methods for Financial Models

0.20

0.15
Expected return

0.10

0.05

(0, μf)
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk

FIGURE 4.7
Risk, mean–return pairs corresponding to a particular portfolio of risky assets
in Example 4.13.

This plot illustrates a number of points regarding combining a portfolio of

risky assets with a risk-free asset.
• The line segment connecting (0, μf ) and the point on the curve corre-
sponds to portfolios placing weight wf , 0 ≤ wf ≤ 1, on the risk-free
asset and weight 1 − wf on the portfolio of risky assets corresponding
to the point on the curve.
• The dashed line segment extending beyond the curve corresponds to
portfolios with wf < 0, for which the investor borrows at the risk-free
rate in order to purchase the portfolio of risky assets. Note that such
a portfolio has a smaller expected return than does a portfolio of
risky assets alone with the same risk; such portfolios correspond to
the points on the solid curve that lies above the dotted line segment.
• Thus, the portfolio problem consists of choosing a point on the curve
(i.e., choosing a portfolio of risky assets) together with choosing a
point on the half-line that starts at (0, μf ) and passes through the
point on the curve (i.e., choosing wf ).

Let (σp , μp ) denote the risk, mean–return pair for a portfolio of risky
assets; for example, in Figure 4.7, (σp , μp ) is a point on the curve. Because all
half-lines starting at (0, μf ) and passing through (σp , μp ) have the same
starting point, the diﬀerent possible half-lines may be described by their
slopes,
μp − μf
.
σp

T&F Cat #K31368 — K31368 C004— page 86 — 6/14/2017 — 22:05

Portfolios 87

0.20

0.15
Expected return

0.10

0.05

0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk

FIGURE 4.8
Comparison of two portfolios of risky assets in Example 4.13.

Therefore, choosing between two risky portfolios is essentially choosing

between two possible slopes for such lines; see Figure 4.8 for such a comparison
in the context of Example 4.13.
Note that the half-line with the larger slope (the dashed line) is preferred
for any level of risk; that is, for any desired level of risk, the portfolio corre-
sponding to the larger slope has a greater expected return. This suggests that
the portfolio of risky assets with the value of (σp , μp ) that yields the largest
possible slope for the line connecting (0, μf ) and (σp , μp ) is optimal.

Sharpe Ratio
Consider a portfolio of risky assets with expected return μp and risk σp . The
slope of the line connecting (0, μf ) and (σp , μp ) is known as the Sharpe ratio
of the portfolio; hence, the Sharpe ratio is given by
μp − μf
SR = .
σp

The Sharpe ratio has a useful interpretation as the expected excess return on
the portfolio per unit of risk.
Note that the portfolio giving weight wf to the risk-free asset and weight
1 − wf to the portfolio of risky assets has expected return

(1 − wf )μp + wf μf

and return standard deviation |1 − wf |σp .

T&F Cat #K31368 — K31368 C004— page 87 — 6/14/2017 — 22:05

88 Introduction to Statistical Methods for Financial Models

Suppose that an investor would like to choose wf so that the expected

return on the portfolio consisting of the two risky assets together with the
risk-free asset is m for some value m > μf . This may be achieved by choosing
wf to solve
(1 − wf )μp + wf μf = m
μp − m
wf = ;
μp − μf
note here we may assume that μp > μf because, otherwise, there would be no
reason to invest in the portfolio of risky assets and that 0 ≤ wf < 1 provided
that μf < m ≤ μp . If m > μp , then wf < 0, indicating that the investor will
need to borrow capital at the risk-free rate in order to attain an expected
return of m.
The resulting portfolio has risk

1 − μp − m σp = m − μf σp = m − μf ,
μp − μf μp − μf SR
where SR denotes the Sharpe ratio of the portfolio of risky assets. Hence,
the risk of the portfolio with expected return m is inversely proportional to
the Sharpe ratio of the portfolio of risky assets. That is, we should choose the
portfolio of risky assets to have the largest Sharpe ratio possible.
Conversely, if wf is chosen to achieve a given level of risk, s, then either
wf = 1 + s/σp or wf = 1 − s/σp . It is easy to show that the second of these
yields the larger expected return,

s s
1− μf + μp = μf + s(SR).
σp σp
Hence, the expected excess return of the porfolio with risk s is proportional to
the Sharpe ratio of the portfolio of risky assets, showing again that we should
choose the portfolio of risky assets to have the largest Sharpe ratio possible.
Thus, we have proven the following important result. Note the previous
analysis is not limited to the case of two risky assets; it applies to any portfolio
of risky assets.
Proposition 4.1. Consider a portfolio consisting of the risk-free asset and a
portfolio of risky assets. Then the optimal portfolio of risky assets is the one
with the largest Sharpe ratio.

Tangency Portfolio
Thus, to construct the optimal portfolio of risky assets with returns R1 and
R2 , we ﬁnd w̄1 , w̄2 , w̄1 + w̄2 = 1, so that the portfolio with return

w̄1 R1 + w̄2 R2

has the maximum possible Sharpe ratio.

T&F Cat #K31368 — K31368 C004— page 88 — 6/14/2017 — 22:05

Portfolios 89

Write w̄1 = w and w̄2 = 1 − w. Our goal is to ﬁnd the value of w that
maximizes
f (w) = (μp (w) − μf )/σp (w)
where
μp (w) = wμ1 + (1 − w)μ2
and
σ2p (w) = w2 σ21 + (1 − w)2 σ22 + 2w(1 − w)ρ12 σ1 σ2 .
We may maximize f (w) using standard results from calculus. Using the
rule for taking the derivative of a ratio, it follows that

μp (w) − f (w)σp (w)

f (w) = ;
σp (w)

hence, the solution to f (w) = 0 solves

μp (w)
= f (w). (4.5)
σp (w)

It may be shown that f (w) is maximized by

(μ1 − μf )σ22 − (μ2 − μf )ρ12 σ1 σ2

w = wT ≡ ;
(μ2 − μf )σ21 + (μ1 − μf )σ22 − [(μ1 − μf ) + (μ2 − μf )]ρ12 σ1 σ2
(4.6)
see Proposition 5.7 for a proof of this result in a more general setting.
Hence, the Sharpe ratio of the portfolio of the two risky assets is maximized
by choosing w = wT , as given by (4.6). The portfolio corresponding to w = wT
is known as the tangency portfolio, a term that is based on an important
property of the line connecting (0, μf ) and (σp (wT ), μp (wT )).
Consider a curve (x(z), y(z)) parameterized by a real number z. The tan-
gent vector to the curve at z is given by (x (z), y (z)) and the slope of the
tangent vector is y (z)/x (z). Thus, μp (w)/σp (w) is the slope of the tan-
gent vector to the curve (σp (w), μp (w)) at w and the fact that the first-order
condition (4.5) is satisfied at w = wT may be interpreted as the condition
that the slope of the line connecting (0, μf ) and (σp (wT ), μp (wT )) is equal
to the slope of the tangent vector to the efficient frontier at w = wT . Since
(σp (wT ), μp (wT )) is on the efficient frontier, it follows that the line connecting
(0, μf ) and (σp (wT ), μp (wT )) is tangent to the efficient frontier.
Example 4.14 As in Example 4.13, consider two risky assets with expected
returns μ1 = 0.05 and μ2 = 0.15, respectively, return standard deviations
σ1 = σ2 = 0.25, and correlation of the returns given by ρ12 = 0.125, and sup-
pose that there is a risk-free asset with expected return μf = 0.025. Then,
using the formula (4.6), the weight given to asset 1 in the tangency portfolio
.
is wT = 1/14 = 0.071. Therefore, the optimal portfolio of the risky assets is
obtained by placing weight 1/14 on asset 1 and weight 13/14 on asset 2.

T&F Cat #K31368 — K31368 C004— page 89 — 6/14/2017 — 22:05

90 Introduction to Statistical Methods for Financial Models

0.20

0.15
Expected return

0.10

0.05

0
0 0.1 0.2 0.3 0.4 0.5 0.6
Risk

FIGURE 4.9
Tangency portfolio in Example 4.14.

Figure 4.9 shows the efficient frontier in this example with the location of
the tangency portfolio (the point shown on the efficient frontier) and a dashed
line representing the risk-expected return pairs for portfolios constructed from
the tangency portfolio and the risk-free asset.
Note that, since the efficient frontier of the two risky assets lies below the
dashed line, for any desired level of risk, a portfolio based on the tangency
portfolio and the risk-free asset will have an expected return at least as large
(and usually larger) than that of any portfolio on the efficient frontier with
that level of risk.

Consider the problem of ﬁnding the best combination of risky assets, with
returns R1 , and R2 , and risk-free asset, with return Rf ; that is, consider the
problem of ﬁnding the optimal portfolio return of the form

wf Rf + (1 − wf )[wR1 + (1 − w)R2 ].

The previous results show that the optimal solution is to ﬁrst take w = wT ,
corresponding to the tangency portfolio; this gives the optimal combination
of risky assets. Let μT and σT denote the mean and standard deviation of the
return on the tangency portfolio.
Given a desired level of risk s we then choose wf so that wf Rf + (1 − wf )
RT has a standard deviation equal to s, that is, take wf = s/σT ; alternatively,
given a desired value for the expected return, m, we ﬁnd wf so that

wf μf + (1 − μf )μT = m.

T&F Cat #K31368 — K31368 C004— page 90 — 6/14/2017 — 22:05

Portfolios 91

Note that, according to this theory, all investors should use the same com-
bination of risky assets; only the proportion of the tangency portfolio versus
the risk-free asset depends on the investor’s goals.

4.7 Suggestions for Further Reading

The statistical approach to portfolio theory, as described in this book, orig-
inated with Markowitz (1952); hence, it is often referred to as Markowitz
portfolio theory. See Benninga (2008, Chapter 8), Elton et al. (2007, Chapters 4
and 5), Francis and Kim (2013, Chapter 6), and Fabozzi et al. (2006, Chap-
ter 2) for further discussion of the results of this chapter, along with a number of
extensions, from the ﬁnancial modeling perspective. Ruppert (2004, Chapter 5)
and Sclove (2013, Section 6.2) consider these topics from more of a statisti-
cal point of view. Bernstein (2001) discusses the mean-variance approach to
portfolio selection, focusing on the main ideas, rather than on technical results;
Francis and Kim (2013, Chapter 5) solve a number of portfolio problems using
graphical techniques, which some readers may ﬁnd to be a useful alternative to
the more mathematical approach used here.

4.8 Exercises
1. Suppose that there are N assets, with log-returns r1 , r2 , . . . , rN .
Consider a portfolio placing weight wj on asset j, j = 1, 2, . . . , N
and let rp denote the log-return on the portfolio.
Is it true that

rp = w1 r1 + w2 r2 + · · · + wN rN ?

Why or why not?

2. Consider two assets. Suppose that the return on asset 1 has expected
value 0.05 and standard deviation 0.1 and suppose that the return
on asset 2 has expected value 0.02 and standard deviation 0.05.
Suppose that the asset returns have correlation 0.4.
Consider a portfolio placing weight w on asset 1 and weight
1 − w on asset 2; let Rp denote the return on the portfolio.
Find the mean and variance of Rp as a function of w.
3. For the assets described in Exercise 2, plot the opportunity set of
possible (σp , μp ) pairs.
4. Consider two assets. Suppose that the return on asset 1 has expected
value 0.02 and standard deviation 0.05 and suppose that the return
on asset 2 has expected value 0.04 and standard deviation 0.06.

T&F Cat #K31368 — K31368 C004— page 91 — 6/14/2017 — 22:05

92 Introduction to Statistical Methods for Financial Models

Consider an equally weighted portfolio in which each asset

receives weight 1/2 and let Rp denote the return on the portfolio.
Find the expected value of Rp and the variance of Rp as
functions of ρ12 , the correlation of the returns on the two assets.
5. Consider two assets. Suppose that the return on asset 1 has expected
value 0.004 and standard deviation 0.05 and suppose that the return
on asset 2 has expected value 0.002 and standard deviation 0.06.
Suppose that the asset returns have correlation 0.2.
Find the portfolios based on these assets that have a return
standard deviation of 0.045.
6. Consider two assets. Suppose that the return on asset 1 has expected
value 0.08 and standard deviation 0.1 and suppose that the return
on asset 2 has expected value 0.02 and standard deviation 0.05.
Suppose that the asset returns have correlation 0.25.
Consider the opportunity set corresponding to these assets. For
each of the following pairs of portfolio return variances (s2 ) and
portfolio return means (m) determine whether or not the pair is an
element of the opportunity set.

a. s2 = 0.00375 and m = 0.05

b. s2 = 0.01 and m = 0.1
c. s2 = 0.00625 and m = 0.065
7. Consider two assets. Suppose that the return on asset 1 has expected
value 0.02 and standard deviation 0.1 and suppose that the return
on asset 2 has expected value 0.005 and standard deviation 0.1.
Suppose that the asset returns have correlation 0.5.
Find the weight given to asset 1 in the eﬃcient portfolio with
return standard deviation 0.09.
8. Consider two assets with returns R1 and R2 . Suppose that

E(R1 ) = E(R2 ).

Describe the opportunity set and the efficient frontier for these
assets.
9. For the assets described in Exercise 2 find the minimum-variance
portfolio.
10. Let wmv denote the weight given to asset 1 in the minimum-variance
portfolio. Find conditions on ρ12 , in terms of σ1 , σ2 , so that 0 <
wmv < 1.
11. For the assets described in Exercise 5, find the weight of asset 1 in
the risk-averse portfolio with parameter λ.

T&F Cat #K31368 — K31368 C004— page 92 — 6/14/2017 — 22:05

Portfolios 93

12. Consider the risk-averse portfolio based on the risk-aversion param-

eter λ. Describe the properties of the portfolio in the limiting case
in which λ → ∞ and the limiting case in which λ → 0. Interpret
the results.
13. Let wλ denote the weight given to asset 1 in the risk-averse portfolio
with parameter λ. Suppose we wish to enforce the restriction that
the weight given to asset 1 is in the interval [0, 1]. Let w̄λ denote the
weight given to asset 1 that maximizes the risk-aversion criterion
function under this restriction.
Give an expression for w̄λ in terms of wλ .
14. For the assets described in Exercise 2, ﬁnd the tangency portfolio.
Suppose that μf = 0.005.
15. Consider two risky assets with returns R1 and R2 , respectively, and
a risk-free asset with return Rf . For j = 1, 2, let μj = E(Rj ) and
σ2j = Var(Rj ) and let ρ12 denote the correlation of R1 and R2 . Let
wT denote the weight given to asset 1 in the tangency portfolio.
Suppose that the values of σ1 and σ2 used in determining the
tangency portfolio are incorrect and that the correct values are ασ1
and ασ2 , respectively, for some α > 0. Compare the tangency port-
folio based on the correct values to the tangency portfolio based on
the incorrect values.
16. Consider two risky assets with returns R1 and R2 , respectively, and
a risk-free asset with return Rf . For j = 1, 2, let μj = E(Rj ) and
σ2j = Var(Rj ) and let μf = E(Rf ).
Suppose that the Sharpe ratios for the two assets are equal:
μ1 − μf μ2 − μf
= .
σ1 σ2
Show that wT , the weight given to asset 1 in the tangency portfolio,
depends only on σ1 and σ2 and give an expression for wT .
17. Consider two assets with returns R1 and R2 , respectively, and a
risk-free asset with return Rf . For j = 1, 2, let μj = E(Rj ) and
σ2j = Var(Rj ) and let μf = E(Rf ).
Suppose that μ1 = μf and that σ1 > σ2 . Assume that μ2 > μf .
Give conditions under which 0 < wT . Interpret your results.
18. Consider two assets with returns R1 and R2 , respectively, and
a risk-free asset with return Rf . For j = 1, 2, let μj = E(Rj ),
σ2j = Var(Rj ), and let ρ12 denote the correlation of R1 and R2 . Let
μf = E(Rf ).
Let SRj = (μj − μf )/σj denote the Sharpe ratio of asset j, j =
1, 2. Find conditions on SR1 , SR2 , and ρ12 under which the tangency
portfolio consists entirely of asset 1.

T&F Cat #K31368 — K31368 C004— page 93 — 6/14/2017 — 22:05

94 Introduction to Statistical Methods for Financial Models

19. Let R1 and R2 denote the returns on two assets. Suppose that
R2 = R1 + where E() = 0 and R1 and are uncorrelated. Assume
that Var() > 0. Thus, the return on asset 2 is equal to the return
on asset 1 plus “noise.”
a. Find the minimum variance portfolio of R1 , R2 .
b. Find the tangency portfolio of R1 , R2 .

T&F Cat #K31368 — K31368 C004— page 94 — 6/14/2017 — 22:05

5
Eﬃcient Portfolio Theory

5.1 Introduction
In Chapter 4, we considered the problem of constructing a portfolio of two
risky assets, possibly together with a risk-free asset. The focus of this chapter
is the extension of those results to the general case of N risky assets, for an
integer N ≥ 2.
Although the basic approaches we will consider are the same as the ones
used in the N = 2 case, there are a number of important details in which
the general N case is different. The most obvious difference is in the scale
of the problem. When N = 2, the portfolio is described by a single weight,
w, representing the investment in asset 1, with 1 − w invested in asset 2; the
mean and variance of the portfolio return are linear and quadratic functions,
respectively, of w. The statistical properties of the portfolio returns depend
on five parameters: two mean returns, two return standard deviations, and
the correlation of the returns.
In the general N case, the mean and variance of the portfolio return
are functions of the weights w1 , w2 , . . . , wN ; because the weights must sum
to 1, there are effectively N − 1 weights that must be selected. The mean
and standard deviation of the portfolio return depend on N asset expected
returns, N return standard deviations, and N (N − 1)/2 correlations between
the returns on different assets, representing the ways in which the asset returns
are interrelated.
Perhaps the most important mathematical difference between the two cases
is that, in the two-asset case, there is only one portfolio with a given mean
return. That is, if we require a portfolio mean return of 0.02, for example, this
specifies the value of w that must be used. Furthermore, this value of w deter-
mines the value of the return standard deviation σp , so that there is only a single
possible value of σp for a given value of μp . That is, σp is a function of μp . When
N > 2, typically there are infinitely many portfolios with a given mean return.

5.2 Portfolios of N Assets

Consider a set of N assets with returns R1 , R2 , . . . , RN and consider a portfolio
placing weight wj on asset j; then w1 + w2 + · · · + wN = 1. The return on the

T&F Cat #K31368 — K31368 C005— page 95 — 6/14/2017 — 22:05

96 Introduction to Statistical Methods for Financial Models

portfolio with weights w1 , w2 , . . . , wN is a function of the random variables

N
R1 , R2 , . . . , RN , Rp = j=1 wj Rj , and hence, the properties of Rp depend on
the properties of R1 , R2 , . . . , RN .
For instance, the expected return of the portfolio, E(Rp ), is a simple
function of the expected returns on the individual assets,

N
E(Rp ) = wj E(Rj ) = wj μj
j=1 j=1

where μj = E(Rj ), j = 1, 2, . . . , N .
The standard deviation of Rp depends on the standard deviations of R1 ,
R2 , . . . , RN , but it also depends on the relationships among R1 , R2 , . . . , RN ,
as measured by their covariances or correlations. Speciﬁcally,

Var(Rp ) = wj2 Var(Rj ) + 2 wi wj Cov(Ri , Rj ) (5.1)

j=1 i<j

where the second summation in this expression is the sum over all i, j from 1
to N for which i is less than j.
Let σj denote the standard deviation of Rj , j = 1, 2, . . . , N , let σij denote
the covariance of Ri , Rj for i, j = 1, 2, . . . , N , i = j, and let σp denote the
standard deviation of the portfolio with weights w1 , w2 , . . . , wN . Then (5.1)
may be written

σ2p = wj2 σ2j + 2 wi wj σij . (5.2)

j=1 i<j

Alternatively, σp may be expressed in terms of the asset return standard

deviations together with their correlations, ρij = σij /(σi σj ):

σ2p = wj2 σ2j + 2 wi wj ρij σi σj .

j=1 i<j

Matrix Notation
In the case of N assets, the expressions for the expected return and risk of a
portfolio may be conveniently expressed using matrix notation. Let
⎛ ⎞
R1
⎜ R2 ⎟
⎜ ⎟
R=⎜ . ⎟
⎝ .. ⎠
RN

or, equivalently, R = (R1 , R2 , . . . , RN )T , denote the vector of asset returns; note

that all vectors used here will be column vectors. Then R is a random vector ;

T&F Cat #K31368 — K31368 C005— page 96 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 97

it has mean vector

⎛ ⎞
μ1
⎜ μ2 ⎟
⎜ ⎟
μ = E(R) = ⎜ .. ⎟.
⎝ . ⎠
μN

Similarly, the portfolio weights w1 , w2 , . . . , wN describing a portfolio can

be represented by a weight vector w = (w1 , w2 , . . . , wN )T . Then Rp =
N T
j=1 wj Rj , which may be written Rp = w R.
Using this notation, the mean return on the portfolio is given by

N
μp = wj μj = wT μ,
j=1

which is simply the “dot product,”

N or inner product, of the mean and weight
vectors. The requirement that j=1 wj = 1 may be written 1TN w = 1. Here
1N denotes the N -dimensional vector of all ones; when the dimension is clear
from the context, we will write it simply as 1. Because inner products are
symmetric in their arguments, we may also write μp = μT w and wT 1 = 1.
The simpliﬁcation achieved by matrix notation is most apparent when
considering the variance of a portfolio return. Let Σ denote the covariance
matrix of R, the N × N matrix given by
⎛ ⎞
σ21 σ12 . . . σ1N
⎜ σ21 σ22 . . . σ2N ⎟
⎜ ⎟
Σ=⎜ .. .. .. .. ⎟.
⎝ . . . . ⎠
σN 1 σN 2 . . . σ2N

Thus, Σij , the (i, j)th element of Σ, is Cov(Ri , Rj ) for i = j and is Var(Ri ) for
i = j. Because Cov(Ri , Rj ) = Cov(Rj , Ri ), so that σij = σji , Σ is a symmetric
matrix.
A covariance matrix gives a particularly simple way of expressing the vari-
ance of a linear function of a random vector. Let aN denote an N -dimensional
vector, that is, an element of N ; then aT R = j=1 aj Rj and

Var(aT R) = aT Σa.

To see why such a result holds, write a = (a1 , a2 , . . . , aN )T ; then

⎛ ⎞
a1 σ21 + a2 σ12 + · · · + aN σ1N
⎜ a1 σ21 + a2 σ22 + a3 σ23 + · · · + aN σ2N⎟
⎜ ⎟
Σa = ⎜ .. ⎟
⎝ . ⎠
a1 σN 1 + a2 σN 2 + · · · + aN −1 σN,N −1 + aN σN
2

T&F Cat #K31368 — K31368 C005— page 97 — 6/14/2017 — 22:05

98 Introduction to Statistical Methods for Financial Models

and
⎛ ⎞
a1 σ21 + a2 σ12 + · · · + aN σ1N
⎜
⎜ a1 σ21 + a2 σ22 + a3 σ23 + · · · + aN σ2N ⎟
⎟
aT Σa = a1 a2 ··· aN ⎜ .. ⎟
⎝ . ⎠
a1 σN 1 + a2 σN 2 + · · · + aN −1 σN,N −1 + aN σ2N
= (a21 σ21 + a1 a2 σ12 + · · · + a1 aN σ1N )
+ (a2 a1 σ21 + a22 σ22 + a2 a3 σ23 + · · · + a2 aN σ1N )
+ · · · + (aN a1 σN 1 + · · · + aN aN −1 σN,N −1 + a2N σ2N ).

Note that, in this sum, each term of the form a2j σ2j occurs once and each term
of the form aj ak σjk , j < k, occurs twice; hence, the sum is equal to

a2j σ2j + 2 aj ak σjk .

j=1 j<k

In particular, (5.1) for the variance of the return on the portfolio based on
the weight vector w may be written using matrix notation as

Var(Rp ) = wT Σw.

Furthermore, the covariance matrix may be used to calculate the covari-

ance of two linear functions of a random vector. Let a, b be elements of N .
Then
Cov(aT R, bT R) = aT Σb. (5.3)
Conversely, if for a given N × N matrix A,

Cov(aT R, bT R) = aT Ab

for any a, b ∈ N , then A must be the covariance matrix of R.

Example 5.1 Consider a set of four assets and suppose that the return vector
R has mean vector (0.10, 0.20, 0.05, 0.10)T and covariance matrix
⎛ ⎞
0.05 0.01 0.02 0
⎜0.01 0.10 0.05 0.02⎟
Σ=⎜ ⎝0.02 0.05 0.20 0.10⎠ .
⎟

0 0.02 0.10 0.20

Consider the portfolio with the weight vector (0.20, 0.30, 0.10, 0.40)T .
To calculate the mean and variance of the portfolio return in R, we may
use the following commands.

> mu<-c(0.10, 0.20, 0.05, 0.10)

> Sigma<-matrix(c(0.05, 0.01, 0.02, 0, 0.01, 0.10, 0.05, 0.02,

T&F Cat #K31368 — K31368 C005— page 98 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 99

+ 0.02, 0.05, 0.20, 0.10, 0, 0.02, 0.10, 0.20), 4, 4)

> Sigma
[,1] [,2] [,3] [,4]
[1,] 0.05 0.01 0.02 0.00
[2,] 0.01 0.10 0.05 0.02
[3,] 0.02 0.05 0.20 0.10
[4,] 0.00 0.02 0.10 0.20
> w<-c(0.2, 0.3, 0.1, 0.4)
> sum(w*mu)
[1] 0.125
> w%*%Sigma%*%w
[,1]
[1,] 0.0628
Therefore, the return on the portfolio has mean 0.125 and variance 0.0628; it
follows that the return standard deviation is given by
> (0.0628)^.5
[1] 0.251
In these calculations, the product w*mu of vectors w and mu creates a new
vector with the jth element given by the product of the jth elements of w and
mu; hence, sum(w*mu) returns the mean return corresponding to the weight
vector w and mean vector mu. The function matrix is used to construct a
matrix from a vector; note that the entries in the matrix are populated by
column, unless byrow=T is speciﬁed. The matrix multiplication operator in R is
%*%; in the expression w%*%Sigma%*%w, the ﬁrst w is automatically interpreted
as a row vector and the second w is interpreted as a column vector.
Consider a second portfolio, with the weight vector (0.50, 0.10, 0.10, 0.30)T .
Then the covariance of the returns on the two portfolios may be calculated by
> w1<-c(0.5, 0.1, 0.1, 0.3)
> w1%*%Sigma%*%w
[,1]
[1,] 0.0487.
Thus, the covariance of the returns on the two portfolios is 0.0487.

Because the variance of aT R must be nonnegative for any a ∈ N , we

must have
aT Σa ≥ 0 for all a ∈ N ;
a matrix satisfying this property is said to be nonnegative deﬁnite; hence, all
covariance matrices are nonnegative deﬁnite.
If, in addition,

aT Σa = 0 if and only if a = 0N ,

where 0N denotes the zero vector in N , then Σ is said to be positive deﬁnite.

T&F Cat #K31368 — K31368 C005— page 99 — 6/14/2017 — 22:05

100 Introduction to Statistical Methods for Financial Models

In the present context, Σ is positive deﬁnite provided that there is no

nontrivial linear function of R1 , R2 , . . . , RN that has zero variance (a “trivial”
linear function being one with all coefficients equal to zero). In particular,
since portfolio weights must sum to 1, if the covariance matrix of the returns is
positive definite, any portfolio has positive variance. Unless stated otherwise,
we will always assume that the covariance matrix of any return vector is
positive definite.
Example 5.2 Consider the covariance matrix Sigma described in Example
5.1. One way to determine if a symmetric matrix is positive definite is to
compute its eigenvalues; if the eigenvalues are all positive, then the matrix
is positive definite; if the eigenvalues are all nonnegative, then the matrix is
nonnegative definite.
Therefore, to determine if Sigma is positive definite, we can compute its
eigenvalues using the eigenfunction:
> eigen(Sigma)$values
[1] 0.3127 0.1188 0.0731 0.0455
All eigenvalues are positive; therefore, the matrix is positive definite.
Note that we may write Σ in terms of the asset return standard deviations
σ1 , σ2 , . . . , σN and their correlations, ρij , i, j = 1, . . . , N , i = j. Using the fact
that σij = ρij σi σj , we may write
⎛ ⎞⎛ ⎞
σ1 0 . . . 0 1 ρ12 . . . ρ1N
⎜ .. ⎟ ⎜ ρ
⎜ 0 σ2 . . . . ⎟ 1 . . . ρ2N ⎟
⎟⎜ ⎟
21
⎜
Σ =⎜ . ⎜ .. ⎟
⎟⎝ . . . ..
⎝ .. . .
.. .. 0 ⎠ . .
. . . ⎠
0 . . . 0 σN ρN 1 ρN 2 . . . 1
⎛ ⎞
σ1 0 . . . 0
⎜ .. ⎟
⎜ 0 σ2 . . . . ⎟
⎜
×⎜ . ⎟.
. . ⎟
⎝ .. .. .. 0 ⎠
0 . . . 0 σN
The matrix ⎛ ⎞
1 ρ12 . . . ρ1N
⎜ ρ21 1 . . . ρ2N ⎟
⎜ ⎟
C=⎜ .. .. .. .. ⎟
⎝ . . . . ⎠
ρN 1 ρN 2 ... 1
is known as the correlation matrix of R and the matrix
⎛ 2 ⎞
σ1 0 ... 0
⎜ .. .. ⎟
⎜ 0 σ22 . . ⎟
V =⎜ ⎜ . .
⎟
⎟
⎝ .. ..
.. . 0 ⎠
0 ... 0 σ2N

T&F Cat #K31368 — K31368 C005— page 100 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 101

is known as the variance matrix of R. Then

1 1
Σ = V 2 CV 2

1
where V 2 denotes the matrix square root of V , the diagonal matrix with the
asset standard deviations on the diagonal.
The requirement that Σ be nonnegative definite can be expressed in terms
of C: using the fact that all σj are nonnegative, Σ is nonnegative definite
if and only if C is nonnegative definite. This condition on C is the N -asset
analogue of the requirement that −1 ≤ ρ12 ≤ 1 in the two-asset case.
Similarly, Σ is positive definite if and only if all σj are positive and C
is positive definite, the analogue of the condition that −1 < ρ12 < 1 in the
two-asset case. In particular, the condition that −1 < ρij < 1 for all i = j is
not sufficient for the return covariance matrix Σ to be positive definite.

Diversification
In Section 4.2, we saw the benefits of a diversified portfolio of two assets.
When there are N assets available, the analysis is more complicated but the
potential benefits of diversification are potentially even greater.
Example 5.3 Consider a set of N assets and suppose that the returns on the
assets all have mean μ, standard deviation σ, and that they are uncorrelated.
Therefore, μ, the mean vector of the returns, may be written μ1, and Σ,
the covariance matrix of the returns, is σ2 IN where IN denotes the N × N
identity matrix; when the dimension of the identity matrix is clear from the
context, we will write it simply as I.
Consider an equally-weighted portfolio with weights w1 = w2 · · · = wN =
1/N , which we may write as w = (1/N )1. Then the expected return on the
portfolio is
1 μ μ
wT μ = 1T (μ1) = 1T 1 = N = μ
N N N
and the variance of the portfolio return is
T
T 1 2 1 σ2 σ2
w Σw = 1 (σ I) 1 = 2 1T I1 = ;
N N N N
note that, for any matrix A, 1T A1 is the sum of all elements of A. Thus, the
larger the number of assets under consideration, the smaller is the variance
of the equally-weighted portfolio; that is, the larger the number of assets,
the greater is the benefit of diversification, at least in this simple setting.
Furthermore, when the asset returns are uncorrelated, as we have assumed,
the portfolio variance approaches zero as N increases.
The previous example shows that when there are a large number of assets
with uncorrelated returns, then it is possible to construct a portfolio with
a small standard deviation. The following example shows that having the
returns be uncorrelated is important for this result to hold.

T&F Cat #K31368 — K31368 C005— page 101 — 6/14/2017 — 22:05

102 Introduction to Statistical Methods for Financial Models

Example 5.4 Consider the same scenario as in Example 5.3, except that now
assume that the correlation of any two asset returns is ρ, where 0 < ρ < 1.
Then,
⎛ ⎞
1 ρ ... ... ρ
⎜ ρ 1 ρ ... ρ ⎟
⎜ ⎟
⎜ ⎟
Σ = σ2 ⎜ .. . . . . . . . . . ... ⎟ .
. (5.4)
⎜ ⎟
⎝ ρ ... ρ 1 ρ ⎠
ρ ... ... ρ 1

Recall that Σ must be a positive-deﬁnite matrix; thus, all the eigenvalues of

Σ must be positive. Note that all the rows of Σ sum to 1 + (N − 1)ρ; hence, the
vector 1 is an eigenvector of Σ and, because Σ1 = (1 + (N − 1)ρ)1, it follows
that 1 + (N − 1)ρ is an eigenvalue of Σ. Hence, we must have ρ > −1/(N − 1)
and, for large N , this lower bound is close to 0. It may be shown that the
remaining eigenvalues of Σ are all 1 − ρ so that ρ must satisfy −1/(N − 1) <
ρ < 1.
Because
1T Σ1 = N σ2 + (N 2 − N )ρσ2 ,

the variance of the equally-weighted portfolio is given by

σ2 1
σ2p T
= w Σw = + 1− ρσ2 .
N N

Therefore, for any 0 < ρ < 1, σp < σ. However, unlike the case of uncorrelated
assets, if ρ > 0, the standard deviation of the portfolio return cannot be made
arbitrarily small by including more assets.
If ρ is negative, it is possible for σp to be close to zero. However, a negative
value of ρ is generally unrealistic—try to imagine a large number of assets such
that, for any pair, a greater than average return on one corresponds to less
than average returns on the others.
It may be shown that, for Σ of the form (5.4), the minimum possible
variance of a portfolio return is

σ2 1
+ 1− ρσ2
N N

so that the equally-weighted portfolio has the smallest possible return

variance; this result will be discussed in Example 5.7.

Although the specific results given here depend on the form of Σ, the basic
conclusion holds more generally. That is, when asset returns are positively
correlated, as is usually the case, diversification generally reduces the risk of
a portfolio, but there are limits to its benefits.

T&F Cat #K31368 — K31368 C005— page 102 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 103

5.3 Minimum-Risk Frontier

We begin by describing the set of mean returns and return standard deviations
that are available to the investor based on a given set of assets. Let R denote
an N × 1 vector of asset returns, with mean vector μ and covariance matrix Σ.
Therefore, the return on the portfolio based on a weight vector w is given by
wT R, which has mean μp = wT μ and standard deviation σp , where σ2p =
wT Σw.
Consider portfolios with an expected return of m, where m is a given real
number. This is achieved by a portfolio corresponding to a weight vector w
satisfying
wT μ = m; (5.5)
of course, the weights must sum to 1 so that the requirement (5.5) is in addition
to the requirement that wT 1 = 1. It follows that, for N > 2, assuming there
are at least three distinct values in the vector μ, there are inﬁnitely many
weight vectors leading to a portfolio with expected return m.
Therefore, when N > 2, the set of (σp , μp ) pairs for all possible portfolios
is a region in 2 , known as the opportunity set ; recall that in the N = 2
case, the opportunity set is a curve. Figure 5.1 shows the general form of the
opportunity set, that is, the set of possible risk-mean pairs, for a typical N > 2
case.
Let Rp1 and Rp2 denote the returns on two portfolios, both with mean
return m, for some real number m. Because E(Rp1 ) = E(Rp2 ), if Var(Rp1 ) <
Var(Rp2 ), then portfolio 1 is preferred to portfolio 2. That is, although there
μp

σp

FIGURE 5.1
Opportunity set for an N > 2 case.

T&F Cat #K31368 — K31368 C005— page 103 — 6/14/2017 — 22:05

104 Introduction to Statistical Methods for Financial Models

m
μp

σp

FIGURE 5.2
Minimum risk corresponding to an expected return of m.

are inﬁnitely many portfolios with a given mean return, we are only interested
in the one with the smallest return standard deviation.
Graphically, the portfolio with a given mean return m that has the small-
est standard deviation corresponds to the leftmost value on the line segment
of points (σp , m) in the opportunity set; see Figure 5.2, where the value of
(σp , μp ) for a minimum risk portfolio with mean m is indicated with a dot.
The solid horizontal line represents the possible values of σp corresponding to
μp = m.
The weight vector for this portfolio may be obtained by minimizing
Var(Rp ) subject to the restriction that E(Rp ) = m. That is, we solve a
constrained minimization problem:

minimize wT Σw
w∈N

subject to wT μ = m (5.6)
wT 1 = 1.

The weight vector for the minimum-risk portfolio for any given mean return
m is the solution to a minimization problem of this type. In terms of the
opportunity set of possible pairs (σp , μp ), as shown in Figure 5.1, solving (5.6)
for each value of m ﬁnds the left boundary of the set; see Figure 5.3. This
boundary is known as the minimum-risk frontier.
For convenience, we will use the term “minimum-risk frontier” to refer
to risk-expected return pairs of the form (σp , μp ) as well as to refer to the
corresponding portfolios and their weight functions. For instance, the state-
ment that a weight vector is on the minimum-risk frontier means that the

T&F Cat #K31368 — K31368 C005— page 104 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 105

μp

σp

FIGURE 5.3
An example of a minimum-risk frontier.

value of (σp , μp ) for the portfolio corresponding to that weight vector is on

the minimum-risk frontier.
The minimum-risk frontier corresponds to the portfolios with the smallest
risk for a given expected return. Note that the minimum-risk frontier for the
N -asset case is similar in many respects to the opportunity set in the two-asset
case. For instance, it includes portfolios for which there is a portfolio with the
same risk but a larger expected return.
The efficient frontier consists of those portfolios with the smallest risk for
a given expected return as well as the largest expected return for a given risk;
recall that this is the same definition used in the N = 2 case. Efficient frontiers
for an arbitrary number of assets will be discussed in detail in Section 5.5.
The remainder of this section is devoted to describing the minimum-risk
frontier. Although the results are somewhat technical, they all rely on a simple
result from linear algebra, the Cauchy–Schwarz inequality, together with some
basic properties of quadratic functions of a single variable.
The following lemma gives a statement of the Cauchy–Schwarz inequality;
the proof, which is available in many books on linear algebra, is omitted.

Lemma 5.1 (Cauchy–Schwarz Inequality). Let x and y be elements of

n , for some n = 1, 2, . . . . Then

(xT y)2 ≤ (xT x)(y T y)

with equality if and only if either one of x and y is the zero vector or x = cy
for some scalar c.

T&F Cat #K31368 — K31368 C005— page 105 — 6/14/2017 — 22:05

106 Introduction to Statistical Methods for Financial Models

Because xT y ≤ |xT y|, the Cauchy–Schwarz inequality also states that

xT y ≤ (xT x)1/2 (y T y)1/2

with equality if either one of x, y is the zero vector or x = cy for some c > 0.

Zero-Investment Portfolios
Consider a vector v ∈ N such that v T 1 = 0. Then the portfolio based on the
vector v, that is, with return v T R, has no net investment—the coordinates
of v that are greater than zero are offset by one or more coordinates that take
negative values. A portfolio based on weights given by such a vector is said to
be a zero-investment portfolio; the vector v will be called a zero-investment
weight vector, to distinguish such vectors from standard weight vectors that
sum to 1.
Let w1 , w2 be two portfolio weight vectors. Then v = w1 − w2 satisfies
v T 1 = 0 so that w1 − w2 is a zero-investment weight vector. Conversely, if
w is a portfolio weight vector and v is a zero-investment weight vector, then
w + v is also a portfolio weight vector.
Define a set of N -dimensional vectors by

V0 = {v ∈ N : v T 1 = 0, v T μ = 0}. (5.7)
If δ ∈ V0 , then δT R is the return on a zero-investment portfolio that has zero
expected return; in this chapter, δ generally will be used to denote an element
of V0 .
Elements of V0 play a central role in describing the minimum-risk frontier
because if w satisﬁes the constraints wT 1 = 1 and wT μ = m and v ∈ V0 ,
then w + v also satisﬁes these constraints:

(w + v)T 1 = wT 1 + v T 1 = 1 + 0 = 1
and
(w + v)T μ = wT μ + v T μ = m + 0 = m.
Conversely, if w1 and w2 are weight vectors satisfying the constraint that
wjT μ = m for j = 1, 2, then w1 − w2 ∈ V0 . Note that V0 is a linear subspace
of N , with dimension N − 2.

Characterizing the Minimum-Risk Frontier

Note that the constrained minimization problem (5.6) is an N -dimensional
problem; that is, the decision variable in the minimization problem is
N -dimensional. However, we can describe its solutions by looking at all
“one-dimensional subproblems” of this N -dimensional problem. This is a
useful technique that will appear several times in this and later chapters.
One result of this type is the following proposition, which gives a simple
characterization of the minimum-risk frontier.

T&F Cat #K31368 — K31368 C005— page 106 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 107

Proposition 5.1. Consider a vector of portfolio returns R with mean vector

μ and covariance matrix Σ. The portfolio with weight function ŵ is on the
minimum-risk frontier if and only if

ŵT Σδ = 0 for all δ ∈ V0 .

Proof. Suppose that the portfolio corresponding to ŵ is on the minimum-risk

frontier. Let m̂ = ŵT μ so that ŵ solves the constrained minimization problem
(5.6) with m = m̂. Note that this implies that ŵT 1 = 1. Let δ be an element
of V0 ; we want to show that ŵT Σδ = 0.
Note that for any real number z, ŵ + zδ satisﬁes

(ŵ + zδ)T 1 = ŵT 1 + zδT 1 = 1 + z(0) = 1

and
(ŵ + zδ)T μ = ŵT μ + zδT μ = m̂ + z(0) = m̂.

That is, weight vectors of the form ŵ + zδ satisfy the constraints in the
minimization problem (5.6).
Deﬁne
f (z) = (ŵ + zδ)T Σ(ŵ + zδ), z ∈ .

Because ŵ is the weight vector of the minimum-risk portfolio with mean

return m̂, and for any z, ŵ + zδ is the weight vector of a portfolio with mean
return m̂, it follows that f (z) must be minimized at z = 0. Note that the func-
tion f (z) is a quadratic function of the real-valued variable z. Therefore, we
may use the properties of the function f (·) of a single variable to describe the
properties of a portfolio on the minimum-risk frontier.
Because f is a quadratic function in z, we must have f (0) = 0. Note that

f (z) = ŵT Σŵ + 2z ŵT Σδ + z 2 δT Σδ

so that
f (z) = 2ŵT Σδ + 2zδT Σδ

and f (0) = 2ŵT Σδ. It follows that ŵT Σδ = 0. Because the value of δ in V0
is arbitrary, we have shown that if the portfolio with weight vector ŵ is on
the minimum-risk frontier, then

ŵT Σδ = 0 for all δ ∈ V0 .

Now suppose that ŵ ∈ N satisﬁes ŵT 1 = 1 and ŵT Σδ = 0 for all δ ∈ V0 .

We must show that this condition implies that the portfolio with the weight
vector ŵ is on the minimum-risk frontier.

T&F Cat #K31368 — K31368 C005— page 107 — 6/14/2017 — 22:05

108 Introduction to Statistical Methods for Financial Models

Let m̂ = ŵT μ. We need to show that, for any weight vector w satisfying
w μ = m̂,
T

ŵT Σŵ ≤ wT Σw.

Note that w − ŵ ∈ V0 ; it follows that

ŵT Σ(w − ŵ) = 0. (5.8)

Consider wT Σw. To take advantage of (5.8), write

T
wT Σw = (ŵ + (w − ŵ)) Σ (ŵ + (w − ŵ))

and expand this expression as

ŵT Σŵ + 2ŵT Σ(w − ŵ) + (w − ŵ)T Σ(w − ŵ). (5.9)

Note that here we have used the fact that, because

ŵT Σ(w − ŵ)

is a scalar,
ŵT Σ(w − ŵ) = (w − ŵ)T Σŵ.
By (5.8), the cross-product term in (5.9) is zero. Therefore,

wT Σw = ŵT Σŵ + (w − ŵ)T Σ(w − ŵ).

Furthermore, because Σ is positive deﬁnite,

(w − ŵ)T Σ(w − ŵ) ≥ 0.

It follows that wT Σw ≥ ŵT Σŵ. Because this holds for any w satisfying
wT 1 = 1 and wT μ = m̂, it follows that ŵ solves the constrained minimization
problem (5.6) with m = m̂; that is, ŵ is on the minimum-risk frontier.

Proposition 5.1 is an important result in portfolio theory. The importance

of the result is not because it is useful for ﬁnding speciﬁc portfolios on the
minimum-risk frontier—there are simple numerical methods available for that
purpose—instead, it is important because it gives several useful properties of
portfolios on the minimum-risk frontier.
For instance, it follows that portfolios on the minimum-risk frontier are
unique in the sense that, if two portfolios on the minimum risk frontier have
the same mean return, then they must have the same weight vector. This
result is formally stated in the following corollary to Proposition 5.1; the
proof is left as an exercise.

Corollary 5.1. For j = 1, 2, let R̂pj denote the return on the portfolio with
weight vector ŵj . Suppose that both portfolios are on the minimum-risk
frontier and that E(R̂p1 ) = E(R̂p2 ). Then ŵ1 = ŵ2 .

T&F Cat #K31368 — K31368 C005— page 108 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 109

The conclusion of Proposition 5.1 may also be expressed as a property

of the covariances of portfolio returns. This result is given in the following
corollary; the proof is left as an exercise.
Corollary 5.2. Let R̂p denote the return on a portfolio on the minimum-risk
frontier and let R0 denote the return on a zero-investment portfolio that has
zero expected return. Then

Cov(Rp , R0 ) = 0.

Alternatively, let R1 , R2 denote the returns on two assets satisfying

E(R1 ) = E(R2 ). Then

Cov(R̂p , R1 ) = Cov(R̂p , R2 ).

Therefore, according to Corollary 5.2, if R̂p is the return of a portfolio on

the minimum-risk frontier and R is the return on any asset, then Cov(R̂p , R)
is a function of E(R).

Portfolios Constructed from Portfolios on the

Minimum-Risk Frontier
An important feature of the necessary and suﬃcient conditions given in Propo-
sition 5.1 is that if weight vectors ŵ1 and ŵ2 are on the minimum-risk frontier
then aﬃne combinations of ŵ1 , ŵ2 are on the minimum-risk frontier; that is,
weight vectors of the form

z ŵ1 + (1 − z)ŵ2 , −∞ < z < ∞

are on the minimum-risk frontier. Note that a weight vector of this form
may be viewed as the weight vector of a portfolio constructed from the two
portfolios having weight vectors ŵ1 and ŵ2 , respectively.
To establish this result, ﬁrst note that

(z ŵ1 + (1 − z)ŵ2)T 1 = z ŵ1T 1 + (1 − z)ŵ2T 1 = z(1) + (1 − z)(1) = 1

using the fact that ŵ1T 1 and ŵ2T 1 are both 1. Furthermore, if ŵjT Σδ = 0 for
all δ ∈ V0 , for j = 1, 2 then for all δ ∈ V0

(z ŵ1 + (1 − z)ŵ2)T Σδ = z ŵ1T Σδ + (1 − z)ŵ2T Σδ

= 0.

It now follows from Proposition 5.1 that the portfolio with weight vector
z ŵ1 + (1 − z)ŵ2 is on the minimum-risk frontier.
A formal statement of this result is given in the following lemma. Although
the result in Corollary 5.3 applies to two portfolios, clearly it can be extended
to a portfolio formed from a ﬁnite number of portfolios on the minimum-risk
frontier.

T&F Cat #K31368 — K31368 C005— page 109 — 6/14/2017 — 22:05

110 Introduction to Statistical Methods for Financial Models

Corollary 5.3. Suppose that ŵ1 and ŵ2 are the weight vectors of two port-
folios on the minimum-risk frontier. Then, for any z ∈ , z ŵ1 + (1 − z)ŵ2 is
the weight vector of a portfolio on the minimum-risk frontier.

Let ŵ1 and ŵ2 be the two weight vectors in Corollary 5.3 and let mj =
ŵjT μ, j = 1, 2; that is, the portfolio with weight vector ŵj has mean return
mj , j = 1, 2. Note that the portfolio with weight vector z ŵ1 + (1 − z)ŵ2 has
mean return zm1 + (1 − z)m2; if m1 = m2 , then any real number can be writ-
ten as zm1 + (1 − z)m2 for some z. Therefore, according to Corollary 5.3, the
weight vector of any portfolio on the minimum-risk frontier can be written in
terms of the weight vectors ŵ1 and ŵ2 ; the details are given in the following
result.

Lemma 5.2. Let m1 and m2 denote distinct real numbers; for j = 1, 2, let
ŵj denote the weight vector of the minimum-risk portfolio with mean return
mj . Then, for any given m ∈ ,

wm ŵ1 + (1 − wm )ŵ2

is the weight vector of the minimum-risk portfolio with mean return m, where
m − m2
wm = .
m1 − m2
Lemma 5.2 shows that the entire minimum-risk frontier may be generated
from two portfolios; thus, it is sometimes called the two-fund theorem. This
result shows that, in some respects, portfolio theory for an arbitrary number
of assets is essentially the same as portfolio theory for the case of two assets,
as discussed in the previous chapter.
For instance, the set of possible variances of portfolios on the minimum-
risk frontier is a parabola. It follows that there is a minimum possible variance
of portfolios on the minimum-risk frontier and this minimum is achieved by a
single portfolio, called the minimum-variance portfolio. The properties of the
minimum-variance portfolio will be discussed in the following section. Further-
more, if one portfolio on the minimum-variance frontier has a given variance,
then, unless that variance is the minimum variance, there is a second portfolio
with the same variance, as was illustrated in Figure 4.1.

Calculating the Weight Vector of a Portfolio on the

Minimum-Risk Frontier
The results thus far in this section give some important properties of the
minimum-risk frontier. We now consider the problem of calculating the weight
vector that solves the constrained minimization problem (5.6) for a particular
value of m, given values for Σ and μ.
A constrained maximization problem of this form is an example of a
quadratic programming problem. The type of quadratic programming problems

T&F Cat #K31368 — K31368 C005— page 110 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 111

we will consider here generally have three components: a quadratic objective

function of the form
1 T
x Dx − dT x
2
where x is the decision variable, a vector taking values in N , D is a known
N × N matrix and d is a known N × 1 vector; a set of equality constraints of
the form AT x = b where A is a known N × k matrix and b is a known k × 1
vector; and a set of inequality constraints on the elements of x. Note that
the minimization problem (5.6) does not include any inequality constraints;
hence, here we consider minimizing the objective function subject to equality
constraints. Portfolio problems with inequality constraints will be considered
in Section 5.8.
Quadratic programming problems of this type are well-studied and soft-
ware to obtain numerical solutions is widely available. In R, we can use the
function solve.QP in the package quadprog (Turlach and Weingessel 2013)
to solve a general quadratic programming problem as described previously.
The function solve.QP has several arguments; Dmat corresponds to the
matrix D, dvec corresponds to the vector d, Amat corresponds to the matrix
A, and bvec corresponds to the vector b. There is an additional argument meq,
which speciﬁes the number of columns of Amat that correspond to equality con-
straints, with the remaining columns corresponding to inequality constraints.
Because, in the present context, all constraints are equality constraints, the
value given for meq is simply the number of columns of Amat or, equivalently,
the number of rows of t(Amat), the transpose of Amat.
The following example describes how to use R to ﬁnd the weight vectors
of portfolios on the minimum-risk frontier.

Example 5.5 Consider the set of four assets described in Example 5.1, with
mean return vector (0.10, 0.20, 0.05, 0.10)T and return covariance matrix
⎛ ⎞
0.05 0.01 0.02 0
⎜0.01 0.10 0.05 0.02⎟
Σ=⎜
⎝0.02
⎟.
0.05 0.20 0.10⎠
0 0.02 0.10 0.20

These are stored in the R variables mu and Sigma, respectively.

> mu
[1] 0.10 0.20 0.05 0.10
> Sigma
[,1] [,2] [,3] [,4]
[1,] 0.05 0.01 0.02 0.00
[2,] 0.01 0.10 0.05 0.02
[3,] 0.02 0.05 0.20 0.10
[4,] 0.00 0.02 0.10 0.20

T&F Cat #K31368 — K31368 C005— page 111 — 6/14/2017 — 22:05

112 Introduction to Statistical Methods for Financial Models

Suppose we wish to ﬁnd the portfolio on the minimum-risk frontier that

has mean return 0.2. Then the objective function is

wT Σw

and the constraints are wT 1 = 1 and wT μ = 0.2, which may also be written
as 1T w = 1 and μT w = 0.2.
Therefore, in the notation of solve.QP, d is the zero vector of length 4,
D is 2Σ, A is a 4 × 2 matrix with the ﬁrst column given by a vector of all
ones and the second column given by the mean vector μ, and b is the vector
(1, 0.1). Thus, the constraint AT w = b speciﬁes that the portfolio weights
sum to 1 and that the mean return on the portfolio is 0.2.
Therefore, the R commands needed to solve this constrained optimization
problem are as follows:

> library(quadprog)
> A<-cbind(c(1,1,1,1), mu)
> t(A)
[,1] [,2] [,3] [,4]
1.0 1.0 1.00 1.0
0.1 0.2 0.05 0.1
> mrf1<-solve.QP(Dmat=2*Sigma, dvec=mu, Amat=A, bvec=c(1, 0.2),
+ meq=2)

Note that cbind combines two vectors or matrices by the columns; when the
vector c(1,1,1,1) is used in this context, it is interpreted as a column vector.
The weight vector that maximizes the objective function is the component
$solution of the result of the function solve.QP; therefore, the solution to
the constrained minimization problem is

> mrf1$solution
[1] 0.362 0.813 -0.374 0.199

Thus, the mean and standard deviation of the return on the portfolio
corresponding to the weight vector mrf1$solution are given by

> sum(mrf1$solution*mu)
[1] 0.2
> (mrf1$solution%*%Sigma%*%mrf1$solution)^.5
[,1]
[1,] 0.265

The calculations are easily repeated for other values of m. For example,
for m = 0.25,

> mrf2<-solve.QP(Dmat=2*Sigma, dvec=mu, Amat=A, bvec=c(1, 0.25),

+ meq=2)
> mrf2$solution

T&F Cat #K31368 — K31368 C005— page 112 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 113

[1] 0.179 1.198 -0.604 0.227

> sum(mrf2$solution*mu)
[1] 0.25
> (mrf2$solution%*%Sigma%*%mrf2$solution)^.5
[,1]
[1,] 0.373

5.4 The Minimum-Variance Portfolio

Choosing from among the portfolios on the minimum risk frontier requires
some consideration of the relative importance of the mean return and risk of
a portfolio. In this section, we consider the simplest analysis of this type, based
on the belief that only risk is important when evaluating a portfolio. Under
this assumption, the portfolio with minimum return variance is optimal; we
will call this portfolio the minimum-variance portfolio.
As in the N = 2 case discussed in the previous chapter, the minimum-
variance portfolio also provides a useful reference point for evaluating the
mean and standard deviation of the returns on portfolios, and it will play a
role in several results in this chapter.
Let wmv denote the weight vector of the minimum-variance portfolio and
T
let Rmv = wmv R denote the corresponding portfolio return. Then
T
wmv Σwmv ≤ wT Σw for any w ∈ N such that wT 1 = 1.
The following proposition gives a useful characterization of the minimum-
variance portfolio, stating that the covariance of the return on the minimum-
variance portfolio and any other portfolio is constant, not depending on the
portfolio under consideration. The proof proceeds by showing that, if this
were not the case, then we could construct a portfolio with variance smaller
than that of the minimum-variance portfolio.
Proposition 5.2. An asset with return R̂ is the minimum-variance portfolio
if and only if
Cov(R̂, Rp ) = Var(R̂) (5.10)
for Rp = wT R, for any weight vector w.
Proof. First suppose that R̂ is the return on the minimum-variance portfolio
so that R̂ = Rmv . Let w be the weight vector corresponding to a portfolio
with return Rp . For a given real number z, consider the portfolio with weight
vector wmv + z(w − wmv ), which has return Rmv + z(Rp − Rmv ). Deﬁne
f (z) = Var (Rmv + z(Rp − Rmv )) , −∞ < z < ∞.
Because Rmv is the return on the minimum-variance portfolio, f (z) is mini-
mized at z = 0. Note that f (z) is a quadratic function of z; hence, f (0) = 0.

T&F Cat #K31368 — K31368 C005— page 113 — 6/14/2017 — 22:05

114 Introduction to Statistical Methods for Financial Models

By expanding the variance used to deﬁne f (z),

f (z) = Var(Rmv ) + 2zCov(Rmv , Rp − Rmv ) + z 2 Var(Rp − Rmv ).

It follows that
f (0) = 2Cov(Rmv , Rp − Rmv )
so that
Cov(Rmv , Rp − Rmv ) = 0;
using properties of covariance,

Cov(Rmv , Rp − Rmv ) = Cov(Rmv , Rp ) − Cov(Rmv , Rmv )

= Cov(Rmv , Rp ) − Var(Rmv )

so that
Cov(Rmv , Rp ) = Var(Rmv ),
as stated in the proposition.
Now suppose that Cov(R̂, Rp ) = Var(R̂) holds for any portfolio return Rp .
Because

Var(Rp ) = Var R̂ + (Rp − R̂)
= Var(R̂) + 2Cov(R̂, Rp − R̂) + Var(Rp − R̂)

= Var(R̂) + 2 Cov(R̂, Rp ) − Var(R̂) + Var(Rp − R̂)
= Var(R̂) + Var(Rp − R̂),

it follows that
Var(Rp ) ≥ Var(R̂).
Because this holds for any portfolio return Rp , R̂ must be the return on
the minimum-variance portfolio.

One consequence of the Proposition 5.2 is that the return on the minimum-
variance portfolio is uncorrelated with the return on any zero-investment
portfolio. Note that, if there were a zero-investment portfolio with weight
vector v such that Rmv and R0 = v T R are correlated, then we could ﬁnd a
constant c such that the portfolio with return Rmv + cR0 has a smaller return
variance than does Rmv .
The following corollary gives a formal statement of this result; the proof
is left as an exercise.

Corollary 5.4. Let Rmv denote the return on the minimum-variance portfo-
lio and let R0 denote the return on a zero-investment portfolio. Then

Cov(Rmv , R0 ) = 0.

T&F Cat #K31368 — K31368 C005— page 114 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 115

The characterization of the minimum-variance portfolio given in Propo-

sition 5.2 can be used to suggest a form for wmv , the weight vector of the
minimum-variance portfolio.
Let R1 , R2 , . . . , RN denote the returns on the N assets under consider-
ation. Then, treating each asset as a portfolio in Proposition 5.2, for any
j = 1, 2, . . . , N
T
Cov(Rmv , Rj ) = Cov(Rj , Rmv ) = Cov(Rj , wmv R) = eTj Σwmv = c (5.11)
for some constant c, where ej denotes the jth column of the N × N identity
matrix; that is, it is the vector in N consisting of all zeros, except for the
jth element, which is 1. Of course, by Proposition 5.2, the constant c must be
Var(Rmv ); however, that fact is not needed here.
Therefore, for each j = 1, 2, . . . , N ,
eTj Σwmv = c
and combining these, it follows that
IΣwmv = c1
so that
Σwmv = c1.
It follows that
wmv = cΣ−1 1. (5.12)
Because the weights in wmv must sum to 1, c must be
1
.
1T Σ−1 1
The following result uses the Cauchy–Schwarz inequality to show directly
that the weight vector of the minimum-variance portfolio is of the form given
in (5.12).
Proposition 5.3. Let R denote the return vector for a set of assets and let
Σ denote the covariance matrix of R. Then the weight vector of the
minimum-variance portfolio is given by
Σ−1 1
wmv =
1T Σ−1 1.
Proof. The variance of the return on a portfolio based on weight vector w is
given by wT Σw, which can be written
1 1
(Σ 2 w)T (Σ 2 w).
Using the Cauchy–Schwarz inequality with x = Σ1/2 w and y = Σ−1/2 1,
1 1
2 1 1
1 1

(Σ 2 w)T (Σ− 2 1) ≤ (Σ 2 w)T (Σ 2 w) (Σ− 2 1)T (Σ− 2 1) (5.13)
1 1
with equality if Σ 2 w = cΣ− 2 1 for some scalar c.

T&F Cat #K31368 — K31368 C005— page 115 — 6/14/2017 — 22:05

116 Introduction to Statistical Methods for Financial Models

Note that (5.13) may be written

(wT 1)2 ≤ wT Σw 1T Σ−1 1

and because wT 1 = 1,
1
wT Σw ≥
1T Σ−1 1
with equality if w = cΣ−1 1; that is, the weight vector of the minimum-
variance portfolio must be of the form cΣ−1 1 for some constant c. Since
the weights must sum to 1, we must have

1
c= ,
1T Σ−1 1
proving the result.

Example 5.6 Consider a set of three assets, with mean returns 0.25, 0.125,
and 0.3, respectively, and suppose that the returns have covariance matrix
⎛ ⎞
0.25 0.1 0.24
Σ = ⎝ 0.1 0.16 0.096⎠ . (5.14)
0.24 0.096 0.36

Hence, the asset returns have standard deviations 0.5, 0.4, and 0.6, respec-
tively, and their correlation matrix is
⎛ ⎞
1 0.5 0.8
⎝0.5 1 0.4⎠ (5.15)
0.8 0.4 1

The mean vector and covariance matrix may be entered into R using the
commands

> mu<-c(0.25, 0.125, 0.3)

> Sig<-matrix(c(0.25, 0.1, 0.24, 0.1, 0.16, 0.096, 0.24, 0.096,
+ 0.36),3,3)
> Sig
[,1] [,2] [,3]
[1,] 0.25 0.100 0.240
[2,] 0.10 0.160 0.096
[3,] 0.24 0.096 0.360

To calculate the weights of the minimum-variance portfolio, we may use the

solve function. Let A denote an m × m invertible matrix and let b denote
an m × 1 vector; let A and b denote the corresponding R variables. Then
solve(A, b) returns A−1 b; the function with the second argument omitted,
that is, solve(A), returns A−1 .

T&F Cat #K31368 — K31368 C005— page 116 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 117

Thus, wmv may be calculated by

> w0<-solve(Sig, c(1,1,1))

> w_mv<-w0/sum(w0)
> w_mv
[1] 0.243 0.713 0.044

Example 5.7 Consider a set of N assets with covariance matrix of the form
⎛ ⎞
1 ρ
... ... ρ
⎜ ρ 1ρ ... ρ ⎟
⎜ ⎟
⎜ .. ..
.. .. .. ⎟
Σ=σ ⎜ 2
. .. . . ⎟ (5.16)
⎜ ⎟
⎝ ρ ... ρ 1 ρ ⎠
ρ ... ... ρ 1

where 0 ≤ ρ < 1. Recall that, under this condition, Σ is positive-deﬁnite; see

Example 5.4. Thus, for this covariance matrix, all asset returns have standard
deviation σ and the correlation between any two returns is ρ.
To calculate the weight vector of the minimum-variance portfolio we need
Σ−1 1. Recall that in Example 5.4 it was shown that 1 is an eigenvector of Σ,
with corresponding eigenvalue 1 + (N − 1)ρ. It follows that

1 = Σ−1 Σ1 = (1 + (N − 1)ρ)Σ−1 1

and, hence, that

1
Σ−1 1 = 1.
1 + (N − 1)ρ
Because
N
1T Σ−1 1 = ,
1 + (N − 1)ρ
it follows that the weight vector of the minimum-variance portfolio is given by

Σ−1 1 1
= 1
1T Σ−1 1 N
so that the equally weighted portfolio is the minimum-variance portfolio.
To ﬁnd the variance of the minimum-variance portfolio, we use the fact
that
T
1 1 σ2 σ2 1
1 Σ 1 = (N + N (N − 1)ρ) = + 1 − ρσ2 .
N N N2 N N

This is the minimum possible variance for a portfolio based on a return vector
with a covariance matrix of the form (5.16).

T&F Cat #K31368 — K31368 C005— page 117 — 6/14/2017 — 22:05

118 Introduction to Statistical Methods for Financial Models

5.5 The Eﬃcient Frontier

The minimum-risk frontier consists of those portfolios that have the smallest
return standard deviation for a given value of the mean return. However, for
some portfolios on the minimum-risk frontier, there is another portfolio with
the same risk, but with a larger mean return; see Section 4.4 for a similar
situation in the N = 2 case.
The efficient frontier consists of those portfolios on the minimum-risk
frontier that also have the largest expected return for a given level of risk.
For instance, the efficient frontier corresponding to Figure 5.3 is given in
Figure 5.4; as shown in the graph, it corresponds to the “top half” of the
minimum-risk frontier.
Therefore, when choosing a portfolio based on the mean and standard devi-
ation of the portfolio return, there is never any reason to choose one that is
not in the efficient frontier. Like “minimum-risk frontier,” the term “efficient
frontier” will refer to risk, expected-return pairs, as well as to their corre-
sponding portfolios and weight functions. A portfolio on the efficient frontier
will be said to be an efficient portfolio and the set of all efficient portfolios is
also known as the efficient set. The efficient frontier is a fundamental concept
in portfolio theory and in this section we consider its properties.
The example illustrated in Figure 4.2 suggests that, if two portfolios on
the minimum-risk frontier have the same risk, but different mean returns,
then one of the portfolios has a mean return greater than the return on the
minimum-variance portfolio and the other has a mean return less than that
μp

σp

FIGURE 5.4
An example of an eﬃcient frontier.

T&F Cat #K31368 — K31368 C005— page 118 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 119

of the minimum-variance portfolio. The following result shows that this is, in
fact, the case.

Lemma 5.3. Suppose there are two distinct portfolios on the minimum-risk
frontier, with returns Rp1 and Rp2 , respectively. Let μj = E(Rpj ), j = 1, 2
and suppose that Var(Rp1 ) = Var(Rp2 ). Then

1
μmv = (μ1 + μ2 )
2

where μmv denotes the mean return on the minimum-variance portfolio.

It follows that either

μ1 < μmv < μ2 or μ2 < μmv < μ1 .

Proof. Let wj denote the weight vector corresponding to return Rpj , j = 1, 2.

Because the two portfolios are on the minimum-risk frontier, and they have
diﬀerent mean returns, it follows from Lemma 5.2 that the weight vector of
any portfolio on the minimum-risk frontier may be written

wz = zw1 + (1 − z)w2

for some real number z.

Note that the variance of the return on the portfolio based on weight vector
wz is given by

wzT Σwz = z 2 w1T Σw1 + (1 − z)2w2T Σw2 + 2z(1 − z)w2T Σw1

= z 2 + (1 − z)2 w1T Σw1 + 2z(1 − z)w2T Σw1

= 2 w1T Σw1 − w2T Σw1 z 2 + 2 w2T Σw1 − w1T Σw1 z + w1T Σw1 ,

using the fact that w1T Σw1 = w2T Σw2 .

Note that, because Σ is positive deﬁnite, the correlation of the returns
with weight vectors w1 and w2 is less than one; it follows that

w2T Σw1 < w1T Σw1 . (5.17)

Hence, wzT Σwz is a quadratic function of z with a positive coeﬃcient of z 2 .

It follows that wzT Σwz can be minimized by solving

d T
wz Σwz = (4z − 2) w1T Σw1 − w2T Σw1 = 0.
dz

The result in (5.17) shows that w1T Σw1 − w2T Σw1 = 0; therefore, 4z − 2 = 0
or z = 1/2.

T&F Cat #K31368 — K31368 C005— page 119 — 6/14/2017 — 22:05

120 Introduction to Statistical Methods for Financial Models

Thus, the portfolio on the minimum-risk frontier with the smallest variance
has weight vector
1 1
w1 + w2 ;
2 2
because the minimum variance portfolio is on the minimum-risk frontier, we
must have
1 1
w1 + w2 = wmv .
2 2
Furthermore, we must have
1 1
μ1 + μ2 = μmv .
2 2
Clearly, this cannot hold if either μ1 and μ2 are both greater than μmv or if
both μ1 and μ2 are less than μmv . The result follows.

The most important consequence of Lemma 5.3 is the following

result describing the efficient frontier; the proof follows immediately from
Lemma 5.3.
Proposition 5.4. The efficient frontier consists of those portfolios on the
minimum-risk frontier with a mean return greater than or equal to μmv .
Thus, the efficient frontier is the “top half” of the minimum-risk frontier,
as suggested previously. It follows that many of the results describing the
minimum-risk frontier are easily converted into results for the efficient fron-
tier. The following corollaries give such conversions of Lemma 5.1 and 5.2,
respectively.
Corollary 5.5. Define V0 as in (5.7). Consider a vector of portfolio returns
R with mean vector μ and covariance matrix Σ. The portfolio with weight
vector ŵ is on the efficient frontier if and only if
ŵT Σδ = 0 for all δ ∈ V0
and
ŵT μ ≥ μmv .
Corollary 5.6. Let w0 denote the weight vector of a portfolio on the effi-
cient frontier that is not the minimum-variance portfolio and let wmv denote
the weight vector of the minimum-variance portfolio. Then the portfolio with
weight vector
zw0 + (1 − z)wmv ,
where z ≥ 0, is on the efficient frontier.
Proof. By Lemma 5.2, the portfolio with weight vector zw0 + (1 − z)wmv
is on the minimum-risk frontier. This portfolio has expected return
μmv + z(μ0 − μmv ), where μ0 is the expected return on the portfolio with
weight vector w0 ; note that, since that portfolio is on the efficient frontier,
μ0 ≥ μmv . The result follows.

T&F Cat #K31368 — K31368 C005— page 120 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 121

5.6 Risk-Aversion Criterion

Choosing a portfolio from the eﬃcient set requires some consideration of the
trade-oﬀ between higher mean return and greater risk. The minimum variance
portfolio deals with this issue by ignoring the mean return completely and
simply minimizing risk. In this section, we consider an N -asset version of the
risk-aversion criterion analyzed in Section 4.4 that is a function of both the
mean return and the variance of the return; we then choose the portfolio that
maximizes this function.
Let R denote an N -dimensional vector of asset returns and let w denote
the weight vector for a portfolio. Then the portfolio return wT R has expected
return wT μ and return variance wT Σw, where μ and Σ denote the mean
vector and covariance matrix, respectively, of R.
The risk-aversion criterion function is given by
λ T
wT μ − w Σw (5.18)
2
where the risk-aversion parameter λ > 0 is given. This function has the same
interpretation as the risk-aversion criterion function considered in Section 4.4.
It may be viewed as a penalized mean return, with the penalty based on the
variance of the return; the extent to which the variance penalizes the mean
return is controlled by the parameter λ. An investor primarily interested in a
large mean return with a high tolerance for risk might choose λ to be small.
On the other hand, an investor with a strong preference for a low-risk portfolio
might choose λ to be large.
Our goal is to choose the weight vector w that maximizes this (5.18), which
we denote by wλ ; the following result gives an explicit expression for wλ . For
lack of a better term, we will refer to the portfolio with weight vector wλ as
the risk-averse portfolio with parameter λ.
Proposition 5.5. Let R denote the return vector for a set of assets, let μ
denote the mean vector of R, and let Σ denote the covariance matrix of R.
For a given value of λ > 0, the weight vector that maximizes
λ T
wT μ − w Σw, (5.19)
2
subject to the restriction wT 1 = 1, is given by
1
wλ = wmv + v̄
λ
where wmv is the weight vector of the minimum variance portfolio,

v̄ = Σ−1 (μ − μmv 1)

and μmv is the mean return on the minimum-variance portfolio.

T&F Cat #K31368 — K31368 C005— page 121 — 6/14/2017 — 22:05

122 Introduction to Statistical Methods for Financial Models

Proof. To maximize (5.19) subject to the restriction wT 1 = 1, we can use the

method of Lagrange multipliers. The function (5.19) is modiﬁed to
λ T
wT μ − w Σw + θ(wT 1 − 1) (5.20)
2
for θ ∈ and then it is maximized over w ∈ N for each θ. The solution will
depend on θ, which is then chosen so that wT 1 = 1.
Because, for a vector q ∈ N ,
λ λ λ
− (w − q)T Σ(w − q) = − wT Σw + λwT Σq − qΣq
2 2 2
taking
1 −1 θ
q= Σ μ + Σ−1 1,
λ λ
it follows that
λ T λ
wT μ − w Σw + θ(wT 1 − 1) = − (w − q)T Σ(w − q) + A
2 2
where A is a term not depending on w. Hence, (5.20) is maximized over
w ∈ N by
1 −1 θ
wλ (θ) = q = Σ μ + Σ−1 1. (5.21)
λ λ
To complete the maximization, we choose θ so that wλ (θ)T 1 = 1; because,
using (5.21),
1 θ
wλT (θ)1 = μT Σ−1 1 + 1T Σ−1 1,
λ λ
it follows that
λ − μT Σ−1 1
θ= .
1T Σ−1 1
Substituting this expression into (5.21) yields

1 μT Σ−1 1
wλ = wmv + Σ−1 μ − T −1 1
λ 1 Σ 1
where wmv is the weight vector of the minimum variance portfolio.
The term
μT Σ−1 1
1T Σ−1 1
appearing in the expression for wλ is the mean return on the minimum
variance portfolio,
1T Σ−1
μmv = wmv T
μ = T −1 μ.
1 Σ 1
Thus, we can write
1
wλ = wmv + v̄
λ

T&F Cat #K31368 — K31368 C005— page 122 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 123

where
v̄ = Σ−1 (μ − μmv 1) (5.22)
as stated in the proposition.

Thus, the optimal weight vector corresponding to the investor’s choice

of the risk-aversion parameter λ starts with the weight vector of the mini-
mum variance portfolio and adds 1/λ times v̄. Note that, because the weights
in both wλ and wmv must sum to 1, v̄ must be the weight vector of a
zero-investment portfolio; this may be conﬁrmed directly.

v̄ T 1 = (μ − μmv 1) Σ−1 1
T

= μT Σ−1 1 − μmv 1T Σ−1 1

using the fact that

μT Σ−1 1
μmv = .
1T Σ−1 1
Example 5.8 Consider the assets described in Example 5.6, with mean
return vector (0.25, 0.125, 0.3) and covariance matrix given by (5.14); these
are stored in variables mu and Sig, respectively. Recall that the weight vec-
tor of the minimum-variance portfolio was determined in Example 5.6 and is
given by

> w_mv
[1] 0.243 0.713 0.044

The vector v̄ deﬁned in Proposition 5.5 may be calculated by the following

commands.

> m<-sum(w_mv*mu)
> m
[1] 0.163
> vbar<-solve(Sig, mu - m*c(1,1,1))
> vbar
[1] 0.194 -0.607 0.413

The weight vector wλ corresponding to any value of λ may now be

calculated using w_mv and v_bar. For instance, for λ = 1, wλ is

> w_mv + vbar

[1] 0.437 0.106 0.457

and for λ = 4,

> w_mv + vbar/4

[1] 0.292 0.561 0.147

T&F Cat #K31368 — K31368 C005— page 123 — 6/14/2017 — 22:05

124 Introduction to Statistical Methods for Financial Models

Asset 1
Asset 2
Asset 3
0.6

Weight
0.4

0.2

0
1 2 3 4 5
λ

FIGURE 5.5
Weights of the risk-averse portfolio in Example 5.8 as λ varies.

Figure 5.5 contains a plot of the weights for the three assets as λ varies.
Note that the weight for asset 1 is relatively constant, while the weight for
asset 2 is small, or even negative, for λ < 1 and increases rapidly as λ increases
over the range (1, 3). For λ > 5, the weights are stable, being approximately
equal to the weights of the minimum variance portfolio.
For λ = 1, the mean return of the portfolio with weight vector wλ is
> sum((w_mv + vbar)*mu)
[1] 0.260
and the standard deviation of the return is
> (t(w_mv + vbar)%*%Sig%*%(w_mv + vbar))^.5
[,1]
[1,] 0.489
For λ = 4, the mean and standard deviation of the return are 0.187 and 0.386,
respectively. Thus, for small λ, there is less of a penalty on the variance
of return; it follows that the optimal portfolio has a larger return standard
deviation and, hence, a larger mean return. Figure 5.6 contains a plot of the
mean and standard deviation of the return on the portfolio with weight vector
wmv as λ varies.

Properties of Risk-Averse Portfolios

Let Rmv denote the return on the minimum-variance portfolio. Because
the minimum-variance portfolio is on the minimum-risk frontier and the

T&F Cat #K31368 — K31368 C005— page 124 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 125

1.0 Mean
st dev
0.8

Mean or st dev
0.6

0.4

0.2

0
1 2 3 4 5
λ

FIGURE 5.6
Mean and standard deviation of the risk-averse portfolio in Example 5.8 as λ
varies.

portfolio based on v̄ is a zero-investment portfolio, it follows from Corollary 5.4

that
Cov(Rmv , v̄ T R) = wmv
T
Σv̄ = 0.

This result may also be established using a direct argument:

Cov(Rmv , v̄ T R) = Cov(wmv
T
R, v̄ T R)
T
= wmv Σv̄
1
= T −1 1T Σ−1 ΣΣ−1 (μ − μmv 1)
1 Σ 1
1
= T −1 (1T Σ−1 μ − μmv 1T Σ−1 1)
1 Σ 1
= 0.

This fact is useful in computing the mean and variance of the return
corresponding to wλ .

Corollary 5.7. Let R denote the return vector for a set of assets, let μ denote
the mean vector of R, and let Σ denote the covariance matrix of R. For a given
value of λ > 0, the portfolio with weight vector wλ , as given in Proposition
5.5, has mean return

1
μλ = μmv + (μ − μmv 1)T Σ−1 (μ − μmv 1)
λ

T&F Cat #K31368 — K31368 C005— page 125 — 6/14/2017 — 22:05

126 Introduction to Statistical Methods for Financial Models

and return variance

1
σ2λ = σ2mv + (μλ − μmv )
λ
where μmv and σmv denote the mean and standard deviation, respectively, of
the return on the minimum variance portfolio.

Proof. Using the expression for v̄,

v̄ T μ = (μ − μmv 1)T Σ−1 μ;

this result, together with the fact that

(μ − μmv 1)T Σ−1 1 = μT Σ−1 1 − μmv 1T Σ−1 1 = 0,

leads to the expression for μλ given in the statement of the corollary.

Note that

v̄ T Σv̄ = (μ − μmv 1)T Σ−1 ΣΣ−1 (μ − μmv 1)

= (μ − μmv 1)T Σ−1 (μ − μmv 1);

the expression for σ2λ follows from noting that

(μ − μmv 1)T Σ−1 (μ − μmv 1)

is simply μλ − μmv , as shown earlier.

The result in Corollary 5.7 may be used to interpret the risk-aversion

parameter λ used to construct the portfolio with weight vector wλ . According
to the expressions for μλ and σ2λ given in Corollary 5.7,

μλ − μmv
λ= .
σ2λ − σ2mv

Therefore, the value of λ may be chosen to set a desired value for the mean
return of the portfolio above that of the minimum variance portfolio as a
proportion of the variance of the portfolio above that of the minimum variance
portfolio.
It is not surprising that the risk-averse portfolios are on the eﬃcient fron-
tier. However, the converse is also true—every portfolio on the eﬃcient frontier
is a risk-averse portfolio with parameter λ, for some λ > 0. This result is given
in the following proposition.

Proposition 5.6. The portfolio with weight vector wp is on the eﬃcient fron-
tier if and only if either wp = wλ for some λ > 0 or wp = wmv . Here wλ
denotes the weight vector of the risk-averse portfolio with parameter λ, as
deﬁned in Proposition 5.5.

T&F Cat #K31368 — K31368 C005— page 126 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 127

Proof. First suppose that wp = wλ for some λ > 0. Suppose the portfolio with
weight vector wp is not on the minimum risk frontier; then there is a portfolio
with the same mean return but a smaller return variance. However, such a
portfolio would have a smaller value of the risk-aversion criterion (for any
value of λ), which contradicts the fact that the portfolio with weight vector
wp is the risk-averse portfolio with parameter λ. It follows that the portfolio
with weight vector wp is on the minimum risk frontier. By Corollary 5.7,
together with the fact that Σ is positive deﬁnite, it follows that

μλ − μmv > 0;

it follows that the portfolio with weight vector wp is on the efficient frontier.
Note that, because the minimum-variance portfolio is on the efficient frontier,
this result also holds if wp = wmv . Therefore, if wp = wλ for some λ > 0 or
wp = wmv , then the portfolio with weight vector wp is on the efficient frontier.
Now suppose the portfolio with weight vector wp is on the efficient fron-
tier, but it is not the minimum-variance portfolio; let μp = E(Rp ) and note
that μp > μmv . According to Corollary 5.7, the risk-averse portfolio with
parameter λ has expected return
1
μλ = μmv + (μ − μmv 1)T Σ−1 (μ − μmv 1).
λ
Because
(μ − μmv 1)T Σ−1 (μ − μmv 1) > 0,
there exists a λp > 0 such that the expected return on the portfolio with
weight vector wλp is μp . Futhermore, the portfolio with weight vector wλp
is on the efficient frontier. By the uniqueness of portfolios on the efficient
frontier, it follows that wp = wλ . Using the fact that the minimum-variance
portfolio is on the efficient frontier, it follows that if the portfolio with weight
vector wp is on the efficient frontier, then either wp = wλ for some λ > 0 or
wp = wmv , proving the result.

Therefore, the risk-aversion approach gives an alternative way of param-

eterizing the portfolios on the efficient frontier. Instead of defining efficient
portfolios as those with minimum risk for a given mean return m ≥ μmv or
as those with the maximum mean return for a given level of portfolio risk,
we can define them as those portfolios maximizing the risk-aversion criterion
function for some value of 0 < λ ≤ ∞.

Finding wλ Using Quadratic Programming

Although Proposition 5.5 gives an expression for wλ , the weight vector of the
risk-averse portfolio with risk-aversion parameter λ, it is often more convenient
to ﬁnd wλ numerically. The R function solve.QP in the package quadprog
that was used in Example 5.5 to ﬁnd the weight vector of a portfolio on the

T&F Cat #K31368 — K31368 C005— page 127 — 6/14/2017 — 22:05

128 Introduction to Statistical Methods for Financial Models

minimum-risk frontier with a given mean return may also be used to compute
the weight vector maximizing the risk-aversion criterion function.
Recall that in solve.QP the objective function is of the form

1 T
x Dx − dT x,
2
which is minimized with respect to x. This is equivalent to maximizing the
objective function
1
dT x − xT Dx.
2
The constraint on the weight vector w, wT 1 = 1, is easily included using the
argument Amat to solve.QP. Equality constraints on x are given by AT x = b
for a given matrix A and a given vector b.
The following example illustrates how solve.QP can be used to ﬁnd the
weight vector wλ .

Example 5.9 Consider the assets described in Example 5.6 and analyzed in
Example 5.8. The assets have mean return vector (0.25, 0.125, 0.3) and return
covariance matrix given by (5.14); these are stored in variables mu and Sig,
respectively.
The same basic approach used in Example 5.5 can be used here, except
that now the objective function is of the form

λ T
wT μ − w Σw
2
and in the present context the only constraint is that the portfolio weights
sum to 1.
Thus, the arguments of solve.QP that deﬁne the objective function are
Dmat=lambda*Sig and dvec=mu, where lambda denotes the value of the
risk-aversion parameter λ. To specify the constraint that the weights sum to
1, we take Amat to be matrix(rep(1,3), 3, 1); in this command, rep(1,3)
is a vector consisting of 1 repeated three times and matrix forms a matrix
from that vector. The remaining arguments are bvec=1, which speciﬁes that
the weights sum to 1, and meq=1, which indicates that the constraint is an
equality constraint.
Consider calculation of the weights of the risk-averse portfolio correspond-
ing to λ = 1. These may be obtained using the R command

> library(quadprog)
> ra<-solve.QP(Dmat=Sig, dvec=mu, Amat=matrix(rep(1,3),3,1),
+ bvec=1, meq=1)
> ra$solution
[1] 0.437 0.106 0.457

Note that the result matches that obtained in Example 5.8.

T&F Cat #K31368 — K31368 C005— page 128 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 129

5.7 The Tangency Portfolio

So far in this chapter, we have considered the problem of choosing a portfolio
based on a set of N risky assets. In this section, we consider including a
risk-free asset in the portfolio. Let Rf denote the return on the risk-free asset;
let μf = E(Rf ) and recall that Var(Rf ) = 0. We will assume that μf < μmv ,
where μmv is the mean return on the minimum-variance portfolio.
In selecting the weights for such a portfolio, we may use the same approach
utilized in the two-asset case: first choose a portfolio of the N risky assets
and then combine that portfolio with the risk-free asset. When selecting the
weights of the N risky assets, it is important to use the fact that the resulting
portfolio will be combined with the risk-free asset.
In particular, as noted in Section 4.6, the result of Proposition 4.1 con-
tinues to hold in this setting. That is, when the portfolio of risky assets is
being combined with the risk-free asset, the optimal portfolio is the one that
maximizes the Sharpe ratio
E(Rp ) − μf
1 ,
(Var(Rp )) 2
the portfolio known as the tangency portfolio.
The following result gives an expression for the weight vector of the tan-
gency portfolio in terms of the mean vector μ and the covariance matrix Σ of
the vector R of asset returns.
Proposition 5.7. Let R denote the return vector for a set of assets, let μ
denote the mean vector of R, and let Σ denote the covariance matrix of R.
Then the weight vector for the tangency portfolio is given by
Σ−1 (μ − μf 1)
wT =
1T Σ−1 (μ − μf 1)
where μf denotes the expected return on the risk-free asset.
Proof. The Sharpe ratio of the portfolio based on weight vector w is given by
wT (μ − μf 1)
1 . (5.23)
(wT Σw) 2
To find the tangency portfolio, we need to find the weight vector that maxi-
mizes (5.23). Define b = Σ1/2 w and d = Σ−1/2 (μ − μf 1). Then (5.23) can be
written
bT d
1 . (5.24)
(bT b) 2
By the Cauchy–Schwarz inequality,
bT d 1
1 ≤ (dT d) 2
(bT b) 2
with equality if b = cd for a scalar c > 0.

T&F Cat #K31368 — K31368 C005— page 129 — 6/14/2017 — 22:05

130 Introduction to Statistical Methods for Financial Models

Therefore, (5.24) is maximized over b by cd for any c > 0. It follows that

(5.23) is maximized when
1
Σ−1/2 w = cΣ− 2 (μ − μf 1)

for c > 0. That is, wT is of the form

wT = cΣ−1 (μ − μf 1)

for some c > 0. For the weights in wT to sum to 1, we need

1
c= .
1T Σ−1 (μ − μf 1)

Note that
1T Σ−1 (μ − μf 1) = (μmv − μf )1T Σ−1 1 > 0
using the fact that Σ is positive deﬁnite, along with the assumption that
μf < μmv .

The role of the tangency portfolio here is the same as in the N = 2 case:
When constructing a portfolio consisting of risky assets plus the risk-free asset,
all investors should use the tangency portfolio as their portfolio of risky assets.

Example 5.10 Consider the assets described in Example 5.6, with mean
return vector stored in the variable mu and covariance matrix of the returns
stored in Sig:

> mu
[1] 0.250 0.125 0.300
> Sig
[,1] [,2] [,3]
[1,] 0.25 0.100 0.240
[2,] 0.10 0.160 0.096
[3,] 0.24 0.096 0.360

Suppose that the risk-free asset has return μf = 0.01. Then the weight
vector of the tangency portfolio is given by

> w_T<-solve(Sig, mu-0.01)/sum(solve(Sig, mu-0.01))

> w_T
[1] 0.424 0.148 0.428

This tangency portfolio has the Sharpe ratio

> sum(w_T*mu)/(w_T%*%Sig%*%w_T)^.5
[,1]
[1,] 0.532

T&F Cat #K31368 — K31368 C005— page 130 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 131

An Alternative Characterization of the

Tangency Portfolio
The weight vector of the tangency portfolio, wT , is the vector w ∈ N that
maximizes the Sharpe ratio

wT (μ − μf 1)
1
(wT Σw) 2

subject to the restriction that wT 1 = 1. However, there is another way to

describe wT that is sometimes useful.
Note that, for any scalar c > 0,

(cw)T (μ − μf 1) wT (μ − μf 1)
1 = 1 . (5.25)
((cw)T Σ(cw)) 2 (wT Σw) 2

Consider two weight vectors w1 , w2 such that

wjT (μ − μf 1) > 0, j = 1, 2.

Using (5.25),
w1T (μ − μf 1) w2T (μ − μf 1)
1 ≥ 1
(w1T Σw1 ) 2 (w2T Σw2 ) 2
if and only if
(cw1 )T (μ − μf 1) (dw2 )T (μ − μf 1)
1 ≥ 1 (5.26)
((cw1 )T Σ(cw1 )) 2
((dw2 )T Σ(dw2 )) 2
for any c > 0 and d > 0. That is, in maximizing the Sharpe ratio, it is not
necessary for the weights to sum to 1; recall that this fact was used in the
proof of Proposition 5.7, when we found the vector in N that maximizes the
Sharpe ratio and then rescaled it to sum to 1.
Let
1
c̄ = T
w1 (μ − μf 1)
and
1
d¯ = .
w2T (μ − μf 1)
Then taking c = c̄ and d = d¯ in (5.26), it follows that

w1T (μ − μf 1) w2T (μ − μf 1)
1 ≥ 1
(w1T Σw1 ) 2 (w2T Σw2 ) 2

if and only if
(c̄w1 )T (μ − μf 1) ¯ 2 )T (μ − μf 1)
(dw
1 ≥ 1 . (5.27)
((c̄w1 )T Σ(c̄w1 )) 2
(dw ¯ 2) 2
¯ 2 )T Σ(dw

T&F Cat #K31368 — K31368 C005— page 131 — 6/14/2017 — 22:05

132 Introduction to Statistical Methods for Financial Models
¯
Note that, by deﬁnition of c̄ and d,

(c̄w1 )T (μ − μf 1) = 1

and
¯ 2 )T (μ − μf 1) = 1.
(dw
If u1 and u2 satisfy

uTj (μ − μf 1) = 1, j = 1, 2

then
uT1 (μ − μf 1) uT2 (μ − μf 1)
1 ≥ 1
(uT1 Σu1 ) 2 (uT2 Σu2 ) 2
if and only if
uT1 Σu1 ≤ uT2 Σu2 .
That is, we can describe the weight vector of the tangency portfolio as
being proportional to the vector u ∈ N that minimizes

uT Σu (5.28)

subject to the restriction that uT (μ − μf 1) = 1; the proportionality constant

is chosen so that the weights sum to 1.
Example 5.11 Consider the assets described in Example 5.6 and analyzed in
Example 5.10; recall that in Example 5.10, the weights of the tangency port-
folio were calculated using the expression Σ−1 (μ − μf 1)/(1T Σ−1 (μ − μf 1))
and were shown to be
> w_T
[1] 0.424 0.148 0.428
In this example, we calculate the weights of the tangency portfolio using
its characterization scalar multiple of the vector that minimizes (5.28) subject
to the restriction that uT (μ − μf 1) = 1.
The return mean vector and covariance matrix are stored in R variables
mu and Sig, respectively; the risk-free rate of return is taken to be 0.01.
To use solve.QP to minimize (5.28) subject to the restriction given
previously, we construct the constraint matrix A.tan by
> A.tan<-matrix(mu - 0.01, 3, 1)
and use the commands
> tan1<-solve.QP(Dmat=2*Sig, dvec=c(0,0,0), Amat=A.tan, bvec=1,
+ meq=1)
> tan1$solution/sum(tan1$solution)
[1] 0.424 0.148 0.428
Note that this result matches the one obtained in Example 5.10.

T&F Cat #K31368 — K31368 C005— page 132 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 133

5.8 Portfolio Constraints

The optimal portfolios described in this chapter are all derived under the
assumption that the only restriction on the weight vector w is that the weights
sum to 1, wT 1 = 1. In practice, analysts often place constraints on the vector
of weights available to the investor.
Such constraints often have only minor effects on the basic approaches
described in this chapter; for instance, the minimum risk frontier still consists
of those portfolios with the smallest risk for a given expected return, but now
such portfolios must satisfy the constraints.
On the other hand, constraints may have large effects on specific results
and on the numerical solutions to the optimization problems used to calculate
the weight vectors of optimal portfolios. For instance, Corollary 5.3, which
states that if the portfolios with weight functions w1 and w2 are on the
minimum-risk frontier, then the portfolio with weight vector zw1 + (1 − z)w2 ,
for any z, is on the minimum-risk frontier, may not hold since the portfolio
with weight vector zw1 + (1 − z)w2 may not satisfy the constraints.
In this section, we consider the calculation of the weight vectors of the
risk-averse and tangency portfolios, when those weight vectors are subject to
certain types of constraints.
First, consider determination of the weight vector that maximizes the
risk-aversion criterion function
λ T
wT μ − w Σw (5.29)
2
subject to some commonly used constraints. A constrained maximization
problem of this form is another example of a quadratic programming problem.
The R function solve.QP that was used in Example 5.9 to find the weight vec-
tors of the risk-averse portfolio subject only to the constraint that the weights
sum to 1 can be used to solve many constrained minimization problems.
Recall that the function solve.QP in the package quadprog can be used
to maximize an objective function of the form

1
dT x − xT Dx (5.30)
2
with respect to the vector x, which is subject to equality and inequality
constraints on x. The constraints are of the form

AT x(=, ≥)b

for a matrix A and vector b, where (=, ≥) denotes either equality or inequality
of the form ≥ on a component-wise basis.
Inequality constraints may be included in solve.QP by specifying appro-
priately the arguments Amat, corresponding to the matrix A described earlier,

T&F Cat #K31368 — K31368 C005— page 133 — 6/14/2017 — 22:05

134 Introduction to Statistical Methods for Financial Models

and bvec, corresponding to the vector b. The argument meq of solve.QP indi-
cates the number of equality constraints; these must correspond to the ﬁrst
columns of Amat.
The following example illustrates in detail the maximization of the
risk-aversion criterion function subject to the constraint that the weight on
asset j, wj , is nonnegative for each j. That is, the portfolio cannot contain a
short position on any asset. We will write this constraint as w ≥ 0.
Example 5.12 Consider the assets described in Example 5.6, with mean
return vector (0.25, 0.125, 0.3) and covariance matrix given by
⎛ ⎞
0.25 0.1 0.24
Σ = ⎝ 0.1 0.16 0.096⎠ .
0.24 0.096 0.36
Consider maximization of the risk-aversion criterion function (5.29) as dis-
cussed in Example 5.8.
For λ = 0.5, the weight vector wλ is given by
> w_mv + 2*vbar
[1] 0.632 -0.501 0.869,
which includes a substantial short position on asset 2. Hence, we might con-
sider maximizing (5.29) when λ = 0.5, subject to the constraint that all asset
weights are nonnegative.
The arguments to solve.QP are Dmat and dvec, which specify the objective
function as described in (5.30), and Amat, bvec, and meq, which specify the
equality and inequality constraints.
Thus, to maximize the risk-aversion criterion function, dvec is taken to
be μ, the vector of asset means, and Dmat is taken to be λΣ, where Σ is the
covariance matrix of the asset returns. The constraints in our problem are
1T w = 1, w1 ≥ 0, w2 ≥ 0, w3 ≥ 0;
thus, ⎛ ⎞
1 1 1
⎜1 0 0⎟
A =⎜
T
⎝0
⎟
1 0⎠
0 0 1
and
b = (1, 0, 0, 0)T .
The argument meq indicates the number of equality constraints; thus, in
this example, meq = 1, indicating that the constraints are given by
⎛ ⎞ ⎛ ⎞
1 1 1 = 1
⎜1 0 0⎟ ≥ ⎜0 ⎟
⎜ ⎟ ⎜ ⎟
⎝0 1 0⎠ x ≥ ⎝0⎠ .
0 0 1 ≥ 0

T&F Cat #K31368 — K31368 C005— page 134 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 135

Therefore, the R commands needed to solve this constrained optimization

problem are as follows:

> library(quadprog)
> A<-cbind(c(1,1,1), diag(3))
> t(A)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 0 1
> b<-c(1, 0, 0, 0)
> qpsol<-solve.QP(Dmat=(.5)*Sig, dvec=mu, Amat=A, bvec=b, meq=1)

Note that diag(3) returns a 3 × 3 identity matrix and cbind combines two
vectors or matrices by the columns; when the vector c(1,1,1) is used in
this context, it is interpreted as a column vector. The weight vector that
maximizes the objective function is the component $solution of the result of
the function solve.QP; therefore, for this problem, the weight vector is

> qpsol$solution
[1] 0.154 0.000 0.846

This weight vector maximizes the risk-aversion criterion function based on

λ = 0.5, subject to the restriction that the weights are nonnegative.
Thus, the expected return of the portfolio that minimizes the risk-aversion
criterion function with λ = 0.5, subject to the no-short-positions constraint, is

> sum(qpsol$solution*mu)
[1] 0.292

and the return standard deviation is

> ((qpsol$solution)%*%Sig%*%qpsol$solution)^.5
[,1]
[1,] 0.571

These may be compared to the mean and standard deviation of the return
of the optimizing portfolio that is not subject to the constraint

> sum((w_mv + 2vbar)mu)

[1] 0.356
> ((w_mv + 2*vbar)%*%Sig%*%(w_mv + 2*vbar))^.5
[,1]
[1,] 0.727

T&F Cat #K31368 — K31368 C005— page 135 — 6/14/2017 — 22:05

136 Introduction to Statistical Methods for Financial Models

Thus, the no-short-positions constraint leads not only to a lower expected

return but also to a lower risk. In terms of the objective function (5.29), for
the constrained solution, the value is
> 1.17 - (0.5/2)*(1.14^2)
[1] 0.210
while for the unconstrained solution, the value is
> 1.42 - (0.5/2)*(1.45^2)
[1] 0.224
Thus, as expected, the unconstrained optimal value is higher than the
constrained optimal value.

Holding Constraints
Another example of commonly used constraints are holding constraints of
the form Lj ≤ wj ≤ Uj , j = 1, . . . , N , where Lj and Uj are lower and upper
bounds, respectively, on the proportion of the investment in asset j. These
may also be handled using the function solve.QP.
Example 5.13 Consider the assets analyzed in Example 5.12. The weight
vector that maximizes the risk-aversion criterion function (5.29) for λ = 1 is
given by
> w_mv + vbar
[1] 0.437 0.106 0.457
Suppose we would like our portfolio to allocate between 25% and 75% of
our investment to each asset; that is, we would like to enforce the constraints

0.25 ≤ wj ≤ 0.75 for j = 1, 2, 3.

To express these constraints as lower bounds on functions of the weights, we

write them as
wj ≥ 0.25, −wj ≥ −0.75, j = 1, 2, 3.
Thus, the matrix Amat used in solve.QP is taken to be
> Ah<-cbind(c(1,1,1), diag(3), -1*diag(3))
> t(Ah)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 0 1
[5,] -1 0 0
[6,] 0 -1 0
[7,] 0 0 -1

T&F Cat #K31368 — K31368 C005— page 136 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 137

and the vector bvec is taken to be

> bh<-c(1, 0.25, 0.25, 0.25, -0.75, -0.75, -0.75)

The solution to the maximization problem is then given by

> w1<-solve.QP(Sig, mu, Amat=Ah, bvec=bh, meq=1)$solution

> w1
[1] 0.30 0.25 0.45

The corresponding portfolio has a mean return 0.241 and return standard
deviation of 0.455. These can be compared to the mean return and return
standard deviation of the unconstrained optimal portfolio, given by 0.260 and
0.489, respectively.

Other types of constraints can be handled in a similar manner. For exam-

ple, suppose that N > 5; we might want the total of the portfolio weights on
assets 1 through 5 to be at least 0.50, that is,

w1 + w2 + w3 + w4 + w5 ≥ 0.50.

Or if we want the weight on asset 1 to be at least as great as the weight on

asset 2, we may use the constraint

w1 − w2 ≥ 0.

By properly choosing the argument Amat, solve.QP can solve many con-
strained optimization problems of this type.

Maximizing the Sharpe Ratio under

Nonnegativity Constraints
In Example 5.11, it was shown that the Sharpe ratio may be maximized by
minimizing the variance of the portfolio return subject to a restriction on
the portfolio mean return; therefore, solve.QP may be used to calculate the
weights of the tangency portfolio. That approach is based on the following
property of Sharpe ratios:

w1T (μ − μf 1) w2T (μ − μf 1)
1 ≥ 1
(w1T Σw1 ) 2 (w2T Σw2 ) 2

if and only if
(cw1 )T (μ − μf 1) (dw2 )T (μ − μf 1)
1 ≥ 1 (5.31)
((cw1 )T Σ(cw2 )) 2
((dw2 )T Σ(dw2 )) 2
for any c > 0 and d > 0.
Therefore, the same approach can be used to ﬁnd the portfolio that max-
imizes the Sharpe ratio under constraints, provided that a weight vector w

T&F Cat #K31368 — K31368 C005— page 137 — 6/14/2017 — 22:05

138 Introduction to Statistical Methods for Financial Models

satisfies the constraints if and only if cw satisfies the constraints for any c > 0.
Note that this condition is satisfied for nonnegativity constraints of the form
w ≥ 0; however, it is not satisfied for other types of constraints, such as the
holding constraints considered in Example 5.13.
The details are described in the following example.
Example 5.14 Consider the set of four assets used in Example 5.1 and let mu
and Sigma denote the R variables containing the mean vector and covariance
matrix, respectively, of the returns,
> mu
[1] 0.10 0.20 0.05 0.10
> Sigma
[,1] [,2] [,3] [,4]
[1,] 0.05 0.01 0.02 0.00
[2,] 0.01 0.10 0.05 0.02
[3,] 0.02 0.05 0.20 0.10
[4,] 0.00 0.02 0.10 0.20
and take the risk-free rate to be 0.01.
Then the weight vector of the tangency portfolio is given by
> solve(Sigma, mu-0.01)/sum(solve(Sigma, mu-0.01))
[1] 0.482 0.559 -0.223 0.182
Alternatively, we can calculate this weight vector using solve.QP, as described
in Example 5.11.
> wT<-solve.QP(Dmat=2*Sigma, dvec=rep(0,4), Amat=cbind(mu-0.01),
+ bvec=1, meq=1)$solution
> wT/sum(wT)
[1] 0.482 0.559 -0.223 0.182
We can include nonnegativity constraints in the calculation by taking the
argument Amat to be the matrix
> A.shrp<-cbind(mu-0.01, diag(4))
> t(A.shrp)
[,1] [,2] [,3] [,4]
[1,] 0.09 0.19 0.04 0.09
[2,] 1.00 0.00 0.00 0.00
[3,] 0.00 1.00 0.00 0.00
[4,] 0.00 0.00 1.00 0.00
[5,] 0.00 0.00 0.00 1.00
taking bvec to be the vector
> b.shrp<-c(1, rep(0,4))
> b.shrp
[1] 1 0 0 0 0

T&F Cat #K31368 — K31368 C005— page 138 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 139

and taking meq=1. Thus, the portfolio with nonnegative weights that
maximizes the Sharpe ratio is given by
> wT.nn<-solve.QP(Dmat=2*Sigma, dvec=rep(0,4), Amat=A.shrp,
+ bvec=b.shrp, meq=1)$solution
> wT.nn/sum(wT.nn)
[1] 0.425 0.494 0.000 0.081

Any constraint of the form aT w = 0 or aT w ≥ 0, where a is a given vector

in N , can be handled by the same method. For example, if N = 5, the
constraint w1 + w2 + w3 ≥ w4 + w5 satisﬁes the condition that the constraint
holds if and only if cw1 + cw2 + cw3 ≥ cw4 + cw5 for any c > 0. Hence, we may
use the same general approach used in Example 5.14 to ﬁnd the weight vector
that maximizes the Sharpe ratio subject to this constraint.

5.9 Suggestions for Further Reading

Efficient portfolio theory is one of the cornerstones of quantitative finance and
financial engineering, and it is discussed in many texts in these fields; see, for
example, Campbell et al. (1997, Section 5.2), Francis and Kim (2013, Chap-
ter 7), and Qian et al. (2007, Chapter 2) for useful introductions to this area.
Hult et al. (2012, Chapter 4) provide a detailed treatment of portfolio selection
under the risk-aversion criterion. Merton (1972) offers clear proofs of many
of the fundamental results of efficient portfolio theory and Markowitz (1987)
provides a comprehensive treatment of the subject, although at a more math-
ematically advanced level. Michaud (1989) presents an interesting discussion
of the practical usefulness of the theory; see also Jobson and Korkie (1981).
The method of Lagrange multipliers is a useful technique for solving many
constrained optimization problems; see, for example, Stewart (2015, Section
14.8) and Larson and Edwards (2014, Section 13.10) for further discussion.

5.10 Exercises
1. Consider a three-dimensional return vector R with mean vector
given by (0.04, 0.03, 0.05) and covariance matrix given by
⎛ ⎞
0.05 0.05 0.025
⎝ 0.05 0.10 0.08 ⎠ .
0.025 0.08 0.075

Let Rp1 denote the return on the portfolio with weight vector
(1/3, 1/3, 1/3) and let Rp2 denote the return on the portfolio with
weight vector (0.4, 0.4, 0.2).
a. Find the mean and standard deviation of Rp1 ; see Example 5.1.
b. Find the mean and standard deviation of Rp2 .

T&F Cat #K31368 — K31368 C005— page 139 — 6/14/2017 — 22:05

140 Introduction to Statistical Methods for Financial Models

c. Find the correlation of Rp1 and Rp2 .

d. Based on these results, is one of the portfolios preferable to the
other? Why or why not?
2. Consider a four-dimensional return vector R with mean vector given
by (0.02, 0.10, 0.05, 0.06) and covariance matrix given by
⎛ ⎞
0.02 0.01 0.01 0
⎜0.01 0.05 0.02 0 ⎟
⎜ ⎟.
⎝0.01 0.02 0.03 0 ⎠
0 0 0 0.04

a. Using the computational method described in Example 5.5, ﬁnd

the portfolio on the minimum-risk frontier with a mean return
of 0.05. Find the return standard deviation of the portfolio.
b. Repeat Part (a) for mean returns of 0.06 and 0.07.
c. Based on these results, is it possible to say with certainty that
any of these portfolios is not on the eﬃcient frontier? Why or
why not? If it is possible, which ones are not on the eﬃcient
frontier?
3. (Corollary 5.1) For j = 1, 2, let R̂pj denote the return on the port-
folio with weight vector ŵj . Suppose that both portfolios are on
the minimum-risk frontier and that E(R̂p1 ) = E(R̂p2 ). Show that
ŵ1 = ŵ2 .
4. (Corollary 5.2) Let R̂p denote the return on a portfolio on the
minimum-risk frontier and let Rp1 , Rp2 denote the returns on two
portfolios having the same expected return. Show that

Cov(R̂p , Rp1 ) = Cov(R̂p , Rp2 ).

5. Consider a ﬁve-dimensional return vector R with mean vector given

by (0.25, 0.20, 0.30, 0.275, 0.15) and covariance matrix given by
⎛ ⎞
1.0 0.40 0.60 0.5 0.30
⎜0.4 0.70 0.50 0.4 0.25⎟
⎜ ⎟
⎜0.6 0.50 1.30 0.6 0.35⎟ .
⎜ ⎟
⎝0.5 0.40 0.60 1.0 0.30⎠
0.3 0.25 0.35 0.3 0.50

Find the weight vector of the minimum-variance portfolio; see

Example 5.6.
6. Let Rmv denote the return on the minimum-variance portfolio and
let R0 denote the return on a zero-investment portfolio. Show that

Cov(Rmv , R0 ) = 0.

T&F Cat #K31368 — K31368 C005— page 140 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 141

Does the converse hold? That is, suppose that a portfolio return
Rp satisfies
Cov(Rp , R0 ) = 0
for the return R0 on any zero-investment portfolio. Does it follow
that Rp is the return on the minimum-variance portfolio? Why or
why not?
7. Let Rmv denote the return on the minimum-variance portfolio and
let Rp denote the return on another portfolio. Find an expression
for
Var(Rp )
Var(Rmv )
in terms of the correlation of Rp and Rmv .
8. Consider a market consisting of N assets and let R denote the
vector of asset returns; let μ denote the vector of mean returns and
let Σ denote the covariance matrix of R.
Let ŵ denote the weight vector of a portfolio on the efficient
frontier and let w̃ denote the weight vector of another portfolio that
is not on the efficient frontier. Suppose that the two portfolios have
the same mean return and let γ = Var(ŵT R)/Var(w̃T R) denote the
ratio of the variances of the portfolio returns; note that, since the
portfolio with weight vector ŵ is on the efficient frontier, 0 < γ < 1.
Find the correlation of returns on the two portfolios as a function
of γ.
9. Let λ1 , λ2 be nonnegative real numbers and let Rj denote the return
on the risk-averse portfolio with the risk-aversion parameter λj ,
j = 1, 2.
Consider the portfolio with a return of the form

Rp = wR1 + (1 − w)R2

where 0 < w < 1.

Show that Rp is the return on the risk-averse portfolio with
risk-aversion parameter λp and ﬁnd λp in terms of λ1 , λ2 , and w.
10. Suppose that the risk-aversion criterion is based on excess returns
rather than on standard returns. Therefore, for a given value of
λ > 0, let w̃λ denote the maximizer of
λ T
wT (μ − μf 1) − w Σw
2
over w ∈ N , subject to the restriction that 1T w = 1.
Find w̃λ and show how it relates to wλ , the weight vector of the
risk-averse portfolio based on parameter λ.

T&F Cat #K31368 — K31368 C005— page 141 — 6/14/2017 — 22:05

142 Introduction to Statistical Methods for Financial Models

11. Consider a set of three assets with the mean return vector and
return covariance matrix as given in Exercise 1.
Using the approach described in Example 5.8, find wmv , the
weight vector of the minimum-variance portfolio, and v̄, the weight
vector of the zero-investment portfolio given in the statement of
Proposition 5.5. Use those results to give the weight vectors of the
risk-averse portfolio with parameters λ = 1 and λ = 5.
12. Consider a set of three assets with the mean return vector and
return covariance matrix as given in Exercise 1.
Find the mean and variance of the return on the risk-averse
portfolio based on a risk-aversion parameter λ.
13. Consider the set of five assets with the mean return vector and
return covariance matrix as given in Exercise 5.
Using the R function, solve.QP, as in Example 5.9, find the
weight vector of the risk-averse portfolio based on the risk-aversion
parameter λ = 1. Find the mean return and return standard
deviation of the portfolio.
14. Consider a vector of asset returns with mean vector μ and covari-
ance matrix Σ. Find an expression for the Sharpe ratio of the
tangency portfolio in terms of μ, Σ, and μf .
15. Consider a market consisting of N assets and let R denote the
vector of asset returns. Let μ denote the vector of mean returns
and let Σ denote the covariance matrix of R. Let μf denote the
return on the risk-free asset.
Find conditions on μ under which the minimum-variance port-
folio is the same as the tangency portfolio.
16. Use the general expression for the weight vector of the tangency
portfolio given in Section 5.7 to derive the expression for the
weight vector of the tangency portfolio for the N = 2 case given
in Section 4.6.
17. Let wT denote the weight vector of the tangency portfolio and let
wλ denote the weight vector of the risk-averse portfolio based on
the risk-aversion parameter λ.
Show that there exists λT > 0 such that wT = wλT and give an
expression for λT .
18. Consider a set of five assets with the mean return vector and return
covariance matrix as given in Exercise 5. Assume that the risk-free
rate of return is μf = 0.01.
Find wT , the weight vector of the tangency portfolio; see
Example 5.10.

T&F Cat #K31368 — K31368 C005— page 142 — 6/14/2017 — 22:05

Eﬃcient Portfolio Theory 143

19. Consider a set of six assets with return vector R. Suppose that the
mean vector of R − Rf 1 is given by (0.04, 0.08, 0.02, 0.10, 0.03, 0.06)
and that the covariance matrix of R is given by
⎛ ⎞
0.20 0.02 0.03 0.04 0.05 0.06
⎜0.02 0.50 0.06 0.08 0.10 0.12⎟
⎜ ⎟
⎜0.03 0.06 0.20 0.12 0.15 0.18⎟
⎜ ⎟
⎜0.04 0.08 0.12 0.80 0.20 0.24⎟ .
⎜ ⎟
⎝0.05 0.10 0.15 0.20 1.20 0.30⎠
0.06 0.12 0.18 0.24 0.30 0.80
Find wT , the weight vector of the tangency portfolio; see Exam-
ple 5.10.
20. Consider the set of five assets with the mean return vector and
return covariance matrix as given in Exercise 5. Assume that the
risk-free rate of return is μf = 0.01.
Find the Sharpe ratio of the tangency portfolio and compare
it to the Sharpe ratios of the equally-weighted portfolio and the
minimum-variance portfolio.
21. Let wλ denote the weight vector of the risk-averse portfolio with
parameter λ, as given by Proposition 5.5. Write wλ in terms of
wmv , the weight vector of the minimum-variance portfolio, and
wT , the weight vector of the tangency portfolio.
22. For the three assets with the mean return vector and return covari-
ance matrix given in Exercise 1, determine the weight vector
that maximizes the risk-aversion criterion function with parameter
λ = 5, subject to the constraint that all weights are nonnegative;
see Example 5.12.
Find the mean and variance of the return on the resulting port-
folio and compare these to the mean and variance of the return on
the risk-averse portfolio based on λ = 5.
23. Consider the set of five assets with the mean return vector and
return covariance matrix as given in Exercise 5.
Suppose we want to find the portfolio weights that maximize
the risk-aversion criterion function with parameter λ = 1 subject
to the constraints that the portfolio weights are all nonnegative
and that the sum of the weights given to assets 1 and 2 is equal to
the sum of the weights given to assets 4 and 5. That is, in terms
of the weight vector w = (w1 , w2 , w3 , w4 , w5 )T , we want to enforce
the constraints that wj ≥ 0 for j = 1, 2, . . . , 5 and that
w1 + w2 = w4 + w5 .

Find the optimal weight vector and calculate the mean return
and return standard deviation of the corresponding portfolio.

T&F Cat #K31368 — K31368 C005— page 143 — 6/14/2017 — 22:05

144 Introduction to Statistical Methods for Financial Models

Compare these to the mean return and return variance of the

unconstrained risk-averse portfolio with parameter λ = 1 found in
Exercise 13.
24. Consider the set of six assets with the mean excess return vector
and return covariance matrix as specified in Exercise 19. Using the
approach described in Example 5.14, find the weight vector of the
portfolio that maximizes the Sharpe ratio subject to the restriction
that all weights are nonnegative.
25. Consider the set of six assets with the mean excess return vector and
return covariance matrix as specified in Exercise 19. Find the weight
vector of the portfolio that maximizes the Sharpe ratio subject
to the restriction that the weight vector (w1 , w2 , w3 , w4 , w5 , w6 )T
satisfies
w1 + w2 + w3 ≤ w4 + w5 + w6 .

T&F Cat #K31368 — K31368 C005— page 144 — 6/14/2017 — 22:05

6
Estimation

6.1 Introduction
The portfolio theory developed in the previous chapters is based on prop-
erties of the distribution of asset returns, speciﬁcally their means, standard
deviations, and correlations. Of course, in practice, these parameters are all
unknown and must be estimated.
The simplest approach to estimating such parameters is to use the cor-
responding sample versions based on historical data; for instance, we can
estimate a mean return by the sample mean of a series of observed returns.
Such methods often work fairly well, particularly when analyzing just a few
assets. However, in many cases, better estimators are available.
In this chapter, several methods of estimating the parameters needed
for portfolio analysis are presented. Other methods, which build on those
discussed here, are covered in the following chapters.

6.2 Basic Sample Statistics

The most straightforward approach to estimating parameters of return distri-
butions is to use empirical estimators based on observed returns. Let Rj,t
denote the return on asset j for period t; here we use monthly returns,
although the methods can be applied to other return intervals as well. We
assume that T periods of data are available so that t = 1, 2, . . . , T . For Rj,t ,
we generally use returns that have been adjusted for dividends and stock splits,
as discussed in Section 2.3 and as provided by services such as Yahoo Finance.
When obtaining return data, we must choose the observation period, which
is sometimes called the sampling horizon. The observation period refers to the
total time period over which data are collected. The choice of the observation
period must balance two competing considerations. A longer period gives us
more data points that may yield more accurate estimators. However, this
higher accuracy is only available if the parameters being estimated are con-
stant over the period being sampled. For instance, it may be tempting to use
50 years of monthly returns to estimate the mean return on a stock. However,
it is unlikely that the return on a stock in 1966 will be relevant to investment

145

T&F Cat #K31368 — K31368 C006— page 145 — 6/14/2017 — 22:05

146 Introduction to Statistical Methods for Financial Models

decisions made in 2016. From a statistical point-of-view, using a long obser-

vation period may lead to a bias in the estimator due to the fact that the
parameters are not constant over that period. In practice, observation periods
in the range of 3–10 years are typically used. With shorter return intervals,
shorter observation periods are sometimes used.
We assume that, over the observation period, the parameters of interest are
constant. For instance, consider asset j, for some j = 1, 2, . . . , N ; we assume
that for each t = 1, 2, . . . , T , E(Rj,t ) = μj and Var(Rj,t ) = σ2j . Returns on
an asset in different time periods are assumed to be uncorrelated so that
for t, s, t = s, Cov(Rj,t , Rj,s ) = 0; that is, for a given asset, the sequence
of returns consists of uncorrelated random variables all with the same mean
and standard deviation. Let ρjk denote the correlation of Rj,t , Rk,t for any
j, k = 1, 2, . . . , N , j = k; thus, we assume that returns on different assets in
the same time period are correlated. Returns on different assets in different
time periods are assumed to be uncorrelated: Cov(Rj,t , Rk,s ) = 0 for t = s.
In many cases, we are interested in the mean excess return on an asset.
Let Rf,t denote the return on the risk-free asset at time t. It is important to
realize that, even though the return on the risk-free asset has zero variance,
the risk-free rate itself changes over time. When estimating the mean excess
return on asset j, we assume that the excess returns Rj,t − Rf,t , t = 1, 2, . . . , T
are uncorrelated random variables, each with a given expected value, which
we denote by μj − μf .

Estimation of Return Means and Standard Deviations

The simplest estimator of μj , the expected return on asset j, is given by the
sample mean of Rj,1 , Rj,2 , . . . , Rj,T :

T
R̄j = Rj,t .
T t=1

The corresponding estimator of the mean excess return, μj − μf , is given by

T
(Rj,t − Rf,t ) = R̄j − R̄f
T t=1

where
1

T
R̄f = Rf,t .
T t=1
Now consider estimation of the return standard deviations or the return
variances. Note that here we have a choice—we have deﬁned σ2j to be the vari-
ance of Rj,t ; however, because the return on the risk-free asset has zero
variance, σ2j is also the variance of Rj,t − Rf,t . Therefore, to estimate σ2j ,
we may use either the sample variance of the returns
Rj,1 , Rj,2 , . . . , Rj,T

T&F Cat #K31368 — K31368 C006— page 146 — 6/14/2017 — 22:05

Estimation 147

or the sample variance of the excess returns

Rj,1 − Rf,1 , Rj,2 − Rf,2 , . . . , Rj,T − Rf,T .

Because Rf,t changes with t, these estimates will diﬀer.

To some extent, the choice depends on the context. For instance, if we
are simply analyzing the properties of the returns on an asset, as we did in
Chapter 2, the sample variance of the returns is generally appropriate; on the
other hand, if we are using the results to construct an eﬃcient portfolio, as
we did in Chapters 4 and 5, then excess returns are often relevant and, in
those cases, it would be appropriate to use the sample variance of the excess
returns.
Both approaches are used here; of course, given a result for an estimator
based on excess returns, it is a simple matter to describe an analogous estima-
tor based on standard returns and vice versa. Note that if values in the series
Rf,t , t = 1, 2, . . . , T are approximately constant, which is often the case, there
will be only minor diﬀerences in the two estimates.
The sample variance of the excess returns on asset j is given by

T
Sj2 = {Rj,t − Rf,t − (R̄j − R̄f )}2 ;
T − 1 t=1

the sample standard

deviation Sj is simply the square root of the sample
2
variance, Sj = Sj .

Example 6.1 Consider returns on Wal-Mart stock. In Chapter 2, we cal-

culated the monthly returns for the ﬁve-year period from January 2010 to
December 2014, which we placed in the variable wmt.m.ret. Therefore, to cal-
culate the excess returns, we need ﬁve years of monthly returns on a risk-free
asset.
As noted in Chapter 4, a standard choice for the risk-free return is the
return on a three-month Treasury Bill. This can be obtained from the Federal
Reserve website,

http://www.federalreserve.gov/releases/h15/data.htm

which contains an Excel spreadsheet with historical values dating back to

1934, found under the “Treasury Bills (secondary market)” heading.
Once the spreadsheet is downloaded, the simplest way to use the data in R
is to copy them from Excel by highlighting the relevant values, copying them
to the “clipboard” (i.e., by highlighting the cells and clicking “copy”), and
using the scan function, which reads the data into a vector:
> rff<-scan(file="clipboard")
Read 60 items
> head(rff)
[1] 0.06 0.11 0.15 0.16 0.16 0.12

T&F Cat #K31368 — K31368 C006— page 147 — 6/14/2017 — 22:05

148 Introduction to Statistical Methods for Financial Models

Alternatively, the ﬁle name may be speciﬁed in scan; it is often convenient to

use file=file.choose(), which allows the user to select the file from a menu.
The values in the variable rff are annual returns as percentages, which
must be converted to proportional monthly returns using the command
> rfree<-(1 + rff/100)^(1/12)-1
> head(rfree)
[1] 4.999e-05 9.162e-05 1.249e-04 1.332e-04 1.332e-04 9.995e-05
For example, 12 months of compounded returns at a rate of 4.999 × 10−5
yields a yearly rate of 0.06%:
> (1 + 4.999e-05)^12
[1] 1.0006
The excess returns for Wal-Mart stock can then be obtained as the
difference of the Wal-Mart returns and the risk-free rate:
> wmt.ex<-wmt.m.ret-rfree
> mean(wmt.ex)
[1] 0.01088
> sd(wmt.ex)
[1] 0.04416
Note that the standard deviation of the standard Wal-Mart returns is also
approximately 0.04416; retaining more significant digits shows that the stan-
dard deviation of the excess returns is 0.0441599 while the standard deviation
of the standard returns is 0.0441572, a difference that is meaningless in
practice.

Sample Covariances and Correlations

The same basic approach used to estimate return means and standard devi-
ations may be used to estimate a covariance or correlation of the returns on
diﬀerent assets. Using the excess returns, the sample covariance of the returns
on assets j and k is given by

T
Sjk = {Rj,t − Rf,t − (R̄j − R̄f )}{Rk,t − Rf,t − (R̄k − R̄f )}.
T − 1 t=1

The correlation of the returns on asset j and k can be estimated by the sample
correlation, sometimes called the sample correlation coeﬃcient,
Sjk
ρ̂jk ≡ .
Sj Sk
Note that the sample correlation has many of the same properties as the
correlation between two random variables; for example, it takes values in the
interval [−1, 1] and it is not aﬀected by linear transformations of the data.

T&F Cat #K31368 — K31368 C006— page 148 — 6/14/2017 — 22:05

Estimation 149

Example 6.2 Let sbux.ex denote ﬁve years of excess monthly returns on
Starbucks stock (symbol SBUX), calculated using the same procedure used
for wmt.ex in Example 6.1. The sample covariance between the excess returns
on Wal-Mart and Starbucks stock may be calculated using the cov function:

> cov(wmt.ex, sbux.ex)

[,1]
[1,] 0.0002081

The corresponding sample correlation may be calculated using the cor

function:

> cor(wmt.ex, sbux.ex)

[,1]
[1,] 0.07769

Statistical Properties of the Estimators

The sample mean, standard deviation, and correlation are used in many
applications of statistics and their properties are well-known.
Consider the properties of R̄j and Sj under the assumption that
Rj,1 , Rj,2 , . . . , Rj,T are independent, identically distributed random variables.
Then E(R̄j ) = μj , so that R̄j is an unbiased estimator of μj , and Var(R̄j ) =
σ2j /T . Futhermore, the sampling distribution of R̄j is approximately nor-
√
mal. More formally, we say that the standardized estimator T (R̄j − μj )/σj
converges in distribution to a standard normal distribution as T → ∞, a con-
squence of the central limit theorem (CLT). The same result holds under the
weaker assumption that Rj,1 , Rj,2 , . . . , Rj,T are uncorrelated.

Example 6.3 Suppose that the returns on an asset have mean 0.01 and
standard deviation 0.05. Then the√sample mean return R̄j has expected value
0.01 and standard deviation 0.05/ T , where T is the number of observations.
For example, for T = 60, R̄j is approximately normally distributed with mean
0.01 and standard deviation
0.05
√ = 0.00645.
60

Since the 75th percentile of the standard normal distribution is approx-

imately 2/3, the upper and lower quartiles of the distribution of R̄j are
approximately

2
0.01 ± (0.00645) = 0.00570 and 0.0143.
3
That is, based on a sample of size 60, there is about a 50% chance that the
sample mean return will be within 0.0043 of the true mean return.

T&F Cat #K31368 — K31368 C006— page 149 — 6/14/2017 — 22:05

150 Introduction to Statistical Methods for Financial Models

The sampling distribution of the sample mean return may be used to

construct an approximate confidence interval for the asset’s true mean return
in the usual way. Let R̄j and Sj denote the sample mean and sample standard
deviation of a series of returns on asset j; then
Sj
R̄j ± 1.96 √
T
is an approximate 95% confidence interval for μj , the mean return on the
asset; the same approach may be used to construct an approximate confi-
dence interval for the mean excess
√ return. Recall that the estimated standard
deviation of an estimator, Sj / T in this case, is known as its standard error.
When T is small, a confidence interval based on the t-distribution might be
used, provided that it is reasonable to assume that the returns are normally
distributed. However, when analyzing return data, T is generally large enough
that we may use the approximate confidence interval described earlier that
does not require normally-distributed returns.
Example 6.4 Let μW − μf denote the mean excess return on a share of
Wal-Mart stock. Let R̄W and R̄f denote the sample means of the returns
on Wal-Mart stock and the risk-free asset, respectively, and let SW denote
the sample standard deviation of the excess returns. Using the results in
Example 6.1, R̄W − R̄f = 0.0109 and SW = 0.0442; therefore, an approximate
95% confidence interval for μW − μf is given by
0.0442
0.0109 ± 1.96 √ = 0.0109 ± 0.0112 = (−0.0003, 0.0221).
60
It may be of interest to compare the returns on two assets. Consider
assets i and j; let Ri,1 , Ri,2 , . . . , Ri,T denote the returns on asset i and
let Rj,1 , Rj,2 , . . . , Rj,T denote the returns on asset j. Suppose that we are
interested in estimating the difference in the mean returns, μi − μj , where
μi = E(Ri,t ) and μj = E(Rj,t ). To estimate μi − μj , we can simply use the
difference in the sample means, R̄i − R̄j .
However, to construct a confidence interval for μi − μj , we must take into
account the fact that the returns on different assets in the same time period are
generally correlated. That is Cov(Ri,t , Rj,t ) = 0. The data Ri,1 , Ri,2 , . . . , Ri,T
and Rj,1 , Rj,2 , . . . , Rj,T may be viewed as “matched pairs” data, where the
pair is based on the time period. Therefore, we should consider the data to
be T pairs of the form (Ri,t , Rj,t ), t = 1, 2, . . . , T , where the pairs for different
time periods are uncorrelated, but the returns in each pair are correlated.
To construct a confidence interval for μi − μj in such cases, we compute
the differences of the pairs’ returns, Yt = Ri,t − Rj,t , t = 1, 2, . . . , T . Then
Y1 , Y2 , . . . , YT are uncorrelated random variables with mean μi − μj and stan-
dard deviation σd ; although it is possible to write σd in terms of the asset
return standard deviations and the correlation of the asset returns because σd
may be estimated directly, such an expression is not needed.

T&F Cat #K31368 — K31368 C006— page 150 — 6/14/2017 — 22:05

Estimation 151

Therefore, an approximate 95% conﬁdence interval for μi − μj is given by

SY
Ȳ ± 1.96 √
T
where Ȳ and SY are the sample mean and sample standard deviation, respec-
tively, of Y1 , Y2 , . . . , YT . Note that Ȳ = R̄i − R̄j but SY cannot be written as
a function of the individual return standard deviations. Also note that Yt is
also the difference of the excess returns on the two assets.
Example 6.5 Consider the five years of monthly excess returns on Wal-Mart
and Starbucks stock, stored in the variables wmt.ex and sbux.ex, respectively.
Then the differences of the returns may be calculated using
> ret.d<-wmt.ex-sbux.ex
Note that ret.d also contains the differences of the standard returns on the
two assets.
The sample mean and standard deviation of the differences are given by
> mean(ret.d)
[1] -0.0134
> sd(ret.d)
[1] 0.0722
Therefore, an approximate 95% confidence interval for μW − μS , the
difference in the mean returns of the two stocks, is given by
0.0722
−0.0134 ± 1.96 √ = −0.0134 ± 0.0183 = (−0.0317, 0.0049).
60
The sample correlation of the returns on two assets, ρ̂ij , is also approxi-
mately normally distributed, with the mean given by the corresponding true
√
correlation, ρij , and standard error 1 − ρ2ij / T . Thus, an approximate 95%
confidence interval for ρij is given by

1 − ρ̂2ij
ρ̂ij ± 1.96 √ .
T
Many of the quantities estimated when analyzing return data have sim-
ilar sampling distributions that can be used to determine standard errors
and approximate confidence intervals, and these will be discussed as needed
throughout the remainder of the book.

6.3 Estimation of the Mean Vector and

Covariance Matrix
In practice, we are often interested in estimating the means, standard devi-
ations, and correlations for the returns, or excess returns, on several assets.

T&F Cat #K31368 — K31368 C006— page 151 — 6/14/2017 — 22:05

152 Introduction to Statistical Methods for Financial Models

Hence, it is useful to consider estimation of the mean vector and covariance

matrix of a vector of asset returns.
Suppose that there are N assets under consideration. Let Rt be the N × 1
vector asset returns at time period t of the form
⎛ ⎞
R1,t
⎜ R2,t ⎟
⎜ ⎟
Rt = ⎜ . ⎟ , t = 1, 2, . . . , T.
⎝ .. ⎠
RN,t
Let μ and Σ denote the mean vector and covariance matrix, respectively,
of Rt .
An estimator of μ is given by the sample mean vector of the returns,
1

T
R̄ = Rt .
T t=1
The vector of mean excess returns, μ − μf 1, may be estimated by the
sample mean vector of the excess returns
⎛ ⎞
R̄1 − R̄f
⎜ R̄2 − R̄f ⎟
⎜ ⎟
R̄E = R̄ − R̄f 1 = ⎜ .. ⎟.
⎝ . ⎠
R̄N − R̄f
To estimate Σ, we can use the sample covariance matrix S, calculated
from either the standard returns or the excess returns; here we use the excess
returns. The sample covariance matrix is an N × N matrix with the (j, k)th
element given by Sj2 if k = j and ρ̂jk Sj Sk if k = j. The same information,
in a form that is easier to interpret, is provided by the asset excess-return
sample standard deviations, S1 , S2 , . . . , SN together with the corresponding
sample correlation matrix, Ĉ, the N × N matrix with ones on the diagonal,
and the (j, k)th element given by ρ̂jk for j = k.

Data Matrix
When describing the sample mean vector and the sample covariance matrix, it
is often convenient to express them in terms of a data matrix. The data matrix
for the excess returns, which we denote here by X, is the T × N matrix with
row t given by the vector of excess returns at time t, (Rt − Rf,t 1)T :
⎛ ⎞
R1,1 − Rf,1 R2,1 − Rf,1 · · · RN,1 − Rf,1
⎜ R1,2 − Rf,2 R2,2 − Rf,2 · · · RN,2 − Rf,2 ⎟
⎜ ⎟
X=⎜ .. .. .. ⎟.
⎝ . . ··· . ⎠
R1,T − Rf,T R2,T − Rf,T ··· RN,T − Rf,T
Thus, the jth column of X is the time series of excess returns on asset j and
the row t of X is the vector of N asset excess returns at time t.

T&F Cat #K31368 — K31368 C006— page 152 — 6/14/2017 — 22:05

Estimation 153

The sample mean vector and the sample covariance matrix have simple
expressions in terms of X. The (column) vector of sample mean excess returns
may be written
1
R̄E = R̄ − R̄f 1 = X T 1T . (6.1)
T
The sample covariance matrix has a particularly simple expression in terms
of X:
1
S= (X − 1T R̄TE )T (X − 1T R̄TE ). (6.2)
T −1
Note that
⎛ ⎞
1
⎜1 ⎟
⎜ ⎟
1T R̄TE = ⎜ . ⎟ R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f
⎝ .. ⎠
1
⎛ ⎞
R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f
⎜R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f ⎟
⎜ ⎟
=⎜ .. .. .. ⎟
⎝ . . ··· . ⎠
R̄1 − R̄f R̄2 − R̄f · · · R̄N − R̄f

so that X − 1T R̄TE is given by

⎛ ⎞
R1,1 −Rf,1 −(R̄1 −R̄f ) R2,1 −Rf,1 −(R̄2 −R̄f ) ... RN,1 −Rf,1 − (R̄N −R̄f )
⎜ R1,2 −Rf,2 −(R̄1 −R̄f ) R2,2 −Rf,2 −(R̄2 −R̄f ) ... RN,2 −Rf,2 −(R̄N −R̄f ) ⎟
⎜ ⎟
⎜ .. .. .. .. ⎟.
⎝ . . . . ⎠
R1,T −Rf,T −(R̄1 −R̄f ) R2,T −Rf,T −(R̄2 −R̄f ) ... RN,T −Rf,T −(R̄N −R̄f )

Example 6.6 Consider the returns on the stocks of eight large companies,
Apple (symbol AAPL), Baxter International (BAX), Coca-Cola (KO), CVS
Health Corporation (CVS), Exxon Mobil (XOM), IBM (IBM), Johnson &
Johnson (JNJ), and Walt Disney (DIS). These companies were chosen to
represent large companies from a variety of industries.
For each stock, ﬁve years of monthly excess returns were calculated for the
period ending December 31, 2014. In R, each vector of 60 excess returns was
stored as a variable with the name of the stock symbol (e.g., aapl for Apple).
To calculate the parameters of the distribution of the return vector Rt ,
it is convenient to have all of the data stored in a single matrix, with each
column corresponding to a particular stock; this can be done using the cbind
command:

> big8<-cbind(aapl, bax, ko, cvs, xom, ibm, jnj, dis)

T&F Cat #K31368 — K31368 C006— page 153 — 6/14/2017 — 22:05

154 Introduction to Statistical Methods for Financial Models

Then big8 is a 60 × 8 matrix of excess returns; it corresponds to the data

matrix X described previously.
When reading the output from various functions, it is helpful to have the
columns of big8 labeled; this may be achieved using the following command:

> colnames(big8)<-c("AAPL", "BAX", "KO", "CVS", "XOM", "IBM",

+ "JNJ", "DIS")
> head(big8)
AAPL BAX KO CVS XOM IBM JNJ
[1,] -0.0886 -0.0186 -0.0483 0.00753 -0.0552 -0.06506 -0.0241
[2,] 0.0653 -0.0116 -0.0283 0.04254 0.0153 0.04353 0.0098
[3,] 0.1483 0.0272 0.0517 0.08313 0.0303 0.00845 0.0348
[4,] 0.1109 -0.1888 -0.0283 0.01211 0.0117 0.00571 -0.0139
[5,] -0.0163 -0.1058 -0.0385 -0.06216 -0.1019 -0.02415 -0.0852
[6,] -0.0209 -0.0309 -0.0168 -0.15344 -0.0562 -0.01431 0.0129
DIS
[1,] -0.0838
[2,] 0.0571
[3,] 0.1174
[4,] 0.0552
[5,] -0.0930
[6,] -0.0576

Descriptive statistics for the returns may now be calculated. Although

we could use matrix expressions like the one in (6.1) to obtain such results,
a simpler approach is to use the apply command, which applies a function
to the margins of a matrix or, more generally, an array. For instance, the
command apply(big8, MARGIN=2, FUN=mean) applies the function mean to
“margin” of the matrix big8 designated by the MARGIN argument, here given
by “2,” to denote columns, the second dimension of the matrix big8. As with
many other R functions, the argument names can be omitted provided that
the order of the arguments is respected.

> apply(big8, MARGIN=2, FUN=mean)

AAPL BAX KO CVS XOM IBM JNJ DIS
0.02540 0.00740 0.00978 0.02119 0.00825 0.00598 0.01153 0.02075
> apply(big8, 2, sd)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579

Therefore, the excess returns on Apple stock, for example, have sample mean
0.0254 and sample standard deviation 0.0739.
Of course, one or both of the results of the aforementioned apply function
may be assigned to a variable. For example,

> Rbar<-apply(big8, 2, mean)

T&F Cat #K31368 — K31368 C006— page 154 — 6/14/2017 — 22:05

Estimation 155

To calculate the sample correlation matrix of the data in the matrix big8,
we use the cor function with the data matrix as the argument.
> cor(big8)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.193 0.260 0.329 0.303 0.319 0.145 0.346
BAX 0.193 1.000 0.310 0.330 0.327 0.381 0.473 0.196
KO 0.260 0.310 1.000 0.310 0.338 0.197 0.493 0.348
CVS 0.329 0.330 0.310 1.000 0.442 0.244 0.421 0.537
XOM 0.303 0.327 0.338 0.442 1.000 0.520 0.408 0.650
IBM 0.319 0.381 0.197 0.244 0.520 1.000 0.206 0.348
JNJ 0.145 0.473 0.493 0.421 0.408 0.206 1.000 0.323
DIS 0.346 0.196 0.348 0.537 0.650 0.348 0.323 1.000
Thus, the excess returns on Apple and Disney stocks have correlation 0.346, for
example. Note that the sample correlation matrix, like all correlation matrices,
is symmetric.
The sample covariance matrix may be calculated using the cov function:
> Smat<-cov(big8)
> Smat
AAPL BAX KO CVS XOM IBM
AAPL 0.005460 0.000794 0.000790 0.001405 0.001028 0.001080
BAX 0.000794 0.003088 0.000709 0.001061 0.000835 0.000969
KO 0.000790 0.000709 0.001694 0.000737 0.000638 0.000371
CVS 0.001405 0.001061 0.000737 0.003339 0.001174 0.000646
XOM 0.001028 0.000835 0.000638 0.001174 0.002109 0.001094
IBM 0.001080 0.000969 0.000371 0.000646 0.001094 0.002099
JNJ 0.000413 0.001016 0.000783 0.000939 0.000723 0.000365
DIS 0.001479 0.000631 0.000830 0.001797 0.001729 0.000923
JNJ DIS
AAPL 0.000413 0.001479
BAX 0.001016 0.000631
KO 0.000783 0.000830
CVS 0.000939 0.001797
XOM 0.000723 0.001729
IBM 0.000365 0.000923
JNJ 0.001491 0.000722
DIS 0.000722 0.003352
Thus, a second way to compute the standard deviations of these eight stocks
is to use the square root of the diagonal elements of Smat. The diagonal
of a square matrix may be extracted using the diag command. Hence, the
estimated standard deviations are given by
> diag(Smat)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579
matching the results obtained previously.

T&F Cat #K31368 — K31368 C006— page 155 — 6/14/2017 — 22:05

156 Introduction to Statistical Methods for Financial Models

Some Properties of the Sample Covariance Matrix

Some basic properties of the sample covariance matrix S are easily obtained
from its expression in terms of the data matrix. For instance, it follows from
(6.2) that S is nonnegative definite. To see this, let u be an N × 1 vector.
Then uT Su = dT d where
1
d= √ (X − 1T R̄TE )u
T −1

is a T × 1 vector. Writing d = (d1 , d2 , . . . , dT )T , dT d = Tt=1 d2t ; it follows
that dT d ≥ 0 and, hence, uT Su ≥ 0 for any u ∈ N .
There is another important implication of (6.2) for the properties of S. Like
X, X − 1T R̄TE is a T × N matrix. Hence, although S is an N × N matrix, the
rank of S is, at most, min(N, T ). Therefore, if T < N , that is, if the number
of time periods is less than the number of assets, then S cannot be invertible
and, hence, it cannot be positive definite. Suppose that we are analyzing five
years of monthly returns, T = 60. Then for S to be positive definite, we can
consider, at most, 60 assets.
In addition to the sample covariance matrix being singular when N is larger
than T , if N and T are roughly the same magnitude, as is often the case with
financial data, then there are features of the sample covariance matrix that
are not, in general, accurate estimators of the corresponding features of Σ.
The basic issue is somewhat obvious. The covariance matrix Σ contains
. .
N (N + 1)/2 parameters. We have N T data points and if N = T, N T = N 2 .
Therefore, we are trying to estimate a large number of parameters with, rel-
atively speaking, a small amount of data. This occurs even though when T is
relatively large any one element of S, Sij , provides an accurate estimator of
Σij , the corresponding element of Σ.
Because the issue arises with the large number of covariances in Σ, not
with the variances, consider the properties of the sample correlation matrix
C as an estimator of the N × N correlation matrix C. If T is very large, so
that we have many observations, and N/T is near zero, so that the number of
= .
assets is small relative to the number of data points, then C C with high

probability; more formally, C converges in probability to C as T → ∞ and
N remains fixed. Here, convergence in probability of a sequence of matrices is
defined elementwise.
However, if T is large but q = N/T is not near zero, so that N is also large,
there are ways in which C is a poor estimator of C. For instance, it may be
shown that
.
−1 ) = 1
tr(C tr(C −1 )
(1 − q)

where tr(A) denotes the trace of a matrix A. Recall that the trace of a
matrix is the sum of its diagonal elements; it is also equal to the sum of its
eigenvalues.

T&F Cat #K31368 — K31368 C006— page 156 — 6/14/2017 — 22:05

Estimation 157

Although we are not interested directly in the trace of C −1 , many of the

weight vectors we derived in the previous chapter are based on the inverse
of the asset return covariance matrix. Hence, this result suggests that, when
N/T is not small, such weight vectors are not well-estimated by replacing the
return covariance matrix with the corresponding sample covariance matrix.
For example, if we have ﬁve years of monthly returns on 40 assets, so that
N = 40 and T = 60, then q = 2/3 and the trace of Ĉ −1 will be approximately
three times the trace of C −1 , suggesting that Ĉ −1 is, in some respects, a poor
estimator of C −1 .

6.4 Weighted Estimators

One drawback of estimators based on basic sample statistics, such as the
ones considered in the previous sections, is that they require a relatively long
series of data to have even moderate accuracy. Furthermore, in computing
the sample mean and sample standard deviation, the observation at t = 1 is
treated identically to the observation at t = T . For instance, if we are using
five years of monthly data, the returns from five years ago receive the same
weight in the estimates as do the returns from last month. Thus, the estimates
are sensitive to the assumption that the means and standard deviations of the
monthly returns are constant over time; if this assumption is questionable,
then such estimators may be biased as estimators of the parameters of the
return distribution in the current period.
Thus, it is sometimes desirable to use a weighted estimator, in which
more recent data receive more weight than do data from the first time period
being used. Such an estimator gives some protection against violations of the
assumption of constant parameter values over time.
Consider estimation of a mean return μj based on a series of returns
Rj,1 , Rj,2 , . . . , Rj,T on an asset; the same approach can be used to estimate
the mean excess return on an asset. An exponentially weighted moving average
(EWMA) estimator of μj is of the form
T T −t
t=1 γ Rj,t
μ̂wj = T −t
t=1 γ
T

for a constant γ, 0 < γ < 1, known as the decay parameter. Thus, the estimator
is a weighted average of the returns Rj,1 , Rj,2 , . . . , Rj,T , with the weights
proportional to γT −t ; the value of γ used controls how quickly the weights
decrease as t decreases.
Note that, under the assumption that Rj,1 , Rj,2 , . . . , Rj,T each has
mean μj ,
T T
γT −t E(Rjt ) t=1 γ
T −t
μj
E(μ̂wj ) = t=1 T = T
= μj
T −t T −t
t=1 γ t=1 γ
so that μ̂wj is an unbiased estimator of μj .

T&F Cat #K31368 — K31368 C006— page 157 — 6/14/2017 — 22:05

158 Introduction to Statistical Methods for Financial Models

If E(Rj,t ) changes with t, then the weighted estimator will often have
smaller bias than does the sample mean return. For instance, suppose that
E(Rj,t ) = a + b(T − t), t = 1, 2, . . . , T , so that Rj,1 has an expected value
a + b(T − 1) and Rj,T has an expected value a. Then, it is straightforward
to show that R̄j has expected value

T −1
a+b
2
and, using properties of geometric series, that μ̂wj has an expected value

γ γT
a+b − T .
1 − γ 1 − γT

Thus, as an estimator of the expected return in period T , E(Rj,T ), R̄j has

bias
T −1
b
2
and μ̂wj has bias

γ γT
b − T .
1 − γ 1 − γT
For instance, suppose that a = 0.02, b = 0.01/59, and T = 60 so that the
asset mean return is 0.03 in period 1 and it decreases linearly to 0.02 in
period 60. Then the sample mean has bias 0.005 as an estimator of E(Rj,T )
and the EWMA estimator using γ = 0.95 has a bias of about 0.0027, roughly
half that of the sample mean. A smaller value of γ reduces the bias; for
instance, using γ = 0.9, the EWMA estimator has a bias of about 0.0015.
Now consider the variance of the EWMA estimator. Suppose that
Rj,1 , Rj,2 , . . . , Rj,T each has standard deviation σj and are uncorrelated. Then
T T −t 2
T
t=1 (γ ) Var(Rjt ) (γT −t )2 2
Var(μ̂wj ) = T = t=1
T
σj
( t=1 γT −t )2 ( t=1 γT −t )2

so that the variance of μ̂wj depends on γ.

.
Note that, when T is reasonably large, in the sense that γT = 0, then

T −1
. 1
γT −t = γt = ; (6.3)
t=1 t=0
1−γ

it follows that
T T −t 2
T 1
t=1 (γ ) (γ2 )T −t . 1−γ2 1−γ
T = t=1 = = ;
( t=1 γT −t )2
T
( t=1 γT −t )2 1
2 1+γ
1−γ

T&F Cat #K31368 — K31368 C006— page 158 — 6/14/2017 — 22:05

Estimation 159

and, hence,
. 1−γ 2
Var(μ̂wj ) = σ .
1+γ j

Because the variance of the average of M returns is σ2j /M , the

effective sample size corresponding to the decay parameter γ is the value of
M satisfying
1 1−γ 1+γ
= or M = .
M 1+γ 1−γ
For example, an EWMA estimator with γ = 0.75 has roughly the same
variance as the average of (1 + 0.75)/(1 − 0.75) = 7 observations, while an
EWMA estimator based on γ = 0.95 has an effective sample size of 39. There-
fore, there is a price to pay for using a weighted estimator, in terms of the
variance of the estimator, for reducing possible bias arising from violation of
the constant-mean assumption. Note that, as is often the case in estimation
problems, a small value of γ tends to yield a smaller bias but a larger variance;
for a large value of γ, the reverse is true. Also note that going back further
in time to increase the number of observations has only a minor effect on the
variance of a weighted estimator once γT is small.
The remaining issue in implementing the weighted estimators is selection of
the decay parameter γ. Although it is possible to use a more formal approach,
such as choosing γ to maximize some measure of predictive accuracy, γ is often
chosen subjectively, looking at properties such as the effective sample size
and relying on past experience with weighted estimators in similar estimation
problems. In many applications, values of γ in the range from 0.90 to 0.98 are
used.

Example 6.7 Consider the monthly returns on Wal-Mart stock, stored in

the variable wmt.m.ret; see Example 6.1.
To calculate the weighted sample mean of the returns, note that 59:0 is
a vector of the form 59, 58, . . . , 1, 0; and, hence, using a decay parameter of
0.9, 0.9^(59:0) is a vector of the form (0.9)59 , (0.9)58 , . . . , (0.9), 1. Then we
may calculate the weighted sample mean using the function weighted.mean,
which takes a data vector as the ﬁrst argument and a vector of weights as the
second argument. Note that the weights do not need to be normalized to sum
to 1; the normalization is done by the function. Hence, the EWMA estimate
of the mean return on Wal-Mart stock based on the decay parameter 0.9 is
given by

> weighted.mean(wmt.m.ret, 0.9^(59:0))

[1] 0.0158

The value of the weighted estimator changes with the value of the decay
parameter γ and, as γ approaches one, the weighted estimator approaches the
sample mean.

T&F Cat #K31368 — K31368 C006— page 159 — 6/14/2017 — 22:05

160 Introduction to Statistical Methods for Financial Models

> weighted.mean(wmt.m.ret, w=0.9^(59:0))

[1] 0.0158
> weighted.mean(wmt.m.ret, w=0.95^(59:0))
[1] 0.01313
> weighted.mean(wmt.m.ret, w=0.97^(59:0))
[1] 0.01231
> weighted.mean(wmt.m.ret, w=0.99^(59:0))
[1] 0.01144
> mean(wmt.m.ret)
[1] 0.01094

The EWMA approach can also be applied to variances. For a given value
of γ, the weighted estimator of the return variance is given by
T
γT −t (Rj,t − μ̂wj )2
σ̂2wj = t=1
T (6.4)
T −t
t=1 γ

where μ̂wj is the EWMA estimate of μj based on this same value of γ.

Example 6.8 Consider the monthly returns on Wal-Mart stock stored in

the R variable wmt.m.ret. Calculation of a weighted estimator of the return
variance is simplified by first constructing a weight variable. Suppose we take
the decay parameter to be 0.9. Define a variable wgt by

> wgt<-(0.9^(59:0))/sum(0.9^(59:0))

This weight vector can be used to calculate the weighted estimate of the
mean return on Wal-Mart stock using the command

> muhat_w<-weighted.mean(wmt.m.ret, w=wgt)

> muhat_w
[1] 0.0158

and the weighted estimate of the return variance for Wal-Mart stock is given by

> weighted.mean((wmt.m.ret-muhat_w)^2, w=wgt)

[1] 0.00261

Weighted Estimators of the Mean Vector and

Covariance Matrix
The EWMA approach may also be applied to estimation of the mean vector
and covariance matrix of a set of asset returns. The basic approach is the
same as that used to estimate the mean or variance of the return on a single
asset; hence, the discussion here focuses on an example.

T&F Cat #K31368 — K31368 C006— page 160 — 6/14/2017 — 22:05

Estimation 161

Example 6.9 Consider Example 6.6, which considered the returns on the
stocks of eight large companies. Recall that the data for this example are
stored in the variable big8.
The calculations needed to estimate the mean vector and covariance matrix
of the excess returns may all be performed using the cov.wt function, which
calculates weighted estimates of the mean vector and covariance matrix based
on a data matrix.
Suppose we take the decay parameter to be γ = 0.97; the corresponding
weight vector may be calculated using the command
> wgt<-0.97^(59:0)
Note that, for use in the function cov.wt, the weight vector does not need to
be standardized to sum to 1.
To calculate the weighted estimates of the mean return vector and
covariance matrix, we use the command
> big8.wt<-cov.wt(big8, wt=wgt, method="ML")
The default value for the method argument is unbiased, which applies a
multiplicative adjustment to the result to yield an unbiased estimator of the
true covariance matrix, analogous to dividing by T − 1 instead of by T when
calculating the unweighted sample covariance matrix. Specifying method="ML"
returns an estimate that corresponds to (6.4) and does not include such an
adjustment.
The result of cov.wt is a list containing several components. For example,
the estimated mean returns are given in the component $center of big8.wt:
> big8.wt$center
AAPL BAX KO CVS XOM IBM JNJ
0.024478 0.008672 0.009056 0.025728 0.006367 0.000341 0.013607
DIS
0.023101
These may be compared to the unweighted sample means, as calculated earlier.
> Rbar
AAPL BAX KO CVS XOM IBM JNJ DIS
0.02540 0.00740 0.00978 0.02119 0.00825 0.00598 0.01153 0.02075
In most cases, the weighted and unweighted estimates are similar; however,
for IBM, the weighted estimate is much smaller than the weighted estimate,
suggesting that the returns on IBM stock may be decreasing over time.
The estimated covariance matrix is available in the component $cov.
Hence, the estimated weighted sample standard deviations may be obtained
by using the diag function, which returns the diagonal elements of a matrix.
> diag(big8.wt$cov)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0720 0.0462 0.0427 0.0511 0.0434 0.0473 0.0374 0.0505

T&F Cat #K31368 — K31368 C006— page 161 — 6/14/2017 — 22:05

162 Introduction to Statistical Methods for Financial Models

0.10

0.05

0
Return

−0.05

−0.10

−0.15

−0.20
0 10 20 30 40 50 60
Time

FIGURE 6.1
Plot of excess returns on Baxter stock.

These can be compared to the unweighted estimates.

> diag(Sighat)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579
Note that, for some stocks, there is a relatively large diﬀerence between the
weighted and unweighted estimates. For instance, for BAX, the unweighted
estimate is about 20% larger than the weighted estimate. This may be
explained by a plot of the BAX excess returns versus time, given in Figure 6.1.
Note that there is much more variability at the earlier time points than at the
later time points. Therefore, the weighted estimator, which gives low weight
to the returns at the beginning of the observed time period, is smaller than
the unweighted estimator, which treats each return equally.
The weighted and unweighted estimates of the correlations may be
obtained using the function cov2cor, which computes a correlation matrix
from a covariance matrix.
> cov2cor(big8.wt$cov)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.199 0.312 0.310 0.247 0.183 0.169 0.254
BAX 0.199 1.000 0.209 0.325 0.298 0.404 0.388 0.153
KO 0.312 0.209 1.000 0.260 0.261 0.171 0.566 0.316
CVS 0.310 0.325 0.260 1.000 0.444 0.137 0.383 0.588
XOM 0.247 0.298 0.261 0.444 1.000 0.420 0.384 0.603
IBM 0.183 0.404 0.171 0.137 0.420 1.000 0.174 0.285
JNJ 0.169 0.388 0.566 0.383 0.384 0.174 1.000 0.301
DIS 0.254 0.153 0.316 0.588 0.603 0.285 0.301 1.000

T&F Cat #K31368 — K31368 C006— page 162 — 6/14/2017 — 22:05

Estimation 163

> cov2cor(Sighat)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.193 0.260 0.329 0.303 0.319 0.145 0.346
BAX 0.193 1.000 0.310 0.330 0.327 0.381 0.473 0.196
KO 0.260 0.310 1.000 0.310 0.338 0.197 0.493 0.348
CVS 0.329 0.330 0.310 1.000 0.442 0.244 0.421 0.537
XOM 0.303 0.327 0.338 0.442 1.000 0.520 0.408 0.650
IBM 0.319 0.381 0.197 0.244 0.520 1.000 0.206 0.348
JNJ 0.145 0.473 0.493 0.421 0.408 0.206 1.000 0.323
DIS 0.346 0.196 0.348 0.537 0.650 0.348 0.323 1.000

Although the weighted and unweighted estimates are generally close to

each other, there are some exceptions. For these data, the weighted estimates
tend to be closer to zero than are the unweighted estimates.

6.5 Shrinkage Estimators

The estimators of the mean return vector and return covariance matrix based
on the sample mean return vector and sample covariance matrix require only
weak assumptions regarding the asset returns. However, as we have discussed,
when analyzing many assets, such estimators may not work well in practice.
One approach to improving on those simple estimators is to make some
further assumptions regarding the return distributions, thus reducing the
number of parameters to be estimated. For instance, we might assume that
all assets have the same mean return, so that only one mean return needs to
be estimated.
Such estimators tend to work well when data follow the assumptions used.
However, as in the example in which all assets have the same mean return,
the assumptions that lead to the greatest simplification in the analysis are
often unrealistic. In such cases, combining the assumption-based estimators
with estimators based on simple sample statistics, a procedure known as
shrinkage, often leads to estimators that have better properties than either
the assumption-based estimators or those based on sample statistics.
Let Rt be the N × 1 vector asset returns at time period t, t = 1, 2, . . . , T ;
let μ and Σ denote the mean vector and covariance matrix of Rt . We first
consider estimation of the mean vector μ; the techniques are then applied to
the more difficult, and in many ways more important, problem of estimating
Σ. The same approach could be used for estimation of the mean excess returns
but, to keep the notation simple, the discussion will focus on mean returns;
excess returns will be considered in the examples.
In Section 6.3, we considered the estimator of μ = (μ1 , μ2 , . . . , μN )T given
by the vector of return sample means R̄; for this estimator, each μj , the mean
return on asset j is estimated by the corresponding sample mean R̄j .

T&F Cat #K31368 — K31368 C006— page 163 — 6/14/2017 — 22:05

164 Introduction to Statistical Methods for Financial Models

Now consider the estimator of μ based on the assumption that all asset
mean returns are equal: μ1 = μ2 = · · · = μN ≡ μ so that the mean vector μ
is of the form ⎛ ⎞
μ
⎜μ⎟
⎜ ⎟
μ = ⎜.⎟.
⎝ .. ⎠
μ
Note that, since the same risk-free rate applies to each asset, the model in which
the μj are equal is equivalent to one in which the excess mean returns are equal.
Under this model, the common mean μ can be estimated by the sample
mean of R̄1 , R̄2 , . . . , R̄N ;
1

N
μ̂ = R̄j .
N j=1

Then each μj may be estimated by μ̂, j = 1, 2, . . . , N or, equivalently, the

mean vector μ can be estimated by the vector
⎛ ⎞
μ̂
⎜μ̂⎟
⎜ ⎟
μ̂ = ⎜ . ⎟ .
⎝ .. ⎠
μ̂

The advantage of estimating μj by μ̂ rather than by R̄j is that we may use

N T observations to estimate one mean, leading to a smaller standard error
for the estimate. Speciﬁcally, the variance of μ̂ may be obtained by noting
that we may write
1

T
μ̂ = R̄·t
T t=1

where R̄·t is the sample mean of all the asset returns in time period t:

N
R̄·t = Rj,t .
N j=1

Note that R̄·t may be written

1 T
R̄·t = 1 Rt . (6.5)
N
Under the assumption that the returns in diﬀerent time periods are uncor-
related, R̄·t , t = 1, 2, . . . , T are uncorrelated; also, under the assumption that
Var(Rt ) has covariance matrix Σ, Var(R̄·t ) does not depend on t. It follows
that
1
Var(μ̂) = Var(R̄·t ).
T

T&F Cat #K31368 — K31368 C006— page 164 — 6/14/2017 — 22:05

Estimation 165

Using the expression (6.5),

1 T
Var(R̄·t ) = 1 Σ1
N2
where Σ is the covariance matrix of the vector of asset returns. It follows that
1 1 T
Var(μ̂) = 1 Σ1. (6.6)
N2 T
Note that 1T Σ1 is simply the sum of all the elements of Σ.
Example 6.10 Consider the stocks of eight large companies, as described
in Example 6.6. The sample mean excess returns of the individual stocks are
stored in the variable Rbar:
> Rbar
AAPL BAX KO CVS XOM IBM JNJ DIS
0.02540 0.00740 0.00978 0.02119 0.00825 0.00598 0.01153 0.02075
and the overall sample mean excess return is
> mean(Rbar)
[1] 0.0138
This value provides an estimate of the asset return means under the
assumption that all the return means are equal.
Using (6.6), the standard deviation of μ̂ may be estimated by
> (sum(Sighat)/(60*8*8))^.5
[1] 0.00439
Note that sum applied to a matrix returns the sum of all the elements in the
matrix.
This standard deviation may be compared to the standard errors of the
sample means of the individual asset returns.
> (diag(Sighat)/60)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.00954 0.00717 0.00531 0.00746 0.00593 0.00591 0.00499 0.00747
Hence, the standard error of μ̂ is roughly 50%–90% as large as the standard
error of R̄j .

Of course, the drawback of the estimator μ̂ is that we do not believe that

μ1 = μ2 = · · · = μN and, if this assumption is not true, then μ̂ is a biased
estimator of μj . Speciﬁcally,

N
E(μ̂) = μi
N i=1

T&F Cat #K31368 — K31368 C006— page 165 — 6/14/2017 — 22:05

166 Introduction to Statistical Methods for Financial Models

so that the bias of μ̂ as an estimator of μj is

N
E(μ̂) − μj = μi − μj .
N i=1

If μ1 , μ2 , . . . , μN are nearly equal, then in general this bias will be small;

however, it is possible that, for some j, μ̂ will have a large bias.
These results are true for model-based estimators more generally. We
expect the variance of a model-based estimator to be smaller than the vari-
ance of a simple empirical estimator, in some cases, much smaller. However,
if the model assumptions are not valid, the model-based estimators tend to
be biased; in some cases, the bias is small but, in others, it may be large.
In the previous example, as well as in (nearly) all examples of this type, we
know that the assumption μ1 = μ2 = · · · = μN is not literally true. However,
we might think that it is approximately true, in the sense that μ1 , μ2 , . . . , μN
are “close to” one another. Therefore, we might consider estimating μ by
combining the vector of asset sample mean returns R̄ and the vector μ̂ =
(μ̂, μ̂, . . . , μ̂)T = μ̂1 using a weighted average of the form

ψμ̂ + (1 − ψ)R̄

for some weight ψ.

To choose the weight ψ, we consider the distance between R̄ and μ̂, relative
to the sampling variability in R̄; if R̄ and μ̂ are “close,” then the assumption
that all μj are equal appears to be reasonable for these data and we give more
weight to μ̂.
The squared Euclidean distance between R and μ̂ is given by

N
(R̄j − μ̂)2 ;
j=1

and, hence, a standardized measure of this distance is given by

N
τ2 = (R̄j − μ̂)2 .
N j=1

If the μj are approximately equal, then we expect τ2 to be relatively small;

on the other hand, if there is a great deal of variability in the μj , then we
expect τ2 to be relatively large.
To choose the value of ψ to use in the estimates of the μj , we compare
τ2 to a measure of the sampling variability in a sample mean return. The
estimated variance of the sampling distribution of R̄j is given by Sj2 /T , where

T&F Cat #K31368 — K31368 C006— page 166 — 6/14/2017 — 22:05

Estimation 167

Sj2 is the sample return variance for asset j. Hence, the average estimated
variance of a sample mean return is given by S 2 /T where

1
2
N
S2 = S .
N j=1 j

Thus, we take the weight ψ to be

S 2 /T 1
ψ= =
S 2 /T + τ2 1 + T τ2/S 2
and estimate μj by
ψμ̂ + (1 − ψ)R̄j ;
alternatively, we can think of estimating μ by
⎛ ⎞
μ̂
⎜μ̂⎟
⎜ ⎟
μ̂S = ψ ⎜ . ⎟ + (1 − ψ)R̄ = ψμ̂ + (1 − ψ)R̄. (6.7)
⎝ .. ⎠
μ̂
Hence, if the variation between R̄1 , R̄2 , . . . , R̄N , as measured by τ2 , is small
relative to S 2 /T , we give more weight to μ̂; on the other hand, if τ2 is large
relative to S 2 /T , we give more weight to R̄.
An estimator of the form (6.7) is known as a shrinkage estimator because
it takes the individual asset sample means R̄1 , R̄2 , . . . , R̄N and “shrinks” them
toward the overall sample mean return μ̂. The hope is that the higher bias
from including μ̂ in the estimator is more than oﬀset by the lower variance.
A related beneﬁt is that shrinkage estimates tend to be more stable, in the
sense that if one of the R̄j happens to be abnormally large or small, the
shrinkage estimate is often more reasonable.
Example 6.11 Consider the returns on eight large companies as described
in Example 6.6 and as analyzed in Example 6.10. Recall that the variable
Rbar contains the sample means of the asset mean excess returns; the sample
variances of the asset returns, as well as the estimate of τ2 , may be calculated
as follows.
> Rbar
AAPL BAX KO CVS XOM IBM JNJ DIS
0.02540 0.00740 0.00978 0.02119 0.00825 0.00598 0.01153 0.02075
> S2<-apply(big8, 2, var)
> S2
AAPL BAX KO CVS XOM IBM JNJ DIS
0.00546 0.00309 0.00169 0.00334 0.00211 0.00210 0.00149 0.00335
> tau2<-mean((Rbar-mean(Rbar))^2)
> tau2
[1] 4.89e-05

T&F Cat #K31368 — K31368 C006— page 167 — 6/14/2017 — 22:05

168 Introduction to Statistical Methods for Financial Models

These results may then be used to calculate the weight ψ and the vector
of estimated asset mean returns.
> psi<-(mean(S2)/60)/(tau2 + (mean(S2)/60))
> psi
[1] 0.491
> muhat<-psi*mean(Rbar) + (1-psi)*Rbar
> muhat
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0197 0.0105 0.0117 0.0176 0.0110 0.0098 0.0126 0.0173
Thus, the estimated mean returns for the eight stocks are weighted averages
of the individual sample mean returns and the average mean return for all
stocks.
Of course, we do not know which estimator, R̄ or the shrinkage estimator μ̂S
is the more accurate estimator; this issue will be considered in Section 6.7.

Shrinkage Estimation of a Covariance Matrix

The same basic approach used to construct a shrinkage estimate of the mean
vector μ may be used to estimate the covariance matrix Σ.
First consider an estimator based on assumptions regarding the form of Σ.
The simplest such assumptions are that the returns all have the same variance,
Var(Rj,t ) = σ2 for all j = 1, 2, . . . , N and t = 1, 2, . . . , T , and that the returns
on diﬀerent assets are uncorrelated, Cov(Rj,s , Rk,t ) = 0 for all j = k and all
t, s = 1, 2, . . . , T .
Under these assumptions, Σ is of the form
Σ = σ2 IN (6.8)
where IN is the N × N identity matrix and σ2 is an unknown parameter. This
matrix is often referred to as the target matrix for the shrinkage estimation.
The parameter σ2 may be estimated by the average sample variance,
1
2
N
σ̂2 = S .
N j=1 j

Under the assumption (6.8), this is an unbiased estimator of σ2 and, hence,

Σ̂ = σ̂2 IN is an unbiased estimator of Σ. Like the estimator of the common
mean μ, σ̂2 can be expected to have a relatively small variance compared to
those of the individual sample variances S12 , S22 , . . . , SN
2
. Furthermore, since
the covariances and correlations are assumed to be zero, there is no estimation
error incurred for these.
Of course, the assumptions on which Σ̂ is based are very strong, and
obviously, they do not hold in practice. Hence, following the method used
when estimating the mean return of an asset, we use a shrinkage estimator of
the form
ψΣ̂ + (1 − ψ)S

T&F Cat #K31368 — K31368 C006— page 168 — 6/14/2017 — 22:05

Estimation 169

where S is the sample covariance matrix. The same basic reasoning used in
estimating a mean return applies here as well: By taking a weighted aver-
age of Σ̂ and S, we hope to form an estimator with the best properties of
each.
The remaining issue is selection of ψ. A value of ψ close to one leads to
an estimator that is close to the model-based estimator Σ̂, while for a value
close to zero the resulting estimator is close to the sample covariance matrix.
Although the details are fairly complicated, the basic idea in choosing the
value of ψ used in the shrinkage estimate of the covariance matrix is the same
as that used in estimating the asset return means. Let a2 denote an estimate
of the squared distance between Σ and σ2 I and let b2 denote a measure of
the variability in S as an estimator of Σ. Then the optimal value of ψ is of
the form b2 /(a2 + b2 ).
To calculate a shrinkage estimate of a covariance matrix in R, using
this model, we may use the function shrinkcovmat.equal, available in the
package ShrinkCovMat (Touloumis 2015).

Example 6.12 Consider the returns on eight large companies as described in

Example 6.6 and as analyzed in the previous examples of this section. Let Σ
denote the covariance matrix of the excess returns for the stocks of the eight
firms. The argument of shrinkcovmat.equal is a data matrix in which the
rows represent the different stocks and the columns represent different time
periods (i.e., the form of the data matrix is the transpose of the way we have
defined the data matrix previously).

> library(ShrinkCovMat)
> cov.shrnk<-shrinkcovmat.equal(t(big8))

The value of the shrinkcovmat.equal function contains a number of com-

ponents. The one of most interest is the estimated covariance matrix, given
by $Sigmahat; $Target gives the estimated “target” matrix, in this case, the
diagonal matrix with diagonal element equal to the average of the return vari-
ances, and $lambdahat contains the estimated shrinkage parameter, which we
have denoted by ψ.

> cov.shrnk$Sigmahat
AAPL BAX KO CVS XOM
AAPL 0.00503 0.00067 0.00066 0.00118 0.00086
BAX 0.00067 0.00305 0.00059 0.00089 0.00070
KO 0.00066 0.00059 0.00188 0.00062 0.00053
CVS 0.00118 0.00089 0.00062 0.00326 0.00098
XOM 0.00086 0.00070 0.00053 0.00098 0.00223
IBM 0.00091 0.00081 0.00031 0.00054 0.00092
JNJ 0.00035 0.00085 0.00066 0.00079 0.00061
DIS 0.00124 0.00053 0.00070 0.00151 0.00145

T&F Cat #K31368 — K31368 C006— page 169 — 6/14/2017 — 22:05

170 Introduction to Statistical Methods for Financial Models

IBM JNJ DIS

AAPL 0.00091 0.00035 0.00124
BAX 0.00081 0.00085 0.00053
KO 0.00031 0.00066 0.00070
CVS 0.00054 0.00079 0.00151
XOM 0.00092 0.00061 0.00145
IBM 0.00222 0.00031 0.00077
JNJ 0.00031 0.00171 0.00061
DIS 0.00077 0.00061 0.00327

> cov.shrnk$Target
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00283 0.00000 0.00000 0.00000 0.00000
[2,] 0.00000 0.00283 0.00000 0.00000 0.00000
[3,] 0.00000 0.00000 0.00283 0.00000 0.00000
[4,] 0.00000 0.00000 0.00000 0.00283 0.00000
[5,] 0.00000 0.00000 0.00000 0.00000 0.00283
[6,] 0.00000 0.00000 0.00000 0.00000 0.00000
[7,] 0.00000 0.00000 0.00000 0.00000 0.00000
[8,] 0.00000 0.00000 0.00000 0.00000 0.00000
[,6] [,7] [,8]
[1,] 0.00000 0.00000 0.00000
[2,] 0.00000 0.00000 0.00000
[3,] 0.00000 0.00000 0.00000
[4,] 0.00000 0.00000 0.00000
[5,] 0.00000 0.00000 0.00000
[6,] 0.00283 0.00000 0.00000
[7,] 0.00000 0.00283 0.00000
[8,] 0.00000 0.00000 0.00283
> cov.shrnk$lambdahat
[1] 0.162

For interpreting the results, it is often convenient to consider the vector

of estimated standard deviations and the estimated correlation matrix. The
estimated variances may be obtained by using the diag command, which
returns the diagonal of a matrix. The correlation matrix may be obtained by
the command cov2cor, which takes a covariance matrix as its argument and
returns the corresponding correlation matrix.
> diag(cov.shrnk$Sigmahat)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0710 0.0552 0.0433 0.0571 0.0472 0.0471 0.0413 0.0572
> cov2cor(cov.shrnk$Sigmahat)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.170 0.215 0.291 0.257 0.271 0.118 0.306
BAX 0.170 1.000 0.248 0.282 0.269 0.313 0.373 0.168

T&F Cat #K31368 — K31368 C006— page 170 — 6/14/2017 — 22:05

Estimation 171

AAPL BAX KO CVS XOM IBM JNJ DIS

KO 0.215 0.248 1.000 0.250 0.262 0.152 0.367 0.281
CVS 0.291 0.282 0.250 1.000 0.366 0.202 0.334 0.462
XOM 0.257 0.269 0.262 0.366 1.000 0.413 0.311 0.537
IBM 0.271 0.313 0.152 0.202 0.413 1.000 0.157 0.287
JNJ 0.118 0.373 0.367 0.334 0.311 0.157 1.000 0.256
DIS 0.306 0.168 0.281 0.462 0.537 0.287 0.256 1.000

These results may be compared to the sample variances and the sample
correlation matrix.

> apply(big8, 2, var)^.5

AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579
> cor(big8)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.193 0.260 0.329 0.303 0.319 0.145 0.346
BAX 0.193 1.000 0.310 0.330 0.327 0.381 0.473 0.196
KO 0.260 0.310 1.000 0.310 0.338 0.197 0.493 0.348
CVS 0.329 0.330 0.310 1.000 0.442 0.244 0.421 0.537
XOM 0.303 0.327 0.338 0.442 1.000 0.520 0.408 0.650
IBM 0.319 0.381 0.197 0.244 0.520 1.000 0.206 0.348
JNJ 0.145 0.473 0.493 0.421 0.408 0.206 1.000 0.323
DIS 0.346 0.196 0.348 0.537 0.650 0.348 0.323 1.000

Note that the shrinkage estimates of standard deviation are all closer to
the average sample variance 0.0532 and the shrinkage correlation estimates
are all closer to zero than are the estimates based on the sample covariance
matrix.

6.6 Estimation of Portfolio Weights

In Chapters 4 and 5, expressions for the weight vectors of a number of dif-
ferent portfolios were presented; these expressions are generally functions of
the means, standard deviations, and correlations of the returns on the various
assets under consideration. For instance, the weight vector for the tangency
portfolio is given by
Σ−1 (μ − μf 1)
wT = T −1
1 Σ (μ − μf 1)
where μ − μf 1 is the vector of mean excess returns for the assets and Σ is
the covariance matrix of the return vector of the assets.
Of course, in practice, parameters such as μ and Σ are unknown; and,
hence, weight vectors such as wT are unknown and must be estimated. The
simplest approach is to use a plug-in estimator in which a weight vector is

T&F Cat #K31368 — K31368 C006— page 171 — 6/14/2017 — 22:05

172 Introduction to Statistical Methods for Financial Models

estimated by simplifying “plugging in” estimators for any unknown parame-

ters; that is, functions of unknown parameters are estimated by replacing any
parameter by an appropriate estimator.
For instance, the estimator of wT using the sample mean of the excess
returns and the sample covariance matrix is given by

S −1 (R̄ − R̄f 1)
.
1 S −1 (R̄ − R̄f 1)
T

Example 6.13 Consider the stocks of eight large companies as discussed

in Example 6.6. Suppose we would like to estimate the weight vector of
the minimum-variance portfolio. Recall that the sample covariance matrix
of the returns in the data matrix big8 is stored in the matrix Smat. Hence,
the estimated weight vector of the minimum-variance portfolio is given by

> w_mv<-solve(Smat, rep(1, 8))/sum(solve(Smat, rep(1, 8)))

> w_mv
AAPL BAX KO CVS XOM IBM JNJ DIS
0.030 -0.001 0.268 0.032 0.065 0.269 0.345 -0.008

Thus, w_mv may be viewed as an estimate of the weight vector of the minimum-
variance portfolio constructed from the stocks represented in the data matrix
big8.
Alternatively, we could use a shrinkage estimator of the covariance matrix
to estimate the weights of the minimum-variance portfolio. Recall that the
shrinkage estimate based on a target matrix of the form σ2 I is stored
in the matrix cov.shrnk$Sigmahat; see Example 6.12. The corresponding
estimate of the minimum-variance portfolio weight vector is given by

> w_mv.sh<-solve(cov.shrnk$Sigmahat, rep(1, 8))/

+ sum(solve(cov.shrnk$Sigmahat, rep(1, 8)))
> w_mv.sh
AAPL BAX KO CVS XOM IBM JNJ DIS
0.041 0.051 0.246 0.048 0.096 0.224 0.274 0.020

The two estimates are generally similar but there are some diﬀerences; for
instance, there are no negative weights in the shrinkage estimate.
For the weights of the tangency portfolio, the estimates based on the
sample mean returns and the sample covariance matrix are given by

> w_tan<-solve(Smat, Rbar)/sum(solve(Smat, Rbar))

> w_tan
AAPL BAX KO CVS XOM IBM JNJ DIS
0.290 -0.097 0.037 0.258 -0.420 0.020 0.506 0.405

T&F Cat #K31368 — K31368 C006— page 172 — 6/14/2017 — 22:05

Estimation 173

and the corresponding estimates based on the shrinkage estimates are given by

> w_tan.sh<-solve(cov.shrnk$Sigmahat, muhat)/

+ sum(solve(cov.shrnk$Sigmahat, muhat))
> w_tan.sh
AAPL BAX KO CVS XOM IBM JNJ DIS
0.149 0.006 0.175 0.146 -0.044 0.115 0.300 0.152

Here the diﬀerences in the weights are greater than we saw for the
weights of the minimum-variance portfolio. In general, the weights are less
extreme, with the largest diﬀerences occurring for the largest positive and
largest negative weights. This is not surprising given the nature of shrinkage
estimates.

An alternative implementation of the plug-in approach is to replace any

unknown parameters in the objective function used to deﬁne the weight vec-
tor by appropriate estimators. Then optimizing such an estimated objective
function yields an estimate of the corresponding portfolio weight vector.
For instance, to estimate the weight vector of the risk-averse portfolio
based on the risk-aversion parameter λ, which maximizes

λ T
wT μ − w Σw,
2
we can maximize the estimator of the criterion function given by

λ T
wT R̄ − w Sw.
2
Example 6.14 Consider estimating the weight vector of a risk-averse port-
folio of the stocks represented in the data matrix big8. Recall that, given
the mean vector and covariance matrix of the returns on the assets under
consideration, the weight vector of the risk-averse portfolio may be obtained
as the solution to a quadratic programming problem. In R, we can calculate
such a solution using the function solve.QP in the package quadprog; see
Example 5.9.
The following commands can be used to estimate the portfolio weights of
the risk-averse portfolio with parameter λ = 5 for the stocks represented in
the data matrix big8.

> library(quadprog)
> mu8<-apply(big8, 2, mean) + mean(rfree)
> A1<-cbind(rep(1, 8))
> ra8.5<-solve.QP(Dmat=5*Smat, dvec=mu8, Amat=A1, bvec=1,
+ meq=1)$solution
> ra8.5
[1] 0.606 -0.213 -0.243 0.532 -1.007 -0.281 0.702 0.905

T&F Cat #K31368 — K31368 C006— page 173 — 6/14/2017 — 22:05

174 Introduction to Statistical Methods for Financial Models

Here mu8 is the vector of sample mean returns; the matrix big8 contains
excess returns so that the sample mean of the risk-free rate must be added
back in.
Note that the estimated weight vector contains large short positions on
four stocks. Hence, we might consider enforcing the requirement that all asset
weights be nonnegative. The function solve.QP can also be used to calculate
the weight vector that maximizes the risk-aversion criterion subject to such a
restriction; see Example 5.12.
> A2<-cbind(rep(1, 8), diag(8))
> b2<-c(1, rep(0, 8))
> ra8.5.nn<-solve.QP(Dmat=5*Smat, dvec=mu8, Amat=A2, bvec=b2,
+ meq=1)$solution
> round(ra8.5.nn, 5)
[1] 0.392 0.000 0.000 0.343 0.000 0.000 0.000 0.265
The function round is used here so that values very close to 0 are written as 0.
Thus, with the nonnegativity constraint, only three stocks are represented
in the risk-averse portfolio with λ = 5: AAPL, CVS, and DIS. It is inter-
esting to note that the weight of JNJ, which is 0.702 in the unconstrained
risk-averse portfolio, is zero in the constrained portfolio.

It is important to keep in mind that the weight vectors obtained using

these procedures are estimates of the underlying “true” weight vector. Hence,
it is of interest to determine some measure of the sampling variability of the
estimates; methods for studying the sampling distribution of an estimator are
discussed in the following section.

6.7 Using Monte Carlo Simulation to Study the

Properties of Estimators
In Section 6.2, the statistical properties of some simple estimators were con-
sidered, such as the mean and standard deviation of the sampling distribution
of the sample mean return on an asset. However, such properties are diﬃcult
to derive for more complicated estimators, such as shrinkage estimators or
estimators of portfolio weights.
In such cases, an alternative approach to studying the behavior of esti-
mators is to use Monte Carlo simulation, in which observations are simulated
from a known distribution, estimates of interest are calculated, and those sim-
ulated values of the estimates are used to assess the sampling distributions of
the estimators. In this section, the Monte Carlo method is used to estimate
properties of the sampling distributions of a number of estimators.
The basic approach is illustrated on the following example.
Example 6.15 In Example 6.3, we considered the properties of the sample
mean return based on the observation of 60 returns, independently distributed

T&F Cat #K31368 — K31368 C006— page 174 — 6/14/2017 — 22:05

Estimation 175

according to a distribution with mean 0.01 and standard deviation 0.05. In this
example, we perform a similar analysis, using Monte Carlo simulation in place
of the theoretical properties discussed in Section 6.2.
To simulate a sequence of 60 such returns, we may use the command rnorm.
Speciﬁcally,
> ret<-rnorm(60, mean=0.01, sd=0.05)
draws 60 random variables independently from a normal distribution with
mean 0.01 and standard deviation 0.05 and places the result in the variable
ret, a vector of length 60. The sample mean of this vector,
> mean(ret)
[1] 0.0178
represents
a sample of size one from the distribution of the random variable
R̄j = Tt=1 Rj,t /60, where Rj,1 , Rj,2 , . . . , Rj,60 have the distribution described
previously.
Of course, a sample of size one does not provide much information. Hence,
we are generally interested in a large sample of such simulated sample means.
To simulate many sample means at one time, we may use the following proce-
dure. We begin by simulating a matrix of returns, where each row corresponds
to a vector of 60 returns.
> ret_mat<-matrix(rnorm(60*10000, mean=0.01, sd=0.05), 10000, 60)
Here the function rnorm simulates 600,000 independent returns, each normally
distributed with mean 0.01 and standard deviation 0.05, and the function
matrix arranges these in a 10, 000 × 60 matrix.
Using apply, we may now calculate the sample mean of each row:
> ret_mean<-apply(ret_mat, 1, mean)
The variable ret_mean now contains a sample of size 10, 000 from the dis-
tribution of R̄j . We may now analyze ret_mean as we would any sample of
observations. For instance,
> mean(ret_mean)
[1] 0.01002
> sd(ret_mean)
[1] 0.00642
are estimates of the mean and standard deviation, respectively, of the sampling
distribution of R̄j .
Recall that, in this scenario, the mean and standard deviation of this sam-
pling distribution may be calculated exactly and, in Example 6.3, they were
shown to be 0.01 and 0.00645, respectively. Thus, the Monte Carlo estimates
closely match the true values; generally speaking, even closer agreement could
be obtained by increasing the number of Monte Carlo replications. That is, we

T&F Cat #K31368 — K31368 C006— page 175 — 6/14/2017 — 22:05

176 Introduction to Statistical Methods for Financial Models

can achieve closer agreement by increasing the number of rows in the matrix
ret_mean.
Other properties of the sampling distribution of R̄j may be found in the
same way. For instance,
> summary(ret_mean)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.01260 0.00571 0.01003 0.01002 0.01440 0.03110
Thus, estimates of the upper and lower quartiles of the sampling distribution of
R̄j are given by 0.00571 and 0.0144, respectively. Recall that, in Example 6.3,
we approximated these by 0.00570 and 0.0143, respectively. The advantage of
the Monte Carlo method is that these values were obtained using only numer-
ical methods, without using any analytical approximations. One drawback of
the Monte Carlo method is that, if the analysis is repeated, diﬀerent results
will be obtained, although if a large Monte Carlo sample size is used, such
as the 10, 000 used here, the diﬀerences are generally slight. For instance, if
the calculations described previously are repeated, the sample mean and stan-
dard deviation of the simulated return sample means are 0.01009 and 0.00643,
respectively.
A histogram of the simulated mean returns gives some information about
the shape of the sampling distribution; such a plot can be produced by the
command hist(ret_mean). The result is given in Figure 6.2, and it supports
the conclusion of the CLT that the sampling distribution of R̄j is approxi-
mately normal. A normal probability plot, as discussed in Section 2.5, could
also be considered.

Histogram of ret_mean

2500

2000
Frequency

1500

1000

500

0
−0.01 0 0.01 0.02 0.03
ret_mean

FIGURE 6.2
Histogram of simulated sample mean returns.

T&F Cat #K31368 — K31368 C006— page 176 — 6/14/2017 — 22:05

Estimation 177

Given the form of the estimator—a sample mean—along with the distri-
bution of the returns, which are assumed to be independent and normally
distributed, it is not surprising that the distribution of R̄j is approximately
normal; of course, in this case, it is well-known that the distribution of R̄j
is exactly normal. A more extensive study of this type would likely change
the assumed distribution of the returns in order to study the eﬀect of distri-
butional assumptions on the properties of the estimator. With Monte Carlo
simulation, such changes are generally easy to implement.
Example 6.16 Consider the Monte Carlo simulation described in Example
6.15, which analyzed the properties of the sample mean return based on the
observation of 60 returns, independently distributed according to a distri-
bution with mean 0.01 and standard deviation 0.05, but now suppose that
the observations are not normally distributed. Empirical studies have shown
that the t-distribution is often useful for modeling return data; thus, here we
assume that the returns follow a t-distribution with six degrees of freedom,
with location and scale parameters chosen to achieve the desired mean and
standard deviation.
Changing the distribution used in the Monte Carlo simulation from the
normal distribution to the t-distribution focuses on changing the function
rnorm used in Example 6.15 to the function rt, which simulates observations
from a t-distribution. However, the properties of the t-distribution are such
that some care is needed in order to achieve the required results.
The degrees of freedom in rt is speciﬁed by the argument df; for example,
rt(1, df=6) returns one observation from a t-distribution with six degrees
of freedom. The complications arise when specifying the mean and standard
deviation of the distribution. A random variable with a t-distribution with ν
degrees of freedom has mean 0 and variance ν/(ν − 2), provided that ν > 2;
thus, a random variable with a √ t-distribution with six degrees of freedom has
mean 0 and standard deviation 1.5. It follows that to draw a random sample
of size n from a t-distribution with mean 0.01, standard deviation 0.05, and
six degrees of freedom, we use the command
rt(n, df=6)*(0.05/(1.5^.5))+0.01
Therefore, to repeat the Monte Carlo study described in Example 6.15,
but with a t-distribution with six degrees of freedom replacing the standard
normal distribution, we use the commands
> ret_mat_t<-matrix(rt(60*10000, df=6)*(0.05/(1.5^.5))+0.01,
+ 10000, 60)
> ret_mean_t<-apply(ret_mat_t, 1, mean)
> mean(ret_mean_t)
[1] 0.00999
> sd(ret_mean_t)
[1] 0.00641
> summary(ret_mean_t)

T&F Cat #K31368 — K31368 C006— page 177 — 6/14/2017 — 22:05

178 Introduction to Statistical Methods for Financial Models

Min. 1st Qu. Median Mean 3rd Qu. Max.

-0.01340 0.00566 0.01000 0.00999 0.01430 0.03450

Note that the results are generally similar to those obtained in Example
6.15, suggesting that the sampling distribution of the sample mean is close to
normal even if the returns follow a t-distribution. This is not surprising given
that each sample mean is based on 60 observations; the approximation given
by the CLT tends to be accurate for that sample size.

In the examples considered thus far in this section, Monte Carlo simulation
was applied to an estimator whose properties may be determined using ana-
lytic methods. However, the Monte Carlo method is most useful when such
analytic results are not readily available. Many such cases occur when the
estimator under consideration is a function of the returns on several assets,
so that the correlation structure of the returns plays a role in the sampling
distribution.

Simulating a Return Vector

To handle such cases using Monte Carlo simulation, we need a method of
simulating a random vector in which the component random variables are
correlated. For instance, let R1 , R2 , . . . , RT denote N -dimensional random
vectors, with Rt representing the vector of asset returns at time t. These
random vectors may be viewed as the multivariate analogues of the returns
Rj,1 , Rj,2 , . . . , Rj,60 analyzed in Example 6.15. Note that although it is often
reasonable to assume that Rt and Rs are uncorrelated or even independent,
for t = s, the components of Rt , which represent the returns on diﬀerent assets
at time t, cannot realistically be modeled as uncorrelated random variables.
Consider simulation of Rt under the assumption that this random vector
has mean vector μ and covariance matrix Σ for given values of μ and Σ. For
the distribution of Rt , we ﬁrst use the multivariate normal distribution; we
then will consider a multivariate version of a t-distribution.
A random vector Y = (Y1 , Y2 , . . . , YN )T has a multivariate normal distri-
N
bution if any linear function of Y , that is, any function of the form j=1 aj Yj
for constants a1 , a2 , . . . , aN , has a univariate normal distribution. The param-
eters of the multivariate normal distribution are its mean vector and its
covariance matrix.
To simulate random variates with a multivariate normal distribution
in R, we use the function mvrnorm, which is available in the package MASS
(Venables and Ripley 2002); the following example illustrates how mvrnorm
may be used to conduct a simulation study similar to the one conducted in
Example 6.15. The arguments of mvrnorm are n, the number of samples
requested, mu, the mean vector of the distribution, and Sig, the covariance
matrix of the distribution.

T&F Cat #K31368 — K31368 C006— page 178 — 6/14/2017 — 22:05

Estimation 179

Example 6.17 Consider a three-dimensional multivariate normal distribu-

tion with mean vector (0.1, 0.2, 0.3)T and covariance matrix
⎛ ⎞
0.2 0.1 0.1
⎝0.1 0.2 0.1⎠ . (6.9)
0.1 0.1 0.2

Thus, each of the three component random variables in the random vector
has variance 0.2 and the correlation between any two such random variables
is 0.5.
The following command generates one sample from this distribution.

> library(MASS)
> mu0<-c(0.1,0.2,0.3)
> Sig0<-matrix(c(0.2,0.1,0.1,0.1,0.2,0.1,0.1,0.1,0.2), 3, 3)
> Sig0
[,1] [,2] [,3]
[1,] 0.2 0.1 0.1
[2,] 0.1 0.2 0.1
[3,] 0.1 0.1 0.2
> mvrnorm(n=1, mu=mu0, Sig=Sig0)
[1] -0.3335 0.0529 0.0982

For n = 1, the result of the function is a vector; for n > 1, the result is a
matrix, with each row corresponding to a simulated random vector. For exam-
ple, to draw four random vectors from the multivariate normal distribution
with mean vector mu0 and covariance matrix Sig0, we use the command

> mvrnorm(4, mu=mu0, Sig=Sig0)

[,1] [,2] [,3]
[1,] -0.193 0.1644 -0.273
[2,] 0.297 0.3012 -0.148
[3,] 0.603 -0.0955 1.085
[4,] -0.149 0.2805 -0.331

Therefore, if the returns on a set of three assets are modeled as random vectors
with mean mu0 and covariance matrix Sig0, and the return vectors in diﬀerent
time periods are independent, then the matrix ret.sim represents the returns
on the three assets over four time periods, with each row representing a time
period. Hence, ret.sim is a simulated data matrix of the type described in
Section 6.3 and is similar to the observed data matrix big8.
Simulated values of functions of these returns may be calculated in the
usual way. For instance, simulated return sample means for the three assets
are given by

> apply(ret.sim, 2, mean)

[1] 0.1393 0.1627 0.0833

T&F Cat #K31368 — K31368 C006— page 179 — 6/14/2017 — 22:05

180 Introduction to Statistical Methods for Financial Models

Now consider replications of this procedure. For the case of a single asset, in
which the returns are given in a vector, independent replications of the returns
may be stored in a matrix, with each row corresponding to a series of asset
returns. When simulating a vector of returns, one replication of the simulation
is already a matrix. Although it is possible to handle the replications using a
three-dimensional array, a generalization of a matrix (an interesting exercise,
for those so inclined), the simplest approach is to use a loop.
For instance, to obtain 10,000 simulated values of the return sample means
for the three assets in the example, we can use the following commands:
ret_means<-matrix(0, 10000, 3)
for (j in 1:10000){
+ ret_sim<-mvrnorm(4, mu=mu0, Sig=Sig0)
+ ret_means[j, ]<-apply(ret_sim, 2, mean)
+ }
The command ret_means<-matrix(0, 10000, 3) creates a matrix that will
store the results. Each iteration of the loop simulates a vector of sample means
of the asset returns and stores them in a row of ret_means.
The results in ret_means may now be used to estimate properties of the
sampling distribution. For instance, the mean of the sampling distribution of
Rt is estimated by
> apply(ret_means, 2, mean)
[1] 0.0967 0.2007 0.2991
which is close to the known mean vector (0.1, 0.2, 0.3)T .

Many statistical methods tend to work particularly well for normally dis-
tributed data. Thus, in conducting a Monte Carlo study, it is often of interest
to consider other distributions in addition to the multivariate normal.
The multivariate t-distribution is a multivariate generalization of the t-
distribution; its relationship to the univariate t-distribution is similar to the
relationship the multivariate normal distribution has to the univariate normal
distribution.
To simulate random variates with a multivariate t-distribution in R, we
use the function rmvt, which is available in the package mvtnorm (Genz et al.
2016). In the function rmvt, we may specify the number of samples (the
argument n), the degrees of freedom (the argument df), and the “scale” matrix
(the argument sigma). Let A denote the scale matrix; then the covariance
matrix of the distribution is given by Σ = A(ν/(ν − 2)), where ν denotes the
degrees of freedom of the distribution. The following example illustrates the
use of rmvt.
Example 6.18 Consider a three-dimensional random vector with the mean
vector and covariance matrix given in Example 6.17 and stored in the R
variables mu0 and Sig0, respectively. Furthermore, assume that the random
vector has a multivariate t-distribution.

T&F Cat #K31368 — K31368 C006— page 180 — 6/14/2017 — 22:05

Estimation 181

To generate one sample from the multivariate t-distribution with six

degrees of freedom, mean vector mu0, and covariance matrix Sig0, we use
the following command.

> library(mvtnorm)
> rmvt(n=1, df=6, sigma=Sig0)/(1.5^.5) + mu0
[,1] [,2] [,3]
[1,] 0.172 0.677 0.0995

Note that rmvt draws a random vector from a multivariate distribution with
mean vector 0; adding the vector mu0 to the result modiﬁes the mean vector
to mu0.
To simulate several random vectors, we can increase the value of n. How-
ever, in order to add the mean vector to the simulated random vectors, we
must construct a matrix of mean vectors. For instance, suppose that we wish
to simulate four random vectors. Then

> rmvt(n=4, df=6, sigma=Sig0)/(1.5^.5)

[,1] [,2] [,3]
[1,] 1.2761 -0.2582 -0.367
[2,] -0.0084 0.0747 -0.676
[3,] 0.7659 0.3709 0.588
[4,] 0.4114 0.1233 0.324

returns a data matrix with the correct covariance matrix but each value is
simulated from a distribution with mean zero.
To add the mean vector to the result of rmvt, note that
matrix(mu0, 4, 3, byrow=T) returns a matrix with each row equal to mu0:

> matrix(mu0, 4, 3, byrow=T)

[,1] [,2] [,3]
[1,] 0.1 0.2 0.3
[2,] 0.1 0.2 0.3
[3,] 0.1 0.2 0.3
[4,] 0.1 0.2 0.3

Thus, the command

> rmvt(n=4, df=6, sigma=Sig0)/(1.5^.5)+matrix(mu0, 4, 3, byrow=T)

[,1] [,2] [,3]
[1,] -0.0498 0.4837 -0.2124
[2,] -0.4936 0.7804 0.0633
[3,] 0.2842 0.0878 0.3022
[4,] -0.0518 -0.0329 -0.1120

returns random variates simulated from the correct distribution.

T&F Cat #K31368 — K31368 C006— page 181 — 6/14/2017 — 22:05

182 Introduction to Statistical Methods for Financial Models

To verify that the command is working as desired, we may simulate a data

matrix with 1000 observations on the random vector,

> x<-rmvt(n=1000, df=6, sigma=Sig0)/(1.5^.5)+matrix(mu0, 1000, 3,

+ byrow=T)

and verify that the sample mean vector and sample covariance matrix are
close to mu0 and Sig0, respectively.

> apply(x, 2, mean)

[1] 0.110 0.203 0.293
> cov(x)
[,1] [,2] [,3]
[1,] 0.220 0.1116 0.1063
[2,] 0.112 0.2055 0.0956
[3,] 0.106 0.0956 0.2043

We now consider two examples illustrating how Monte Carlo simulation

may be used to better understand the properties of statistical methods used
to analyze ﬁnancial data.

Using Monte Carlo Simulation to Describe the Sampling

Distribution of a Statistic
In Chapters 4 and 5, a number of diﬀerent approaches to constructing portfolio
weights were discussed; these methods are based on the means, standard
deviations, and correlations of the returns on the assets under consideration,
along with some criteria for choosing the weights. Here we use Monte Carlo
simulation to assess the variability in the estimated weights for the tangency
portfolio of two assets.
Let μj and σj denote the mean and standard deviation, respectively, of
the return on asset j, j = 1, 2 and let ρ denote the correlation of the returns
on the two assets. Then, according to the analysis in Section 4.6, the tangency
portfolio places weight

(μ1 − μf )σ21 − (μ2 − μf )ρ12 σ1 σ2

wT = (6.10)
(μ2 − μf )σ21 + (μ1 − μf )σ22 − [(μ1 − μf ) + (μ2 − μf )]ρ12 σ1 σ2

on asset 1, where μf denotes the return on the risk-free asset. Alternatively,

we may use the formula for the tangency weights for a vector of assets,

Σ−1 (μ − μf 1)
,
1 Σ−1 (μ − μf 1)
T

which is easier to evaluate numerically. It is straightforward to show that this

formula reduces to (6.10) for N = 2.

T&F Cat #K31368 — K31368 C006— page 182 — 6/14/2017 — 22:05

Estimation 183

Example 6.19 Consider two assets such that the excess return on the ﬁrst
asset has mean 0.025 and variance 0.0055 and the excess return on the second
asset has mean 0.010 and variance 0.0017; take the covariance of the returns
to be 0.0008. These are the observed values (slightly rounded) for Apple stock
and Coca-Cola stock based on the big8 data. Deﬁne variables mu1 and Sig1
to represent the mean vector and covariance matrix of the excess returns:
> mu1<-c(0.025, 0.010)
> Sig1<-matrix(c(0.0055, 0.0008, 0.0008, 0.0017), 2, 2)
Thus, the tangency portfolio has weights
> solve(Sig1, mu1)/sum(solve(Sig1, mu1))
[1] 0.496 0.504
so that this portfolio places roughly half its investment on each asset.
Simulating the data according to a multivariate normal distribution, as
described previously in this section, we may draw a sample from the sampling
distribution of the weight on Apple stock (note that, since the weights sum
to 1, we only need to consider one of the weights).
> wgts<-rep(0, 10000)
> sharpe<-rep(0, 10000)
> for (j in 1:10000){
+ ret_sim<-mvrnorm(60, mu=mu1, Sig=Sig1)
+ mean_sim<-apply(ret_sim, 2, mean)
+ sig_sim<-cov(ret_sim)
+ wgt_sim<-solve(sig_sim, mean_sim)/sum(solve(sig_sim,
+ mean_sim))
+ wgts[j]<-wgt_sim[1]
+ sharpe[j]<-sum(mean_sim*wgt_sim)/((wgt_sim%*%sig_sim%*%
+ wgt_sim)^.5)
+ }
The vector wgts contains a sample of size 10, 000 from the sampling distribu-
tion of wT , the weight on Apple stock in the tangency portfolio of Apple and
Coca-Cola stocks, each based on 60 observations; the vector sharpe contains
the estimated Sharpe ratio corresponding to the estimated tangency portfolio.
The sampling distribution of wT can be summarized using the usual functions;
for example,
> mean(wgts)
[1] 0.64
> median(wgts)
[1] 0.487
Hence, although the median weight of 0.487 is close to the true weight of
0.496, the mean weight is considerably larger than the true weight, suggesting
a skewed distribution.

T&F Cat #K31368 — K31368 C006— page 183 — 6/14/2017 — 22:05

184 Introduction to Statistical Methods for Financial Models

The quantiles of the sampling distribution are a useful summary of the

distribution; these may be calculated using the quantile function.
>probvec<-c(0.01, 0.05, 0.10, 0.25, 0.50, 0.75, 0.90, 0.95, 0.99)
> quantile(wgts, prob=probvec)
1% 5% 10% 25% 50% 75% 90% 95% 99%
-0.439 0.095 0.190 0.324 0.487 0.713 1.057 1.476 3.565
These results show that there is considerable variability in the estimated
weights. For instance, in more than 10% of the Monte Carlo simulations
the weight for Apple stock is greater than one and, hence, the weight for
Coca-Cola is negative.
The maximum Sharpe ratios also exhibit a high degree of variability:
> quantile(sharpe, prob=probvec)
1% 5% 10% 25% 50% 75% 90% 95% 99%
0.018 0.189 0.235 0.311 0.399 0.491 0.575 0.629 0.737
These values may be compared to the Sharpe ratio of the tangency portfolio
based on the true distribution, which is 0.373.

The analysis in Example 6.19 is based on the assumption that the returns
in a given period have a multivariate normal distribution; we may repeat
the analysis using the assumption that the returns have a multivariate t-
distribution with six degrees of freedom.
Example 6.20 Consider the two assets described in Example 6.19, with mean
excess return vector given in mu1 and return covariance matrix given in Sig1.
Simulating the data according to a multivariate t-distribution with six
degrees of freedom, as described previously in this section, we may draw a
sample from the sampling distribution of the weight on Apple stock using the
following commands.
> wgts.t<-rep(0, 10000)
> sharpe.t<-rep(0, 10000)
> for (j in 1:10000){
+ ret_sim<-rmvt(60, df=6, sigma=Sig1)/(1.5^.5) + matrix(mu1,
+ 60, 2, byrow=T)
+ mean_sim<-apply(ret_sim, 2, mean)
+ sig_sim<-cov(ret_sim)
+ wgt_sim<-solve(sig_sim, mean_sim)/sum(solve(sig_sim,
+ mean_sim))
+ wgts.t[j]<-wgt_sim[1]
+ sharpe.t[j]<-sum(mean_sim*wgt_sim)/((wgt_sim%*%sig_sim%*%
+ wgt_sim)^.5)
+ }
The vector wgts.t contains a sample of size 10,000 from the sampling dis-
tribution of the weight on Apple stock in the tangency portfolio of Apple and

T&F Cat #K31368 — K31368 C006— page 184 — 6/14/2017 — 22:05

Estimation 185

Coca-Cola stocks under the assumption that the returns have a t-distribution.
Thus, the sampling distribution of this weight may be summarized by its
quantiles:
> quantile(wgts.t, prob=probvec)
1% 5% 10% 25% 50% 75% 90% 95% 99%
-0.638 0.085 0.180 0.320 0.486 0.710 1.054 1.418 3.487
Note that the results are very similar to those in Example 6.19.
The quantiles of the sampling distribution of the maximum Sharpe ratio
are given by
> quantile(sharpe.t, prob=probvec)
1% 5% 10% 25% 50% 75% 90% 95% 99%
-0.060 0.187 0.236 0.315 0.404 0.500 0.588 0.641 0.746
Again, these results are similar to those in Example 6.19.

The results of the previous two examples suggest that tangency weights
based on sample data are not well-determined. There are two general reasons for
this. One is the variability in the estimates of the mean vector and the covari-
ance matrix, as we have discussed in this section. The other is that, although the
tangency portfolio maximizes the Sharpe ratio, many portfolios have a Sharpe
ratio close to the maximum value and, hence, the tangency portfolio itself is not
very well-deﬁned. This fact may be illustrated in the following example.
Example 6.21 Consider the framework in Example 6.19. Recall that the tan-
gency portfolio, which maximizes the Sharpe ratio, places weight 0.496 on
Apple stock. Figure 6.3 contains a plot of the Sharpe ratio of a portfolio of

0.40

0.35
Sharpe ratio

0.30

0.25

0.20
0 0.2 0.4 0.6 0.8 1.0
w

FIGURE 6.3
Plot of the Sharpe ratio versus w in Example 6.21.

T&F Cat #K31368 — K31368 C006— page 185 — 6/14/2017 — 22:05

186 Introduction to Statistical Methods for Financial Models

Apple and Coca-Cola stocks, using the distribution of returns described in

Example 6.19, as a function of w, the weight placed on Apple stock. The dotted
line corresponds to a Sharpe ratio of 0.354, which is 95% of the maximum value;
given the results in Example 6.19, this is a small diﬀerence from the maximum
Sharpe ratio relative to the variation in estimating the maximum Sharpe ratio.
The values of w corresponding to a Sharpe ratio of 95% of the maximum
value are roughly 0.29 and 0.81. Thus, for a wide range of values of w, the
Sharpe ratio is close to the maximum value.

Using Monte Carlo Simulation to

Compare Estimators
In many cases, there are two or more estimators that might be used to esti-
mate a given parameter; and, hence, it is important to use the one that can
be expected to yield more accurate estimates. In such cases, Monte Carlo
simulation is often useful in evaluating the estimators.
The following example compares the shrinkage estimator of a mean return,
as described in Section 6.5, with the estimator based on the sample mean of
the return vectors.
Example 6.22 The procedure used to calculate the shrinkage estimate of
asset mean returns in Example 6.11, based on the data in the matrix big8,
may be summarized as follows.
> Rbar<-apply(big8, 2, mean)
> S2<-apply(big8, 2, var)
> tau2<-mean((Rbar-mean(Rbar))^2)
> psi<-(mean(S2)/60)/(tau2 + (mean(S2)/60))
> muhat<-psi*mean(Rbar) + (1-psi)*Rbar
Suppose we simulate returns using the approach described in Example
6.17, taking mu and Sig to be the estimates from the big8 data set; then a
simulated value for the vector of shrinkage estimates may be calculated using
the aforementioned procedure, with a simulated data matrix replacing big8.
The following commands repeat such a procedure 10, 000 times.
> library(MASS)
> shrnk_means<-matrix(0, 10000, 8)
> sample_means<-matrix(0, 10000, 8)
> for (j in 1:10000){
+ ret_sim<-mvrnorm(60, mu=Rbar, Sig=Sighat)
+ Rb<-apply(ret_sim, 2, mean)
+ S2<-apply(ret_sim, 2, var)
+ tau2<-mean((Rb-mean(Rb))^2)
+ psi<-(mean(S2)/60)/(tau2 + (mean(S2)/60))
+ muhat<-psi*mean(Rb) + (1-psi)*Rb
+ shrnk_means[j, ]<-muhat

T&F Cat #K31368 — K31368 C006— page 186 — 6/14/2017 — 22:05

Estimation 187

+ sample_means[j, ]<-Rb
+ }
The matrix shrnk_means contains a sample of 10, 000 from the sampling dis-
tribution of the vector of shrinkage estimates of the mean asset returns, with
one set of estimates in each row of the matrix. The matrix sample_means
contains the sample means from each iteration in a similar format.
The results in shrnk_means and sample_means may be summarized using
the usual functions. For instance, the mean vector of the sampling distribution
of the shrinkage estimator may be obtained by

> apply(shrnk_means, 2, mean)

[1] 0.0211 0.0097 0.0113 0.0183 0.0104 0.0090 0.0122 0.0180
and the bias of the shrinkage estimator is given by

> apply(shrnk_means, 2, mean)-Rbar

AAPL BAX KO CVS XOM IBM JNJ DIS
-0.0043 0.0023 0.0015 -0.0029 0.0022 0.0030 0.0007 -0.0027
Thus, the shrinkage estimator appears to be biased; the standard error of the
estimated bias is the sample standard deviation of the values in each column
of shrnk_means divided by the square root of the sample size, which in this
context is 10, 000, the number of Monte Carlo replications:

> apply(shrnk_means, 2, sd)/10000^.5

[1] 7.97e-05 5.67e-05 4.39e-05 6.36e-05 5.01e-05 4.92e-05
4.36e-05 6.33e-05

Therefore, for each asset, the estimated bias is much greater than its stan-
dard error; for instance, the ratios of the estimated biases to their standard
errors, that is, the t-statistics for testing the hypothesis of no estimator bias,
are given by

AAPL BAX KO CVS XOM IBM JNJ DIS

-53.6 40.9 33.6 -45.4 42.9 60.5 16.1 -43.3

Hence, based on these results, we conclude that the shrinkage estimators are
biased; this is to be expected because the shrinkage estimate of an asset mean
return is a weighted average of the sample mean return for that asset and the
sample mean return for all assets.
A similar analysis can be done on the sample mean returns. The estimated
biases are given by

> apply(sample_means, 2, mean)-Rbar

AAPL BAX KO CVS XOM IBM
-2.86e-05 3.98e-05 -7.07e-05 -4.33e-05 -4.45e-05 -1.64e-05
JNJ DIS
-5.85e-05 -4.29e-05

T&F Cat #K31368 — K31368 C006— page 187 — 6/14/2017 — 22:05

188 Introduction to Statistical Methods for Financial Models

and the ratios of the estimated biases to their standard errors are

> (apply(sample_means, 2, mean)-Rbar)/(apply(sample_means, 2,

+ sd)/10000^.5)
AAPL BAX KO CVS XOM IBM JNJ DIS
-0.300 0.558 -1.354 -0.576 -0.748 -0.279 -1.161 -0.573

As expected, none of the estimated biases is statistically signiﬁcant at the 5%

level.
To compare the accuracies of these estimators, we can look at the mean
squared error (MSE) of the estimators. For an estimator θ̂ of a real-valued
parameter θ, the MSE is given by

MSE(θ̂) = E (θ̂ − θ)2 ;

the MSE is therefore a type of expected squared-distance between the estima-

tor and the parameter we are trying to estimate. Thus, a smaller value of the
MSE indicates a more accurate estimator, on average.
It may be shown that

MSE(θ̂) = bias(θ̂)2 + Var(θ̂),

where bias(θ̂) = E(θ̂) − θ is the bias of the estimator. Hence, the MSE com-
bines the bias of an estimator with the variability of its sampling distribution,
as measured by the variance. Often the square root of the MSE is reported (the
“root mean squared error” or RMSE), for the same reason that the standard
deviation is often preferred to the variance as a measure of variability.
For the shrinkage estimators, the MSE may be estimated using the results
in shrnk_means.

> shrnk.mse<-(apply(shrnk_means, 2, mean)-Rbar)^2 +

+ apply(shrnk_means, 2, var)

Monte Carlo estimates of the RMSE of the estimators are therefore

> shrnk.mse^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.00904 0.00613 0.00463 0.00699 0.00546 0.00575 0.00442 0.00689

The analogous results for the sample mean returns are

> mean.mse<-(apply(sample_means, 2, mean)-Rbar)^2 +

+ apply(sample_means, 2, var)
> mean.mse^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.00954 0.00714 0.00522 0.00752 0.00595 0.00587 0.00504 0.00749

T&F Cat #K31368 — K31368 C006— page 188 — 6/14/2017 — 22:05

Estimation 189

The ratios of the RMSE values for the shrinkage estimators to the RMSE
values for the sample mean returns are given by
> sort((shrnk.mse/mean.mse)^.5)
BAX JNJ KO XOM DIS CVS AAPL IBM
0.858 0.876 0.887 0.917 0.921 0.929 0.948 0.980
Note that the sort function puts the vector in increasing order.
Thus, for all of the stocks, the shrinkage estimator has a smaller esti-
mated RMSE than does the sample mean estimator. That is, the biases of the
shrinkage estimators are more than oﬀset by their smaller standard deviations,
leading to more accurate estimates, on average.
It is important to keep in mind that the analysis in this example, like all
analyses based on Monte Carlo simulation, are simply estimates of the true
properties of the estimators under consideration. In particular, if the Monte
Carlo analysis is repeated, the results will change; however, with a large Monte
Carlo sample size, such as the 10, 000 used here, the changes tend to be small.
For instance, if the analysis in this example is repeated, the new estimates of
the ratios of the RMSE values are given by
BAX JNJ KO DIS XOM CVS AAPL IBM
0.865 0.880 0.888 0.915 0.919 0.920 0.938 0.979
Note that, although the ratios have all changed from the original estimates,
the changes are minor and the general conclusions regarding the estimators
do not change.

6.8 Suggestions for Further Reading

Parameter estimation is a central topic in statistics. Rice (2007, Chapter 8)
presents a good, general discussion of parameter estimation. Johnson and
Wichern (2007, Chapter 3) consider properties of the sample mean vector and
sample covariance matrix based on observation of a random vector; see also
Jobson and Korkie (1980). Fabozzi et al. (2006, Chapter 8) discuss estimation
from the perspective of ﬁnance. The properties of a sample covariance matrix
when the number of variables is close to the number of time periods observed
are studied in an area of statistics known as random matrix theory; see, for
example, Fabozzi et al. (2006, Chapter 8) for a useful summary. The result
in Section 6.2 on the trace of the inverse of the sample correlation matrix is
attributed to Bouchaud and Potters (2011), who present further discussion of
the implications of such a result.
EWMA estimators are discussed by Miller (2012, Chapter 10) and by
Alexander (2008, Section II.3.8). Shrinkage estimation, also known as empir-
ical Bayes estimation, is an important statistical method; see Efron and
Morris (1970) for a useful introduction and Carlin and Louis (2000) for a

T&F Cat #K31368 — K31368 C006— page 189 — 6/14/2017 — 22:05

190 Introduction to Statistical Methods for Financial Models

book-length treatment. The shrinkage estimator of the mean vector presented

here is attributed to DeMiguel et al. (2013) who show that it performs well
in portfolio analysis; see Fabozzi et al. (2006, Chapter 9) for an alternative
approach. Jorion (1986) and Ledoit and Wolf (2004) discuss shrinkage esti-
mation of expected returns and the return covariance matrix, respectively, in
the context of portfolio analysis.
Monte Carlo simulation is an extremely useful technique for understanding
the properties of statistical methods in complex settings. Evans and Rosenthal
(2004, Chapter 4) oﬀer a good introduction to the use of Monte Carlo meth-
ods in understanding sampling distributions. Ross (2013) provides a detailed
introduction to simulation in general; see Hammersley and Handscomb (1964)
for a more advanced introduction to Monte Carlo methods. The Monte Carlo
methods described here are closely related to the statistical method known as
the parametric bootstrap; see, for example, Efron and Tibshirani (1993, Section
6.5) and Davison and Hinkley (1997, Section 2.2). The suggestion to model
asset returns as random variables having a t-distribution with six degrees of
freedom is attributed to Praetz (1972); see also Blattberg and Gonedes (1974)
and Gray and French (1990). The sensitivity of optimal portfolio weights to
estimation error is discussed by Best and Grauer (1991) and DeMiguel et al.
(2009).

6.9 Exercises
1. Consider the returns on two stocks, Papa John’s International, Inc.
(symbol PZZA), and Bed Bath & Beyond, Inc. (BBBY). For each
stock, calculate five years of monthly returns for the period ending
December 31, 2015.
a. Calculate the sample mean of the returns on each stock.
b. Calculate the sample standard deviation of the returns on each
stock.
c. Calculate the sample correlation of the returns on the two stocks.
2. Repeat Exercise 1 using five years of daily returns for the period
ending December 31, 2015.
Compare the sample means and sample standard deviations
for the daily returns to the corresponding values for the monthly
returns. Do the relationships between monthly and daily values dis-
cussed in Section 2.5 appear to hold at least approximately? For
the comparisons, round the results to three significant figures.
Compare the sample correlation of the daily returns to the
sample correlation of the monthly returns.
3. Using data on three-month Treasury Bills obtained from the Fed-
eral Reserve website, calculate five years of monthly risk-free rates

T&F Cat #K31368 — K31368 C006— page 190 — 6/14/2017 — 22:05

Estimation 191

for the period ending December 31, 2015. Calculate five years of
monthly excess returns for this period for Papa John’s and Bed
Bath & Beyond stock.
a. Calculate the sample mean of the excess returns on each stock.
b. Calculate the sample standard deviation of the excess returns on
each stock.
c. Calculate the sample correlation of the returns on the two stocks
using the excess returns.
d. Compare the results obtained in Parts (b) and (c) to those
obtained in Exercise 1. For the comparison, round the results
to three significant figures.
4. Calculate approximate 95% confidence intervals for the mean
monthly excess return on Papa John’s stock and the mean monthly
excess return on Bed Bath & Beyond stock. Using the procedure
discussed in Example 6.5, calculate an approximate 95% confidence
interval for the difference in mean monthly excess returns on these
two stocks.
5. Using a decay parameter of 0.93, calculate weighted estimates of
the mean and standard deviation of the monthly excess return on
Papa John’s stock; see Examples 6.7 and 6.8. Compare the results
to the (unweighted) sample mean and sample standard deviation.
6. Construct a data matrix consisting of five years of monthly excess
returns on five stocks, Papa John’s International, Inc. (symbol
PZZA), Bed Bath & Beyond, Inc. (BBBY), Netflix, Inc. (NFLX),
Time Warner, Inc. (TWX), and Verizon Communications, Inc.
(VZ); use returns for the time period ending December 31, 2015.
Add column names corresponding to the stock symbols to the data
matrix.
a. Calculate the sample mean of the excess returns of each stock.
b. Calculate the sample mean of the risk-free rate R̄f and use that
to calculate the sample mean of the standard returns of each
stock.
c. Calculate the sample standard deviation of the excess returns of
each stock.
d. Calculate the sample covariance matrix of the excess returns of
the five stocks.
e. Calculate the sample correlation matrix of the excess returns of
the five stocks.
7. Consider the data matrix constructed in Exercise 6, consisting of
five years of monthly excess returns on five stocks. Let μ1 , μ2 , . . . , μ5
denote the respective mean returns on the stocks. Using the pro-
cedure described in Example 6.11, construct shrinkage estimates of

T&F Cat #K31368 — K31368 C006— page 191 — 6/14/2017 — 22:05

192 Introduction to Statistical Methods for Financial Models

μ1 − μf , μ2 − μf , . . . , μ5 − μf where μf is the mean return on the

risk-free asset.
8. Let S denote an N × N sample covariance matrix. Show that
ψσ2 I + (1 − ψ)S is positive-definite for any σ2 > 0, and any ψ.
9. Consider the data matrix constructed in Exercise 6, consisting of
five years of monthly excess returns on five stocks. Let Σ denote the
covariance matrix of the returns on the five stocks. Using the pro-
cedure described in Example 6.12, construct a shrinkage estimate
of Σ using a matrix of the form σ2 I as the target matrix.
Compare the shrinkage estimates of the asset return standard
deviations to the sample standard deviations. Compare the corre-
sponding shrinkage estimate of the correlation matrix of the assets
to the sample correlation matrix.
10. Consider the data matrix constructed in Exercise 6, consisting of
five years of monthly excess returns on five stocks. Using the sample
covariance matrix based on the excess returns, estimate the weights
of the minimum-variance portfolio for the five stocks corresponding
to symbols PZZA, BBBY, NFLX, TWX, and VZ; see Example 6.13.
11. Consider the data matrix constructed in Exercise 6, consisting
of five years of monthly excess returns on five stocks. For these
stocks, estimate the weights of the risk-averse portfolio based on a
risk-aversion parameter λ = 8; see Example 6.14.
12. Repeat the calculation in Exercise 11, adding the restriction that
all portfolio weights must be nonnegative.
13. Consider the data matrix constructed in Exercise 6, consisting of
five years of monthly excess returns on five stocks. Using the shrink-
age estimate of the return covariance matrix calculated in Exercise
9, estimate the weights of the minimum-variance portfolio for the
five stocks corresponding to symbols PZZA, BBBY, NFLX, TWX,
and VZ; see Example 6.13.
14. Consider the data matrix constructed in Exercise 6, consisting of
five years of monthly excess returns on five stocks. Using the sam-
ple means and the sample covariance matrix based on the excess
returns, estimate the weights of the tangency portfolio for the
five stocks corresponding to symbols PZZA, BBBY, NFLX, TWX,
and VZ.
15. Consider the data matrix constructed in Exercise 6, consisting
of five years of monthly excess returns on five stocks. Using the
shrinkage estimate of the mean excess return vector calculated in
Exercise 7 and the shrinkage estimate of the return covariance
matrix calculated in Exercise 9, estimate the weights of the tan-
gency portfolio for the five stocks corresponding to symbols PZZA,
BBBY, NFLX, TWX, and VZ.

T&F Cat #K31368 — K31368 C006— page 192 — 6/14/2017 — 22:05

Estimation 193

16. Consider an asset with return R; let μ = E(R) and σ2 = Var(R);

let Rf denote the return on the risk-free asset and let μf = E(Rf ).
Let Rp denote a portfolio return of the form wR + (1 − w)Rf .
Suppose that w is chosen to achieve a return standard deviation
of σ∗ for some given σ∗ > 0; then w, the weight given to the risky
asset in the portfolio, must be taken to be σ∗ /σ.
Now suppose that σ is unknown and must be estimated based
on a sample of 60 observed returns on the risky asset; let S denote
the sample standard deviation of these returns. Then the estimated
weight given to the risky asset in the portfolio is σ∗ /S and the actual
standard deviation of the portfolio return based on this estimate is

σ∗
σ̄ = σ.
S

The purpose of this exercise is to use Monte Carlo simulation

to study the properties of the quantity σ̄ in this setting. Take σ∗ =
0.02 and suppose that σ = 0.04; because the analysis uses only the
standard deviations, we may take μ = 0 without loss of generality.
a. Using the function rnorm, construct a 10, 000 × 60 matrix
of observations drawn independently from a normal distribu-
tion with mean 0 and standard deviation 0.04; see Example
6.15. Using the apply function, compute the sample standard
deviation of each row.
b. Using the vector of sample standard deviations, calculate a vec-
tor of values of σ̄, using σ∗ = 0.02 and σ = 0.04. This vector
represents a sample from the distribution of σ̄.
c. Using the quantile function, estimate the quantiles of the dis-
tribution of σ̄ corresponding to probabilities 0.01, 0.05, 0.10,
0.20, 0.50, 0.80, 0.90, 0.95, 0.99. Summarize the results.
d. Repeat Parts (a)–(c) using a t-distribution with six degrees of
freedom in place of the normal distribution. To do this, the
function rnorm is replaced by the function rt; see Example 6.16.
17. The purpose of this exercise is to use Monte Carlo simulation to
compare the properties of the weighted estimator of a mean return
and the estimator based on the sample mean for the case in which
the true mean returns and the true returns standard deviations
increase over time.
Consider 60 periods of returns on a given asset. Suppose
that, for t = 1, 2, . . . , 60, the mean return in period t is 0.01 +
(0.01/59)(t − 1) and the return standard deviation in period t is
0.02 + (0.02/59)(t − 1). Hence, the mean return is a linear func-
tion of t, taking the value 0.01 in period 1 and the value 0.02 in

T&F Cat #K31368 — K31368 C006— page 193 — 6/14/2017 — 22:05

194 Introduction to Statistical Methods for Financial Models

period 60; the return standard deviation is 0.02 in period 1 and 0.04
in period 60.
A matrix of return values corresponding to this scenario may be
simulated using the command

> ret_mat<-matrix(rnorm(601000, mean=0.01 + (0.01/59)

+ (0:59), sd=0.02 + (0.02/59)*(0:59)), 1000, 60,
+ byrow=T)

Note that the rnorm function recycles the values speciﬁed in the
argument mean. Because of the way in which they are recycled, we
must populate the matrix of returns by row, instead of by column,
which is the default; this is achieved by the argument byrow=T.
a. Construct a simulated return matrix using the command given
previously.
b. Consider a decay parameter of γ = 0.93. By constructing a vec-
tor of the form γ60−t for t = 1, 2, . . . , 60, using a loop, calculate
the 10, 000 EWMA estimates of the return standard deviation.
c. Using the return matrix from Part (a) together with the apply
function, calculate a vector of 10, 000 sample mean returns.
d. Calculate the sample mean and standard deviation of the
EWMA estimates calculated in Part (b) and the sample means
calculated in Part (c).
e. Consider the estimates obtained by the two methods as estimates
of the return standard deviation in period 60, 0.04. Using the
results from the Monte Carlo simulation, estimate the bias and
RMSE of each estimator. Based on these results, which estimator
is preferable?
f. Estimate the RMSE for EWMA estimators based on diﬀerent
values of the decay parameter, γ = 0.90 and γ = 0.95. Which
EWMA estimator has the smallest estimated RMSE?
18. The goal of this exercise is to use Monte Carlo simulation to study
the behavior of estimates of the weights for tangency and minimum-
variance portfolios. Consider a three-dimensional vector of asset
returns with excess mean vector of the form c(1, 1, 1)T for some c
and covariance matrix of the form
⎛ ⎞
1 ρ ρ
σ2 ⎝ρ 1 ρ⎠
ρ ρ ρ

for some σ2 > 0 and some 0 < ρ < 1.

Speciﬁcally, consider the case in which c = 0.02, σ = 0.05, and
ρ = 0.2.

T&F Cat #K31368 — K31368 C006— page 194 — 6/14/2017 — 22:05

Estimation 195

a. Find the weights of the tangency and minimum-variance port-

folios for the mean vector and covariance matrix described
earlier.
b. Use the approach described in Example 6.17 to repeatedly simu-
late 60 observations from this return distribution; for each set of
simulated excess returns, estimate the mean vector and covari-
ance matrix of the returns. Use these estimates to calculate
the weights of the tangency and minimum-variance portfolios.
Repeat the procedure 10, 000 times so that the result is a
10, 000 × 3 matrix of tangency portfolio weights and a 10, 000 × 3
matrix of minimum-variance portfolio weights.
c. Using apply, calculate the means and standard deviations of the
three columns of the matrix of tangency weights; repeat the cal-
culation for the matrix of minimum-variance weights. Does the
estimator of tangency weights appear to be unbiased? Does the
estimator of the minimum-variance weights appear to be unbi-
ased? Does either of the estimators appear to be more accurate
than the other?
d. Plot the ﬁrst two columns of the matrix of tangency weights
(note that because the weights sum to 1, one of the three columns
is redundant). Use the limits −1/2 and 7/6 for both the x and
y axes (use the xlim and ylim arguments to the plot function).
How would you expect the plot to appear if the weights are
estimated extremely accurately? Based on the plot, what can
you conclude about the sampling distribution of the estimated
tangency portfolio weights?
e. Repeat the previous question using the estimated minimum-
variance weights. Are there any important diﬀerences between
the plot of estimated tangency weights and the plot of estimated
minimum-variance weights?

T&F Cat #K31368 — K31368 C006— page 195 — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page vi — 6/14/2017 — 22:05
7
Capital Asset Pricing Model

7.1 Introduction
In Chapters 4 and 5, we considered portfolio theory in which information
about the means, variances, and correlations of asset returns is used to con-
struct portfolios that are optimal according to certain criteria. In this chapter,
we turn this around—we analyze an optimal portfolio and see what this opti-
mality implies about the distribution of the asset returns. This analysis leads
to important properties of the relationship between the returns on a given
asset and the returns on the optimal portfolio.
According to the theory described in Sections 4.6 and 5.7, an investor
choosing a portfolio of risky assets to combine with the risk-free asset should
always choose the tangency portfolio. This is true for any level of risk preferred
by the investor as follows: to achieve low levels of risk, more of the investment
is placed in the risk-free asset, while investors able to tolerate higher levels of
risk place more of their investment in the tangency portfolio, even borrowing
to do so, if desired. Because, according to this theory, all investors use the
same combination of risky assets, that is, the tangency portfolio, the market
as a whole gives us useful information about the tangency portfolio.
The market portfolio is a portfolio of assets in which the weight placed
on asset j is equal to the investment in asset j, as a proportion of the total
investment in the market. The market portfolio may be viewed as a type of
“consensus portfolio” for all investors.
According to portfolio theory, all investors should use the tangency port-
folio so that this consensus portfolio should be identical to the tangency
portfolio. Therefore, we do not need to calculate the weights of the tangency
portfolio, we can observe them by calculating the investment in each asset in
the market. Furthermore, the equivalence of the market and tangency port-
folios has important implications for the relationship between the returns on
an asset and the returns on the market portfolio, which is summarized in
the capital asset pricing model (CAPM ). This model is a starting point for a
number of models describing the behavior of asset returns.
Of course, such an analysis must be based on a number of assumptions.
Speciﬁcally, we assume the following:
• Asset prices are in equilibrium, with supply equaling demand for
each asset.

197

T&F Cat #K31368 — K31368 C007— page 197 — 6/14/2017 — 22:05

198 Introduction to Statistical Methods for Financial Models

• Investors make their investment decisions based on the expected

returns and the standard deviation of the returns on their invest-
ments. This information is freely available to all investors.
• All investors hold a combination of the portfolio that maximizes
the Sharpe ratio, that is, the tangency portfolio, and the risk-free
asset. The proportion of the investment in the risk-free asset varies
by investor.

7.2 Security Market Line

Consider a portfolio consisting of the tangency portfolio together with the
risk-free asset. The following result shows that the Sharpe ratio of such a
portfolio is equal to the Sharpe ratio of the tangency portfolio.
Lemma 7.1. Let μT and σT denote the mean and standard deviation, respec-
tively, of the return on the tangency portfolio. Let Rp denote the return on a
portfolio consisting of the tangency portfolio together with the risk-free asset,
with positive weight placed on the tangency portfolio, and let μp and σp denote
the mean and standard deviation, respectively, of Rp . Then
μp − μf μT − μf
= (7.1)
σp σT
where μf is the return on the risk-free asset.
Proof. For the portfolio with return Rp , let w denote the proportion of the
portfolio invested in the tangency portfolio. Then
μp = wμT + (1 − w)μf
and
σp = wσT ;
recall that, by assumption, w > 0. The result follows.

The proof also follows from noting that any such portfolio has (σp , μp )
falling on the line connecting (0, μf ) and (σT , μT ).
Note that the condition that the portfolio in Lemma 7.1 places positive
weight on the tangency portfolio is the condition that the investor does not
take a short position in the tangency portfolio in order to buy the risk-free
asset.
According to the argument given in the introduction to this chapter, the
return on the tangency portfolio may be viewed as the return on the market
portfolio. Therefore, we may write the result (7.1) as
μp − μf μm − μf
= (7.2)
σp σm

T&F Cat #K31368 — K31368 C007— page 198 — 6/14/2017 — 22:05

Capital Asset Pricing Model 199

where μm , σm are the mean and standard deviation, respectively, of the return
on the market portfolio. This equation may also be written as
μm − μf
μp = μf + σp ; (7.3)
σm
this form emphasizes the relationship between the expected return on the
portfolio and the portfolio risk. The relationship given in (7.3) is known as
the capital market line.
In this section, we show that a similar relationship holds for the return on
any asset, such as a single stock or a portfolio. Based on the assumptions dis-
cussed in Section 7.1, the market portfolio is equivalent to the tangency port-
folio, and hence, it maximizes the Sharpe ratio among all portfolios. Therefore,
modifying the weight given to asset i in the market portfolio must decrease
the Sharpe ratio. This fact may be used to derive a relationship between the
expected return on asset i and the expected return on the market portfolio.

Proposition 7.1. Let Ri denote the return on a given asset, let μi = E(Ri )
and let σ2i = Var(Ri ). Let Rm denote the return on the market portfolio, let
μm = E(Rm ), let σ2m = Var(Rm ), and let ρi denote the correlation of Ri and
Rm . Then
σi
μi − μf = ρi (μm − μf ) (7.4)
σm
where μf is the return on the risk-free asset.

Proof. Consider a new portfolio, formed by combining asset i with the market
portfolio. Let wi denote the weight given to asset i so that the new portfolio
has return Rp = wi Ri + (1 − wi )Rm . It follows that

μp (wi ) ≡ E(Rp ) = wi μi + (1 − wi )μm

and

σ2p (wi ) ≡ Var(Rp ) = wi2 σ2i + (1 − wi )2 σ2m + 2wi (1 − wi )ρi σi σm .

Viewing this as a two-asset portfolio, we know that the tangency portfolio

occurs at wi = 0, because Rm is the return on the tangency portfolio. That is,
μp (wi ) − μf
,
σp (wi )

the Sharpe ratio of the portfolio with return Rp is maximized at wi = 0. Based

on the analysis in Section 4.6, we know that μp (wi ) and σp (wi ) must satisfy
the tangency condition at wi = 0:
dμp (wi )/dwi |wi =0 μp (0) − μf μm − μf
= = . (7.5)
dσp (wi )/dwi |wi =0 σp (0) σm

T&F Cat #K31368 — K31368 C007— page 199 — 6/14/2017 — 22:05

200 Introduction to Statistical Methods for Financial Models

Here
dμp (wi )
= μi − μm
dwi
and
dσ2p (wi )
= 2wi σ2i − 2(1 − wi )σ2m + 2(1 − 2wi)ρi σi σm
dwi
so that
dσ2p (wi )
= 2ρi σi σm − 2σ2m .
dwi
wi =0

Note that
dσ2p (wi ) dσp (wi )
= 2σp (wi ) ;
dwi dwi
therefore,
dσ2p (wi )
dσp (wi ) dwi w =0 ρi σi σm − σ2m
i
= = .
dwi wi =0 2σp (0) σp (0)
Because μp (0) = μm and σp (0) = σm ,

dσp (wi )
= ρi σi σm
dwi wi =0

and the tangency condition (7.5) may be written

μi − μm μm − μf
= .
ρi σi − σm σm
Rearranging this expression,

σi
μi − μm = ρi − 1 (μm − μf )
σm
or
σi
μi − μf = ρi (μm − μf ) (7.6)
σm
as stated in (7.4) in the proposition.

The result given in Proposition 7.1 is known as the capital asset pricing
model, often abbreviated as CAPM. It may also be written as
μi − μf μm − μf
= ρi (7.7)
σi σm
so that the Sharpe ratio of a given asset is equal to the Sharpe ratio of the
market portfolio times the correlation of the asset’s return with the return on
the market portfolio.
That is, according to Proposition 7.1, the only way for an asset to
have a large Sharpe ratio is for its returns to be highly correlated with the

T&F Cat #K31368 — K31368 C007— page 200 — 6/14/2017 — 22:05

Capital Asset Pricing Model 201

market returns. On the other hand, an asset with returns that are approxi-
mately uncorrelated with the market returns necessarily has a small Sharpe
ratio.
Note that, because ρi ≤ 1, the relationship in (7.7) is consistent with the
assumption that the market portfolio has the largest possible Sharpe ratio.

Example 7.1 Suppose that the monthly return on the market portfolio has
an expected value μm = 0.025 and a standard deviation σm = 0.04, and sup-
pose that the risk-free rate of return is μf = 0.005. Consider an asset with a
return with expected value and standard deviation of μi and σi , respectively,
and let ρi denote the correlation of the asset’s return with the market return.
Then
μi − 0.005 0.025 − 0.005
= ρi = 0.5ρi
σi 0.04
so that the Sharpe ratio of the asset is ρi /2.

Deﬁne
σi Cov(Ri , Rm )
βi = ρi = . (7.8)
σm Var(Rm )
Then the relationship given in Proposition 7.1 may also be written

μi − μf = βi (μm − μf );

this equation is known as the security market line (SML). It shows that the
expected excess return on an asset is proportional to its value of βi . Thus, the
parameter βi describes an important property of an asset and analysts often
refer to the “beta” of an asset; the interpretation of beta will be discussed in
detail in the following section.

Relationship to Linear Regression Analysis

The parameter βi = ρi (σi /σm ) is closely related to the slope parameter in a
linear regression analysis. Consider the problem of ﬁnding constants a, b to
minimize
E[(Ri − Rf − a − b(Rm − Rf ))2 ] (7.9)
where Rf denotes the return on the risk-free asset; recall that, as discussed in
Section 4.5, we will use Rf when referring to returns, as in the aforementioned
expression, and will use μf when referring to properties of the distribution of
returns, as in the SML, for example.
This criterion can be described as the mean squared error (MSE) of
a + b(Rm − Rf ) as an approximation to the excess return Ri − Rf ; hence, the
goal is to ﬁnd a and b so that a + b(Rm − Rf ) best approximates Ri − Rf
in a certain sense. Note that this may be viewed as a “population-level”
least-squares problem.

T&F Cat #K31368 — K31368 C007— page 201 — 6/14/2017 — 22:05

202 Introduction to Statistical Methods for Financial Models

Recall that, for any random variable X, E(X 2 ) = E(X)2 + Var(X) and,
for any constant c, Var(c + X) = Var(X). It follows that
!
2
E[(Ri − Rf − a − b(Rm − Rf )2 ] = E (Ri − Rf − a − b(Rm − Rf ))
+ Var(Ri − bRm ) (7.10)

Let â, b̂ denote the values of a, b, respectively, that minimize (7.9), or equiv-
alently, the expression in (7.10). Then, given b̂, â minimizes E[Ri − Rf − a −
b̂(Rm − Rf )]2 with respect to a. It follows that

â = E(Ri ) − μf − b̂E(Rm − Rf ) = μi − μf − b̂(μm − μf )

so that " 2 #
E Ri − Rf − â − b̂(Rm − Rf ) = 0.

Therefore, b̂ is the value of b that minimizes Var(Ri − bRm ). Note that

Var(Ri − bRm ) = σ2i + b2 σ2m − 2bCov(Ri , Rm )

is a quadratic function of b, with a positive coeﬃcient for b2 . It follows that b̂

solves
dVar(Ri − bRm )
=0
db b=b̂
so that
Cov(Ri , Rm )
b̂ = = βi ,
Var(Rm )
as deﬁned in (7.8).
Because b̂ = βi ,
â = μi − μf − βi (μm − μf ).
Therefore, according to Proposition 7.1, â = 0. It follows that

E[(Ri − Rf − βi (Rm − Rf ))2 ] ≤ E[(Ri − Rf − a − b(Rm − Rf ))2 ] (7.11)

for any a and b. That is, the linear function of Rm − Rf that best approximates
Ri − μf in the sense of MSE is βi (Rm − Rf ).

7.3 Implications of the CAPM

The relationship, given the SML, gives the most obvious conclusion of the
CAPM: The excess return on an asset is equal to the excess return on the
market portfolio times the asset’s value of beta; alternatively, using (7.7),
the Sharpe ratio of a given asset is equal to the Sharpe ratio of the market

T&F Cat #K31368 — K31368 C007— page 202 — 6/14/2017 — 22:05

Capital Asset Pricing Model 203

portfolio times the correlation of the asset’s return with the market return.
However, there are a number of diﬀerent implications of this result and, in
this section, we consider several of these.
The CAPM, as given in Proposition 7.1, describes a relationship between
the expected return on a portfolio and the expected return on a market port-
folio in terms of the standard deviations of the returns and their correlation.
However, that result also implies a relationship for the returns themselves.
Corollary 7.1. Let Ri denote the return on an asset, let Rm denote the
return on the market portfolio, let Rf denote the return on the risk-free asset,
and let βi = Cov(Ri , Rm )/Var(Rm ). Then we may write
Ri − Rf = βi (Rm − Rf ) + Zi
for a random variable Zi that has mean 0 and that is uncorrelated with Rm .
Proof. Note that Zi may be written
Zi = Ri − Rf − βi (Rm − Rf ), (7.12)
where βi is as given in the statement of the corollary.
Then, according to Proposition 7.1, Zi has expected value 0:
E(Zi ) = μi − μf − βi (μm − μf ) = 0.
Furthermore, using properties of covariance,
Cov(Zi , Rm ) = Cov(Ri − μf − βi (Rm − μf ), Rm )
= Cov(Ri , Rm ) − βi Var(Rm ) = 0
so that Zi is uncorrelated with the market return.

It is important to note that it is always true that

Ri − μf = βi (Rm − μf ) + Zi
where Zi and Rm are uncorrelated; the fact that βi = Cov(Ri , Rm )/Var(Rm )
implies that Cov(Zi , Rm ) = 0. The role of the CAPM is to show that
E(Zi ) = 0.
Example 7.2 As in Example 7.1, suppose that the return on the market
portfolio has an expected value μm = 0.025 and a standard deviation σm =
0.04 and suppose that the risk-free rate of return is μf = 0.005. Consider an
asset with a return with a mean and standard deviation of μi = 0.02 and
σi = 0.05, respectively, and suppose that βi = 0.75.
Let Ri and Rm denote the returns on the asset and the market portfolio,
respectively. Then according to (7.12)
Ri − 0.005 = 0.75(Rm − 0.005) + Zi.
That is, the excess return on asset i can be expressed as 0.75 times the excess
return on the market plus a random quantity that is uncorrelated with the
market return and that has expected value zero.

T&F Cat #K31368 — K31368 C007— page 203 — 6/14/2017 — 22:05

204 Introduction to Statistical Methods for Financial Models

The random variable Zi deﬁned by (7.12) is uncorrelated with the mar-

ket return; however, it may be correlated with other economic variables.
In Chapter 10, we will consider models that extend the relationship in (7.12)
by including additional variables.
Note that, according to the expression for Ri − Rf given in Corollary
7.1, along with the fact that Zi and Rm are uncorrelated, we may write
the variance of Ri in terms of two components,
Var(Ri ) = Var(βi Rm ) + Var(Zi ) = β2i Var(Rm ) + Var(Zi ).
The term β2i Var(Rm ) represents the component of Var(Ri ) that is “due
to the market” or that “may be explained by the market”; that is, because
the returns on an individual asset are, in general, correlated with the market
return, and the market return fluctuates, some of the variation of an asset’s
returns can be explained by this variation in the market return. The sec-
ond component, Var(Zi ), may be interpreted as the nonmarket component of
Var(Ri ).
Thus,
β2i Var(Rm )/Var(Ri )
denotes the proportion of the variance of Ri that is “explained by the market”;
it is important to keep in mind that when we say “due to the market” or
“explained by the market,” we are referring specifically to an asset return’s
linear relationship with the market return. Because
β2i Var(Rm )/Var(Ri ) = ρ2i ,
this measure of the proportion of the variance of Ri that is due to the market
is simply one of the standard interpretations of the correlation coefficient ρi .
Example 7.3 As in the previous example in this section, suppose that the
return on the market portfolio has μm = 0.025 and σm = 0.04 and that a
given asset has μi = 0.02, σi = 0.05, and βi = 0.75. Then the component of
the variance of asset i that is explained by the market is
β2i σ2m = (0.75)2 (0.04)2 = (0.03)2 = 0.0009
and the proportion of the variance of asset i that is due to the market is
0.0009/(0.05)2 = 0.36.

Therefore, the random variable Zi in the relationship

Ri − Rf = βi (Rm − Rf ) + Zi
increases the risk of asset i beyond the level attributable to the asset’s rela-
tionship with the market. As noted earlier, decomposing the variance in this
way does not require the CAPM. The role of the CAPM is to show that Zi
has zero mean; that is, the additional variance as a result of Zi does not lead
to an increase in the expected return of the asset. This idea is explored further
as follows.

T&F Cat #K31368 — K31368 C007— page 204 — 6/14/2017 — 22:05

Capital Asset Pricing Model 205

Relationship between Risk and Reward

Consider two assets with returns Ri and Rj , respectively, such that μi ≡
E(Ri ) > μf and μj ≡ E(Rj ) > μf and let βi , βj denote the respective values
of beta for those assets. Because of the relationship between the expected
return on an asset and the expected return on a market portfolio, as given by
the CAPM, and the relationship between the variance of an asset’s return and
the variance of the market return, as discussed earlier, there is a relationship
between the risk of an asset and the corresponding “reward,” as measured by
the asset’s expected return.
Suppose that Var(Rj ) > Var(Ri ) so that asset j is “riskier” than asset i.
If the additional risk of asset j is attributable entirely to the diﬀerence in the
assets’ market components of variance, then

Var(Rj ) − Var(Ri ) = (β2j − β2i )Var(Rm );

hence, it follows that βj > βi . Note that, because μi , μj , and μm are all
greater than μf , βi , and βj must be positive. Therefore,

μj − μf = βj (μm − μf ) > βi (μm − μf ) = μi − μf ;

so that μj > μi . That is, an investor who assumes additional market risk by
investing in asset j is rewarded with a higher expected return.
On the other hand, suppose that the additional risk of asset j is
attributable entirely to the difference in the assets’ nonmarket components
of variance. If the market components of the variances of assets i and j are
equal, then β2i σ2m = β2j σ2m so that βi = βj . It follows that μi = μj ; that is,
there is no “reward” for the additional nonmarket risk.
Now suppose that the difference between Var(Rj ) and Var(Ri ) is because of
differences in both the market and the nonmarket components of the variances.
Then the same argument holds, except that μj − μi depends only on the
difference between the market components of variance.
Specifically,

μj − μi = (βj − βi )(μm − μf )
μm − μf
= (βj σm − βi σm ) .
σm
Note that βi σm and βj σm are the square roots of the market components of
variance for assets i and j, respectively. We will refer to βi σm as the market
component of risk for the asset; this market component may also be written
as ρi σi .
Thus, the difference (μj − μi ) is proportional to the difference in the mar-
ket components of risk for the two assets. This consequence of the CAPM is
often summarized by saying that there is a reward for assuming risk but only
for the market component of risk; there is no benefit in investing in an asset
that has a large nonmarket component of risk.

T&F Cat #K31368 — K31368 C007— page 205 — 6/14/2017 — 22:05

206 Introduction to Statistical Methods for Financial Models

Example 7.4 Suppose that the return on the market portfolio has μm =
0.025 and σm = 0.04; let μf = 0.005. Consider an asset with a return that has
mean μi , standard deviation σi , and correlation with the market return of ρi .
Then
σi
βi = ρi = 25ρi σi
σm
and, hence,
1
μi = μf + βi (μm − μf ) = 0.005 + 25ρiσi (0.025 − 0.005) = 0.005 + ρi σi .
2
Assume that ρi > 0. Let γ2i denote the component of the variance of the
return on asset i that is due to the market, so that γ2i = ρ2i σ2i . Then
1
μi = 0.005 + γi .
2
That is, the expected return on the asset is a linear function of its market
component of risk, γi .

Clearly, this type of relationship holds in general.

Corollary 7.2. Let Ri denote the return on an asset and let μi and σi denote
the mean and standard deviation, respectively, of Ri . Assume that μi > μf ,
where μf denotes the expected return on the risk-free asset. Let Rm denote the
return on the market portfolio, let μm and σm denote the mean and standard
deviation, respectively, of Rm , and let ρi denote the correlation of Ri and Rm .
Then
μm − μf
μi − μf = (ρi σi ) . (7.13)
σm
Note that ρi σi is the market component of risk for the asset so that, accord-
ing to the corollary, the excess return on an asset is proportional to its market
component of risk.

7.4 Applying the CAPM to a Portfolio

Suppose that there are N assets in the market, with returns R1 , R2 , . . . , RN ,
and suppose that the SML holds for all assets:
μi − μf = βi (μm − μf ) (7.14)
where μi = E(Ri ), βi = Cov(Ri , Rm )/Var(Rm ), μm = E(Rm ), Rm is the
return on the market portfolio, and μf is the risk-free rate of return.
Consider a portfolio based on weights w1 , w2 , . . . , wN and let

N
Rp = wi Ri
i=1

T&F Cat #K31368 — K31368 C007— page 206 — 6/14/2017 — 22:05

Capital Asset Pricing Model 207

denote its return. Then βp , the value of beta for the portfolio, may be written
N
Cov(Rp , Rm ) Cov( i=1 wi Ri , Rm )
βp = =
Var(Rm ) Var(Rm )
N N
i=1 Cov(wi Ri , Rm ) wi Cov(Ri , Rm )
= = i=1
Var(Rm ) Var(Rm )
N
wi βi Var(Rm )

N
= i=1 = wi βi .
Var(Rm ) i=1

Because

N
μp = E(Rp ) = wi μi ,
i=1

it follows from (7.14) that

N
μp − μf = wi (μi − μf ) = wi βi (μm − μf )
i=1 i=1
= βp (μm − μf ).

That is, the SML holds for the portfolio as well.

Therefore, when we say that the CAPM holds for a given set of assets it
follows that it holds for all portfolios constructed from those assets as well.
Example 7.5 Consider four assets, with returns R1 , R2 , R3 , and R4 , respec-
tively, and let Rm denote the return on the market portfolio. Suppose that
standard deviations of the returns on the four assets are 0.02, 0.05, 0.01, and
0.04, respectively, and that the standard deviation of Rm is 0.01. Let ρi denote
the correlation of Ri and Rm , for i = 1, 2, 3, and 4 and suppose that ρ1 = 0.6,
ρ2 = 0.1, ρ3 = 0.8, and ρ4 = 0.2. Then the values of beta for the four assets
are given by
0.02 0.05 0.01
β1 = (0.6) = 1.2, β2 = (0.1) = 0.5, β3 = (0.8) = 0.8,
0.01 0.01 0.01
and
0.04
β4 = (0.2) = 0.8.
0.01
Let Rp denote the return on an equally-weighted portfolio of the four
assets; then βp , the value of beta for the portfolio is given by

βp = 0.25β1 + 0.25β2 + 0.25β3 + 0.25β4 = 0.825.

Alternatively, the value of βp may be obtained from the properties of Rp .

Note that

Cov(Ri , Rm ) = βi σm = 0.01βi , i = 1, 2, 3, and 4

T&F Cat #K31368 — K31368 C007— page 207 — 6/14/2017 — 22:05

208 Introduction to Statistical Methods for Financial Models

where σm is the standard deviation of Rm . Hence,

Cov(R1 , Rm ) = 0.012, Cov(R2 , Rm ) = 0.005, Cov(R3 , Rm ) = 0.008,
and
Cov(R4 , Rm ) = 0.008.
Using properties of covariance, it follows that
Cov(Rp , Rm ) = (0.25)(0.012) + (0.25)(0.005) + (0.25)(0.008) + (0.25)(0.008)
= 0.00825
and, hence, that
Cov(Rp , Rm ) 0.00825
βp = = = 0.825,
σm 0.01
matching the result obtained previously.

7.5 Mispriced Assets

For a given asset with return Ri , let
αi = μi − μf − βi (μm − μf ) (7.15)
where βi = Cov(Ri , Rm )/σ2m , Rm is the return on the market portfolio, μm =
E(Rm ), σ2m = Var(Rm ), μi = E(Ri ), and μf is the return on the risk-free
asset. According to the CAPM,
αi = 0.
However, suppose that αi > 0; that is, suppose that for a given asset the
conclusion of Proposition 7.1 does not hold. As in Section 7.2, let Rp denote
the return on a portfolio consisting of the market portfolio and asset i, with
return of the form Rp = wi Ri + (1 − wi )Rm , for some weight wi . Let
μp (wi ) = E(Rp ) = wi μi + (1 − wi )μm
and
σ2p (wi ) = Var(Rp ) = wi2 σ2i + (1 − wi )2 σ2m + 2wi (1 − wi )Cov(Ri , Rm )
= wi2 σ2i + (1 − wi )2 σ2m + 2wi (1 − wi )βi σ2m .
Deﬁne
μp (wi ) − μf
f (wi ) =
σp (wi )
to be the Sharpe ratio of this portfolio, as a function of wi . Then, using the
results in Section 7.2,
dμp (wi )/dwi μp (wi ) − μf dσp (wi )/dwi
f (wi ) = −
σp (wi ) σp (wi ) σp (wi )

T&F Cat #K31368 — K31368 C007— page 208 — 6/14/2017 — 22:05

Capital Asset Pricing Model 209

and
μi − μm μm − μf dσp (wi )/dwi |wi =0
f (0) = − .
σm σm σp (0)
We have seen that
ρi σi σm − σ2m
dσp (wi )/dwi |wi =0 =
σm
so that, using the fact that ρi = βi σm /σi , we may write
μi − μm μm − μf
f (0) = − (βi − 1)
σm σm
μi − μm − βi (μm − μf )
=
σm
αi
= .
σm
Therefore, if αi > 0, then f (0) > 0 so that adding a small investment in asset
i to the market portfolio increases the Sharpe ratio. Stated another way, the
market portfolio does not contain enough of asset i to maximize the Sharpe
ratio.
Let Qi denote the number of shares of asset i in the market and let Pi
denote the price of one share of asset i. Then the weight given to asset i in
the market portfolio is
Qi Pi
C
where C denotes the total investment in the market, known as the
market capitalization.
When αi > 0, the weight given to asset i in the market portfolio is too
small; that is, the ratio Qi Pi /C is too small. Therefore, Pi , the price of asset
i, should be higher on average. It follows that, according to the CAPM, an
asset with αi > 0 is mispriced and its price is too low. Conversely, the price
of an asset with αi < 0 is too high; according to the CAPM, its price should
be lower on average.
Example 7.6 Suppose that Rm , the return on the market portfolio, has mean
0.025 and standard deviation 0.04 and that the risk-free rate is μf = 0.005.
Consider an asset with return Ri that has mean 0.02 and standard deviation
0.08, and suppose that the correlation of Ri and Rm is ρi = 0.30.
Then
βi = ρi σi /σm = (0.30)(0.08)/(0.04) = 0.60
and, hence, according to the CAPM,
μi = μf + βi (μm − μf ) = 0.005 + 0.60(0.025 − 0.005) = 0.017.
However, μi = 0.02, so that
αi = μi − μf − βi (μm − μf ) = 0.02 − 0.017 = 0.003.
Therefore, the price of asset i is too low.

T&F Cat #K31368 — K31368 C007— page 209 — 6/14/2017 — 22:05

210 Introduction to Statistical Methods for Financial Models

The CAPM given in Proposition 7.1 follows from the assumptions pre-
sented in Section 7.1 as follows: Asset prices are in equilibrium, investments
decisions are based on the means and standard deviations of the returns, and
all investors hold a combination of the tangency portfolio and the risk-free
asset. Therefore, if the conclusion of Proposition 7.1 does not hold, then one
or more of the assumptions must be incorrect.
For instance, it may be that market prices are not in equilibrium. This
suggests that if αi > 0, then the price of asset i needs to increase in order
to reach equilibrium, at which point αi will be 0. This leads to an expected
return for asset i that is larger than the expected return given by the CAPM.
The case of αi < 0 is similar except that we expect the return on asset i to be
lower than what is implied by the CAPM.
Alternatively, it may be that prices are in equilibrium but that investors
hold inefficient portfolios so that the market portfolio is inefficient in the sense
that it does not maximize the Sharpe ratio. Thus, if αi > 0, the demand for
asset i is lower than it would be if the market portfolio were efficient, leading
to a price for asset i that is too low.

The Role of the Eﬃciency of the Market Portfolio

The analysis in this section shows that, if the CAPM does not hold for asset
i, that is, if αi as deﬁned previously is not 0, then the market portfolio can
be improved by including more or less of asset i. On the other hand, if the
CAPM does hold for asset i, then changing the weight of asset i in the market
portfolio cannot increase its Sharpe ratio. This suggests the following converse
to Proposition 7.1: If the SML holds for all assets in the market, then the
market portfolio must have the maximum Sharpe ratio.

Proposition 7.2. Consider a set of assets with returns R1 , R2 , . . . , RN and

let Rm denote the return on the market portfolio, which is not necessarily
equivalent to the tangency portfolio. Suppose that the SML (7.4) holds for
each asset:
μi − μf = βi (μm − μf ) (7.16)
where μi = E(Ri ), βi = Cov(Ri , Rm )/Var(Rm ), and μm = E(Rm ).
Let Rp denote the return on a portfolio based on weights w1 , w2 , . . . , wN
so that

N
Rp = wi Ri
i=1

and let μp and σp denote the mean and standard deviation, respectively, of
Rp . Then
μp − μf μm − μf
≤ (7.17)
σp σm
with equality if and only if Rp and Rm have correlation one. That is, the
market portfolio has the maximum possible Sharpe ratio.

T&F Cat #K31368 — K31368 C007— page 210 — 6/14/2017 — 22:05

Capital Asset Pricing Model 211

Proof. Using the form of the SML given by (7.7), together with the properties
of portfolios discussed in Section 7.4, it follows that
μp − μf μm − μf
= ρp (7.18)
σp σm

where ρp denotes the correlation of Rp and Rm .

The result now follows from the fact that ρp ≤ 1.

The CAPM shows that if a given portfolio is efficient in the sense that it
maximizes the Sharpe ratio, then the SML holds for all assets with respect
to that efficient portfolio. Proposition 7.2 shows that if the SML holds for all
assets in the market with respect to a given market portfolio, then that market
portfolio must maximize the Sharpe ratio. Therefore, there is a sense in which
the CAPM, as stated in Proposition 7.1, is actually a statement about the
efficiency of the market portfolio.
The result in Proposition 7.2 may be stated in an alternative form, which
is given in the following corollary; the proof of Proposition 7.2 may be easily
adapted to prove this result.

Corollary 7.3. Let Rp∗ denote a given portfolio and for any arbitrary asset
with return R deﬁne

α(R) = E(R) − μf − β(R)(μ∗p − μf )

where μ∗p = E(Rp∗ ) and β(R) = Cov(R, Rp∗ )/Var(Rp∗ ). Consider the set of
assets for which α(R) = 0. Then the asset with return Rp∗ is in this set and
it has the maximum Sharpe ratio among all portfolios formed from assets in
this set.

7.6 The CAPM without a Risk-Free Asset

The CAPM discussed in this chapter is based on the assumption that all
the investors choose a combination of the risk-free asset and a portfolio of
risky assets. According to eﬃcient portfolio theory, this portfolio of risky
assets is the tangency portfolio for all investors. Thus, the market portfolio
is equivalent to the tangency portfolio so that the market portfolio has the
properties of the tangency portfolio. It is important to note that the optimality
of the tangency portfolio in this context is based on the fact that investors
combine their portfolio of risky assets with the risk-free asset.
In this section, we consider a version of the CAPM that holds without
relying on the existence of a risk-free asset. There are two important implica-
tions of this change for the CAPM. The more obvious one is that we cannot
include the risk-free rate, μf , in the model. The second, less obvious, change
is that the tangency portfolio is no longer the unambiguous optimal portfolio

T&F Cat #K31368 — K31368 C007— page 211 — 6/14/2017 — 22:05

212 Introduction to Statistical Methods for Financial Models

and, hence, we may no longer assume that the market portfolio is equivalent
to the tangency portfolio.
Instead, we assume that each investor holds a portfolio of risky assets
that lies on the efficient frontier, but these portfolios may vary by investor.
According to Propositions 5.2, portfolios constructed from assets lying on
the efficient frontier are also on the efficient frontier provided that the mean
return of the portfolio is greater than the mean return on the minimum vari-
ance portfolio. Hence, we may still assume that the market portfolio lies on
the efficient frontier. However, it is not necessarily equal to the tangency
portfolio.
Let Rm denote the return on the market portfolio. We assume that
if there is another portfolio, with return Rp , such that E(Rp ) = E(Rm ),
then Var(Rp ) ≥ Var(Rm ); alternatively, if Var(Rp ) = Var(Rm ), then E(Rp ) ≤
E(Rm ). Note that these assumptions state simply that the market portfolio
lies on the efficient frontier.
The proof of the CAPM given in Proposition 7.1 is based on the fact that
the market portfolio has the maximum Sharpe ratio among all assets and,
hence, modifying the weight given to any asset cannot increase the Sharpe
ratio. For the version of the CAPM considered in this section, we use a similar
argument based on the efficiency of the market portfolio.
Let Ri denote the return on asset i. Suppose that we can construct a
portfolio consisting of asset i together with the market portfolio that has
the same expected return as the market portfolio; then the variance of that
portfolio must be at least as large as that of the market portfolio. We may try
to use this fact to establish a relationship similar to that in the SML.
However, it is clear that such an approach cannot work—unless asset i
has the same expected return as the market portfolio, a portfolio including
both asset i and the market portfolio cannot have the same expected return as
the market portfolio. Hence, we need to include a third asset in the portfolio.
Because we would like the eventual result to focus on the relationship between
the return on asset i and the return on the market portfolio, we might consider
an asset with a return that is uncorrelated with the market return.
Let Rz denote the return on an asset satisfying Cov(Rz , Rm ) = 0 and
E(Rz ) = E(Rm ). At the end of this section it will be shown that such a
portfolio exists. Note that Cov(Rz , Rm ) = 0 implies that the value of beta
corresponding to Rz is zero; therefore, the asset with return Rz is known as
the zero-beta portfolio.

Proposition 7.3. Let Rm denote the return on the market portfolio and let
Rz denote the return on the corresponding zero-beta portfolio; let μm = E(Rm )
and μz = E(Rz ). Consider an asset with return Ri ; let μi = E(Ri ) and let
βi = Cov(Ri , Rm )/Var(Rm ). Then

μi − μz = βi (μm − μz ). (7.19)

T&F Cat #K31368 — K31368 C007— page 212 — 6/14/2017 — 22:05

Capital Asset Pricing Model 213

Proof. For a real number θ, consider the zero-investment portfolio with return

Ri + (θ − 1)Rm − θRz = θ(Ri − Rz ) + (1 − θ)(Ri − Rm ); (7.20)

hence, this portfolio places weight 1 on asset i, weight θ − 1 on the market

portfolio, and weight −θ on the zero-beta portfolio. Note that the expected
value of (7.20) is
θ(μi − μz ) + (1 − θ)(μi − μm ).
Let
μm − μi
θ0 =
μm − μz
and let
R0 = θ0 (Ri − Rz ) + (1 − θ0 )(Ri − Rm ).
Then

μm − μi μm − μi
E(R0 ) = (μi − μz ) + 1 − (μi − μm )
μm − μz μm − μz
(μm − μi )(μi − μz ) + (μi − μz )(μi − μm )
= = 0.
μm − μz
Thus, R0 is the return on a zero-investment portfolio that has zero
expected return. Because the market portfolio is on the eﬃcient frontier, it
now follows from Corollary 5.2 that Cov(R0 , Rm ) = 0. Note that

Cov(R0 , Rm ) = Cov(Ri + (θ0 − 1)Rm − θ0 Rz , Rm )

= Cov(Ri , Rm ) + (θ0 − 1)Var(Rm )
= (βi − (1 − θ0 )) Var(Rm ). (7.21)

Therefore,
βi = 1 − θ0
and, using the expression for θ0 ,
μm − μi μi − μm
βi = 1 − = ,
μm − μz μm − μz
proving the result.

That is, a form of the SML holds with μz replacing μf . The pricing model
given by (7.19) is known as the zero-beta CAPM.
Note that Proposition 7.3 requires only that the market portfolio is on
the eﬃcient frontier, which is weaker than the condition that the market
portfolio is the tangency portfolio required in Proposition 7.1. Hence, one
might consider the possibility of weakening the conditions of Proposition 7.1
to require only that the market portfolio is on the eﬃcient frontier, using the
method of proof used in Proposition 7.3 with the risk-free asset playing the role
of the zero-beta portfolio. However, in Proposition 7.3, it is important to keep

T&F Cat #K31368 — K31368 C007— page 213 — 6/14/2017 — 22:05

214 Introduction to Statistical Methods for Financial Models

in mind that the efficiency of the market portfolio is with respect to all assets
under consideration; if a risk-free asset is available, then the market portfolio
must be efficient with respect to portfolios that include the risk-free asset.
Thus, such efficiency requires that the market portfolio is equivalent to the
tangency portfolio; that is, it is not possible to use the approach of Proposition
7.3 to weaken the conditions used to establish the SML in Proposition 7.1.

Existence of the Zero-Beta Portfolio

Proposition 7.3 is based on the existence of the zero-beta portfolio; thus, we
now show that such a zero-beta portfolio always exists, provided that the
market portfolio is not the same as the minimum-variance portfolio. It may
be shown that the market portfolio is equivalent to the minimum-variance
portfolio if and only if all investors hold the minimum-variance portfolio.
Lemma 7.2. Let Rm denote the return on the market portfolio, with variance
σ2m , and let Rmv denote the return on the minimum-variance portfolio, with
variance σ2mv . If σ2m > σ2mv , then there exists a portfolio with return Rz such
that Cov(Rz , Rm ) = 0 and E(Rz ) = E(Rm ).
The return Rz may be written
1
Rz = (σ2 Rm − σ2m Rmv ).
σ2mv − σ2m mv
Proof. Consider the portfolio with return φRm + (1 − φ)Rmv for some real
number φ. Recall that, according to Proposition 5.2, the covariance of Rmv
with the return on any other portfolio is equal to Var(Rmv ). Therefore,
Cov(Rmv , Rm ) = Var(Rmv ) = σ2mv .
It follows that, for any real number φ,
Cov(φRm + (1 − φ)Rmv , Rm ) = φVar(Rm ) + (1 − φ)Var(Rmv )
= φσ2m + (1 − φ)σ2mv .
Let
σ2mv
φz =
σ2mv − σ2m
and deﬁne
Rz = φz Rm + (1 − φz )Rmv .
Then
Cov(Rz , Rm ) = φz σ2m + (1 − φz )σ2mv = 0.
Note that because σ2mv < σ2m , φz < 0 and the eﬃciency of the market
portfolio implies that E(Rmv ) < E(Rm ). It follows that
E(Rz ) = E(Rm ) + (1 − φz )[E(Rmv ) − E(Rm )] < E(Rm );
that is, E(Rz ) = E(Rm ), as required.

T&F Cat #K31368 — K31368 C007— page 214 — 6/14/2017 — 22:05

Capital Asset Pricing Model 215

7.7 Using the CAPM to Describe the Expected

Returns on a Set of Assets
In Section 7.3, we considered several diﬀerent interpretations of the CAPM.
These interpretations are based on an analysis of the properties of a single asset
return and how that return relates to the return on the market portfolio. In
this section, another interpretation is given, based on analyzing the expected
returns of a set of assets.
Consider a set of K assets, with returns R1 , R2 , . . . , RK , and let
⎛ ⎞
R1
⎜ R2 ⎟
⎜ ⎟
RK = ⎜ . ⎟
⎝ . ⎠
.
RK
denote the corresponding return vector; RK might include all the returns on
all stocks in a given market, the returns on a subset of those stocks, or the
returns on a set of portfolios.
Let μk = E(Rk ), k = 1, 2, . . . , K, and let
⎛ ⎞
μ1
⎜ μ2 ⎟
⎜ ⎟
μK = ⎜ . ⎟
⎝ .. ⎠
μK
be the corresponding vector of expected returns. According to the classical
form of the CAPM, as given by Proposition 7.1,
μk = μf + βk (μm − μf ), k = 1, 2, . . . , K (7.22)
where μm is the expected return of the market portfolio, μf is the risk-free
rate, and
Cov(Rk , Rm )
βk = , k = 1, 2, . . . , K.
Var(Rm )
Let ⎛ ⎞
β1
⎜ β2 ⎟
⎜ ⎟
βK = ⎜ . ⎟ .
⎝ .. ⎠
βK
Then the set of K equations given by 7.22 may be written
μK = μf 1K + (μm − μf )βK . (7.23)
That is, the vector of expected asset returns may be written as a linear
function of the vector of betas and the vector of all ones.

T&F Cat #K31368 — K31368 C007— page 215 — 6/14/2017 — 22:05

216 Introduction to Statistical Methods for Financial Models

According to the relationship in (7.23), the diﬀerences in the values of μk for

the diﬀerent assets may be described in terms of the diﬀerences in β1 , β2 , . . . , βK .
For instance, suppose we plot the points (β1 , μ1 ), (β2 , μ2 ), . . . , (βK , μK ). In
such a plot, the points will fall on a line with slope μm − μf and intercept μf ;
see Figure 7.1 for a hypothetical example.
Example 7.7 Consider the four assets described in Example 7.5. Recall that,
for these assets, the values of beta are given by
β1 = 1.2, β2 = 0.5, β3 = 0.8, and β4 = 0.8.
Thus, the beta vector for the assets is
⎛ ⎞
1.2
⎜0.5⎟
⎜ ⎟.
⎝0.8⎠
0.8
Suppose that μf = 0.002 and μm = 0.01. Then the vector of expected
returns on the four assets may be written
⎛ ⎞ ⎛ ⎞
1 1.2
⎜1⎟ ⎜0.5⎟
0.002 ⎜ ⎟ ⎜ ⎟
⎝1⎠ + (0.01 − 0.002) ⎝0.8⎠
1 0.8
so that β1 , β2 , β3 , and β4 fall on the line with slope 0.01 − 0.002 = 0.008 and
intercept 0.002, as described previously; alternatively, this relationship may be
described by stating that the vector of expected excess returns is proportional
to the vector of asset betas.
μk

μf

βk

FIGURE 7.1
Hypothetical Plot of μk versus βk .

T&F Cat #K31368 — K31368 C007— page 216 — 6/14/2017 — 22:05

Capital Asset Pricing Model 217

A relationship similar to (7.23) is implied by the zero-beta form of the

CAPM:
μK = μz 1K + (μm − μz )βK . (7.24)
In (7.23), it is generally assumed that μf is known, while in (7.24), μz is
generally considered to be an unknown parameter.

7.8 Suggestions for Further Reading

The CAPM is one of the central results of modern portfolio theory. The clas-
sical CAPM, given here as Proposition 7.1, is attributed to Sharpe (1964),
Lintner (1952), and Mossin (1966). Roll (1977) and Ross (1977) show that
the CAPM follows directly from the assumption that the market portfolio is
eﬃcient. The zero-beta form of the CAPM is attributed to Black (1972).
Francis and Kim (2013, Chapter 12) present a detailed discussion of the
CAPM and its derivation. Francis and Kim (2013, Chapter 13) discuss a num-
ber of extensions of the CAPM, including the zero-beta CAPM. Campbell
et al. (1997, Chapter 5) describe the CAPM from a more statistical perspec-
tive, including empirical tests of the CAPM. Fabozzi et al. (2006, Chapter 7)
discuss asset pricing models in general, including the CAPM as a special case.

7.9 Exercises
1. Consider an asset with expected return 0.04 and suppose that the
return on the market portfolio is 0.06. Assuming that the SML holds
for the asset and that the risk-free return is 0.004, ﬁnd the value of
beta for the asset.
2. Use the relationship given by the CAPM, as stated in (7.7), along
with the assumption that the market portfolio is equivalent to the
tangency portfolio, to establish the result in Lemma 7.1.
3. Consider an asset with return R and let Rm denote the return on
the market portfolio. Let μ = E(R), μm = E(Rm ), σ2 = Var(R),
and σ2m = Var(Rm ), and let μf denote the return on the risk-free
asset. Suppose that σ = σm /2 and that
μ − μf 1 μm − μf
= .
σ 2 σm

Assuming that the SML holds for this asset, ﬁnd its value of β.
4. Consider an asset with return R and let Rm denote the return on
the market portfolio. Suppose that R and Rm are related by
R = 0.002 + 0.9Rm +

T&F Cat #K31368 — K31368 C007— page 217 — 6/14/2017 — 22:05

218 Introduction to Statistical Methods for Financial Models

where is a random variable satisfying E() = 0 and

Cov(Rm , ) = 0.

Assuming that the SML holds for this asset, ﬁnd the value of
μf , the return on the risk-free asset.
5. Consider two assets, with returns R1 and R2 , respectively, and let
Rm denote the return on the market portfolio. For j = 1, 2, let
μj = E(Rj ), let σ2j = Var(Rj ), and let ρj denote the correlation of
Rj and Rm . Let μf denote the expected return on the risk-free asset
and assume that μj > μf for j = 1, 2. Assume that the SML holds
for both assets.
For each of the sets of parameter values given as follows, state
that asset 1 has the greater mean return, that asset 2 has the greater
mean return, or that it is not possible to determine the greater mean
return based on the information given.
a. Suppose that ρ1 = 0.6, σ1 = 0.2, ρ2 = 0.5, and σ2 = 0.3.
b. For j = 1, 2, let SRj denote the Sharpe ratio of asset j. Suppose
that
SR1 = 0.9 and SR2 = 1.2.
c. For j = 1, 2, let γ2j = ρ2j σ2j ; note that γ2j is the market component
of variance for the return on asset j. Suppose that γ21 = 0.5 and
γ22 = 0.8.
6. Consider two assets with returns R1 and R2 , respectively, and let
Rm denote the return on the market portfolio. Suppose that the
SML holds for both assets, with β = 0.80 for asset 1 and β = 0.90
for asset 2. Does it follow that the correlation of R2 and Rm is
greater than the correlation of R1 and Rm ? Why or why not?
7. Let Rmv denote the return on the minimum-variance portfolio and
let Rm denote the return on the market portfolio. Suppose that the
correlation of Rmv and Rm is 0.4. Find the value of beta in the
SML for the minimum-variance portfolio.
8. Consider an asset with return Ri . Suppose that the variance of Ri is
0.04 and that the market component of the variance of Ri is 0.03. Let
μf denote the risk-free return and assume that μi ≡ E(Ri ) > μf .
a. Find the correlation of Ri and Rm , the return on the market
portfolio.
b. If the Sharpe ratio of the market portfolio is 0.12, ﬁnd μi − μf .
9. Consider a set of N assets and let Rλ∗ denote the return on the risk-
averse portfolio based on risk-aversion parameter λ. Let Ri denote
the return on a given asset in this set. Find E(Ri ) − E(Rλ∗ ), in terms
of Cov(Ri , Rλ∗ ), Var(Rλ∗ ), and λ.

T&F Cat #K31368 — K31368 C007— page 218 — 6/14/2017 — 22:05

Capital Asset Pricing Model 219

10. Consider an asset with return R, where R has standard deviation

0.03 and suppose that the return on the market has standard devi-
ation 0.02. If the value of beta for the asset is 0.6, ﬁnd the market
component of Var(R) and ﬁnd the proportion of Var(R) that is due
to the market.
11. Consider an asset for which the SML holds with β = 1.1. If the
return on the market portfolio has standard deviation 0.04, what is
the smallest value that the return standard deviation for the asset
can take?
12. Consider two assets with mean returns μ1 and μ2 , respectively, and
assume that μj > μf , j = 1, 2, where μf denotes the risk-free rate.
If the market component of variance of asset 1 is larger than
the market component of variance of asset 2, does it follow that the
Sharpe ratio of asset 1 is greater than the Sharpe ratio of asset 2?
Why or why not?
13. Consider a market of N assets, where N ≥ 5. Let R1 , R2 , R3 , and R4
denote the returns on four of the assets and let Rm denote the
return on the market portfolio. Suppose that the covariance matrix
of (R1 , R2 , R3 , R4 , Rm )T is given by
⎛ ⎞
0.012 0.004 0.005 0.0066 0.0055
⎜ 0.004 0.008 0.0036 0.005 0.004 ⎟
⎜ ⎟
⎜ 0.005 0.0036 0.012 0.0054 0.0045⎟ . (7.25)
⎜ ⎟
⎝0.0066 0.005 0.0054 0.01 0.006 ⎠
0.0055 0.004 0.0045 0.006 0.005
For instance, the Cov(R1 , Rm ) = 0.0055 and Var(Rm ) = 0.0050.
Suppose that the mean excess return on the market portfolio is
0.04 and that the SML holds for all assets.
a. Calculate beta for each of the four assets.
b. Find the minimum-variance portfolio subject to the restriction
that beta for the portfolio is 1. Calculate the return variance
and the expected excess return for the portfolio.
c. Find the minimum-variance portfolio subject to the restriction
that beta for the portfolio lies in the interval [0.95, 1.05]. Calcu-
late the return variance and the expected excess return for the
portfolio.
d. Find the minimum-variance portfolio subject to the restrictions
that beta for the portfolio lies in the interval [0.95, 1.05] and
that the portfolio weights are nonnegative. Calculate the return
variance and the expected excess return for the portfolio.
14. Suppose the market portfolio has mean return μm = 0.075 and
return standard deviation σm = 0.14; suppose that μf , the risk-free
return, is 0.005.

T&F Cat #K31368 — K31368 C007— page 219 — 6/14/2017 — 22:05

220 Introduction to Statistical Methods for Financial Models

Suppose that asset i has mean return μi and return standard

deviation σi and let ρi denote the correlation of the return on asset
i with the return on the market portfolio. Based on the CAPM,
determine if the asset is overpriced, underpriced, or priced correctly
if the asset parameters take the following parameter values:
a. μi = 0.035, σi = 0.3, and ρi = 0.2
b. μi = 0.045, σi = 0.2, and ρi = 0.6
c. μi = 0.075, σi = 0.15, and ρi = 0.8
15. Consider a market portfolio with return Rm . Suppose that there
exists an asset that is not included in the market portfolio but for
which the SML with respect to the market portfolio holds. Thus, if
R0 denotes the return on this asset,

R0 − Rf = β0 (Rm − Rf ) + Z

where E(Z) = 0, Cov(Z, Rm ) = 0, and Rf denotes the return on

the risk-free asset.
Find the tangency portfolio consisting of the market portfolio
(viewed as a single asset) and the asset with return R0 . Interpret
the result.
16. Consider a market without a risk-free asset and let Rm denote
the return on the market portfolio, which is assumed to be eﬃ-
cient. Let Rz denote the return on the zero-beta portfolio deﬁned
in Proposition 7.3. That is,

Rz = φz Rm + (1 − φz )Rmv

where Rmv is the return on the minimum-variance portfolio and

σ2mv
φz = .
σ2mv − σ2m

Here σ2mv = Var(Rmv ) and σ2m = Var(Rm ).

a. Is the zero-beta portfolio with return Rz on the minimum-risk
frontier? Why or why not?
b. Is the zero-beta portfolio with return Rz on the eﬃcient frontier?
Why or why not?
17. Consider a market without a risk-free asset and let Rm denote the
return on the market portfolio, which is assumed to be eﬃcient.
Suppose that there are two zero-beta portfolios with respect to Rm
and let Rz1 and Rz2 denote their respective returns. Show that

E(Rz1 ) = E(Rz2 ).

T&F Cat #K31368 — K31368 C007— page 220 — 6/14/2017 — 22:05

8
The Market Model

8.1 Introduction
The capital asset pricing model (CAPM) describes a relationship between the
expected return on an asset and the expected return on a “market portfolio,”
which is assumed to be equivalent to the tangency portfolio. The value of
the parameter β for an asset gives important information regarding both the
expected return and the risk of an asset.
However, the CAPM describes a theoretical relationship that is based on
a number of assumptions that are diﬃcult, or impossible, to verify. In this
chapter, we consider the market model, a statistical model for the relationship
between observed asset returns and the observed returns on a type of market
portfolio. This model is a form of linear regression model that can be estimated
using standard techniques. It is consistent with the CAPM in many respects
and, hence, the estimates from the market model give useful information
regarding the statistical properties of asset returns.

8.2 Market Indices

A key component of the CAPM is the market portfolio, consisting of all
marketable assets. However, in many respects, the market portfolio is a hypo-
thetical concept rather than an observable feature of the market. For instance,
because some investors might prefer to invest in real estate or art instead of
stocks, these assets are part of the market and should be included in the
market portfolio. It might even be argued that because an investor might sell
stocks to pay for a child’s education, certain types of human capital must also
be included.
Therefore, it is clear that we cannot hope to accurately measure the return
on a true market portfolio. However, we can use the return on a stock market
index as a proxy for the return on a market portfolio. Hence, in this section,
we consider the properties of such indices.

Cap-Weighted Indices
Suppose that a market index is to be based on N assets, with respective prices
P1,t , P2,t , . . . , PN,t at time t; here Pj,t is the price of one share of asset j at

221

T&F Cat #K31368 — K31368 C008— page 221 — 6/14/2017 — 22:05

222 Introduction to Statistical Methods for Financial Models

time t. Let Qj denote the number of shares of asset j available in the market,
j = 1, 2, . . . , N . Then the amount invested in asset j at time t is Qj Pj,t ; this is
known as the market capitalization of asset j at time t. The total capitalization
of the entire market, or the market cap, at time t is given by

N
Qj Pj,t .
j=1

A cap-weighted index based on these assets is of the form

N
j=1 Qj Pj,t
It = , t = 1, 2, . . .
Dt
where Dt is a divisor used to rescale the total capitalization. The divisor is
modiﬁed periodically so that the index provides a continuous measure of the
value of these assets and it is not aﬀected by certain corporate actions, such
as mergers, or changes to the set of assets used to form the index.
For simplicity, assume that the divisor is constant over time. Then the
return on the index at time t + 1 is given by
N
It+1 j=1 Qj Pj,t+1
− 1 = N − 1.
It j=1 Qj Pj,t

Note that
N N N
j=1 Qj Pj,t+1 j=1 Qj (Pj,t+1 − Pj,t ) j=1 Qj Pj,t Rj,t+1
N −1 = N = N
j=1 Qj Pj,t j=1 Qj Pj,t j=1 Qj Pj,t

where
Pj,t+1
Rj,t+1 = −1
Pj,t
denotes the return on asset j at time t + 1.
Let
c Qj Pj,t
wj,t = N
j=1 Qj Pj,t
denote the proportion of the market cap corresponding to asset j at time t.
Note that

N
c
wj,t = 1.
j=1
c
The weights wj,t , j = 1, 2, . . . , N are known as capitalization weights or simply
cap weights, at time t.
The return on the index at time t + 1 may be written

It+1
N
−1 = c
wj,t Rj,t+1 ,
It j=1

T&F Cat #K31368 — K31368 C008— page 222 — 6/14/2017 — 22:05

The Market Model 223
c
which is identical to the return on the portfolio with weights wj,t , j=
1, 2, . . . , N . If the divisor changes from time t to time t + 1, then that change
also plays a role in the return on the index.
c
In this discussion, the weights wj,t , j = 1, 2, . . . , N , are based on the total
number of shares of the asset, Qj , along with the prices of the assets. However,
not all shares of an asset are always available to investors; for instance, for
stocks, there may be blocks of shares held by directors of the company. Shares
that are available to investors are said to be part of the float. Therefore, shares
held by directors are generally not part of the float.
Some indices exclude shares that are not part of the float from the
index calculation. Let Q̃j denote the number of available shares of asset
j, j = 1, . . . , N . Then
N
˜ j=1 Q̃j Pt,j
It =
Dt
represents a float-adjusted index. It has the same general properties as a cap-
weighted index. The ratio Q̃j /Qj is known as the investable weight factor of
the asset.
A commonly used cap-weighted index is the Standard & Poors (S&P) 500
index, which is based on 500 large-capitalization stocks, representing about
80% of the total market capitalization. Other cap-weighted indices include
the Russell 3000 index, which is based on the 3000 stocks with the largest
capitalizations, representing about 98% of the total market capitalization;
the Russell 1000 index, which includes the 1000 largest stocks, in terms of
capitalization, of those used in the Russell 3000 index and that represents
about 92% of the total market capitalization; and the Wilshire 5000 index,
which is based on the stocks of all publicly traded companies trading on a U.S.
stock exchange. The S&P 500 index, the Russell 3000 index, and the Russell
1000 index are all float-adjusted; the Wilshire 5000 index is not float-adjusted,
although there is a float-adjusted version available.
Example 8.1 Data on stock market indices are available from Yahoo
Finance. For instance, the S&P 500 index is available using the symbol
^GSPC; in general, the symbols used for stock market indices start with the
character ^. Hence, returns on the S&P 500 index may be calculated using
the same method used to calculate the returns on a stock. Suppose that the
monthly excess returns on the S&P 500 index have been calculated for the time
period January 2010 to December 2014 and are stored in the R variable sp500.
> mean(sp500)
[1] 0.0109
> sd(sp500)
[1] 0.0376
The monthly excess returns on the Russell 1000 (Yahoo Finance sym-
bol ^RUI), Russell 3000 (^RUA), and Wilshire 5000 (^W5000) indices for the
same period are stored in the variables r1000, r3000, and w5000, respectively,

T&F Cat #K31368 — K31368 C008— page 223 — 6/14/2017 — 22:05

224 Introduction to Statistical Methods for Financial Models

and the matrix indices contains all four of the indices considered; the ﬁrst
few rows of this matrix are given as follows:

> head(indices)
SP500 R1000 R3000 W5000
[1,] -0.0370 -0.0370 -0.0370 -0.0344
[2,] 0.0284 0.0305 0.0316 0.0323
[3,] 0.0587 0.0597 0.0613 0.0615
[4,] 0.0146 0.0174 0.0205 0.0207
[5,] -0.0821 -0.0815 -0.0811 -0.0812
[6,] -0.0540 -0.0573 -0.0591 -0.0562

It is clear from these few values that the returns on these four indices
are generally, but not always, similar. Therefore, it is not surprising that the
means and standard deviations of the returns on the four indices are generally
close.

> apply(indices, 2, mean)

SP500 R1000 R3000 W5000
0.0109 0.0111 0.0112 0.0112
> apply(indices, 2, sd)
SP500 R1000 R3000 W5000
0.0376 0.0383 0.0391 0.0390

Furthermore, the returns are highly correlated.

> cor(indices)
SP500 R1000 R3000 W5000
SP500 1.000 0.999 0.997 0.996
R1000 0.999 1.000 0.999 0.999
R3000 0.997 0.999 1.000 1.000
W5000 0.996 0.999 1.000 1.000

Thus, the smallest correlation among the four indices is 0.996, between the
S&P 500 index and the Wilshire 5000 index; this is not surprising given that,
of the four indices, the Wilshire 5000 represents the most stocks while the
S&P 500 represents the fewest.
It is worth noting that, even though the return means and standard devia-
tions of the diﬀerent indices are in close agreement and the indices are highly
correlated, often there is considerable variation in the returns on the diﬀerent
indices in a given time period. For instance, in period 4, the returns on the four
indices are 0.0146, 0.0174, 0.0205, and 0.0207, respectively. The high correla-
tions tell us that this variation among indices is small relative to the variation
within each index, a consequence of the fact that even the returns on a broad
stock market index such as the Wilshire 5000 are quite variable.

T&F Cat #K31368 — K31368 C008— page 224 — 6/14/2017 — 22:05

The Market Model 225

Price-Weighted Indices
Another approach to computing a market index is to simply sum the prices
P1,t , P2,t , . . . , PN,t of the stocks represented in the index. Let
N
j=1 Pj,t
Jt =
Dt

where Dt is a divisor, with properties similar to those of a divisor for a cap-

weighted index.
Suppose the divisor does not change from period t to period t + 1. Then
the return on the index Jt at time t + 1 is
N N
Jt+1 j=1 Pj,t+1 j=1 (Pj,t+1 − Pj,t )
− 1 = N −1 = N .
Jt j=1 Pj,t j=1 Pj,t

Writing Pj,t+1 − Pj,t = Pj,t Rj,t+1 ,

Jt+1
N
Pj,t
−1 = N Rj,t+1
Jt j=1 j=1 Pj,t

so that the return on Jt is equal to the return on a portfolio with asset weights

p Pj,t
wj,t = N , j = 1, 2, . . . , N.
j=1 Pj,t

Therefore, an index of this type is said to be a price-weighted index.

The most well-known price-weighted index is the Dow Jones Industrial
Average, which is based on the stocks of 30 large companies, in a variety of
industries; the Yahoo Finance symbol for the Dow Jones Industrial Average
is ^DJI.

Example 8.2 Suppose that the monthly excess returns on the Dow Jones
Industrial Average for the period January 2010 to December 2014 have been
calculated and are stored in the R variable djia.

> mean(djia)
[1] 0.00950
> sd(djia)
[1] 0.0347

Thus, the mean and standard deviation of the returns on the Dow Jones
Industrial Average are close to, but slightly diﬀerent than, those based on the
returns on the cap-weighted indices considered earlier. Similarly, its returns
are highly correlated with those of the cap-weighted indices, but the correla-
tions are not as large as those among the four cap-weighted indices; of course,

T&F Cat #K31368 — K31368 C008— page 225 — 6/14/2017 — 22:05

226 Introduction to Statistical Methods for Financial Models

the Dow Jones is based on only 30 stocks, so we should not expect the same
level of agreement seen earlier.
> cor(djia, indices)
SP500 R1000 R3000 W5000
[1,] 0.977 0.971 0.968 0.967

8.3 The Model and Its Estimation

For t = 1, 2, . . . , T , let Ri,t denote the return on asset i at time t. Let Rm,t
denote the return on a market index at time t and let Rf,t denote the return on
the risk-free asset at time t. Recall that, although the return on the risk-free
asset has zero variance, the rate of return itself varies over time.
Assume that the stochastic process given by {(Ri,t − Rf,t , Rm,t − Rf,t )T :
t = 1, 2, . . .} is weakly stationary; weak stationarity of a pair of random vari-
ables holds if any real-valued linear function of the random variables is weakly
stationary in the usual sense. Hence, under weak stationarity, the means,
variances, and covariances of Ri,t and Rm,t do not depend on t.
The market model states that
Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T (8.1)
where αi , βi are unknown parameters and i,1 , i,2 , . . . , i,T are unobserved
random variables each with mean zero and variance σ2,i . Furthermore, we
assume that
Cov(i,t , Rm,t ) = 0, t = 1, 2, . . . , T. (8.2)
The random variables i,t , t = 1, 2, . . . , T , are known as the residual
returns. They may be interpreted as the component of the excess return on
the asset that is uncorrelated with the market returns; see Corollary 7.1 for a
similar random variable in the context of the CAPM.
Note that some analysts define the residual returns to be αi + i,t ,
t = 1, 2, . . . , T so that αi represents the mean residual return, while here we
define the residual returns to have mean zero; however, the basic idea is the
same—residual returns represent that part of an asset’s returns that remains
after accounting for a linear relationship with the market return.
Note that assumption (8.2) is equivalent to the assumption that the
parameter βi in (8.1) can be expressed in terms of Cov(Ri,t , Rm,t ) and
Var(Rm,t ):
Cov(Ri,t , Rm,t )
βi = . (8.3)
Var(Rm,t )
To see this, first note that, if Cov(i,t , Rm,t ) = 0, then, by (8.2),
Cov(Ri,t , Rm,t ) = βi Var(Rm,t ),
which yields the expression (8.3) for βi .

T&F Cat #K31368 — K31368 C008— page 226 — 6/14/2017 — 22:05

The Market Model 227

Conversely, according to (8.2),

Cov(Ri,t , Rm,t ) = βi Var(Rm,t ) + Cov(i,t , Rm,t ); (8.4)

recall that Cov(Rm,t , Rf,t ) = 0. Therefore, if (8.3) holds, then

Cov(Ri,t , Rm,t ) = Cov(Ri,t , Rm,t ) + Cov(i,t , Rm,t )

so that Cov(i,t , Rm,t ) = 0.

We also assume that the errors are uncorrelated,

Cov(i,t , i,s ) = 0 for all t, s = 1, 2, . . . , T, t = s

and that
Cov(i,t , Rm,s ) = 0 for all t, s = 1, 2, . . . , T.
Therefore, the market model is a regression model with response variable

Yt = Ri,t − Rf,t , t = 1, 2, . . . , T,

the excess returns of asset i, and predictor variable

Xt = Rm,t − Rf,t , t = 1, 2, . . . , T,

the excess returns on a market index

Yt = α + βXt + t , t = 1, 2, . . . , T

where α = αi , β = βi , and t = i,t . Under the assumptions described here, the

errors 1 , 2 , . . . , T have mean zero, constant variance, and are uncorrelated;
furthermore, Cov(t , Xt ) = 0. A model of this type for (Yt , Xt ), t = 1, 2, . . . , T
is often described as a “simple linear regression model.”

Relationship to the CAPM

The market model is similar to the CAPM, but there are important diﬀerences.
The CAPM is a model for the relationship between the expected excess return
on an asset and the expected excess return on a hypothetical market portfolio,
which is assumed to achieve the maximum possible Sharpe ratio, while the
market model is a statistical model for observed excess returns on an asset
and the observed excess returns on a market index. Note that, even though
the CAPM and the market model use the same basic notation for the returns
on the market portfolio and the returns on a market index, these two sets of
returns are not identical.
Taking expectations in (8.1), the market model implies that

μi − μf = αi + βi (μm − μf ),

T&F Cat #K31368 — K31368 C008— page 227 — 6/14/2017 — 22:05

228 Introduction to Statistical Methods for Financial Models

where μi − μf = E(Ri,t − Rf,t ) and μm − μf = E(Rm,t − Rf,t ). Thus, if

the return on the market index used in the market model may be viewed as
the return on the market portfolio, the market model and the CAPM describe
similar relationships. The most important difference between these models is
that, under the CAPM, the efficiency of the market portfolio implies that the
intercept parameter satisfies αi = 0; conversely, if αi = 0, the market model
implies a form of the CAPM using the returns on a market index in place of
the returns on the market portfolio.

Interpretation of βi
As in the CAPM, the parameter βi in the market model is a measure of the
relationship between the excess returns on an asset and the excess returns
on the market index and, in many respects, the interpretation of βi follows
from the interpretation of beta in the CAPM, as discussed in Chapter 7.
For instance, it may be used to decompose the variance of an asset’s returns
into market and nonmarket components; this will be discussed in detail in
Section 8.5.
An alternative interpretation of βi is as a measure of the sensitivity of an
asset’s excess returns to the excess return on the market index. However, such
an interpretation does not follow directly from the assumptions of the market
model as given in this chapter. In particular, the interpretation of βi as a
measure of sensitivity is valid only if the relationship between Ri,t − Rf,t and
Rm,t − Rf,t is a linear one.
That is, suppose that the condition that i,t and Rm,t are uncorrelated is
strengthened to E(i,t |Rm,t − Rf,t ) = 0; then

E(Ri,t − Rf,t |Rm,t − Rf,t = r) = αi + βi r.

It follows that
d
βi = E(Ri,t − Rf,t |Rm,t − Rf,t = r)
dr
and βi may be interpreted as the measure of sensitivity described previously.
However, if E(i,t |Rm,t − Rf,t ) is a nonzero function of Rm,t − Rf,t , then

d
E(Ri,t − Rf,t |Rm,t − Rf,t = r)
dr
might not be equal to βi . Fortunately, it is generally reasonable to assume
that E(i,t |Rm,t − Rf,t ) = 0 does hold and, hence, the interpretation of βi as
a measure of sensitivity is typically appropriate.

Estimation
We now consider estimation of the parameters of the market model. As
discussed previously, the market model may be viewed as a simple linear

T&F Cat #K31368 — K31368 C008— page 228 — 6/14/2017 — 22:05

The Market Model 229

regression model with response variable Yt = Ri,t − Rf,t , where Ri,t is the
return on a speciﬁc asset in period t, and Rf,t is the risk-free rate in period t,
and predictor variable Xt = Rm,t − Rf,t , where Rm,t is the return of a market
index in period t.
Therefore, the parameters αi and βi may be estimated using ordinary least
squares. The formulas for the estimators are
T
t=1 (Yt − Ȳ )(Xt − X̄)
β̂i = T
t=1 (Xt − X̄)
2

and
α̂i = Ȳ − β̂i X̄
where Ȳ and X̄ are the sample means of the Yt and Xt , respectively; these
expressions are sometimes useful for studying the properties of the estimators,
but they are not needed for numerical work.
Thus, the remaining issue is selection of the data to be used in the analysis:
the market index, the risk-free asset, the return interval, and the observation
period.
As discussed in Section 8.2, the “market portfolio” is a hypothetical con-
cept; hence, in estimating the parameters of the market model, we use a
market index chosen to measure the general behavior of the equity market.
The most commonly used index in this context is the S&P 500 index. Although
it includes only 500 stocks, the return on the S&P 500 is generally believed
to reﬂect the return on the entire market. There are a number of broader
indices that can be used such as the Russell 3000 index and the Wilshire 5000
index. As shown in Example 8.1, the S&P 500 index, the Russell 1000 index,
the Russell 3000 index, and the Wilshire 5000 index are generally highly cor-
related with each other; hence, the choice from among these indices has a
relatively small impact on the estimates. Here we will use the return on the
S&P 500 index as the return on the market portfolio.
For the risk-free rate to use in the analysis, we will use the return on
a 3-month Treasury Bill, as discussed in Example 6.1. These are generally
reported as annual percentage rates, which must be converted to proportional
monthly rates. Let Rf a be an annual percentage rate; recall that this may be
converted to a monthly rate by

Rf = (1 + Rf a /100)1/12 − 1.

For the return interval, we could use daily, weekly, monthly, quarterly, or
yearly returns. The return interval should reﬂect the investment horizon of
interest. For instance, if investment decisions are made on a monthly basis,
it generally makes sense to use monthly returns. Here we will use monthly
returns.
The observation period refers to the number of return intervals to use in
the analysis; for example, for monthly data, we need to choose how many

T&F Cat #K31368 — K31368 C008— page 229 — 6/14/2017 — 22:05

230 Introduction to Statistical Methods for Financial Models

months of data to include. For a given return interval, a longer observation

period clearly yields more data and smaller standard errors. However, in using
a longer observation period, we are implicitly assuming that β is constant over
that time. Over a short observation period, this may be reasonable, but such
an assumption becomes questionable as the observation period increases due
to changes in the firms under consideration or changes in economic conditions.
Three to five years is commonly used. Here we will use five years of monthly
data.

Example 8.3 Consider the monthly excess returns on IBM stock, which we
assume have been calculated and stored in the variable ibm; as discussed in
Example 8.1, the excess returns on the S&P 500 index are stored in the vari-
able sp500. As with any statistical analysis, before estimating the parameters
of the linear regression model relating ibm to sp500, it is a good idea to plot
the data, such a plot is given in Figure 8.1. The plot indicates an approximate
linear relationship among the variables that would be accurately described by
the market model.
To estimate the parameters of the market model in R, we use the function
lm. The syntax of the command to estimate the market model relating returns
on IBM stock to returns on the S&P 500 index, as contained in the variables
ibm and sp500, respectively, is

> lm(ibm~sp500)

The expression ibm~sp500 is known as a model formula and may be read

as “ibm is described by sp500.” The screen output from the command contains

0.10

0.05
Return on IBM stock

−0.05

−0.10

−0.05 0 0.05 0.10

Return on S&P 500 index

FIGURE 8.1
Plot of IBM monthly returns versus the returns on the S&P 500 index.

T&F Cat #K31368 — K31368 C008— page 230 — 6/14/2017 — 22:05

The Market Model 231

α̂ and β̂, the least-squares estimates of the parameters α and β, respectively,

in the market model for the returns on IBM stock:
> lm(ibm~sp500)
Coefficients:
(Intercept) sp500
-0.000707 0.618789
Therefore, for the data under consideration, β̂ = 0.619 and α̂ = −0.000707.
However, much more information is available from the command, and it
may be accessed by using certain extractor functions. Therefore, it is often
useful to save the results of the lm function in a variable, which can be accessed
as necessary:
> ibm.mm<-lm(ibm~sp500)
The variable ibm.mm now contains the results of the linear regression anal-
ysis relating the returns on IBM stock to the returns on the S&P 500 index.
The summary command may be used to display a summary of the results:
> summary(ibm.mm)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000707 0.005358 -0.13 0.9
sp500 0.618789 0.138073 4.48 3.5e-05 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.0398 on 58 degrees of freedom

Multiple R-squared: 0.257, Adjusted R-squared: 0.244
F-statistic: 20.1 on 1 and 58 DF, p-value: 3.54e-05
This output contains much useful information. For instance, the standard
error of β̂ is 0.138, so that an approximate 95% conﬁdence interval for β is
given by
0.619 ± (1.96)(0.138) = (0.349, 0.889).
An estimate of the error standard deviation in the market model for returns
on IBM stock, σ , is given by the “Residual standard error” on the output.
Hence, using σ to denote the residual standard deviation in the market model
for IBM stock, and using σ̂ to denote the estimate of σ based on least-squares
regression, σ̂ = 0.0398.
It is also useful to know that this information may be accessed directly
through the components of the result of the summary function. For instance,
in the IBM example, summary(ibm.mm)$coefficients is a 2 × 4 matrix:
> summary(ibm.mm)$coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000707 0.00536 -0.132 8.96e-01
sp500 0.618789 0.13807 4.482 3.54e-05

T&F Cat #K31368 — K31368 C008— page 231 — 6/14/2017 — 22:05

232 Introduction to Statistical Methods for Financial Models

For example, the estimate of β is given by summary(ibm.mm)$coefficients

[2,1]:
> summary(ibm.mm)$coefficients[2,1]
[1] 0.619
Other useful components are $sigma, which contains σ̂ , and $r.squared,
which contains R-squared for the regression:
> summary(ibm.mm)$sigma
[1] 0.0398
> summary(ibm.mm)$r.squared
[1] 0.257
The lm command may be used with a matrix argument as the response
variable in order to provide results for several stocks at one time. Recall that
in Chapter 6 we analyzed the variable big8, a matrix containing the monthly
returns for the stocks of eight large companies; see Example 6.6 for details.
To obtain the market model estimates of αi , βi for all the stocks represented
in the data matrix big8, we use the command
> big8.mm<-lm(big8~sp500)
> big8.mm
Coefficients:
AAPL BAX KO CVS
(Intercept) 0.015361 -0.000353 0.004547 0.009559
sp500 0.920307 0.716442 0.485664 1.071935
XOM IBM JNJ DIS
(Intercept) -0.001335 -0.000707 0.005637 0.007799
sp500 0.878737 0.618789 0.540500 1.193193
Thus, the values of β̂ for these stocks range from 0.486 for Coca-Cola to 1.193
for Disney.

8.4 Testing the Hypothesis that an Asset Is

Priced Correctly
Recall that, according to the CAPM, αi = 0 indicates that asset i is mispriced,
with αi > 0 corresponding to an asset with a price that is too low and αi < 0
corresponding to an asset with a price that is too high; see Section 7.5.
Therefore, a test of the hypothesis αi = 0 in the market model may be used
as a test of the hypothesis that the asset is priced correctly; rejection of this
hypothesis suggests that the price of the asset is either too low or too high. It is
important to keep in mind that such a conclusion is a statement about the mar-
ket in periods 1, 2, . . . , T and is not necessarily a statement about future prices.
The p-value for such a test is available in the result extracted using the
summary function on the results from lm.

T&F Cat #K31368 — K31368 C008— page 232 — 6/14/2017 — 22:05

The Market Model 233

Example 8.4 Recall that, for IBM stock, the results of the lm function
include the following:

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000707 0.005358 -0.13 0.9
sp500 0.618789 0.138073 4.48 3.5e-05 ***

The p-value for testing α = 0 is 0.9; therefore, we do not reject the hypothesis
that the IBM stock is priced correctly.
To calculate the p-values for testing that α = 0 for each of the stocks with
returns included in the big8 variable, we may use the apply function. Note
that the [1, 4] element of the $coefficients component of the output from
the lm function is the p-value for testing α = 0

> summary(lm(ibm~sp500))$coefficients[1, 4]
[1] 0.8955

that is rounded to 0.9 in the output from lm.

Deﬁne a function f.alphapval that takes a vector of excess returns as an
argument and returns this p-value:

> f.alphapval<-function(y)
+ {summary(lm(y~sp500))$coefficients[1, 4]}
> f.alphapval(ibm)
[1] 0.8955

The p-values for the eight stocks may then be calculated using apply

> apply(big8, 2, f.alphapval)

AAPL BAX KO CVS XOM IBM JNJ DIS
0.0883 0.9576 0.3675 0.0945 0.7591 0.8955 0.2110 0.1224

Therefore, for these eight stocks, the hypothesis that the stock is priced cor-
rectly is never rejected at the 0.05 level. Two stocks have a p-value less than
0.10—Apple, which has α̂ = 0.0154 and a p-value of 0.088, and CVS, which
has α̂ = 0.00956 and a p-value of 0.095.

Stock Screening and Multiple Testing

It is tempting to use tests of αj = 0 to screen a large number of stocks, hoping
to ﬁnd a few that are mispriced. However, when testing many hypotheses in
this way, it is important to be aware of the multiple testing problem.
Suppose that we are testing m null hypotheses, each of the form H0 : αj = 0.
Recall that a p-value has the property that, if the null hypothesis is true, then
the probability is approximately 0.05 that the p-value will be less than or equal
to 0.05; more generally, the p-value is approximately distributed as a uniform
random variable on the interval (0, 1) when the null hypothesis is true.

T&F Cat #K31368 — K31368 C008— page 233 — 6/14/2017 — 22:05

234 Introduction to Statistical Methods for Financial Models

Therefore, even if αj = 0 for all j = 1, 2, . . . , m, we expect about 5% of the

p-values to be less than 0.05. For instance, if we are testing αj = 0 for 100
stocks, even if all stocks are priced correctly, we expect about five significant
p-values, defining significance in terms of a 0.05 level, that is, choosing each
test to have a probability of Type I error of 0.05. Thus, if a few of the 100
p-values are significant, it may be inappropriate to conclude that those stocks
are mispriced.
A simple way to deal with this issue is to modify the criterion for a signif-
icant p-value. Suppose that we take the null hypothesis to be the hypothesis
that all αj are 0; that is, consider the null hypothesis
H0 : αj = 0, j = 1, 2, . . . , m
or, equivalently,
H0 : α1 = α2 = · · · = αm = 0.
For this testing problem, a Type I error corresponds to the event of rejecting
αj = 0 for any j when, in fact, all αj are 0. We can test this hypothesis using
the p-values from the tests of the individual hypotheses αj = 0, which we
denote by q1 , q2 , . . . , qm , respectively.
Suppose we want the level of our test of H0 : α1 = α2 = · · · = αm = 0 to
be, at most, 0.05. If we reject αj = 0 when qj ≤ c∗ , for some threshold c∗ ,
then the probability of a Type I error is
P (q1 ≤ c∗ ∪ q2 ≤ c∗ ∪ · · · ∪ qm ≤ c∗ )
calculated under the assumption that
α1 = α2 = · · · = αm = 0.
Exact calculation of this probability requires the joint distribution of
(q1 , q2 , . . . , qm ); hence, it is difficult, if not impossible, without making strong
assumptions.
However, it is generally possible to bound the probability. Recall that, for
two events A and B,
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
and, hence,
P(A ∪ B) ≤ P(A) + P(B).
An induction argument can be used to show that for events A1 , A2 , . . . , Am

m
P(A1 ∪ A2 ∪ · · · ∪ Am ) ≤ P(Aj ),
j=1

a result known as the Bonferroni inequality.

Hence,

m
P(q1 ≤ c∗ ∪ q2 ≤ c∗ ∪ · · · ∪ qm ≤ c∗ ) ≤ P(qj ≤ c∗ ). (8.5)
j=1

T&F Cat #K31368 — K31368 C008— page 234 — 6/14/2017 — 22:05

The Market Model 235

Using the fact that a p-value has a uniform distribution under the null
hypothesis, P(qj ≤ c∗ ) = c∗ . It follows that

P(q1 ≤ c∗ ∪ q2 ≤ c∗ ∪ · · · ∪ qm ≤ c∗ ) ≤ mc∗ .

Therefore, to guarantee that our test has a level less than or equal to 0.05,
we can choose c∗ = 0.05/m. Then the probability of concluding that any of
the assets is mispriced when all are priced correctly is less than or equal to
0.05. Clearly, the same approach may be used for any desired level.
Hence, to address the multiple-testing problem, we modify the criterion for
a significant p-value from 0.05 to 0.05/m, where m is the number of hypotheses
being tested; this is known as the Bonferroni method. An equivalent approach
is to calculate “adjusted p-values,” given by mqj , j = 1, 2, . . . , m; if mqj > 1,
we set the adjusted p-value to 1. The adjusted p-values can then be evaluated
using the usual criteria; for instance, we can compare the adjusted p-values
to 0.05 for a test with level 0.05.
Example 8.5 Consider stocks for firms represented in the S&P 100 index;
stocks in the S&P 100 index are a subset of those in the S&P 500 index,
representing a cross section of large U.S. companies. For each stock, five years
of monthly returns were analyzed for the period ending December 31, 2014;
only 96 of the 100 stocks had five years of monthly returns available.
For each of these 96 stocks, the p-value of the test of αj = 0 described
earlier was calculated; the results are stored in the variable sp96.pv
> head(sp96.pv)
[1] 0.0883 0.2450 0.5338 0.9436 0.1488 0.0397
Thirteen of the p-values are less than 0.05, with the smallest at 0.0043.
> sort(sp96.pv)[1:15]
[1] 0.00426 0.00548 0.00930 0.01299 0.01801 0.01960 0.02715
0.02891 0.03139
[10] 0.03458 0.03966 0.04254 0.04811 0.05394 0.05474
For a test with level 0.05, the Bonferroni-corrected criterion is 0.05/96 =
0.00052; all of the p-values exceed this threshold. Thus, although the p-values
suggest that some of the stocks might be mispriced, after adjusting for multiple
testing, we do not reject the hypothesis that all stocks are priced correctly.
Alternatively, if we compute the adjusted p-values, by multiplying the
p-values by 96, we see that the smallest adjusted p-value is 0.41 (96 times
0.0043), leading to the same conclusion.

False Discovery Rate

An important drawback of the Bonferroni method is that it is generally con-
servative, in the sense that the actual level of the test is less than 0.05; this is
particularly true when m is large, as is often the case when analyzing stock

T&F Cat #K31368 — K31368 C008— page 235 — 6/14/2017 — 22:05

236 Introduction to Statistical Methods for Financial Models

return data. In the present context, this property means that there is a ten-
dency for the procedure to conclude that all stocks are priced correctly even
when one or more is mispriced.
An alternative approach to designing tests of many hypotheses is to control
the false discovery rate (FDR) rather than to control the probability of a Type
I error. Suppose we conduct a series of tests of the hypotheses that a stock is
mispriced, that is, of the hypotheses of the form αj = 0, and that, based on
the procedure used, we conclude that m0 of the stocks are mispriced; that is,
m0 of the hypotheses that αj = 0 are rejected. Let m1 denote the number of
those rejected hypotheses for which αj is actually 0.
We refer to a rejected hypothesis as a “discovery” and an incorrectly
rejected hypothesis as a “false discovery.” In the present context, a false dis-
covery occurs if we conclude that a stock is mispriced when it is not. The false
discovery proportion is defined as m1 /m0 provided that m0 > 0; if m0 = 0, it
is taken to be 0.
Note that the false discovery proportion is a random variable; the FDR is
the expected value of this random variable. Therefore, the FDR is the expected
proportion of rejected null hypotheses that were rejected incorrectly.
It is important to note that although the level of a test and its FDR
are related, they are fundamentally different measures. The level of a test of
α1 = α2 = · · · = αm = 0 is the probability of rejecting this hypothesis, that is,
of concluding that at least one αj is nonzero when all are actually 0. The FDR
measures the expected proportion of those cases in which αj = 0 is rejected for
which αj is actually 0. Hence, procedures that control the FDR do not control
the level of the test. However, the FDR is an intuitively appealing concept
in many applications, such as stock screening; furthermore, the procedures
that control the FDR have higher power than those based on the Bonferroni
correction, so that we are more likely to discover mispriced stocks.
Let qj denote the p-value of the usual test of αj = 0, j = 1, 2, . . . , m.
To control the FDR at F , instead of comparing each qj to a given threshold
value, as in the Bonferroni method, we use the following procedure. First, order
the p-values and let q(1) , q(2) , . . . , q(m) denote the ordered values, so that q(1) is
the smallest p-value, q(2) is the second smallest, and so on. Then, starting with
j = 1 and moving through the list of p-values, we compare q(j) to (j/m)F .
If q(j) > (j/m)F for all j = 1, 2, . . . , m, then we do not reject any of the
hypotheses. Otherwise, find the largest j for which q(j) ≤ (j/m)F ; denote this
value by j ∗ . Then we reject the hypotheses corresponding to q(1) , q(2) , . . . , q(j ∗ ) .
Although this procedure is a bit complicated, fortunately, there is an R func-
tion that computes the corresponding adjusted p-values that can be compared
to a given threshold in the usual way.
Although the conventional choice for the level of a test is 0.05, that is
not necessarily the best choice for the FDR. For instance, an FDR of 0.10
or larger may be reasonable. In particular, if the tests of αj = 0 are used to
screen stocks for further investigation, a threshold as large as 0.20 may be
appropriate.

T&F Cat #K31368 — K31368 C008— page 236 — 6/14/2017 — 22:05

The Market Model 237

Example 8.6 Consider stocks for firms represented in the S&P 100 index
analyzed in Example 8.5; consider testing αj = 0 for these stocks, controlling
the FDR at 0.10.
The p-values for testing αj = 0 for each of the 96 stocks are stored in the
variable sp96.pv. To compute the p-values adjusted for controlling the FDR,
we use the following command:
> sp96.pv.fdr<-p.adjust(sp96.pv, method="fdr")
> head(sp96.pv)
[1] 0.0883 0.2450 0.5338 0.9436 0.1488 0.0397
> head(sp96.pv.fdr)
[1] 0.403 0.523 0.733 0.971 0.468 0.340
The function p.adjust can perform a number of different adjustments; using
the argument method="fdr" specifies the adjustment to control the FDR, as
described earlier.
The minimum adjusted p-value is given by
> min(sp96.pv.fdr)
[1] 0.26
Because this value exceeds 0.10, we conclude that all 96 stocks are priced
correctly. If the minimum adjusted p-value had not exceeded 0.10, we would
reject the hypothesis that αj = 0 for those assets with an adjusted p-value less
than or equal to 0.10.

8.5 Decomposition of Risk

When discussing the CAPM, it was shown that the variance of an asset’s
return may be expressed in terms of market and nonmarket components; see
Section 7.3. Here we use the market model to estimate such components of
variance.
Consider asset i for which the market model (8.1) holds. Since i,t is
uncorrelated with Rm,t ,
σ2i ≡ Var(Ri,t ) = Var(βi Rm,t ) + Var(i,t ) = β2i σ2m + σ2,i ,
where σ2m = Var(Rm,t ) and σ2,i = Var(i,t ). Therefore, the risk of asset i, as
measured by the variance, may be decomposed into two components, the
market component, β2i σ2m , and the nonmarket component, σ2,i .
Estimates of βi and σ2,i are available from the results of the linear regres-
sion analysis used to estimate the market model. To estimate σ2i and σ2m , we
interpret these as variances of excess returns:
σ2i = Var(Ri,t − Rf,t ) and σ2m = Var(Rm,t − Rf,t )
where Rf,t denotes the return on the risk-free asset at time t.
Let Si2 denote the sample variance of Ri,t − Rf,t , t = 1, 2, . . . , T , let Sm 2

denote the sample variance of Rm,t − Rf,t , t = 1, 2, . . . , T , and let β̂i and σ̂2,i

T&F Cat #K31368 — K31368 C008— page 237 — 6/14/2017 — 22:05

238 Introduction to Statistical Methods for Financial Models

denote the estimators from least-squares regression, as discussed in Section 8.3.

Then,
.
Si2 = β̂2i Sm
2
+ σ̂2,i ; (8.6)
note that the relationship does not hold exactly due to the diﬀerent divisors
used in the estimates: Si2 and Sm 2
use T − 1, while σ̂2,i uses T − 2 to account
for an additional degree of freedom lost when basing the estimator on the
residuals from the regression. It follows that

T −2 2
Si2 = β̂2i Sm
2
+ σ̂ ;
T − 1 ,i

hence, except when T is very small, the relationship described in (8.6) holds
to a high degree of approximation.
The proportion of the variance of Ri,t explained by the return on the
market index can be estimated by

β̂2i Sm2
2 ,
Si

which is simply the R-squared value for the regression.

Example 8.7 Consider the market model for the returns on IBM stock; recall
that the output from the linear regression analysis corresponding to the market
model for IBM stock is stored in the variable ibm.mm. The R-squared value
for the regression is available using the function summary.

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000707 0.005358 -0.13 0.9
sp500 0.618789 0.138073 4.48 3.5e-05 ***

Residual standard error: 0.0398 on 58 degrees of freedom

Multiple R-squared: 0.257, Adjusted R-squared: 0.244
F-statistic: 20.1 on 1 and 58 DF, p-value: 3.54e-05

Therefore, 25.7% of the variability in IBM excess returns is attributable

to the market. This result may be used to decompose the sample variance of
IBM excess returns:

0.00210 = (0.257)(0.00210) + (1 − 0.257)(0.00210) = 0.00540 + 0.00156.

The R-squared value may also be obtained directly by accessing the

$r.squared component of the result of summary:

> summary(ibm.mm)$r.squared
[1] 0.257

T&F Cat #K31368 — K31368 C008— page 238 — 6/14/2017 — 22:05

The Market Model 239

The apply function can be used to calculate the R-squared for all eight of the
stocks represented in big8. Deﬁne a function f.rsq by
>f.rsq<-function(y){summary(lm(y~sp500))$r.squared}
Then the R-squared values may be calculated by
> apply(big8, 2, f.rsq)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.219 0.234 0.196 0.485 0.516 0.257 0.276 0.599
Thus, for the eight stocks with return data in the big8 variable, the R-squared
values range from 0.196 for Coca-Cola to 0.599 for Disney. Therefore, nearly
60% of the variation in the returns on Disney stock, as measured by the
return variance, can be explained by variation in the market return; on the
other hand, for Coca-Cola stock, less than 20% of the variation in the returns
can be explained by the market.

8.6 Shrinkage Estimation and Adjusted Beta

Often, we are interested in estimating β for several assets. For instance, sup-
pose we are analyzing N stocks and let β1 , β2 , . . . , βN denote their respective
parameters in the market model (8.1). In such cases, we may use shrinkage
estimation, following the general approach described in Section 6.5.
Thus, in shrinkage estimation, we combine a simple estimator of βi ,
such as the least-squares estimator, with an estimator based on assumptions
regarding the parameters β1 , β2 , . . . , βN , by taking a weighted average of the
two estimators.
Let β̂1 , β̂2 , . . . , β̂N denote the least-squares estimators of β1 , β2 , . . . , βN ,
respectively. For the assumption-based estimators, we may consider the
assumption that β1 = β2 = · · · = βN ≡ β for some β. To estimate β, we may
use the average of the least-squares estimators:
1

N
β̄ = β̂i .
N i=1

Then a shrinkage estimator of βi is given by a weighted average of β̂i and β̄.

When β̂1 , β̂2 , . . . , β̂N are approximately equal, we give more weight to β̄.
When β̂1 , β̂2 , . . . , β̂N do not follow the assumption of equal beta, in the sense
that there is a great deal of variability in β̂1 , β̂2 , . . . , β̂N , then more weight is
given to β̂i .
In order to choose the weights given to β̂i and β̄, we adapt the procedure
used in Section 6.5 when estimating mean returns. Let SE(β̂i ) denote the
standard error of β̂i , as given in the output of the lm function, and let
1

N
SE2 = SE(β̂i )2 .
N i=1

T&F Cat #K31368 — K31368 C008— page 239 — 6/14/2017 — 22:05

240 Introduction to Statistical Methods for Financial Models

Then a shrinkage estimator of βi is given by

ψβ̄ + (1 − ψ)β̂i

where
SE2
ψ=
SE2 + τ2β
and
1

N
τ2β = (β̂i − β̄)2 .
N ij=1

Example 8.8 One use for shrinkage estimation is in estimating the values of β
for a number of assets for which it is reasonable to expect similar relationships
with the market index. For example, here we consider the four airline stocks,
American Airlines Group, Inc. (symbol AAL), Delta Air Lines, Inc. (DAL),
Southwestern Airline Company (LUV), and United Continental Holdings, Inc.
(UAL).
Five years of monthly returns for the period ending December 31, 2014,
were computed for these stocks. The results are stored in variables with
the name matching the stock symbol; for example, the returns on Ameri-
can Airlines stock are stored in the variable aal. The returns for all the four
stocks are stored as a matrix in the variable air, which is similar to the vari-
able big8. Estimates of beta for each of the four stocks may be computed as
follows:

> air.mm<-lm(air~sp500)
> air.beta<-air.mm$coefficients[2,]
> air.beta
AAL DAL LUV UAL
0.610 0.825 1.016 0.679

Therefore, β̄ and τ2β may be calculated by

> beta.bar<-mean(air.beta)
> beta.bar
[1] 0.782
> tausq.beta<-mean((air.beta-beta.bar)^2)
> tausq.beta
[1] 0.0242

To compute SE(β̂i ) for each stock, we may use the apply function. Note
that the [2, 2] element of the component $coefficients of summary applied
to the output from the lm function yields the standard error of β̂.
Deﬁne a function f.betase by

> f.betase<-function(y){ summary(lm(y~sp500))$coefficients[2,2]}

T&F Cat #K31368 — K31368 C008— page 240 — 6/14/2017 — 22:05

The Market Model 241

Then the vector of standard errors of β̂i can be computed by

> air.betase<-apply(air, 2, f.betase)
> air.betase
AAL DAL LUV UAL
0.5628 0.3405 0.2400 0.3879
and SE2 is given by
> sesq.bar<-mean(air.betase^2)
> sesq.bar
[1] 0.160
It follows that the weight ψ used in the shrinkage estimator is given by
> air.psi<-sesq.bar/(sesq.bar+tausq.beta)
> air.psi
[1] 0.869
and the shrinkage estimates of beta for the four airline stocks are given by
> air.psi*beta.bar + (1-air.psi)*air.beta
AAL DAL LUV UAL
0.760 0.788 0.813 0.769
Recall that the least-squares estimates are given by
AAL DAL LUV UAL
0.610 0.825 1.016 0.679
Note that the variation of β̂1 , β̂2 , β̂3 , β̂4 is small relative to the standard
errors of the β̂j so that the shrinkage estimates are all relatively close to the
average of the β̂j .

An important feature of the shrinkage estimator described previously is

that the same weight ψ is used for each asset. However, in many cases, the
standard error of β̂i varies considerably for the different assets. For assets in
which beta is estimated accurately, in the sense that SE(β̂i ) is relatively small,
we may want to give more weight to βî ; on the other hand, for assets for which
SE(β̂i ) is relatively large, it may be preferable to give relatively little weight
to β̂i .
Thus, it may be desirable to use asset-specific values of ψ, ψ1 , ψ2 , . . . , ψN ,
particularly when SE(β̂i ), i = 1, 2, . . . , N exhibit large variation. Let

SE(β̂i )2
ψi = .
SE(β̂i )2 + τ2β

Then an alternative shrinkage estimator of βi is given by

ψi β̄ + (1 − ψi )β̂i .

T&F Cat #K31368 — K31368 C008— page 241 — 6/14/2017 — 22:05

242 Introduction to Statistical Methods for Financial Models

Example 8.9 Consider the four airline stocks analyzed in Example 8.8.
Recall that the excess return data for the four stocks are stored in the vari-
ables aal, dal, luv, and ual; the data matrix for these four assets is stored in
the variable air. The variable sp500 contains the excess returns on the S&P
500 index.
Consider estimation of β for American Airlines stock. Using the results of
the lm function applied to the market model for American Airlines,
> summary(lm(aal~sp500))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0459 0.0218 2.10 0.04 *
sp500 0.6095 0.5628 1.08 0.28

Residual standard error: 0.162 on 58 degrees of freedom

Multiple R-squared: 0.0198, Adjusted R-squared: 0.00292
F-statistic: 1.17 on 1 and 58 DF, p-value: 0.283
we have that SE(β̂i ) = 0.5628; recall that here τ2β is given by 0.0242. Hence,
the weight for AAL is
0.56282
= 0.9290.
((0.5828)2 + 0.0242)
It follows that the shrinkage estimate of beta for AAL is given by

(0.929)β̄ + (1 − 0.929)(0.6095) = (0.929)(0.7822) + (1 − 0.929)(0.6095) = 0.7699.

To calculate the shrinkage estimates of β for all four stocks, recall that the
variable air.betase contains the standard errors of β̂i for the four stocks
> air.betase
AAL DAL LUV UAL
0.5628 0.3405 0.2400 0.3879
The vector of weights used in this procedure along with the vector of
shrinkage estimates of β for all four stocks may now be computed as follows:
> psi.air<-(air.betase^2)/(tausq.beta + air.betase^2)
> psi.air
AAL DAL LUV UAL
0.929 0.827 0.704 0.861
> psi.air*beta.bar + (1-psi.air)*air.beta
AAL DAL LUV UAL
0.770 0.790 0.851 0.768
Recall that the shrinkage estimates based on a global value for ψ are given by
AAL DAL LUV UAL
0.760 0.788 0.813 0.769

T&F Cat #K31368 — K31368 C008— page 242 — 6/14/2017 — 22:05

The Market Model 243

The two sets of estimates are very similar. The greatest diﬀerence occurs
for LUV; note that LUV has the largest value of β̂i and the smallest value of
the standard error of β̂i .

Adjusted Beta
When estimating β for a large number of assets, a simpler type of shrinkage
estimator is sometimes used. Since the value of beta for the entire market is,
by definition, equal to 1, it is often reasonable to use 1 in place of β̄. Using
global values for the weights given to one and β̂i leads to an estimator of the
form
(1 − k) + k β̂i (8.7)
for some constant k. Often k = 2/3 is used, yielding the estimator of βi
given by
1 2
β̂i,adj = + β̂i , (8.8)
3 3
which is known as adjusted beta. It is sometimes attributed to analysts at
the brokerage firm Merrill Lynch, Pierce, Fenner & Smith, Inc. (Vasicek
1973); this type of adjusted beta is used most notably by the financial data
firm Bloomberg L. P. (www.bloomberg.com) so it is sometimes referred to as
“Bloomberg adjusted beta.”
This estimator has the advantage of requiring only β̂i in order to estimate
βi . However, it has the drawbacks of always shrinking the estimates toward
1 and of always using the weights 1/3 and 2/3; these choices may not be
appropriate in all cases.
Example 8.10 Stocks for firms represented in the S&P 100 index were con-
sidered; these data were also analyzed in Example 8.5. For each of the 96
stocks, five years of monthly returns were analyzed for the period ending
December 31, 2014.
For each stock, the least-squares estimate β̂i was calculated along with the
adjusted beta for stock i, β̂i,adj . To measure the accuracy in these estimates as
predictions of future beta values, they were compared with the least-squares
estimates of β based on the 12 monthly returns in 2015, which we denote by
β̂∗i , i = 1, 2, . . . , 96.
The average error in the least-squares estimates is given by

96
|β̂i − β̂∗i | = 0.369;
96 i=1

for adjusted beta, the average error is

96
|β̂i,adj − β̂∗i | = 0.321.
96 i=1

Thus, use of adjusted beta reduces the error in predicting the estimates of
β for 2015 by about 13%.

T&F Cat #K31368 — K31368 C008— page 243 — 6/14/2017 — 22:05

244 Introduction to Statistical Methods for Financial Models

For comparison, the shrinkage estimates of the βi were also calculated.

The average error in these estimates in predicting β̂∗i is 0.347 using a global
value of ψ and 0.344 using asset-speciﬁc values of ψ. Thus, at least for this
example, adjusted beta appears to be at least as successful as the shrinkage
estimators in predicting future beta values.
It is worth noting that none of the estimators are particularly accurate in
predicting future beta estimates. In evaluating these results, it is important
to keep in mind that the future beta values used for comparison are estimates,
with their own sampling variability, in addition to the sampling variability of
the least-squares, adjusted, and shrinkage estimates of beta.

8.7 Applying the Market Model to Portfolios

Although the discussion in this chapter has focused on the analysis of
individual securities, the same approach may be applied to a portfolio.
Consider a portfolio based on N assets, with returns R1,t , R2,t , . . . , RN,t in
period t. Let w1 , w2 , . . . , wN denote the portfolio weights so that the return
on the portfolio in period t is given by

N
Rp,t = wi Ri,t , t = 1, 2, . . . , T.
i=1

Suppose that the market model holds for each asset so that, for each
i = 1, 2, . . . , N ,
Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T (8.9)
where i,t has mean 0 and is uncorrelated with the market return Rm,t . Then

N
Rp,t − Rf,t = wi (Ri,t − Rf,t )
i=1

N
= wi (αi + βi (Rm,t − Rf,t ) + i,t )
i=1
N

N
= wi αi + wi βi (Rm,t − Rf,t ) + wi i,t , t = 1, 2, . . . , T.
i=1 i=1 i=1

Note that

N
E wi i,t = wi E(i,t ) = 0
i=1 i=1
and, because Cov(i,t , Rm,t ) = 0 for all i = 1, 2, . . . , N ,
N

N
Cov wi i,t , Rm,t = wi Cov(i,t , Rm,t ) = 0, t = 1, 2, . . . , T.
i=1 i=1

T&F Cat #K31368 — K31368 C008— page 244 — 6/14/2017 — 22:05

The Market Model 245

It follows that the market model holds for the portfolio with parameters

N
αp = wi αi and βp = wi βi .
i=1 i=1

Furthermore, the least-squares estimators of αp and βp follow these rela-

tionships as well. That is, if α̂i and β̂i are the least-squares estimators of αi
and βi , then α̂p and β̂p , the least-squares estimators of αp and βp , respectively,
are given by

N
α̂p = wi α̂i and β̂p = wi β̂i .
i=1 i=1

To see why this holds, consider the N = 2 case. For j = 1, 2, let Yj,t =
Rj,t − Rf,t , and let Xt = Rm,t − Rf,t . Then, as discussed in Section 8.3, for
j = 1, 2,
T
(Yj,t − Ȳj )(Xt − X̄)
β̂j = t=1T
t=1 (Xt − X̄)
2

where Ȳj is the sample mean of the Yj,t .

Now consider a portfolio with return Rp,t = wR1,t + (1 − w)R2,t at time t.
The least-squares estimator of beta for the portfolio may be written
T
t=1 (Yp,t − Ȳp )(Xt − X̄)
β̂p = T
t=1 (Xt − X̄)
2

where Yp,t = Rp,t − Rf,t and Ȳp is the sample mean of the Yp,t . Because

Yp,t = wY1,t + (1 − w)Y2,t

it follows that
T
+ (1 − w)Y2,t − wȲ1 − (1 − w)Ȳ2 )(Xt − X̄)
t=1 (wY1,t
β̂p = T
t=1 (Xt − X̄)
2
T T
t=1 (Y1,t − Ȳ1 )(Xt − X̄) (Y2,t − Ȳ2 )(Xt − X̄)
=w T + (1 − w) t=1T
t=1 (Xt − X̄) t=1 (Xt − X̄)
2 2

= wβ̂1 + (1 − w)β̂2 .

The argument for α̂p is similar.

Example 8.11 Consider the eight stocks with returns stored in the vari-
able big8. Recall that the results from the market model regression on all
eight stocks are stored in the variable big8.mm. The estimated regression

T&F Cat #K31368 — K31368 C008— page 245 — 6/14/2017 — 22:05

246 Introduction to Statistical Methods for Financial Models

coeﬃcients from the eight regression analyses are available in the component
coefficients of big8.mm:
> big8.mm$coefficients
AAPL BAX KO CVS XOM IBM JNJ
(Intercept) 0.015 -0.00035 0.0045 0.0096 -0.0013 -0.00071 0.0056
sp500 0.920 0.71644 0.4857 1.0719 0.8787 0.61879 0.5405
DIS
(Intercept) 0.0078
sp500 1.1932
These estimates form a 2 × 8 matrix; therefore, the second row of this
vector contains the eight estimates of β:
> big8.mm$coefficients[2,]
AAPL BAX KO CVS XOM IBM JNJ DIS
0.9203 0.7164 0.4857 1.0719 0.8787 0.6188 0.5405 1.1932
Consider the equally weighted portfolio of the eight stocks; the returns on
this portfolio may be calculated as
> big8.port<-apply(big8, 1, mean)
yielding a vector consisting of the average return in each time period.
Alternatively, we can perform the calculation using matrix multiplication
> big8.port<-big8%*%rep(1/8, 8)
Here, rep(1/8, 8) is a vector of length 8 of the form (1/8, 1/8, . . . , 1/8).
The estimates of αp and βp may be calculated directly from the returns on
the portfolio.
> lm(big8.port~sp500)
Coefficients:
(Intercept) sp500
0.00506 0.80320
These estimates can also be obtained as the averages of the coeﬃcient
estimates from the analyses on the eight individual stocks.
> apply(big8.mm$coefficients, 1, mean)
(Intercept) sp500
0.00506 0.80320

Time-Dependent Portfolio Weights

Note that the analysis thus far in this section is based on the assumption that
the portfolio weights do not depend on t. However, in some cases, the portfolio
weights may change over time; for instance, the holdings in a mutual fund may

T&F Cat #K31368 — K31368 C008— page 246 — 6/14/2017 — 22:05

The Market Model 247

be modiﬁed to account for changing economic conditions or changing beliefs

about the future returns of certain stocks.
Let w1,t , w2,t , . . . , wN,t denote the weights at time t so that

N
wi,t = 1 for t = 1, 2, . . . , T.
i=1

If the market model holds for each asset, that is, if (8.9) holds, then, for each
t = 1, 2, . . . , T ,

N
Rp,t − Rf,t = wi,t (Ri,t − Rf,t )
i=1

N
= wi,t αi + wi,t βi (Rm,t − Rf,t ) + wi,t i,t
i=1 i=1 i=1
= αp,t + βp,t (Rm,t − Rf,t ) + p,t

where

N
αp,t = wi,t αi and βp,t = wi,t βi .
i=1 i=1

Note that the conditions that E(p,t ) = 0 and Cov(p,t , Rm,t ) = 0 for each
t = 1, 2, . . . , T continue to hold so that the market model holds for the portfolio
in each specific time period; however, the values of α and β for the portfolio
now depend on t.
If the weights are approximately constant over time, for example, if the
portfolio corresponds to a specific investment strategy with periodic minor
adjustments, then it may be reasonable to assume that αp,t and βp,t are
approximately constant over time and, hence, the market model is appropriate
for the portfolio. On the other hand, if major changes are made regularly to
the portfolio so that, in effect, there is a different portfolio in each time period,
then the market model assumption of constant αp and βp is inappropriate.

8.8 Diversiﬁcation and the Market Model

In Chapter 4, we saw that it is generally possible to reduce the risk of a
portfolio by diversiﬁcation. In this section, we look at the implications of the
market model on diversiﬁcation.
Consider two assets, with returns Ri,t and Rj,t , respectively, at time t. Let
σ2i = Var(Ri,t ), σ2j = Var(Rj,t ), and let ρij denote the correlation of Ri,t , Rj,t .
Suppose that the market model holds for each asset so that

Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T

T&F Cat #K31368 — K31368 C008— page 247 — 6/14/2017 — 22:05

248 Introduction to Statistical Methods for Financial Models

and
Rj,t − Rf,t = αj + βj (Rm,t − Rf,t ) + j,t t = 1, 2, . . . , T.
Now consider a portfolio of assets i and j, with return of the form

Rp,t = wRi,t + (1 − w)Rj,t ,

where 0 < w < 1. Let σ2p = Var(Rp,t ); then

σ2p = w2 σ2i + (1 − w)2 σ2j + 2w(1 − w)ρij σi σj .

Suppose, for simplicity, that σ2i = σ2j . Then

σ2p = w2 + (1 − w)2 + 2w(1 − w)ρij σ2i
= (1 − 2(1 − ρij )w(1 − w)) σ2j .

Because w(1 − w) > 0 for 0 < w < 1 and ρij < 1, it follows that σ2p < σ2j for
any 0 < w < 1; see Section 4.2 for further details. That is, for any 0 < w < 1,
the risk of the portfolio is less than that of either of the two assets used to
form it. However, diversification has very different effects on the market and
nonmarket components of risk.
As discussed in the previous section, the market model holds for the
portfolio:

Rp,t − Rf,t = αp + βp (Rm,t − Rf,t ) + p,t , t = 1, 2, . . . , T

where
αp = wαi + (1 − w)αj and βp = wβi + (1 − w)βj .
Then the market component of the variance of the portfolio return is given
by β2p σ2m .
Suppose that, without loss of generality, βi ≤ βj . Then

βi ≤ βp ≤ βj .

Therefore, provided that βi > 0, as is typically the case,

β2i σ2m ≤ β2p σ2m ≤ β2j σ2m

so that the market component of variance for the portfolio return lies between
the market components of return variance for the two assets. Thus, the market
component of return variance for the portfolio cannot be reduced below the
smaller of the market components of return variance for the two assets.
In particular, if βi = βj then βp = βi and, hence, the market component of
the variance of Rp,t is β2i σ2m , the same as the market components of variance
for returns on each of assets i and j. That is, in this case, diversiﬁcation
does not reduce the market component of risk. Therefore, the reduction in
the variance of the portfolio return, as compared to the return variances of

T&F Cat #K31368 — K31368 C008— page 248 — 6/14/2017 — 22:05

The Market Model 249

assets i and j, is entirely because of a reduction in the nonmarket component

of variance.
Example 8.12 Consider two assets, asset i and asset j; assume that σi =
σj = 0.5, βi = βj = 0.8, and ρij , the correlation of the returns of the two
assets, is 0.2. Suppose that the market portfolio has return standard deviation
of 0.4. Then the equally weighted portfolio of these assets has return variance
2 2
1 1 11
σi +
2
σ2j + 2 ρij σi σj = 0.15.
2 2 22
√ .
That is, the risk of 0.5 for each asset is reduced to 0.15 = 0.39 for the
portfolio.
The market component of the variance of each of the two asset returns is

(0.8)2 (0.4)2 = 0.1024,

and for each asset, the nonmarket component of the return variance is

0.25 − 0.1024 = 0.1476.

The equally weighted portfolio has βp = 0.8; hence, its market component
of return variance is also 0.1024. It follows that the nonmarket component of
the return variance for the portfolio is

0.15 − 0.1024 = 0.0476.

Thus, each asset has a return variance of 0.25, with a market component of
0.1024. The return variance of the portfolio is 0.15, but the market component
of that variance is the same as the market component of the return variance
for the two assets, 0.1024. The nonmarket component of return variance for
each of the two assets is 0.1476, while the nonmarket component of return
variance for the portfolio is only 0.0476.

Recall that, according to the CAPM, it is the market component of risk

that is rewarded by a higher expected return, as discussed in Section 7.3. Thus,
not only does diversiﬁcation tend to reduce risk, the reduction is greater for
the nonmarket component of risk, the component of risk that is not rewarded
with a higher expected return.

Portfolios of Several Assets

Similar considerations apply to a portfolio of N assets. Let R1,t , R2,t , . . . , RN,t
denote the asset returns at time t and suppose that the market model holds
for each asset, that is, for each i = 1, 2, . . . , N ,

Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T.

T&F Cat #K31368 — K31368 C008— page 249 — 6/14/2017 — 22:05

250 Introduction to Statistical Methods for Financial Models

Let
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
α1 β1 1,t
⎜ α2 ⎟ ⎜ β2 ⎟ ⎜ 2,t ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
α = ⎜ . ⎟, β = ⎜ . ⎟, and t = ⎜ . ⎟
⎝ .. ⎠ ⎝ .. ⎠ ⎝ .. ⎠
αN βN N,t

denote N × 1 vectors. Consider a portfolio based on a weight vector w ∈ N ,

with return Rp,t at time t. Then

Rp,t − Rf,t = αp + βp (Rm,t − Rf,t ) + p,t

where αp = wT α, βp = wT β, and p,t = wT t .

Let Σ denote the covariance matrix of t . Then the market component of
Var(Rp,t ) is β2p σ2m and the nonmarket component is

σ2,p ≡ Var(p,t ) = Var(wT t ) = wT Σ w.

Because of the beneﬁts of diversiﬁcation, the variance of the return on the

portfolio tends to be small relative to the variances of the returns on the indi-
vidual assets; as in the two-asset case, this is generally because of a reduction
in the nonmarket components of return variance.

Example 8.13 Consider the equally weighted portfolio of eight stocks ana-
lyzed in Example 8.11, with returns in big8. For the eight individual stocks,
the standard deviations are given by

> apply(big8, 2, sd)

AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579

Now consider estimation of the nonmarket component of risk for each of

the stocks. Note that σ̂,i for asset i is available from the results of the lm
function by extracting the sigma component of the results from the summary
function; for example,

> summary(lm(ibm~sp500))$sigma
[1] 0.0398

Deﬁne a function f.sighat by

> f.sighat<-function(y){summary(lm(y~sp500))$sigma}

Note that

> f.sighat(ibm)
[1] 0.0398

T&F Cat #K31368 — K31368 C008— page 250 — 6/14/2017 — 22:05

The Market Model 251

The nonmarket components of risk—that is, the standard deviations cor-

responding to the nonmarket components of variance—for the eight stocks
may now be calculated using the apply function in the usual way as follows:

> apply(big8, 2, f.sighat)

AAPL BAX KO CVS XOM IBM JNJ DIS
0.0659 0.0490 0.0372 0.0418 0.0322 0.0398 0.0331 0.0370

Now consider the equally weighted portfolio of the stocks represented in

big8; recall that the returns for this portfolio are stored in the variable
big8.port and that the estimate of β for the portfolio is β̂p = 0.803. The
total risk and the nonmarket risk for this portfolio may be calculated using
the functions sd and f.sighat, respectively.

> sd(big8.port)
[1] 0.0340
> f.sighat(big8.port)
[1] 0.0158

Note that the standard deviation of the portfolio return is about two-thirds as
large as the average return standard deviation for the eight stocks, given by

> mean(apply(big8, 2, sd))

[1] 0.0521

However, the standard deviation corresponding to the nonmarket component

of variance for the portfolio is only about one-third as large as the aver-
age nonmarket component of return standard deviation for the eight stocks,
given by

> mean(apply(big8, 2, f.sighat))

[1] 0.042
2
Recall that, for the observed returns on the S&P 500 index, Sm = 0.00141;
it follows that the market component of risk for the “big8” portfolio is
√
β̂p Sm = (0.803) 0.00141 = 0.0302.

This value is similar to the market components of risk for the eight individual
stocks:

> big8.mm$coefficients[2,]*sd(sp500)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0346 0.0269 0.0182 0.0403 0.0330 0.0232 0.0203 0.0448

Therefore, the total risk of the portfolio, 0.0340, is smaller than the
total risks of the individuals stocks, which range from 0.0386 to 0.0739; this
diﬀerence is attributable primarily to a decrease in the nonmarket risk.

T&F Cat #K31368 — K31368 C008— page 251 — 6/14/2017 — 22:05

252 Introduction to Statistical Methods for Financial Models

Because the total variance of the portfolio return consists of the market
component, which is similar to the market components of return variance for
the individual assets in the portfolio, and the nonmarket component, which
tends to be much less than the individual nonmarket components of return
variance, the proportion of return variance explained by the market return
tends to be higher for the portfolio than for the individual assets. That is,
R-squared for the portfolio tends to be larger than R-squared for the individual
stocks.
Example 8.14 Consider the returns for the eight stocks stored in the variable
big8 and analyzed in the previous example. Recall that the R-squared values
for these stocks are given by
> apply(big8, 2, f.rsq)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.219 0.234 0.196 0.485 0.516 0.257 0.276 0.599
The R-squared value for the equally weighted portfolio of the eight stocks
may be calculated using f.rsq as well:
> f.rsq(big8.port)
[1] 0.787
Therefore, R-squared for the portfolio is considerably larger than R-squared
for the individual stocks.
Note that a relatively large value of R-squared for a portfolio indicates
that it is well diversified, in the sense that most of its risk is because of its
relationship with the market portfolio which, by definition, is diversified.

Some Further Results on Portfolio Risk

The properties of the portfolios in Examples 8.13 and 8.14 hold in general,
at least to some degree. As noted previously, let R1,t , R2,t , . . . , RN,t denote
the asset returns at time t and suppose that the market model holds for each
asset so that
Ri,t − Rf,t = αi,t + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T
for i = 1, 2, . . . , N . Let t denote the vector (1,t , 2,t , . . . , N,t )T and let Σ
denote the covariance matrix of t .
Suppose that for any i and j, the residual returns i,t and j,t are
uncorrelated. Then
⎛ 2 ⎞
σ,1 0 · · · 0
⎜ .. .. .. ⎟
⎜ 0 . . . ⎟
⎜
Σ = ⎜ . ⎟ (8.10)
. . ⎟
⎝ .. .. .. 0 ⎠
0 . . . 0 σ2,N
where σ2,j = Var(j,t ). The assumption that Σ is a diagonal matrix is
a strong assumption that leads to the single-index model for the returns

T&F Cat #K31368 — K31368 C008— page 252 — 6/14/2017 — 22:05

The Market Model 253

R1,t , R2,t , . . . , RN,t . The single-index model and its implications for portfolio
theory are covered in detail in Chapter 9.
Consider the equally weighted portfolio with weight vector w = (1/N )1.
Then ⎛ ⎞
1 1 1
N
1
σ2,p = 2 1T Σ 1 = ⎝ σ2,j ⎠ = σ̄2
N N N j=1 N
where
1
2
N
σ̄2 = σ
N j=1 ,j
is the average nonmarket component of the return variance of the assets.
Therefore, if N is large and the residual returns for different assets are uncor-
related, then the nonmarket component of return variance for an equally
weighted portfolio tends to be small.
It is important to note that such a conclusion depends on the assumption
that Σ is a diagonal matrix, that is, on the assumption that i,t and j,t
are uncorrelated for i = j. For a general covariance matrix Σ the minimum
possible nonmarket return variance can be found by choosing w to minimize
wT Σ w subject to wT 1 = 1; note that this is the same as finding the weight
vector for the minimum variance portfolio, but with Σ replacing Σ, the
covariance matrix of the returns.
Therefore, according to Proposition 5.3, wT Σ w is minimized by
Σ−1
1
w̃ =
1T Σ−1
1
and the minimum nonmarket component of return variance is given by
1
w̃T Σ w̃ = T −1 .
1 Σ 1
For instance, suppose that Σ is of the form σ2 Mρ where
⎛ ⎞
1 ρ ··· ρ
⎜ .⎟
⎜ρ . . . . . . .. ⎟
⎜
Mρ = ⎜ . ⎟ (8.11)
⎟
⎝ .. . . . . . . ρ⎠
ρ ... ρ 1
and σ2 > 0; see Example 5.7. Then all residual returns have standard deviation
σ and any pair of residual returns for different assets in the same time period
has correlation ρ. In Example 5.7, it was shown that the equally weighted
portfolio is the minimum variance portfolio in this case and it has variance

σ2 T N + N (N − 1)ρ 1−ρ
1 M ρ 1σ2
= = ρ + σ2 .
N2 N2 N
Therefore, the nonmarket component of return variance when Σ = σ2 Mρ
is never less than ρσ2 . Thus, even for a large number of assets and a diversified

T&F Cat #K31368 — K31368 C008— page 253 — 6/14/2017 — 22:05

254 Introduction to Statistical Methods for Financial Models

portfolio, the nonmarket component of the variance is not negligible unless

ρσ is close to 0.
It is worth noting that, even when it is not negligible, the nonmarket
√
component of risk for the portfolio, which is approximately ρσ for large N ,
is still less than that of the individual assets, which is given by σ .

8.9 Measuring Portfolio Performance

A primary goal of portfolio theory and methodology is the construction of
portfolios that yield large returns. However, as we have seen, large returns
are generally associated with large risk. Therefore, in assessing portfolio per-
formance, we need to consider both the average return on the portfolio and
the variability of the returns as measured by the standard deviation or some
similar measure.
One such measure of portfolio performance that we have used is the Sharpe
ratio, given by (μp − μf )/σp , where μp and σp denote the mean and standard
deviation, respectively, of the return on a given portfolio and μf denotes the
return on the risk-free asset. The Sharpe ratio is estimated by SR $ = (R̄p −
R̄f )/Sp , where R̄p − R̄f and Sp are the sample mean and sample standard
deviation, respectively, of the observed excess returns.
In this section, we consider some alternative measures of portfolio per-
formance based on the estimates of the market model parameters for a
portfolio.

Treynor Ratio
The Sharpe ratio considers the mean excess return of a portfolio relative to the
portfolio risk, as measured by standard deviation of the returns. An important
aspect of this measure is that it uses total risk, including both the market and
the nonmarket components.
According to the CAPM, the expected excess return on an asset depends
on the market component of its risk, as measured by βp σm , where βp denotes
the value of beta for the portfolio, assumed to be positive, and σm is the
standard deviation of the return on the market portfolio. Therefore, portfolios
with large values of βp are expected to have larger returns than portfolios with
values of βp close to 0.
A version of the Sharpe ratio with total risk replaced by the market com-
ponent of risk is given by (μp − μf )/(βp σm ). Note that the market risk, as
measured by σm , is the same for each portfolio considered.
The Treynor ratio measures the excess return of a portfolio relative to its
value of beta:
μp − μf
TR = .
βp
Here we can interpret βp as the portfolio’s value of beta in the market model.

T&F Cat #K31368 — K31368 C008— page 254 — 6/14/2017 — 22:05

The Market Model 255

Recall that we may write

σp
βp = ρp
σm
where ρp denotes the correlation between the portfolio’s return and the return
on the market index. It follows that the Treynor ratio and the Sharpe ratio
of a portfolio are related by
σm
TR = (SR).
ρp

Therefore, the Treynor ratio rewards portfolios that have a large Sharpe ratio
while having returns that have low correlation with the returns on the market
index.
To estimate the Treynor ratio, we may use the least-squares estimator β̂p
of the parameter βp in the market model applied to the portfolio returns. Then
an estimator of the Treynor ratio is given by

% R̄p − R̄f
TR = .
β̂p

Example 8.15 Although the methods described here may be applied to any
portfolio of assets, investors often choose to invest in mutual funds, which
are essentially professionally managed, regulated portfolios. Therefore, the
performance measures described in this chapter will be illustrated based on
mutual funds. A mutual fund combines the capital of a number of investors
for the purpose of investing it on their behalf. The investments made by a
fund form a type of portfolio and may include investments in stocks, bonds,
and cash. A given mutual fund has a specific investment strategy, as described
in its prospectus, but the exact investments generally vary over time; in that
respect, a mutual fund differs from the portfolios we have been considering.
In spite of this variation over time, we will analyze the returns on a mutual
fund as if they are the returns on a given portfolio.
We will consider the returns of four mutual funds: Vanguard U.S.
Growth Portfolio Fund (symbol VWUSX), T. Rowe Price New Horizons
Fund (PRNHX), Fidelity Select Utilities Portfolio (FSUTX), and BlackRock
Natural Resources Trust (MDGRX). These funds differ in their investment
objectives. The Vanguard U.S. Growth Portfolio Fund focuses on stocks
exhibiting long-term capital appreciation; T. Rowe Price New Horizons Fund
also looks for long-term growth but focuses on the stocks of small companies,
before they are widely recognized. The other two funds are more specialized.
Fidelity Select Utilities Portfolio invests in companies in the utilities industry
and BlackRock Natural Resources Trust invests in companies with substantial
assets in natural resources.
The value of a share in a mutual fund changes over time, like the price
of a stock, and these share values may be downloaded from Yahoo Finance,

T&F Cat #K31368 — K31368 C008— page 255 — 6/14/2017 — 22:05

256 Introduction to Statistical Methods for Financial Models

using the same procedure we used to download stock price information. Fur-
thermore, the returns on the mutual funds are calculated in the same way.
The data matrix funds contains ﬁve years of monthly excess returns on these
four funds for the period ending December 31, 2014; thus, funds plays the
same role that big8 played in the analysis of the stock returns of eight large
companies. Note that, although these are excess returns, we will generally
refer to them simply as “returns.”

> head(funds)
VWUSX PRNHX FSUTX MDGRX
[1,] -0.06323 -0.0325 -0.0439 -0.0574
[2,] 0.03687 0.0452 -0.0120 0.0323
[3,] 0.06054 0.0830 0.0375 0.0238
[4,] 0.00576 0.0398 0.0342 0.0299
[5,] -0.08571 -0.0646 -0.0559 -0.0929
[6,] -0.05843 -0.0643 -0.0135 -0.0513

The mean and standard deviation of the fund returns may be calculated
using the apply function.

> funds.mean<-apply(funds, 2, mean)

> funds.mean
VWUSX PRNHX FSUTX MDGRX
0.01262 0.01736 0.01185 0.00341
> funds.sd<-apply(funds, 2, sd)
> funds.sd
VWUSX PRNHX FSUTX MDGRX
0.0440 0.0474 0.0331 0.0626

Estimates of αp and βp for the four funds can be estimated using the same
approach used to estimate the parameters of the market model for the “big8”
stocks in Example 8.3.

> funds.mm<-lm(funds~sp500)
> funds.alpha<-funds.mm$coefficients[1,]
> funds.beta<-funds.mm$coefficients[2,]
> funds.alpha
VWUSX PRNHX FSUTX MDGRX
0.000377 0.005150 0.006691 -0.010968
> funds.beta
VWUSX PRNHX FSUTX MDGRX
1.123 1.120 0.473 1.319

T&F Cat #K31368 — K31368 C008— page 256 — 6/14/2017 — 22:05

The Market Model 257

Using the least-squares estimates, it is straightforward to estimate the

Treynor ratio for the funds.

> funds.treynor<-funds.mean/funds.beta
> funds.treynor
VWUSX PRNHX FSUTX MDGRX
0.01124 0.01550 0.02505 0.00259

Therefore, the Utilities Portfolio has the largest estimated Treynor ratio.
Note that the estimate of beta for this fund is much lower than those of the
other funds and its estimated correlation with the return on the S&P 500
index is much lower than those of the other funds.

> cor(sp500, funds)

VWUSX PRNHX FSUTX MDGRX
SP500 0.959 0.888 0.537 0.791

For comparison, we can estimate the Sharpe ratio for each fund.

> funds.sharpe<-funds.mean/funds.sd
> funds.sharpe
VWUSX PRNHX FSUTX MDGRX
0.2870 0.3665 0.3578 0.0545

Hence, the New Horizons Fund has the largest estimated Sharpe ratio and the
Natural Resources Trust has the smallest.
In interpreting these results, it is important to keep in mind that the
Treynor and Sharpe ratios calculated here are estimates of the underlying
“true” ratios corresponding to the funds considered. In particular, in evalu-
ating and comparing funds, it is important to take into account the sampling
variability of the estimates, as measured by their standard errors. Such issues
will be considered in detail in the following section.

Jensen’s Alpha
According to the CAPM, the expected return on a portfolio depends on its
value of beta:
μp = μf + βp (μm − μf ).
Thus, a portfolio with μp > μf + βp (μm − μf ) has a larger expected return
than predicted by the CAPM for its value of beta. A similar interpretation
holds if μp < μf + βp (μm − μf ).
Therefore, one way to evaluate the performance of a portfolio is to compare
its mean return to what is predicted by the CAPM. Let

αp = μp − μf − βp (μm − μf );

αp is known as Jensen’s alpha for the portfolio.

T&F Cat #K31368 — K31368 C008— page 257 — 6/14/2017 — 22:05

258 Introduction to Statistical Methods for Financial Models

When αp > 0, the mean portfolio return is greater than expected for the
portfolio’s value of β; when αp < 0, the mean return is less than expected.
Jensen’s alpha has the advantage of being easy to interpret because it is mea-
sured on the same scale as returns. Recall that αp may also be interpreted in
terms of a portfolio that is over- or underpriced, as discussed in Section 8.4.
Jensen’s alpha may be estimated by α̂p , the least-squares estimator of the
intercept in the market model regression for the portfolio.
Example 8.16 For the four funds represented in the data matrix funds, the
output from estimation of the market model is stored in the variable funds.mm
and the estimates of alpha are stored in the variable funds.alpha.

> funds.alpha
VWUSX PRNHX FSUTX MDGRX
0.000377 0.005150 0.006691 -0.010968

Hence, the U.S. Growth Portfolio, the New Horizons Fund, and the Utili-
ties Portfolio all have an estimated mean return greater than that predicted by
the CAPM, with the largest diﬀerence for the Utilities Portfolio; the estimated
mean return of the Natural Resources Trust is less than what is predicted by
the CAPM.
As discussed in the previous example, it is important to keep in mind that
the values reported here are only estimates of the true parameter values.

Appraisal Ratio
One drawback of Jensen’s alpha is that it does not explicitly incorporate
portfolio risk. A portfolio with a large value of αp , but with large risk, may
be less desirable than a less-risky portfolio that has a smaller value of αp .
The market component of return variance for a portfolio, β2p σ2m , is directly
tied to its predicted mean return according to the CAPM through the param-
eter βp . Thus, in evaluating the value of αp for a portfolio, we compare it to
the portfolio’s nonmarket component of risk.
The appraisal ratio is a risk-adjusted form of Jensen’s alpha given by
αp
,
σ,p

where σ,p is the error standard deviation in the market model for the portfolio.
Thus, the appraisal ratio is the diﬀerence between the expected return on
the portfolio and its predicted expected return according to the CAPM based
on the value of βp , relative to the nonmarket component of risk for the portfo-
lio. Recall that, according to the CAPM, nonmarket risk is not compensated
by a larger expected return. Thus, a portfolio with a large appraisal ratio is
apparently realizing some reward for its nonmarket risk.
There is an alternative interpretation of the appraisal ratio as the increase
in the Sharpe ratio that may be achieved by combining the portfolio under

T&F Cat #K31368 — K31368 C008— page 258 — 6/14/2017 — 22:05

The Market Model 259

consideration with the market portfolio; this result will be discussed in detail
in Section 9.6.
Example 8.17 Consider the four mutual funds with return data stored in
the variable funds. The estimates σ̂,p for these assets may be calculated using
the apply function with the function f.sighat deﬁned in Example 8.13:

> funds.s<-apply(funds, 2, f.sighat)

> funds.s
VWUSX PRNHX FSUTX MDGRX
0.0125 0.0220 0.0282 0.0386

Using these results, the appraisal ratios are easily calculated.

> funds.appraisal<-funds.alpha/funds.s
> funds.appraisal
VWUSX PRNHX FSUTX MDGRX
0.0301 0.2344 0.2374 -0.2843
Hence, the New Horizons Fund and the Utilities Portfolio have the largest
estimated appraisal ratios for the four funds considered; as with the other per-
formance measures, these values are only estimates of the funds’ true appraisal
ratios.

8.10 Standard Errors of Estimated

Performance Measures
The estimated portfolio performance measures we have discussed are just
estimates of a portfolio’s “true” performance measures. Therefore, when inter-
preting such results, it is important to assess the uncertainty in these estimates
by calculating their standard errors.
For some statistics, such as a sample mean or an estimated regression
parameter, calculating the standard error is straightforward. For√instance, for
the average mean return R̄p , the standard error is given by Sp / T , where Sp
is the sample standard deviation of the returns and T is the sample size. In
some cases, the standard error of a statistic is given as part of the R output
of the function used to calculate the statistic. For instance, the standard error
of Jensen’s alpha, the estimated intercept parameter in the market model, is
available from the regression output from estimating the market model using
the lm function.
One role of the standard error of an estimate is in calculating an approx-
imate conﬁdence interval for a parameter; for instance, an approximate 95%
conﬁdence interval for μp is given by
Sp
R̄p ± 1.96 √ .
T

T&F Cat #K31368 — K31368 C008— page 259 — 6/14/2017 — 22:05

260 Introduction to Statistical Methods for Financial Models

Example 8.18 Consider the returns on the Vanguard U.S. Growth Portfolio,
which are stored in the variable vwusx. Output from estimating the market
model using the lm function includes the table
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000377 0.001687 0.22 0.82
sp500 1.122536 0.043467 25.83 <2e-16 ***
Therefore, the estimate of Jensen’s alpha for this fund is 0.000377 and
the standard error is 0.001687, leading to an approximate 95% conﬁdence
interval of
0.000377 ± 1.96(0.001687) = (−0.00293, 0.00368).

For more general statistics, such as an estimated Sharpe ratio, simple

expressions for the standard error are not available. Hence, here we con-
sider a general method of computing a standard error based on Monte Carlo
simulation.
Monte Carlo simulation was considered in Section 6.7. The approach used
in that section was based on assumptions regarding the distribution of the
data. One drawback of such an approach is that the results depend on the
assumptions used; here we use the observed data as the basis for the simulated
data so that no such distributional assumptions are required.
We begin with a simple example to illustrate the mechanics of the Monte
Carlo procedure. Let Rj,1 , Rj,2 , . . . , Rj,T denote the returns on a given asset
and consider calculating the standard error of the sample mean return √ R̄j .
Of course, in this case, we know that this standard error is given by Sj / T
where Sj is the sample standard deviation of the returns. However, suppose
that such a formula for this standard error is not available.
The standard error of R̄j is an estimate of the standard deviation of
the sampling distribution of R̄j . This sampling distribution is based on the
assumption that Rj,1 , Rj,2 , . . . , Rj,T is a random sample from some population,
in this case, the (hypothetical) population of all possible returns on the asset
under consideration. Hence, here we assume that the Rj,t are independent,
identically distributed random variables.
Suppose that the population values are known; denote them by R̃j,k , k =
1, 2, . . . , K, so that the population of return values may be written
P = {R̃j,1 , R̃j,2 , . . . , R̃j,K }.
Then we can calculate the standard deviation of R̄j by drawing repeated
samples of size T from P, calculating the sample mean of each sample and
then calculating the standard deviation of these simulated sample means of
returns.
That is, for each i = 1, 2, . . . , I, let
(i) (i) (i)
Rj,1 , Rj,2 , . . . , Rj,T

T&F Cat #K31368 — K31368 C008— page 260 — 6/14/2017 — 22:05

The Market Model 261
(i) T (i)
be a sample with replacement drawn from P, and let R̄j = t=1 Rj /T
denote the corresponding sample mean. Then
(1) (2) (I)
R̄j , R̄j , . . . , R̄j

is a sample from the distribution of R̄j .

The standard error of R̄j may now be calculated as the sample standard
(i)
deviation of the values R̄j , i = 1, 2, . . . , I. Provided that I is sufficiently large,
that is, provided that we draw a sufficiently large number of samples from P,
the standard error calculated in this way will be an accurate estimator of the
standard deviation of the sampling distribution of R̄j .
The flaw in this approach, of course, is that we do not have the popula-
tion P. In Section 6.7, we dealt with this issue by making some assumptions
regarding the distribution of the data and used those assumptions to draw
the Monte Carlo samples. Here we use the information regarding P provided
by the observed values Rj,1 , Rj,2 , . . . , Rj,T themselves.
Let PO = {Rj,1 , Rj,2 , . . . , Rj,T } denote the set of observed return values.
Then we may estimate the standard error of R̄j by replacing the hypothetical
population P of return values by the observed set of return values, PO . This
procedure is known as the bootstrap because we are apparently estimating the
standard error by “pulling ourselves up by our bootstraps”; that is, we esti-
mate the standard error without any assistance in the form of distributional
assumptions.
Example 8.19 Consider estimating the mean return on the New Horizons
Fund; recall that five years of monthly returns on this mutual fund are stored
in the variable prnhx. For illustration, we will use only the first five of these
values, which are stored in prnhx5:
> prnhx5<-prnhx[1:5]
> prnhx5
[1] -0.0325 0.0452 0.0830 0.0398 -0.0646
The sample mean of these five values is 0.0142 with a standard error of 0.0272:
> mean(prnhx5)
[1] 0.0142
> sd(prnhx5)/(5^.5)
[1] 0.0272
Now consider the simulation-based approach to estimating this standard
error. The function sample may be used to draw a random sample from a set
of integers. Specifically, sample(5, replace=T) draws a random sample with
replacement from the set {1, 2, 3, 4, 5}:
> samp<-sample(5, replace=T)
> samp
[1] 2 1 1 5 3

T&F Cat #K31368 — K31368 C008— page 261 — 6/14/2017 — 22:05

262 Introduction to Statistical Methods for Financial Models

Using the sampled integers as the indices of the vector prnhx5 yields a random
sample with replacement from the set of returns values in prnhx5:
> prnhx5[samp]
[1] 0.0452 -0.0325 -0.0325 -0.0646 0.0830
The sample mean of prnhx5[samp] yields a simulated value of the sample
mean return R̄j for this asset:
> mean(prnhx5[samp])
[1] -0.0003
This procedure may be repeated multiple times; for example,
> mean(prnhx5[sample(5, replace=T)])
[1] 0.0286
> mean(prnhx5[sample(5, replace=T)])
[1] 0.0142
and so on.
Note that each time mean(prnhx5[sample(5, replace=T)]) is calcu-
lated, a new set of random numbers is drawn. Suppose we perform this
procedure 1000 times, storing the sample means in the variable prnhx5.boot,
these values represent a type of random sample drawn from the distribution
of the sample mean of five returns on the New Horizons Fund. Here are the
first eight values.
> prnhx5.boot[1:8]
[1] -0.0003 0.02860 0.01420 0.05270 -0.15400 0.03400
[7] 0.01523 0.00447
The sample standard deviation of prnhx5.boot yields an estimate of the
standard error of R̄j for this fund.
> sd(prnhx5.boot)
[1] 0.0247
Note that the bootstrap standard error is close to, but not exactly
the same as, the value given by the usual formula for the standard error
of the sample mean, 0.0272. There are two reasons for the difference. One is
that the sample standard deviation uses a divisor of T − 1; it may be shown
that the estimate of the standard deviation implicity used by the bootstrap
method is equivalent to the one with a divisor of T . The effect of these differ-
ent divisors is highlighted in the example because in that case T = 5. In more
realistic
settings, such as the analysis of five years of monthly data, T = 60
and 60/59 = 1.0084 so that the difference is unlikely to be important.
The other reason for the difference between the bootstrap standard error
and the usual value is that the bootstrap method is based on a random sample.
If the method is repeated, a different standard error will be obtained. For
instance, the bootstrap method was repeated three times, with results

T&F Cat #K31368 — K31368 C008— page 262 — 6/14/2017 — 22:05

The Market Model 263

> sd(prnhx5.boot1)
[1] 0.0237
> sd(prnhx5.boot2)
[1] 0.0245
> sd(prnhx5.boot3)
[1] 0.0246

If a very large bootstrap sample size is used, we expect that the result will
be closer to that obtained by the usual method. For example, prnhx5.boot10k
contains a random sample of size 10,000 from the distribution of R̄j for the
New Horizons Fund.

> sd(prnhx.boot10k)
[1] 0.0241
.
Hence, note that 5/4 = 1.118 so that, after accounting for the diﬀerence in
divisors, the result is nearly identical to the 0.0272 obtained from the usual
formula.

Obviously, the bootstrap method is not needed to calculate the standard

error of a sample mean. However, it is extremely useful for calculating the
standard error of more complicated statistics for which a simple formula is
not available. Although it is possible to carry out the calculations for any
statistic by following the procedure described earlier for the sample mean,
fortunately, there are convenient R functions available for that purpose.
Here we will use the function boot in the package boot (Canty and Ripley
2015). This function takes three arguments (there are other, optional argu-
ments for which we will use the default values). The most important of these
is a function calculating the statistic of interest for a given set of data.

Example 8.20 Consider estimation of the Sharpe ratio based on a sequence

of excess returns; we will write a function Sharpe to compute the Sharpe ratio.
A function to be used in boot must take two arguments: the data, in the
form of a vector or matrix, and the indices of the data values to be used
in the computation, similar to the way the indices vector was used in the
aforementioned sample mean example.
Consider the function

> Sharpe<-function(x, ind){mean(x[ind])/sd(x[ind])}

This function takes the values in x corresponding to the indices in the vector
ind and uses those values to compute the Sharpe ratio. For example, consider
the excess return data in the vector prnhx. To use all the data, we set ind to
1:60, the vector of integers from 1 to 60.

> Sharpe(prnhx, 1:60)

[1] 0.367

T&F Cat #K31368 — K31368 C008— page 263 — 6/14/2017 — 22:05

264 Introduction to Statistical Methods for Financial Models

This yields the same result as computing the Sharpe ratio directly for the data
in prnhx:
> mean(prnhx)/sd(prnhx)
[1] 0.367
If 1:60 is replaced by 1:5, the result is the Sharpe ratio based on the ﬁrst
ﬁve values; recall that these are stored in the variable prnhx5.
> Sharpe(prnhx, 1:5)
[1] 0.233
> mean(prnhx5)/sd(prnhx5)
[1] 0.233

The other arguments to boot are data, the data used in calculating the
statistic of interest, and R, the number of bootstrap replications to be used.
Example 8.21 Consider estimation of the Sharpe ratio for the New Horizons
Fund. To calculate the standard error of the estimated Sharpe ratio for the
data in prnhx based on a bootstrap sample size of 1000, we use the command
>library(boot)
> boot(prnhx, Sharpe, 1000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.367 0.00582 0.138
The output gives the value of the estimate, under the heading “origi-
nal”; hence, the estimated Sharpe ratio for these data is 0.367, as calculated
previously. The standard error, given under the “std. error” heading, is 0.138.

The output of the boot function includes an estimate of the bias of the
estimator; recall that the bias is the expected value of the estimator minus
the true value of the parameter.
A bias-corrected estimate may be formed by subtracting the bias from
the estimate; for example, a bias-corrected estimate of the Sharpe ratio for
the New Horizons Fund is 0.367 − 0.006 = 0.361. Whenever the bias is small
relative to the standard error, the impact of the bias correction is small and,
hence, it may be ignored. A simple rule of thumb is that the estimated bias
may be ignored when it is less than one-fourth of the standard error; of course,
such a guideline will not be appropriate in all cases.
The usefulness of the bootstrap method arises from the fact that it can be
applied to a wide range of statistics, by modifying the function used as the
argument to boot. For instance, it may be applied when the statistic under
consideration depends on the returns of more than one asset; this is illustrated
in the following example.

T&F Cat #K31368 — K31368 C008— page 264 — 6/14/2017 — 22:05

The Market Model 265

Example 8.22 Consider calculation of the standard error of the estimated

Treynor ratio for the New Horizons Fund.
The ﬁrst step in using the bootstrap method is to construct a function that
calculates the estimated Treynor ratio based on a set of returns, together with
a vector of indices. The complication in this example is that the Treynor ratio
depends on two sets of returns—the returns on the asset under consideration
and the returns on the market index, used to calculate β̂p . Hence, we take
the data for estimation procedure to be a matrix, in which the ﬁrst column
is the asset (excess) returns and the second column is the (excess) returns on
market index, in this case, the S&P 500 index. The indices variable will then
select the rows of the matrix to be included in the estimate. This approach is
implemented in the following function:

> Treynor<-function(rmat, ind){

+ ret<-rmat[ind, 1]
+ mkt<-rmat[ind, 2]
+ beta<-lm(ret~mkt)$coefficients[2]
+ mean(ret)/beta}

In this function, the return data are input in the matrix rmat, which is
assumed to have two columns, the first with the return data for the asset
and the second with the return data for the market index. The variable ind
contains the indices of the returns to be used in the estimation of the Treynor
ratio. The first two lines of the function extract the relevant return data using
ind and places them in two variables ret and mkt, which contain the return
data for the asset and for the market index, respectively, corresponding to
ind. The third line obtains the estimate of beta for the data in ret and mkt,
and the final line returns the estimate of the Treynor ratio.
For example, using the function with the first argument taken to be
cbind(prnhx, sp500), the matrix formed by combining prnhx and sp500
as column vectors, and taking the second argument to be the sequence of
integers 1:60, yields the estimated Treynor ratio for the New Horizons Fund

> Treynor(cbind(prnhx, sp500), 1:60)

0.01550

in agreement with what was obtained in Example 8.15.

We are now in position to apply the boot function. Suppose we are inter-
% for the data in the variable prnhx.
ested in estimating the standard error of TR
This is obtained from the command

> boot(cbind(prnhx, sp500), Treynor, 10000)

ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.0155 -1.21e-05 0.00534

T&F Cat #K31368 — K31368 C008— page 265 — 6/14/2017 — 22:05

266 Introduction to Statistical Methods for Financial Models

Therefore, the standard error for the estimated Treynor ratio for the New
Horizons Fund is 0.00534; the estimated bias is very small relative to the
standard error and, hence, it may be ignored.

Comparison of Portfolios
A common goal in calculating measures of portfolio performance is to compare
portfolios. Hence, we may be interested in estimating the difference between
measures of performance for two portfolios. Estimation of such a difference is
straightforward. We may estimate the difference in performance measures by
the difference of corresponding estimates. To calculate the standard error of
such a difference, we may again use the bootstrap procedure, as implemented
in the boot function, by defining the inputs to boot appropriately. This is
illustrated in the following example.
Example 8.23 Suppose that we are interested in comparing the Sharpe
ratios of the U.S. Growth Portfolio and the New Horizons Fund, the esti-
mated Sharpe ratios are 0.287 for the U.S. Growth Portfolio and 0.367 for
the New Horizons Fund, suggesting that the Sharpe ratio for the New Hori-
zons Fund is larger. However, these are only estimates and it is of interest to
take into account the sampling variability in evaluating the difference in the
estimates.
First, consider calculation of the standard error for each of these individ-
ual estimates. The return data for the U.S. Growth Portfolio are stored in
the variable vwusx and the returns for the New Horizons Fund are stored
in the variable prnhx. Recall that Sharpe is a function that calculates the
Sharpe ratio of an asset, which can be used in the function boot.
The standard errors for the individual Sharpe ratios are given by
> boot(vwusx, Sharpe, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.287 0.00786 0.141

> boot(prnhx, Sharpe, 10000)

ORDINAR{Y} NONPARAMETRIC BOOTSTRAP

Bootstrap Statistics :
original bias std. error
t1* 0.367 0.00751 0.137
Thus, the standard error of the estimated Sharpe ratio for the U.S. Growth
Portfolio is 0.141 and the standard error of the estimated Sharpe ratio for the
New Horizons Fund is 0.137.
Now consider calculation of the standard error for the diﬀerence in two
estimated Sharpe ratios. Note that we cannot use standard error based on the

T&F Cat #K31368 — K31368 C008— page 266 — 6/14/2017 — 22:05

The Market Model 267

individual standard errors because the two estimates are likely to be corre-
lated. Hence, we use an approach similar to that used when calculating the
standard error for the difference of means for matched-pair data.
Define a function Sharpe_diff by
> Sharpe_diff<-function(rets, ind){
+ Sharpe(rets[,1], ind)-Sharpe(rets[,2], ind)
+ }
This function takes the return data in the matrix rets, with the returns for
the first asset in column 1 and the returns for the second asset in column 2,
and computes the difference in the Sharpe ratios corresponding to the indices
in ind.
To form a matrix with columns given by the variables vwusx and prhnx, we
use the cbind function. Therefore, standard error of the difference in Sharpe
ratios is given by
> boot(cbind(vwusx, prnhx), Sharpe_diff, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* -0.0795 0.00208 0.0589
Note that the estimated difference, −0.0795, agrees with the difference in
the estimated Sharpe ratios calculated in Example 8.15, 0.2870 − 0.3665. The
standard error of the difference is 0.0589 and, hence, the difference is not
statistically significant at the 5% level; that is, a 95% confidence interval
for the difference includes zero. The estimated bias is small relative to the
standard error and may be ignored.
It is worth noting that the standard error of the difference of the esti-
mates based on the standard errors of the individual estimates along with the
assumption that the estimates are uncorrelated,
1
(0.141)2 + (0.137)2 2 = 0.197
is much larger than the estimate given previously. This is because the method
used to obtain the value 0.197 ignores the fact that the returns on the two
funds are correlated.
It is important to keep in mind that the results based on the bootstrap
method are based on the random numbers generated by the boot function.
Hence, if the procedure is repeated, the results will vary. It is generally a good
idea to repeat the standard error calculation in order to assess this variation; if
it is large enough to affect the conclusions of the analysis, then the bootstrap
sample size should be increased.
Example 8.24 Consider calculation of the standard error of the difference of
the Sharpe ratios for the U.S. Growth Portfolio and the New Horizons Fund.
Recall that in Example 8.23, the standard error was found to be 0.0589.

T&F Cat #K31368 — K31368 C008— page 267 — 6/14/2017 — 22:05

268 Introduction to Statistical Methods for Financial Models

Repeating the calculation twice yields

> boot(cbind(vwusx, prnhx), Sharpe_diff, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* -0.0795 0.000631 0.0596

> boot(cbind(vwusx, prnhx), Sharpe_diff, 10000)

ORDINAR{Y} NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* -0.0795 0.000612 0.0591
Clearly, the variation in the estimated standard error is fairly small and the
conclusion that the estimated difference of the Sharpe ratios is not statis-
tically significant is not affected. Although the estimated biases obtained
indicate some variation, it clearly does not affect the conclusion that the
bias is negligible.
Given these additional results, it is reasonable to include them in our cal-
culation of the standard error. For instance, to estimate the standard error, we
could average the values 0.0589, 0.0596, and 0.0591, leading to a new estimate
of 0.0592.

8.11 Suggestions for Further Reading

Although many factors affect the return on a security, the return on a market
portfolio is generally considered to be the most important and, hence, the
market model presented in this chapter is discussed in many books on finan-
cial modeling; see, for example, Benninga (2008, Chapter 2), Campbell et al.
(1997, Chapter 4), and Fama (1976, Chapter 4). Bradfield (2003) presents a
detailed discussion of estimating the parameters of the market model address-
ing a number of practical issues; see also Damodaran (1999). The relationship
between assumptions on the error term in the model and the interpretation
of beta is discussed by Severini (2016).
Standard & Poors has published a technical report providing many details
of stock market indices; it is available at

http://www.spindices.com/documents/methodologies/
methodology-index-math.pdf

Many authors deﬁne the market model in terms of regular returns instead
of excess returns; that is the case, for example, with Campbell et al. (1997)
and Fama (1976). The interpretation of β is the same in both cases, although

T&F Cat #K31368 — K31368 C008— page 268 — 6/14/2017 — 22:05

The Market Model 269

the value of the estimate generally changes slightly. The interpretation of the
intercept, on the other hand, depends on the type of return used; however, it
is a simple matter to translate the parameter for the model based on regular
returns to the parameter based on excess returns and vice versa.
As discussed in Section 8.3, the market model may be viewed as a sim-
ple linear regression model and, hence, standard least-squares-based methods
may be used for inference for the parameters of the model; Newbold et al.
(2013, Chapter 11) offer a good introduction to these methods and include
a discussion of their application to the market model in their Section 11.8.
Like least-squares estimators in general, the least-squares estimator of beta
is sensitive to outliers; see Martin and Simin (2003) for discussion of an
outlier-resistant estimator.
Further details on shrinkage estimation of β are given by Elton et al. (2007,
Chapter 7), Francis and Kim (2013, Section 17.4), and Vasicek (1973). Mul-
tiple testing and the Bonferroni correction are discussed by Tamhane and
Dunlop (2000, Section 6.3.9), who also present other useful information on
hypothesis tests and their properties. The method of controlling the expected
FDR is known as the Benjamini-Hochberg method. Foulkes (2009, Chapter 4)
presents a good introduction to this and other methods of multiple testing;
although the subject of this book is statistical genetics, a field in which multi-
ple testing is routinely used, it is not difficult to see how the methods described
there may also be applied to problems in financial statistics.
Modigliani and Pogue (1974) offer a detailed discussion of the implications
of the market model for understanding risk. An extension of the market model
allows beta to change over time; see Ruppert (2004, Section 7.10) for an
introduction to such models.
Good general discussions of measures of portfolio performance are available
from Elton et al. (2007, Chapter 25) and Francis and Kim (2013, Chapter 18).
The bootstrap is an important method in statistics that can be used to calcu-
late standard errors and confidence intervals for a wide range of statistics; see,
for example, Efron and Tibshirani (1993) and Davison and Hinkley (1997).
In particular, the boot package used here is attributed to Davison and Hinkley
(1997).

8.12 Exercises
1. Calculate ﬁve years of monthly excess returns on Bed Bath &
Beyond Inc., stock (symbol BBBY) and ﬁve years of monthly excess
returns on the S&P 500 index (symbol ^GSPC) for the period end-
ing December 31, 2015. For the risk-free rate, use the return on the
three-month Treasury Bill, available on the Federal Reserve website.
Using these data, estimate the parameters α and β of the market
model for Bed Bath & Beyond stock, using the S&P 500 as the

T&F Cat #K31368 — K31368 C008— page 269 — 6/14/2017 — 22:05

270 Introduction to Statistical Methods for Financial Models

market index; see Example 8.3. Give the standard error of each
estimate. Based on these results, does it appear that Bed Bath &
Beyond stock is priced correctly?
2. Repeat Exercise 1 using the Wilshire 5000 index (symbol ^W5000)
as the market index. Compare the results to those obtained in
Exercise 1.
3. Calculate five years of monthly excess returns for the period end-
ing December 31, 2015, for five stocks, Papa John’s International,
Inc. (symbol PZZA), Bed Bath & Beyond, Inc. (BBBY), Netflix,
Inc. (NFLX), Time Warner, Inc. (TWX), and Verizon Commu-
nications, Inc. (VZ); for the risk-free rate, use the return on the
three-month Treasury Bill, available on the Federal Reserve web-
site. Using the S&P 500 index (symbol ^GSPC) as the market index,
calculate the estimate of beta for the market model for each stock;
see Example 8.3.
Which stock is most sensitive to the market? Which stock is
least sensitive?
4. Consider the return data for the five stocks given in Exercise 3 and
take the market index to be the S&P 500 index.
a. Using the Bonferroni method, test the hypothesis that all five
stocks are priced correctly; see Example 8.5. Using a significance
level of 0.05, what do you conclude?
b. Repeat the analysis controlling the expected FDR at the level
0.10; see Example 8.6. Do your conclusions change?
5. For each of the five stocks listed in Exercise 3, estimate σ,i , the
standard deviation corresponding to the nonmarket component of
the return variance, and give the value of R2 for the market model
regression; see Example 8.7. Use the S&P 500 as the market index.
For which stock is the proportion of the return variance
explained by the market the greatest? For which stock is it the
smallest?
6. For the return data on the five stocks given in Exercise 3, use the
shrinkage method to estimate the values of beta in the market model
with the S&P 500 index as the market index. Compute the esti-
mates using a global value of the weight ψ, as in Example 8.8, and
then repeat the analysis using an asset-specific value of ψ, as in
Example 8.9.
Which shrinkage method appears to be more appropriate here?
Why?
7. For the return data on the five stocks given in Exercise 3, calculate
the value of adjusted beta for each stock. Use the S&P 500 index
as the market index.

T&F Cat #K31368 — K31368 C008— page 270 — 6/14/2017 — 22:05

The Market Model 271

8. Consider the ﬁve stocks given in Exercise 3. Suppose that we con-

struct a portfolio of these stocks placing weight 0.1 on PZZA, 0.2
on BBBY, 0.3 on NFLX, 0.3 on TWX, and 0.1 on VZ.
Using the S&P 500 index as the market index, find the estimate
of beta for the portfolio based on five years of monthly excess returns
for the period ending December 31, 2015; see Example 8.11.
9. For the portfolio considered in Exercise 8, estimate the return stan-
dard deviation along with the standard deviations corresponding
to the market and nonmarket components of return variance; see
Example 8.11. Estimate the proportion of its return variance that
is explained by the market.
10. Consider three mutual funds, Copley (symbol COPLX), Edgewood
Growth Retail (EGFFX), and Fidelity Contrafund (FCNTX). Using
five years of monthly excess returns for the period ending December
31, 2015, and the S&P 500 index as the market index, determine
which of these three funds is the most diversified. Justify your
answer.
11. Consider the returns on the three mutual funds described in
Exercise 10. Estimate the values of alpha and beta in the market
model for an equally weighted portfolio of the three funds using the
S&P 500 as the market index. Is there evidence that the portfolio
is priced incorrectly? Why or why not? Estimate the proportion of
the portfolio return variance that is explained by the market.
12. Consider the returns on three mutual funds described in Exercise 10.
For each fund, estimate the Treynor ratio using the market model
with the S&P 500 index as the market index; see Example 8.15.
According to these estimates, rank the performances of funds. Com-
pare the results to the ranking based on the estimated Sharpe
ratios.
13. Consider two assets. Let Tj denote the Treynor ratio and let βj
denote the value of beta in the market model for asset j, j = 1, 2.
Suppose we construct a portfolio of these two assets, giving weight
w to asset 1 and weight 1 − w to asset 2. Let μj denote the mean
return on asset j, j = 1, 2 and let μf denote the mean return on
the risk-free asset. Assume that μj > μf , j = 1, 2.
a. Find the Treynor ratio of the portfolio, as a function of
T1 , T2 , β1 , β2 , and w.
b. Find the value of w, 0 ≤ w ≤ 1, that maximizes the Treynor
ratio of the portfolio.
14. Consider the returns on the three mutual funds described in
Exercise 10. For each fund, estimate the appraisal ratio using
the market model with the S&P 500 index as the market index;

T&F Cat #K31368 — K31368 C008— page 271 — 6/14/2017 — 22:05

272 Introduction to Statistical Methods for Financial Models

see Example 8.17. According to these results, rank the perfor-

mances of the funds. Compare the results to the ranking based
on the estimated Sharpe ratios.
15. Consider the returns on the three mutual funds described in
Exercise 10. For each fund, use the bootstrap method as described
in Example 8.22 to calculate the standard error of the estimated
Treynor ratio. Use the estimates from the market model with the
S&P 500 index as the market index. Calculate an approximate 95%
confidence interval for the true Treynor ratio of each fund. Take the
bootstrap sample size to be 10,000.
16. Consider the returns on the three mutual funds described in
Exercise 10. For each fund, use the bootstrap method to calculate
the bias of the estimated appraisal ratios and calculate the bias-
corrected estimate. Use the market model with the S&P 500 index
as the market index and a bootstrap sample size of 10,000.
17. Consider the returns on the three mutual funds described in
Exercise 10. Estimate the difference in the Sharpe ratios for the
Copley and Edgewood funds and use the bootstrap method to cal-
culate the standard error of the estimate; see example 8.23. Use the
market model with the S&P 500 index as the market index and a
bootstrap sample size of 10,000.
Repeat the procedure for the difference in the Sharpe ratios of
the Copley and Fidelity funds and for the difference in the Sharpe
ratios of the Edgewood and Fidelity funds.
Based on these results, what do you conclude regarding the
differences in the Sharpe ratios of the three funds?

T&F Cat #K31368 — K31368 C008— page 272 — 6/14/2017 — 22:05

9
The Single-Index Model

9.1 Introduction
In the previous chapter, the market model, which relates the return on an
asset to the return on a market index, was presented. According to the market
model, the excess return on an asset may be written in terms of two random
variables, the excess return on a market index and the residual return, which
is uncorrelated with the return on the market index.
The decomposition given by the market model is useful for understanding
the properties of the expected return and risk of an asset. However, the market
model applies only to a single asset and, hence, it is not useful in explaining
the relationships among the returns on several assets, as required in portfolio
theory.
In this chapter, we consider the single-index model, which is a type of
extension of the market model to a set of N assets. In particular, the single-
index model leads to a simple model for the covariance matrix of an asset
return vector.

9.2 The Model

Consider the market model for asset i, with returns Ri,1 , Ri,2 , . . . , Ri,T :
Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T (9.1)

where Rm,t denotes the return on a market index at time t, Rf,t denotes
the risk-free rate at time t, i,t is an error term that has mean zero and
is uncorrelated with Rm,t , as discussed in Section 8.3, and αi and βi are
parameters.
As we have seen, the market model is useful for understanding the rela-
tionship between the return on an asset and the return on the market, as
reﬂected in a market index; in particular, it gives a decomposition of the
variance of an asset’s returns into market and nonmarket components. The
single-index model uses the same approach to describe the covariance struc-
ture of a set of assets; thus, the single-index model is a model for the returns
on a set of assets.

273

T&F Cat #K31368 — K31368 C009— page 273 — 6/14/2017 — 22:05

274 Introduction to Statistical Methods for Financial Models

Consider a set of N assets with returns R1,t , R2,t , . . . , RN,t , in period t.

Suppose that, for each asset, the market model (9.1) holds; then we have N
equations describing the asset returns in terms of the returns on the market
index: for t = 1, 2, . . . , T ,
R1,t − Rf,t = α1 + β1 (Rm,t − Rf,t ) + 1,t
R2,t − Rf,t = α2 + β2 (Rm,t − Rf,t ) + 2,t
.. ..
. .
RN,t − Rf,t = αN + βN (Rm,t − Rf,t ) + N,t .

Writing these equations in matrix form, we have

Rt − Rf,t 1 = α + (Rm,t − Rf,t )β + t (9.2)
where ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
R1,t α1 β1
⎜ R2,t ⎟ ⎜ α2 ⎟ ⎜ β2 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
Rt = ⎜ . ⎟ , α = ⎜ . ⎟, β = ⎜ . ⎟,
⎝ .. ⎠ ⎝ .. ⎠ ⎝ .. ⎠
RN,t αN βN
and ⎛ ⎞
1,t
⎜ 2,t ⎟
⎜ ⎟
t = ⎜ . ⎟ .
⎝ .. ⎠
N,t
As with the market model, we assume that the stochastic process
T
{ (Rt − Rf,t 1)T , Rm,t − Rf,t : t = 1, 2, . . .}
is weakly stationary; we may interpret this condition as requiring that, for
any b ∈ N and any scalar c, the real-valued stochastic process
bT (Rt − Rf,t 1) + c(Rm,t − Rf,t ), t = 1, 2, . . .
satisﬁes the conditions of weak stationarity given in Section 2.4. In particular,
1 , 2 , . . . is a weakly stationary process. Furthermore, we assume that t and
Rm,t are uncorrelated, in the sense that for any b ∈ N , bT t and Rm,t are
uncorrelated; we may write this condition as
Cov(t , Rm,t ) = 0.
Also, we assume that
Cov(t , s ) = 0
for any t, s, t = s, which we interpret to mean that for any a, b ∈ N , aT t
and bT s are uncorrelated; in particular, i,t and j,s are uncorrelated for any
i, j = 1, . . . , N and any t = s.

T&F Cat #K31368 — K31368 C009— page 274 — 6/14/2017 — 22:05

The Single-Index Model 275

At this point, all we have done is rewrite the market models for the N assets
using matrix notation. The single-index model goes further, by assuming that
the covariance matrix of t is a diagonal matrix,
⎛ 2 ⎞
σ,1 0 ... 0
⎜ .. ⎟
⎜ 0 σ2,2 . . . . ⎟
Σ = ⎜⎜ .
⎟.
⎟
⎝ .. .. ..
. . 0 ⎠
0 . . . 0 σ2,N

That is, the single-index model is an extension of the market model to a set of
asset returns in which the market model holds for each asset and the residual
returns for diﬀerent assets are uncorrelated.
We will say that the single-index model holds for Rt whenever Rt follows
the model given in (9.2) and the preceding assumptions are satisﬁed.

9.3 Covariance Structure of Returns under the

Single-Index Model
The single-index model yields a simple expression for the covariance matrix
of Rt , as described in the following proposition.

Proposition 9.1. For each t = 1, 2, . . . , T , let Rt denote an N × 1 vector of

asset returns and suppose that the single-index model holds for Rt .
Let Σ denote the covariance matrix of Rt and let Σ denote the covariance
matrix of t . Then
Σ = σ2m ββT + Σ (9.3)
where σ2m = Var(Rm,t ).

Proof. Let a and b be elements of N and consider Cov(aT Rt , bT Rt ).

According to the single-index model,

Cov(aT Rt , bT Rt ) = Cov(aT βRm,t + aT t , bT βRm,t + bT t )

= Cov(aT βRm,t , bT βRm,t ) + Cov(aT βRm,t , bT t )
+ Cov(aT t , bT βRm,t ) + Cov(aT t , bT t ).

Note that aT βRm,t = (aT β)Rm,t where (aT β) is a scalar; it follows that

Cov(aT βRm,t , bT βRm,t ) = (aT β)(bT β)σ2m = (aT β)(βT b)σ2m = aT ββT bσ2m .

By the properties of covariance matrices,

Cov(aT t , bT t ) = aT Σ b

T&F Cat #K31368 — K31368 C009— page 275 — 6/14/2017 — 22:05

276 Introduction to Statistical Methods for Financial Models

and, by the assumptions of the single-index model,

Cov(aT βRm,t , bT t ) = Cov(aT t , bT βRm,t ) = 0.
It follows that
Cov(aT Rt , bT Rt ) = (aT β)(bT β)σ2m + aT Σ b
= (aT β)(βT b)σ2m + aT Σ b

= aT ββT σ2m + Σ b.
Since this holds for any vectors a, b, it follows that ββT σ2m + Σ must be the
covariance matrix of Rt .

Under the assumption that Σ is a diagonal matrix, the correlation between

the returns of any two assets is attributable entirely to the fact that the returns
are related to the market index. That is, the correlation structure of Rt is
attributable to a “single index,” the return on the portfolio corresponding
to Rm,t .
The primary assumption underlying the single-index model, that the
covariance matrix of t is a diagonal matrix, is a strong one, but it greatly
simpliﬁes the structure of the covariance matrix of Rt . In general, the covari-
ance matrix of Rt has N (N + 1)/2 parameters, N variances plus (N 2 − N )/2
covariances. Under the single-index model, (9.3) has 2N + 1 parameters, N
elements of β, N diagonal elements of Σ , and σ2m . For example, for N = 50
assets, in general, the covariance matrix of Rt has 1225 unknown parameters,
while under the single-index model it has only 101.
Furthermore, the parameters in (9.3) can be estimated using the regression
output from applying the market model to each of the N assets together with
the sample variance of the excess returns on the market index.

Correlation of Asset Returns under the

Single-Index Model
To better understand the implications of the single-index model, consider the
properties of two assets, i and j. Let σ2i and σ2j denote the variances of Ri,t
and Rj,t , respectively. Then, under either the market model or the single-index
model, σ2i and σ2j can be decomposed into market and nonmarket components:

σ2i = β2i σ2m + σ2,i and σ2j = β2j σ2m + σ2,j .

Let ρi and ρj denote the correlations of Ri,t and Rj,t , respectively, with
Rm,t . Under either the market model or the single-index model,
Cov(Ri,t Rm,t ) = βi σ2m and Cov(Rj,t , Rm,t ) = βj σ2m ;
it follows that
βi σ2m βi σm
ρi = =
σm βi σm + σe,i
2 2 2 βi σ2m + σ2,i
2

T&F Cat #K31368 — K31368 C009— page 276 — 6/14/2017 — 22:05

The Single-Index Model 277

and
βj σm
ρj = .
β2j σ2m + σ2,j

Now consider ρij , the correlation of Ri,t and Rj,t . Under the single-index
model, the covariance of Ri,t and Rj,t is the (i, j)th element of

σ2m ββT + Σ ;

hence,
Cov(Ri,t , Rj,t ) = βi βj σ2m .

It follows that

βi βj σ2m
ρij = = ρi ρj .
β2i σ2m + σ2,i β2j σ2m + σ2,j

Thus, we have proven the following result.

Corollary 9.1. For each t = 1, 2, . . . , T , let Rt denote an N × 1 vector of

asset returns and suppose that Σ, the covariance matrix of Rt is of the form

Σ = σ2m ββT + Σ (9.4)

where Σ is a diagonal matrix.

Let ρij denote the correlation of Ri,t and Rj,t , let ρi denote the correlation
of Ri,t and Rm,t , and let ρj denote the correlation of Rj,t and Rm,t . Then for
i, j = 1, 2, . . . , N , i = j,
ρij = ρi ρj .

Thus, under the single-index model, the correlation of the returns on any
two assets is equal to the product of the correlations of each assets’ returns
with the returns on the market index.

Example 9.1 Suppose that for a set of three assets, the single-index
model holds with β = (0.8, 0.5, 1.1)T , σ,1 = 0.20, σ,2 = 0.25, σ,3 = 0.10, and
σm = 0.05. Then the covariance matrix of the assets is given by

> beta<-c(0.8, 0.5, 1.1)

> Sig1<-(0.05^2)*(beta%*%t(beta)) + diag(c(0.2,0.25, 0.1)^2)
> Sig1
[,1] [,2] [,3]
[1,] 0.0416 0.00100 0.00220
[2,] 0.0010 0.06313 0.00138
[3,] 0.0022 0.00138 0.01303

T&F Cat #K31368 — K31368 C009— page 277 — 6/14/2017 — 22:05

278 Introduction to Statistical Methods for Financial Models

The corresponding correlation matrix can be obtained using the function

cov2cor, which converts a covariance matrix into a correlation matrix.

> cov2cor(Sig1)
[,1] [,2] [,3]
[1,] 1.0000 0.0195 0.0945
[2,] 0.0195 1.0000 0.0480
[3,] 0.0945 0.0480 1.0000

Recall that, for a given asset, βi = ρi (σi /σm ) where σ2i = Var(Ri,t ).
Therefore, the vector of correlations (ρ1 , ρ2 , ρ3 )T of each asset’s returns with
the returns on the market index is given by

> cor_vec<-beta*(0.05/(diag(Sig1)^.5))
> cor_vec
[1] 0.1961 0.0995 0.4819

Products of the form ρi ρj for i = j may be easily obtained from the oﬀ-
diagonal elements of cor_vec%*%t(cor_vec):

> cor_vec%*%t(cor_vec)
[,1] [,2] [,3]
[1,] 0.0385 0.0195 0.0945
[2,] 0.0195 0.0099 0.0480
[3,] 0.0945 0.0480 0.2322

Note that the oﬀ-diagonal elements of the matrix cor_vec%*%t(cor_vec) are

identical to the oﬀ-diagonal elements of cov2cor(Sig1).

Partial Correlation
This property of the correlation between the returns of any two assets in a
given time period may be described in terms of their partial correlation. If, as
in the single-index model, the correlation between Ri,t and Rj,t is attributable
entirely to the fact that both assets’ returns are linearly related to the market
return, then ρij = ρi ρj . It follows that

ρij − ρi ρj

represents the correlation of Ri,t and Rj,t relative to the value of the correla-
tion that would be obtained if the correlation between Ri,t and Rj,t is because
of the relationships of the assets’ returns to the market return.
The partial correlation coeﬃcient of Ri,t and Rj,t given Rm,t is a scaled
version of this diﬀerence:
ρij − ρi ρj
ρij·m = . (9.5)
(1 − ρ2i )(1 − ρ2j )

T&F Cat #K31368 — K31368 C009— page 278 — 6/14/2017 — 22:05

The Single-Index Model 279

This partial correlation coefficient describes the extent to which Ri,t and
Rj,t are linearly related, after removing the effect of the market return on this
relationship; this is often expressed by saying that ρij·m gives the correlation
between Ri,t and Rj,t “controlling for” the market return. Like the usual
correlation coefficient, it is based on the assumption that the relationships
among Ri,t , Rj,t , and Rm,t are all linear ones. If the single-index model holds,
then ρij·m = 0 for all i, j = 1, 2, . . . , N , i = j.

Example 9.2 In this chapter, we apply the single-index model to the returns
on the stocks of five companies, Cablevision Systems Corp. (symbol CVC),
Edison International (EIX), Expedia, Inc. (EXPE), Humana, Inc. (HUM),
and Wal-Mart Stores, Inc. (WMT). Five years of monthly returns for the
period ending December 31, 2014, were calculated for each stock and stored
in variables with the same name as the stock symbol; for example, cvc contains
the excess returns for Cablevision.
The matrix of excess returns for all five stocks is stored in the variable
stks, which is analogous to the matrix stored in the variable big8 used in
Example 6.6, as well as other examples in Chapters 6 through 8. The vari-
able sp500 contains similar excess returns on the Standard & Poors (S&P)
500 index.
An estimate of the partial correlation coefficient is given by replacing the
correlation coefficients in (9.5) by the sample correlation coefficients. Although
such an estimate is easily calculated using results from the cor function, it
is convenient to use the pcor.test function in package ppcor, which also
includes a test of ρij·m = 0.
For instance, consider the estimated partial correlation of the returns on
Edison stock and Wal-Mart stock given the returns on the S&P 500 index.

> library(ppcor)
> cor(eix, wmt)
V1
V1 0.2769
> pcor.test(eix, wmt, sp500)
estimate p.value statistic n gp Method
1 0.1574 0.229 1.203 60 1 pearson

Therefore, the sample correlation coefficient for the returns on Edison and
Wal-Mart stock is 0.277 and the estimated partial correlation coefficient is
0.157. The test of the null hypothesis that the partial correlation coefficient
for Edison and Wal-Mart returns, controlling for the returns on the S&P 500
index, is zero has p-value 0.229. Hence, there is no evidence to reject the null
hypothesis, and it appears that the correlation between Edison and Wal-Mart
stock returns is attributable entirely to their relationships with the market
return, as measured by the return on the S&P 500 index. That is, there is no
evidence to reject the hypothesis that the single-index model holds for Edison
and Wal-Mart stock.

T&F Cat #K31368 — K31368 C009— page 279 — 6/14/2017 — 22:05

280 Introduction to Statistical Methods for Financial Models

To calculate the partial correlation coefficients for all pairs of assets, we can
use nested loops that calculate the partial correlation and the corresponding
p-value for the returns on each pair of stocks.
> pcor<-pvalue<-matrix(0, 5, 5)
> for (i in 1:4){
+ for (j in (i+1):5){
+ res<-pcor.test(stks[,i], stks[, j], sp500)
+ pcor[i, j]<-res[1, 1]
+ pvalue[i, j]<-res[1, 2]
+ }
+ }
The matrix pcor will contain the estimated partial correlation coefficients and
the matrix pvalue will contain the associated p-values. Note, although there
are five columns in the return matrix stks, i runs from 1 to 4 and j runs
from i + 1 to 5 to avoid computing each estimate and p-value twice and to
avoid computing the partial correlation coefficient of a vector of returns with
itself.
It is easier to read the results if we add column names and row names to
the result matrices:
> rownames(pcor)<-colnames(stks)
> colnames(pcor)<-colnames(stks)
> rownames(pvalue)<-colnames(stks)
> colnames(pvalue)<-colnames(stks)
> pcor
CVC EIX EXPE HUM WMT
CVC 0 -0.0856 -0.0354 -0.0318 -0.1018
EIX 0 0.0000 -0.0396 -0.0285 0.1574
EXPE 0 0.0000 0.0000 -0.2486 0.1526
HUM 0 0.0000 0.0000 0.0000 -0.0667
WMT 0 0.0000 0.0000 0.0000 0.0000
> pvalue
CVC EIX EXPE HUM WMT
CVC 0 0.517 0.789 0.8101 0.440
EIX 0 0.000 0.765 0.8297 0.229
EXPE 0 0.000 0.000 0.0527 0.244
HUM 0 0.000 0.000 0.0000 0.614
WMT 0 0.000 0.000 0.0000 0.000
Because all p-values are relatively large, there is no evidence that the single-
index model is inappropriate.
When there are many zeros in a table, it is often convenient to replace
them by a different symbol; this is particularly true when, as in the present
case, the zeros do not provide any information—they are simply placeholders.
This can be achieved by converting the matrix to a “table” and then using

T&F Cat #K31368 — K31368 C009— page 280 — 6/14/2017 — 22:05

The Single-Index Model 281

the function print, which includes an argument for the symbol used for zeros.
For example,
> print(as.table(pvalue), zero.print=".")
CVC EIX EXPE HUM WMT
CVC . 0.5167 0.7892 0.8101 0.4398
EIX . . 0.7649 0.8297 0.2290
EXPE . . . 0.0527 0.2436
HUM . . . . 0.6138
WMT . . . . .
Using nested loops in this way generally works well when analyzing a small
or moderate number of assets. However, when analyzing a large number of
assets, it may be preferable to use one of the vector-based functions available
in R, such as outer, which is often more eﬃcient in such cases.
Note that when testing a large number of hypotheses, as in the previous
example, it is important to be aware of the multiple testing issue, as discussed
in Section 8.4. That is, when interpreting the p-values for tests of ρij·m = 0 for
a large number of stocks, we expect a few small p-values even if the single-index
model holds for all stocks. As discussed in Section 8.4, the Bonferroni method
may be used in such cases to calculate an adjusted p-value that is valid for test-
ing the hypothesis that the single-index model holds for all stocks considered.

9.4 Estimation
As in the previous section, consider a set of N assets, with returns
R1,t , R2,t , . . . , RN,t at time t, t = 1, 2, . . . , T , and suppose the single-index
model (9.2) holds. In this section, we consider estimation of the parameters of
the model: the vector β, β = (β1 , β2 , . . . , βN )T, the vector α = (α1 , α2 , . . . , αN )T,
and the standard deviations of the residual returns, σ,1 , σ,2 , . . . , σ,N .
Note that, under the assumptions of the single-index model, the parameter
estimates for each asset may be obtained by estimating the parameters of the
market model for that asset. Hence, parameter estimation for the single-index
model uses the methods discussed in Section 8.3.
Example 9.3 Consider the stock returns for the ﬁve companies listed in
Example 9.2. Using the same procedure used in Chapter 8 for the data in big8,
the results for the market model applied to the returns in stks are stored in
the variable stks.mm. The estimated regression coeﬃcients are therefore in the
$coefficients component of stks.mm.
> stks.mm<-lm(stks~sp500)
> stks.mm$coefficients
CVC EIX EXPE HUM WMT
(Intercept) -9.75e-05 0.00914 0.0108 0.0162 0.0059
sp500 1.07e+00 0.47407 0.8902 0.6516 0.4568

T&F Cat #K31368 — K31368 C009— page 281 — 6/14/2017 — 22:05

282 Introduction to Statistical Methods for Financial Models

Therefore, the estimates α̂ and β̂ may be obtained from stks.mm

$coefficients:

> stks.alpha<-stks.mm$coefficients[1,]
> stks.alpha
CVC EIX EXPE HUM WMT
-9.75e-05 9.14e-03 1.08e-02 1.62e-02 5.90e-03
> stks.beta<-stks.mm$coefficients[2,]
> stks.beta
CVC EIX EXPE HUM WMT
1.073 0.474 0.890 0.652 0.457

The estimates σ̂,j , j = 1, 2, . . . , 5 may be calculated using the procedure

described in Example 8.13, in which the apply function is used with the func-
tion f.sighat we have deﬁned; the results are stored in the variable stks.s.
Therefore, the residual standard deviations are given by
> stks.s<-apply(stks, 2, f.sighat)
> stks.s
CVC EIX EXPE HUM WMT
0.0822 0.0457 0.1055 0.0695 0.0410
Thus, the parameters of the single-index model may be estimated using the
methods used to estimate the parameters of the market model.

Under the assumptions of the single-index model, the covariance matrix

of the asset returns is of the form

Σ = σ2m ββT + Σ (9.6)

where ⎛ ⎞
σ2,1 0 ... 0
⎜ .. .. ⎟
⎜ 0 σ2,2 . . ⎟
Σ = ⎜
⎜ ..
⎟.
⎟
⎝ .. ..
. . . 0 ⎠
0 ... 0 σ2,N
Let ⎛ ⎞
σ̂2,1 0 ... 0
⎜ .. .. ⎟
⎜ 0 σ̂2,2 . . ⎟

Σ = ⎜ ⎟.
⎜ .. .. .. ⎟
⎝ . . . 0 ⎠
0 ... 0 σ̂2,N
Then an estimate of Σ based on the single-index model is given by
= Sm
Σ 2
β̂β̂T + Σ (9.7)
2
where Sm is the sample variance of Rm,1 − Rf,1 , Rm,2 − Rf,2 , . . . , Rm,T −
Rf,T .

T&F Cat #K31368 — K31368 C009— page 282 — 6/14/2017 — 22:05

The Single-Index Model 283

Example 9.4 Consider the data analyzed in Example 9.3. The matrix Σ
is the diagonal matrix with diagonal elements σ̂,1 , . . . , σ̂,5 . Hence, for this
2 2

example, it may be formed from the estimates in stks.s using the diag
function, which forms a diagonal matrix from a vector of diagonal elements.
> stks.Sigeps<-diag(stks.s^2)
> rownames(stks.Sigeps)<-colnames(stks.Sigeps)<-c(labels(stks.s))
> print(as.table(stks.Sigeps), zero.print=".")

CVC EIX EXPE HUM WMT

CVC 0.00676 . . . .
EIX . 0.00209 . . .
EXPE . . 0.01114 . .
HUM . . . 0.00483 .
WMT . . . . 0.00168

The command
> rownames(stks.Sigeps)<-colnames(stks.Sigeps)<-c(labels(stks.s))
assigns the labels from the vector stks.s to the row and column names of the
matrix stks.Sigeps. The print command is used with as.table in order to
print the zeros in the matrix as dots.
The matrix β̂β̂T may be obtained using matrix multiplication with the
vector stks.beta. Recall that the matrix multiplication operator is %*% and
t is the transpose function.
> stks.beta%*%t(stks.beta)
CVC EIX EXPE HUM WMT
[1,] 1.152 0.509 0.955 0.699 0.490
[2,] 0.509 0.225 0.422 0.309 0.217
[3,] 0.955 0.422 0.792 0.580 0.407
[4,] 0.699 0.309 0.580 0.425 0.298
[5,] 0.490 0.217 0.407 0.298 0.209
The estimate Σ deﬁned in (9.7) may now be obtained by combining
diag(stks.s^2) with stks.beta%*%t(stks.beta) and the sample variance
of the returns on the S&P 500 index.
> stks.Sig<-var(c(sp500))*(stks.beta%*%t(stks.beta)) +
+ diag(stks.s^2)
> stks.Sig
CVC EIX EXPE HUM WMT
[1,] 0.008388 0.000718 0.001348 0.000987 0.000692
[2,] 0.000718 0.002410 0.000595 0.000436 0.000305
[3,] 0.001348 0.000595 0.012255 0.000818 0.000574
[4,] 0.000987 0.000436 0.000818 0.005428 0.000420
[5,] 0.000692 0.000305 0.000574 0.000420 0.001979

T&F Cat #K31368 — K31368 C009— page 283 — 6/14/2017 — 22:05

284 Introduction to Statistical Methods for Financial Models

Note that the variance of the returns on the S&P 500 index is calculated by
var(c(sp500)) rather than by var(sp500) since sp500 is a 60 × 1 matrix
rather than a vector and, hence, var(sp500) returns a 1 × 1 matrix, rather
than a scalar; c(sp500) converts sp500 to a vector.
The estimate stks.Sig may be compared to the sample covariance matrix
of the returns in stks.

> cov(stks)
CVC EIX EXPE HUM WMT
CVC 0.008273 0.000401 0.001046 0.000808 0.000354
EIX 0.000401 0.002374 0.000407 0.000347 0.000596
EXPE 0.001046 0.000407 0.012067 -0.000974 0.001223
HUM 0.000808 0.000347 -0.000974 0.005346 0.000233
WMT 0.000354 0.000596 0.001223 0.000233 0.001950

The two estimates appear to be generally similar, although it is diﬃcult

to compare covariance matrices in this way. A useful alternative is to look at
the standard deviations of the asset returns along with the correlation matrix
of the return vector.
Recall that the diag function may also be used to extract the diagonal from
a matrix. Therefore, the estimated standard deviations based on the single-
index model may be compared to the sample standard deviations by compar-
ing the values in diag(stks.Sig)^.5 to those in diag(cov(stks))^.5.
Because
Var(Rj,t ) = β2j σ2m + σ2,j ,

the two estimates of standard deviation should be very close, with the
diﬀerence due to the slightly diﬀerent divisors used in the estimates.

> diag(stks.Sig)^.5
[1] 0.0916 0.0491 0.1107 0.0737 0.0445
> diag(cov(stks))^.5
CVC EIX EXPE HUM WMT
0.0910 0.0487 0.1098 0.0731 0.0442

That is, the estimates of the asset return standard deviations based on the
single-index model are essentially the same as the return sample standard
deviations.
The correlation matrix corresponding to stks.Sig is given by

> cov2cor(stks.Sig)
CVC EIX EXPE HUM WMT
[1,] 1.000 0.160 0.133 0.146 0.170
[2,] 0.160 1.000 0.110 0.120 0.140
[3,] 0.133 0.110 1.000 0.100 0.116
[4,] 0.146 0.120 0.100 1.000 0.128
[5,] 0.170 0.140 0.116 0.128 1.000

T&F Cat #K31368 — K31368 C009— page 284 — 6/14/2017 — 22:05

The Single-Index Model 285

this may be compared to the sample correlation matrix of the excess returns,

> cor(stks)
CVC EIX EXPE HUM WMT
CVC 1.0000 0.0905 0.1047 0.1215 0.0881
EIX 0.0905 1.0000 0.0761 0.0973 0.2769
EXPE 0.1047 0.0761 1.0000 -0.1212 0.2522
HUM 0.1215 0.0973 -0.1212 1.0000 0.0721
WMT 0.0881 0.2769 0.2522 0.0721 1.0000

In looking for discrepancies between the two matrices, it is often useful to

compute their diﬀerence.

> cov2cor(stks.Sig)-cor(stks)
CVC EIX EXPE HUM WMT
[1,] 0.0000 0.0691 0.0283 0.0247 0.0817
[2,] 0.0691 0.0000 0.0334 0.0232 -0.1370
[3,] 0.0283 0.0334 0.0000 0.2216 -0.1357
[4,] 0.0247 0.0232 0.2216 0.0000 0.0560
[5,] 0.0817 -0.1370 -0.1357 0.0560 0.0000

Another approach to estimating the covariance matrix of the asset returns

is to use a shrinkage estimate, as discussed in Section 6.5. For instance, the
shrinkage estimate of the covariance matrix of the returns on the stocks
represented in the data matrix stks, based on the target matrix taken
to be a matrix of the form σ2 I, may be obtained using the function
shrinkcovmat.equal in the package ShrinkCovMat (Touloumis 2015). The
estimates of the asset standard deviations are given by

> stks.shrink<-shrinkcovmat.equal(t(stks))$Sigmahat
> diag(stks.shrink)^.5
CVC EIX EXPE HUM WMT
0.0878 0.0571 0.1029 0.0742 0.0542

These estimates are similar to those based on the single-index model, with
some diﬀerences, particularly for EIX and WMT. The shrinkage estimate of
the correlation matrix is given by

> cov2cor(stks.shrink)
CVC EIX EXPE HUM WMT
CVC 1.0000 0.0604 0.0874 0.0936 0.0561
EIX 0.0604 1.0000 0.0524 0.0618 0.1452
EXPE 0.0874 0.0524 1.0000 -0.0963 0.1655
HUM 0.0936 0.0618 -0.0963 1.0000 0.0437
WMT 0.0561 0.1452 0.1655 0.0437 1.0000

Although the three estimates of the return correlation matrix are similar,
there are some diﬀerences. The most important is the correlation between

T&F Cat #K31368 — K31368 C009— page 285 — 6/14/2017 — 22:05

286 Introduction to Statistical Methods for Financial Models

Humana and Expedia stocks. Note that, according to the sample covariance
matrix and the shrinkage estimate of the covariance matrix, the returns on
these stocks are negatively correlated. However, both stocks have positive esti-
mates of beta, 0.890 for Expedia and 0.652 for Humana. Therefore, according
to the single-index model, both stocks are positively correlated with the mar-
ket index and, hence, they are positively correlated with each other. Speciﬁ-
cally, the estimate of the correlation based on the single-index model is 0.100,
while the sample covariance is −0.121 and the shrinkage estimate is −0.096.
Because the shrinkage estimate is a weighted average of the sample covari-
ance matrix and a scaled identity matrix, it is not surprising that the shrinkage
correlation estimates are closer to zero than are the sample correlations. The
shrinkage estimates also tend to be closer to zero than are the correlation
estimates based on the single-index model.

9.5 Applications to Portfolio Analysis

In Proposition 9.1, it was shown that the single-index model implies a simple
form for the covariance matrix of asset returns:

Σ = σ2m ββT + Σ (9.8)

where Σ is an N × N diagonal matrix, β denotes the vector of betas with

respect to the return on a market index, and σ2m is the variance of the return
on the market index. In this section, the implications of this form of the
covariance matrix for portfolio theory are considered.
Recall that the inverse covariance matrix, Σ−1 , plays a central role in cal-
culating the weight vectors of several types of optimal portfolios. For instance,
the weight vector of a “risk-averse” portfolio, as derived in Section 5.6, is of
the form
1
wmv + Σ−1 (μ − μmv 1) (9.9)
λ
where wmv is the weight vector of the minimum-variance portfolio, given by
Σ−1 1
wmv = ,
1T Σ−1 1
λ is a scalar constant, and μ is the vector of asset return means. The weight
vector of the tangency portfolio is given by
1
wT = Σ−1 (μ − μf 1)
1T Σ−1 (μ − μf 1)
where μf denotes the risk-free rate; see Proposition 5.7.
When the single-index model holds,

μ − μf 1 = α + β(μm − μf )

T&F Cat #K31368 — K31368 C009— page 286 — 6/14/2017 — 22:05

The Single-Index Model 287

and
Σ = σ2m ββT + Σ (9.10)
where σ2m = Var(Rm ) and Σ is a diagonal matrix.
To estimate the weight vector of a given portfolio under the single-index
model, we can simply use the single-index-model estimates of Σ and μ − μf 1
in the expression for the portfolio weights. The following example illustrates
this for the case of the tangency portfolio.
Example 9.5 Consider the example of the returns on the stocks of the five
companies discussed in the examples in this chapter. Recall that stks.Sig is
an estimate of the return covariance matrix under the assumption that the
single-index model holds and stks.alpha and stks.beta are estimates of α
and β, respectively. Therefore, an estimate of wT is given by
> wgt<-solve(stks.Sig, stks.alpha + stks.beta*mean(sp500))
> w_T_si<-wgt/sum(wgt)
> w_T_si
CVC EIX EXPE HUM WMT
0.009 0.353 0.080 0.269 0.289
Note that sp500 contains the excess returns on the S&P 500 index.
The weights calculated here may be compared to the estimated weights
calculated without assuming that the single-index model holds.
> w_1T<-solve(cov(stks), apply(stks, 2, mean))
> w_T<-w_1T/sum(w_1T)
> w_T
CVC EIX EXPE HUM WMT
0.035 0.331 0.119 0.314 0.201
The two sets of weights are similar but there are differences; this is not sur-
prising because the single-index model is an important assumption regarding
the relationship among the returns of the different stocks.
Note that the differences in the estimates are a consequence of the dif-
ferences in the estimates of Σ. Because of the properties of least-squares
estimates, the estimate of μ − μf 1 under the single-index model agrees with
the estimate based on the sample means of the excess returns:
> apply(stks, 2, mean)
CVC EIX EXPE HUM WMT
0.0116 0.0143 0.0205 0.0233 0.0109
> stks.alpha + stks.beta*mean(sp500)
CVC EIX EXPE HUM WMT
0.0116 0.0143 0.0205 0.0233 0.0109
The estimates of the tangency portfolio weights calculated earlier can also
be compared to the estimates based on a shrinkage estimate of the covari-
ance matrix. Using the shrinkage estimate based on a target matrix of the

T&F Cat #K31368 — K31368 C009— page 287 — 6/14/2017 — 22:05

288 Introduction to Statistical Methods for Financial Models

form σ2 I, which is stored in the variable stks.shrink (see Example 9.4), the
estimate of tangency weight vector is given by
> w.sh1<-solve(stks.shrink, apply(stks, 2, mean))
> w.sh1/sum(w.sh1)
CVC EIX EXPE HUM WMT
0.061 0.278 0.149 0.331 0.180
The three sets of estimates are generally similar, with some diﬀerences; as
noted previously, such diﬀerences are not unexpected.

Note that many of the portfolio weight vectors we have considered depend
on the return covariance matrix Σ through its inverse Σ−1 . Under the form
of Σ using the single-index model, it is possible to derive a relatively simple
expression for this inverse. Although such an expression is not needed for
numerical work, it is useful for understanding how such portfolio weights are
related to the parameters of the single-index model.
Hence, we now consider such an expression for Σ−1 ; we begin with some
useful results on matrix inverses.

Some Preliminary Results on Matrix Inverses

We ﬁrst consider the inverse of a matrix of the form I + cddT where d is an
N × 1 vector, I is the N × N identity matrix, and c is a scalar. This result
will then be used to derive the inverse of a matrix of the form (9.8).
Lemma 9.1. For any scalar c and any vector d ∈ N ,
c
(I + c ddT )−1 = I − ddT .
1 + c dT d
Proof. The result may be established by verifying that both

c
(I + c dd ) I −
T
dd T
(9.11)
1 + c dT d
and
c
I− ddT (I + c ddT ) (9.12)
1 + c dT d
are equal to the identity matrix.
Consider (9.11).

c c
(I + c dd ) I −
T
T
dd T
= I + c ddT − ddT
1+c d d 1 + cdT d
c2
− ddT ddT (9.13)
1 + c dT d
Note that, because dT d is a scalar, we may write
ddT ddT = d(dT d)dT = (dT d)ddT

T&F Cat #K31368 — K31368 C009— page 288 — 6/14/2017 — 22:05

The Single-Index Model 289

and, hence,
c c2
c ddT − dd T
− ddT ddT
1 + c dT d 1 + c dT d

1 c dT d
= c 1− − ddT
1 + c dT d 1 + c dT d

1 + c dT d
= c 1− ddT = 0.
1 + c dT d
It follows that the right-hand side of (9.13) is equal to I. A similar argument
can be used to show (9.12).

The result in Lemma 9.1 may now be used to determine the inverse of a
matrix of the form σ2m ββT + Σ .
Lemma 9.2. Let Σ be an N × N positive-deﬁnite matrix and let β be an
element of N . Then, for all σ2m ≥ 0,
2 T −1 σ2m
σm ββ + Σ = Σ−1
− Σ−1 T −1
ββ Σ . (9.14)
1 + σ2m βT Σ−1
β

Proof. Note that we may write

1
1
−1 −1
σ2m ββT + Σ = Σ + σ2m ββT = Σ2 I + σ2m (Σ 2 β)(Σ 2 β)T Σ2 .

Hence,
−1
− 12 −1 −1 −1
(Σ + σ2m ββT )−1 = Σ IN + σ2m (Σ 2 β)(Σ 2 β)T Σ 2 . (9.15)
−1
Applying Lemma 9.1 with c = σ2m and d = Σ 2 β, it follows that
−1 −1 c −1 −1
(IN + σ2m (Σ 2 β)(Σ 2 β)T )−1 = IN − − 12 − 1 (Σ 2 β)(Σ 2 β)T .
1 + c(Σ β)T (Σ 2 β)
Using this result in (9.15) yields
σ2m
(σ2m ββT + Σ )−1 = (Σ + σ2m ββT )−1 = Σ−1
− Σ−1 T −1
ββ Σ
1 + σm βT Σ−1
2
β
(9.16)
as given in the statement of the lemma.

Note that Lemma 9.1 does not require Σ to be a diagonal matrix; however,
when it is a diagonal matrix, a simple expression for Σ−1 is available. Lemma
9.1 also shows that the covariance matrix σ2m ββT + Σ is invertible provided
that Σ is invertible. The same argument shows that the estimated covariance
matrix Sm 2 is invertible under the weak condition that σ̂2 > 0 for
β̂β̂T + Σ ,j
j = 1, 2, . . . , N .
The explicit expression for the inverse of Σ in this setting can be used
to derive explicit expressions for the weight vectors of some of the optimal
portfolios we have considered in terms of β and Σ .

T&F Cat #K31368 — K31368 C009— page 289 — 6/14/2017 — 22:05

290 Introduction to Statistical Methods for Financial Models

Weight Vector of the Tangency Portfolio under the

Single-Index Model
The explicit expression for the matrix inverse in Lemma 9.2 may be used to
obtain an explicit expression for the weights of the tangency portfolio. As
noted previously, such an expression is not needed for numerical work, but it
is useful for understanding how the tangency portfolio weights depend on the
parameters α, β, and Σ .
Proposition 9.2. Consider a set of N assets and let Rt denote the cor-
responding vector of asset returns at time t. Assume that Rt follows the
single-index model so that μ, the mean vector of Rt , is of the form

μ = μf 1 + α + (μm − μf )β

and Σ, the covariance matrix of Rt , is of the form

Σ = σ2m ββT + Σ

where Σ is a diagonal matrix with diagonal elements σ2,1 , σ2,2 , . . . , σ2,N .

Let wT denote the weight vector of the tangency portfolio based on Rt .
Then
v
wT = T
1 v
where
v = Σ−1
(α + (μm − μf − γ)β)

and
σ2m
γ= βT Σ−1
(μ − μf 1)
1 + σ2m βT Σ−1
β
σ2m
= βT Σ−1
(α − (μm − μf )β) .
1 + σ2m βT Σ−1
β

Proof. For a given value of the mean vector μ, consider calculation of the
weight vector of the tangency portfolio when Σ is given by σ2m ββT + Σ . Using
(9.14), this weight vector is proportional to

σ2m
Σ−1
− Σ −1
ββ T −1
Σ (μ − μf 1), (9.17)
1 + σ2m βT Σ−1
β

which may be written as

σ2m
Σ−1
(μ − μf 1) − Σ−1 T −1
β(β Σ (μ − μf 1)).
1 + σ2m βT Σ−1
β

Because βT Σ−1
(μ − μf 1) is a scalar,

σ2m
Σ−1 T −1
β(β Σ (μ − μf 1)) = γΣ−1
β
1 + σm βT Σ−1
2
β

T&F Cat #K31368 — K31368 C009— page 290 — 6/14/2017 — 22:05

The Single-Index Model 291

where
σ2m T −1 σ2m
γ= β Σ (μ− μ f 1) βT Σ−1
(α − (μm − μf )β) .
1 + σ2m βT Σ−1
β 1 + σ2m βT Σ−1
β

Therefore, (9.17) may be written as

Σ−1 −1
(μ − μf 1) − γΣ β;

substituting α + (μm − μf )β for μ − μf 1 shows that, under the single-index

model, wT is proportional to v, as deﬁned in the statement of the proposition.
The result now follows from noting that the tangency weights must sum to 1.

This result is useful for gaining some insight into the relationship
between weights of the tangency portfolio and the parameters (αj , βj , σ2,j ),
j = 1, 2, . . . , N . Note that vj , the jth element of v, may be written as
vj = αj /σ2,j + δβj /σ2,j , j = 1, 2, . . . , N
where δ = μm − μf − γ. That is, the tangency weight for asset j is propor-
tional to a linear function of αj /σ2,j and βj /σ2,j .
If α = 0, as would be the case if the capital asset pricing model (CAPM)
holds, then
v = (μm − μf − γ0 )Σ−1 β
where
σ2m
γ0 = βT Σ−1
β(μm − μf ).
1 + σ2m βT Σ−1
β
That is, v is proportional to Σ−1 β so that the weight given to asset j in
the tangency portfolio is proportional to βj /σ2,j . Such weights may be viewed
as a simple approximation to the tangency weights that holds when the |αj |
are small. According to this approximation, asset j receives a large weight in
the tangency portfolio when βj is large relative to the residual variance for
asset j.
Example 9.6 For the stocks analyzed in Example 9.5, consider the approx-
imation to the tangency portfolio weights based on βj /σ2,j , j = 1, . . . , 5.
Recall that the estimates of βj for these stocks are stored in the variable
stks.beta and the estimates of σ,j are stored in the variable stks.s. Then
the approximations to the tangency portfolio weights are given by
> stks.beta/(stks.s^2)/sum(stks.beta/(stks.s^2))
CVC EIX EXPE HUM WMT
0.182 0.260 0.092 0.155 0.311
These weights may be compared to the estimated tangency portfolio
weights based on the single-index model
> w_T_si
CVC EIX EXPE HUM WMT
0.009 0.353 0.080 0.269 0.289

T&F Cat #K31368 — K31368 C009— page 291 — 6/14/2017 — 22:05

292 Introduction to Statistical Methods for Financial Models

9.6 Active Portfolio Management and the

Treynor–Black Method
According to the assumptions used to derive the CAPM, the market portfolio
is the optimal choice for investors; see Section 7.1. In practice, such a market
portfolio is unobservable; hence, we use a suitably chosen market index as
a substitute. If we accept the optimality of the portfolio corresponding to
the market index, then the portfolio problem is solved—simply invest in a
portfolio that tracks the market index along with the risk-free asset. This is
sometimes referred to as a passive approach to investing because the investor
does not need to choose the individual assets or their weights in constructing
a portfolio. However, if the portfolio corresponding to the market index is not
efficient, then it may be possible to modify it in order to improve portfolio
performance, a process known as active portfolio management.
Active portfolio management is based on a benchmark portfolio, assumed
to have desirable properties—high expected return and low risk—but not
assumed to satisfy any optimality criteria. The goal in active portfolio man-
agement is to construct a portfolio with a return that is highly correlated with
the return on the benchmark portfolio but that improves on it by having a
higher expected return and/or lower risk.
The benchmark portfolio is generally taken to be the portfolio based on
a market index, such as the S&P 500 index. In this section, we will take the
benchmark portfolio to be the portfolio corresponding to the market index
used in the single-index model; we will refer to this portfolio as the market
portfolio. However, it is important to keep in mind that we do not assume
that the market portfolio is efficient or nearly efficient; on the contrary, it is
exactly because of the inefficiency of the market portfolio that active portfolio
management is considered to be a worthwhile activity.
Consider a set of N assets and let Rt be the corresponding vector of asset
returns at time t, t = 1, 2, . . . , T . We assume that the single-index model holds
for Rt so that, as described in Section 9.2,

E(Rt ) = α + (μm − μf )β

and Σ, the covariance matrix of Rt , is of the form

Σ = σ2m ββT + Σ ,

where Σ is a diagonal matrix.

Let wp denote the weight vector of a given portfolio and let Rp,t = wpT Rt
denote the corresponding return variable, t = 1, 2, . . . , T . Then

E(Rp,t ) = wpT α + wpT β(μm − μf ) = αp + βp (μm − μf ),

T&F Cat #K31368 — K31368 C009— page 292 — 6/14/2017 — 22:05

The Single-Index Model 293

where αp = wpT α and βp = wpT β are the values of alpha and beta, respectively,
for the portfolio. The variance of the portfolio return is given by

Var(Rp,t ) = wpT σ2m ββT + Σ wp
= σ2m (wpT β)(βT wp ) + wpT Σ wp
= σ2 β2p + wpT Σ wp .
The goal in active portfolio management is to choose wp so that the resulting
portfolio outperforms the market portfolio.
The approach used here to construct such a portfolio is based on the
following idea. We may view the market portfolio as a single asset in which we
may place an investment; hence, we have, eﬀectively, N + 1 assets from which
to form a portfolio. That is, we consider the market portfolio to be a tradeable
asset. This is, in fact, essentially the case as there are a number of mutual
funds constructed to track various market indices. We then choose the optimal
weights for these N + 1 assets. The result is a combination of the market
portfolio and the N assets under consideration. This is known as the Treynor–
Black method.

Adding a Single Asset to the Market Portfolio

We begin by considering the simplest case, in which we attempt to improve on
the market portfolio by combining it with a single asset. Suppose we have an
asset with return Ri,t at time t and let Rm,t denote the return on the market
portfolio at time t. Under the assumption that the single-index model holds
with respect to Rm,t for asset i, we can write
Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T.
Here Rf,t denotes the return on the risk-free asset at time t and
i,1 , i,2 , . . . , i,T is a sequence of uncorrelated, mean-zero random variables
that are uncorrelated with the Rm,t . Note that because we are only consider-
ing one asset, in this case, the single-index model is equivalent to the market
model.
Consider the tangency portfolio formed from asset i and the market port-
folio. Recall that, when forming the tangency portfolio from two assets, the
weight given to asset 1 is given by
(μ1 − μf )σ22 − (μ2 − μf )ρ12 σ1 σ2
, (9.18)
(μ2 − μf )σ21 + (μ1 − μf )σ22 − [(μ1 − μf ) + (μ2 − μf )]ρ12 σ1 σ2
where μ1 , μ2 are the mean returns on the two assets, σ1 , σ2 are the return
standard deviations, and ρ12 is the correlation of the returns.
Taking the asset with return Ri,t to be asset 1 and the market portfolio
to be asset 2, here
μ1 − μf = E(Ri,t ) − μf = αi + βi (μm − μf ),

T&F Cat #K31368 — K31368 C009— page 293 — 6/14/2017 — 22:05

294 Introduction to Statistical Methods for Financial Models

where μm = E(Rm,t ),
σ21 = Var(Ri,t ) = β2i σ2m + σ2,i
where σ2m = Var(Rm,t ) and σ2,i = Var(i,t ),
ρ12 = Cov(Ri,t , Rm,t )Cov(βi Rm,t , Rm,t ) = βi σ2m ,
μ2 = μm and σ2 = σm .
Using these values in (9.18), and simplifying, leads to the expression
αi /σ2,i
wi∗ = (9.19)
(μm − μf )/σ2m + (1 − βi )αi /σ2,i
for the weight given to asset 1, which in this case is asset i. The weight given
to the market portfolio is therefore
(μm − μf )/σ2m − βi αi /σ2,i
1 − wi∗ = .
(μm − μf )/σ2m + (1 − βi )αi /σ2,i
We will refer to the portfolio placing weight wi∗ on asset i and weight 1 − wi∗ on
the market portfolio as the Treynor–Black portfolio of asset i and the market
portfolio.
If asset i is priced correctly relative to the market portfolio, in the sense
that αi = 0, then the optimal weight to give asset i is 0; that is, we cannot
improve the market portfolio by including more or less of asset i. However,
if αi = 0, then the market portfolio may be improved by combining it with
asset i.
The Treynor–Black portfolio of asset i and the market portfolio has mean
excess return
wi∗ (μi − μf ) + (1− wi∗ )(μm − μf ) = wi∗ (αi + βi (μm − μf )) + (1− wi∗ )(μm − μf )
= wi∗ αi + (wi∗ βi + 1 − wi∗ )(μm − μf )
1 2 2
= αi /σ,i + (μm − μf )2 /σ2m
c
where
c = (μm − μf )/σ2m + (1 − βi )αi /σ2,i
is the denominator in the expressions for wi∗ and 1 − wi∗ .
The variance of the return on this portfolio is given by
(wi∗ )2 σ2i + (1 − wi∗ )2 σ2m + 2wi∗ (1 − wi∗ )Cov(Ri , Rm )
= (wi∗ )2 (β2i σ2m + σ2,i ) + (1− wi∗ )2 σ2m + 2wi (1− wi )βi σ2m
= (wi∗ )2 σ2,i + (wi βi + 1 − wi )) σ2m
2
1 μm − μf
= 2 αi /σ,i +
2 2
σ2m
c σ2m

1 (μm − μf )2
= 2 α2i /σ2,i + .
c σ2m

T&F Cat #K31368 — K31368 C009— page 294 — 6/14/2017 — 22:05

The Single-Index Model 295

It follows that the squared Sharpe ratio of the Treynor–Black portfolio is

given by
2 2 2
αi /σ,i + (μm − μf )2 /σ2m
= α2i /σ2,i + (μm − μf )2 /σ2m . (9.20)
α2i /σ2,i + (μm − μf )2 /σ2m

Note that (μm − μf )2 /σ2m is the squared Sharpe ratio of the market portfolio.
Therefore, if αi = 0, the Sharpe ratio of the Treynor–Black portfolio is
equal to the Sharpe ratio of the market portfolio. However, if αi = 0, the
Sharpe ratio of the Treynor–Black portfolio is greater than that of the market
portfolio. The quantity αi /σ,i is known as the appraisal ratio of the asset;
recall that we considered the appraisal ratio as a measure of portfolio perfor-
mance in Section 8.9. In the present context, the magnitude of the appraisal
ratio indicates the possible improvement in the Sharpe ratio of the market
portfolio that may be achieved by including more or less of the asset.

Example 9.7 Suppose that Rm , the return on the market portfolio, has mean
0.025 and standard deviation 0.04 and that the risk-free rate is μf = 0.005.
Consider an asset with return Ri with mean 0.02, standard deviation 0.08,
and suppose that the correlation of Ri and Rm is ρi = 0.30 so that βi = 0.60
and
σ2,i = (0.08)2 − (0.60)2 (0.04)2 = 0.00582.

In Example 7.6, it was shown that αi = 0.008 so that the price of asset i is
mispriced, with a price that is too low.
Therefore, we can improve on the market portfolio by constructing a port-
folio based on the market portfolio and asset i. The weight given to asset i in
such a portfolio is

αi /σ2,i
(μm − μf )/σ2m + (1 − βi )αi /σ2,i
(0.008)/(0.00582)
=
(0.025 − 0.005)/(0.04)2 + (1 − 0.60)(0.008)/(0.00582)
= 0.105;

the market portfolio receives weight 1 − 0.105 = 0.895.

The squared Sharpe ratio of this portfolio is

(0.008)2 /(0.00582) + (0.025 − 0.005)2/(0.04)2 = 0.261

√
corresponding to a Sharpe ratio of 0.261 = 0.511. This can be compared to
the Sharpe ratio of the market portfolio, (0.025 − 0.005)/(0.04) = 0.5.

Of course, in practice, these quantities must be estimated based on

observed return data.

T&F Cat #K31368 — K31368 C009— page 295 — 6/14/2017 — 22:05

296 Introduction to Statistical Methods for Financial Models

Example 9.8 Consider the stock in Apple Inc. (symbol AAPL); here we
analyze ﬁve years of monthly returns for the period ending December 31,
2014. The variables aapl.alpha and aapl.s contain the estimates of αi and
σ,i , respectively, for this asset. Then the estimated appraisal ratio for Apple is
> aapl.alpha/aapl.s
(Intercept)
0.233
The estimated Sharpe ratio of the S&P 500 index is 0.290. Thus, according
to (9.20), the estimated Sharpe ratio of the Treynor–Black portfolio based on
Apple stock together with the S&P 500 index is
1
(0.233)2 + (0.290)2 2 = 0.372.

The weight placed on Apple in this portfolio is estimated to be

> cc<-mean(sp500)/var(sp500) + (1-aapl.beta)*aapl.alpha/aapl.s^2
> (aapl.alpha/aapl.s^2)/cc
0.442
Here, aapl.beta is the estimate of beta for Apple stock.

Of course, these results are estimates based on observed data; hence, it

is important to keep in mind the sampling properties of the results. We have
discussed two approaches to assessing such properties—the Monte Carlo sim-
ulation method used in Section 6.7 to study the sampling distribution of an
estimator and the bootstrap method used in Section 8.10 to calculate standard
errors and to estimate bias. Although either approach could be used here, for
simplicity, we consider only the bootstrap method, as in the following example.
Example 9.9 Consider estimation of the appraisal ratio of Apple stock, as
discussed in Example 9.8. Recall that the estimate obtained using five years
of monthly data for the period ending December 31, 2014, is 0.233; the return
data for Apple stock are stored in the variable aapl.
To obtain the standard error of this estimate, we use the procedure
described in Section 8.10; specifically, we follow the method described in
Example 8.22 for calculating the standard error of an estimated Treynor ratio.
Define a function appraisal that may be used to compute the appraisal
ratio of a stock:
> appraisal<-function(rmat, ind){
+ ret<-rmat[ind, 1]
+ mkt<-rmat[ind, 2]
+ mm<-lm(ret~mkt)
+ alpha<-mm$coefficients[1]
+ sighat<-summary(mm)$sigma
+ alpha/sighat
+ }

T&F Cat #K31368 — K31368 C009— page 296 — 6/14/2017 — 22:05

The Single-Index Model 297

Then the standard error of the estimated appraisal ratio for Apple may be
calculated using
> library(boot)
> boot(cbind(aapl, sp500), appraisal, 10000)
ORDINAR{Y} NONPARAMETRIC BOOTSTRAP

Bootstrap Statistics :
original bias std. error
t1* 0.233 0.00571 0.141
Thus, an approximate 95% conﬁdence interval for the true appraisal ratio
of Apple stock is

0.233 ± 1.96(0.141) = (−0.043, 0.509).

Hence, although there is some evidence to suggest that increasing the invest-
ment in Apple stock leads to a portfolio with a larger Sharpe ratio, a formal
test of the hypothesis that the true appraisal ratio is zero does not reject the
null hypothesis at the 5% level.
A similar approach may be used to calculate a standard error for the
estimated weight given to Apple stock in the Treynor–Black portfolio. The
standard error based on a bootstrap sample size of 10,000 was calculated
to be 6.84. An extremely large value such as this should be interpreted as
an indication that there is large variability in bootstrap replications of the
Treynor–Black weight.
In order to investigate this variability, the vector of bootstrap estimates of
the Treynor–Black weight may be saved to a variable using
> aapl.boot<-boot(cbind(aapl, sp500), tb.wgt, 10000)
Here tb.wgt is a user-defined function, similar to appraisal defined earlier,
that calculates the Treynor–Black weight. The bootstrap replications of the
statistic specified in the boot function are stored in the component $t of the
result, in this case aapl.boot$t.
The sample quantiles of the 10,000 values in aapl.boot$t may be
calculated using
> quantile(aapl.boot$t, prob=c(0.01, 0.05, 0.10, 0.50, 0.90,
+ 0.95, 0.99))
1% 5% 10% 50% 90% 95% 99%
-0.54860 0.00923 0.09006 0.42768 1.23355 1.77756 5.23613
These results show that there is considerable variability in these values, sug-
gesting that the Treynor–Black weight is not accurately estimated, at least
with the amount of data considered here.
It is worth noting that this same high degree of variability is observed
if other assets are analyzed in place of Apple stock. Also, recall that in

T&F Cat #K31368 — K31368 C009— page 297 — 6/14/2017 — 22:05

298 Introduction to Statistical Methods for Financial Models

Example 6.20, which considered a Monte Carlo study of the properties of

estimated weights for the tangency portfolio, this same large variability was
observed. Thus, given that the Treynor–Black portfolio is a type of tangency
portfolio, the results obtained here are not surprising.

The Treynor–Black Portfolio of N Assets Together with the

Market Portfolio
We now consider the case in which there are N assets available for investment,
together with the market portfolio, which plays the role of the (N + 1)st asset.
Assume that the single-index model holds for all N + 1 assets. Because
the (N + 1)st asset is the market index, αN +1 = 0, βN +1 = 1, and σ2,N +1 = 0.
We then construct a tangency portfolio from these N + 1 assets. Therefore,
instead of deriving the tangency portfolio weights for the N assets, as in
Proposition 9.2, we start with the market index as one asset and derive an
expression for how the market index must be modiﬁed when forming the
tangency portfolio.
Note that the set of N assets need not contain all the assets in the market.
In fact, if the set of N assets includes all assets in the market, then the
single-index model cannot hold. To see this, note that in thisNcase, the return
on the market portfolio at time t would be of the form j=1 wm,j Rj,t for
some market portfolio weights wm,1 , wm,2 , . . . , wm,N ; hence, the error term
N
in the market model for the market portfolio is of the form j=1 wm,j j,t .
Because the error term in the market model for the market portfolio must
N
be zero, the variance of j=1 wm,j j,t must be zero, contradicting the single-
index-model assumption of uncorrelated error terms. Hence, we assume that
the set of N assets is a subset of the assets in the market, for which the
single-index model holds.
Let Σ+ denote the covariance matrix of the returns on the N + 1 assets.
Then Σ+ may be written as a partitioned matrix,
2 T
σm ββ + Σ σ2m β
Σ+ = . (9.21)
σ2m βT σ2m
Similarly, we may write the mean vector of the N + 1 assets as

μ
μ+ = ,
μm
where ⎛ ⎞
μ1
⎜ μ2 ⎟
⎜ ⎟
μ=⎜ . ⎟
⎝ .. ⎠
μN
denotes the mean-vector of the original N assets, and μm is the expected
return on the market index.

T&F Cat #K31368 — K31368 C009— page 298 — 6/14/2017 — 22:05

The Single-Index Model 299

According to Proposition 5.7, the weight vector of the tangency portfolio

of the N + 1 assets is proportional to

Σ−1
+ (μ+ − μf 1N +1 ). (9.22)

The following lemma gives a simple expression for Σ−1

+ ; it is based on the
well-known formula for the inverse of a partitioned matrix.

Lemma 9.3. Deﬁne Σ+ by (9.21). Then

−1 Σ−1
−Σ−1 β
Σ+ = .
−βT Σ−1
1/σ2m + βT Σ−1
β

Proof. The result may be established by showing that

Σ−1 −1
+ Σ+ = Σ+ Σ+ = IN +1 .

Here, we consider Σ−1 −1

+ Σ+ ; the analysis of Σ+ Σ+ follows along similar lines.
Write
−1 A B
Σ+ Σ+ =
BT C
for an N × N matrix A, an N × 1 matrix B, and a scalar C. Then Σ−1 + Σ+ =
IN +1 provided that A = IN , B = 0, and C = 1.
Using the expression for Σ−1
+ given in the statement of the lemma and the
expression for Σ+ given in (9.21), it follows that

A = Σ−1 2 T 2 −1
(Σ + σm ββ ) − σm Σ ββ = IN ,
T

B = σ2m Σ−1 −1
β − σm Σ β = 0,
2

and
C = −σ2m βT Σ−1 2 T 2 T −1 −1
β + (1 − σm β (Σ + σm ββ ) β) = 1,
verifying that Σ−1
+ Σ+ = IN +1 .

We can now use the result in Lemma 9.3, along with the expression for
the weights of the tangency portfolio given in (9.22), to derive the optimal
modiﬁcation to the market portfolio.

Proposition 9.3. Consider a set of N assets with return vector Rt at time

t and suppose that Rt follows the single-index model.
Let (w0 , wm )T denote the weight vector of the tangency portfolio of these
N assets, together with the portfolio corresponding to the market index; here
w0 = (w0,1 , . . . , w0,N )T is the N × 1 weight vector for the N assets and wm is
the weight for the market index.
Then
1 αj
w0,j = , j = 1, 2, . . . , N
c σ2,j

T&F Cat #K31368 — K31368 C009— page 299 — 6/14/2017 — 22:05

300 Introduction to Statistical Methods for Financial Models

and ⎛ ⎞
1 ⎝ μm − μf

N
wm = − αj βj /σ2,j ⎠
c σ2m j=1

where c is chosen so that

N
w0,j + wm = 1.
j=1

Proof. The expression for the weights of the tangency portfolio of the N + 1
assets is given in (9.22). Write

w0
Σ−1
+ (μ+ − μ 1
f N +1 ) = c
wm
where c is a constant, chosen so that the weights sum to 1.
Using the result in Lemma 9.3, along with the fact that

μ − μf 1
μ+ − μf 1N +1 = ,
μm − μf
it follows that
cw0 = Σ−1 −1
(μ − μf 1) − Σ β(μm − μf )
= Σ−1
(μ − μf 1 − β(μm − μf )).

Note that ⎛ ⎞
α1
⎜ α2 ⎟
⎜ ⎟
α = ⎜ . ⎟ = μ − μf 1 − β(μm − μf )
⎝ .. ⎠
αN
so that ⎛ ⎞
α1 /σ2,1
1⎜
2 ⎟
1 ⎜ α2 /σ,2 ⎟
w0 = Σ−1 α = ⎜ . ⎟.
c c⎝ .. ⎠
αn /σ2,N
The weight for the market index is given by
1 + σ2m βT Σ−1
β
cwm = −βT Σ−1
(μ − μf 1) + (μm − μf )
σ2m
μm − μf
= − βT Σ−1
(μ − μf 1 − β(μm − μf ))
σ2m
μm − μf

N
μm − μf
= − βT Σ−1
α = − αj βj /σ2,j
σm
2 σ2m j=1

as stated in the proposition.

T&F Cat #K31368 — K31368 C009— page 300 — 6/14/2017 — 22:05

The Single-Index Model 301

As in the case of a single asset, we will refer to the portfolio deﬁned in

Proposition 9.3 as the Treynor–Black portfolio.
Note that if α1 = α2 = · · · = αN = 0, in agreement with the CAPM, then
w0 = 0; hence, the Treynor–Black portfolio reduces to the market portfolio.
If not all αj are zero, then we may improve on the market portfolio by including
more or less of some assets.
We may describe the Treynor–Black portfolio in terms of a portfolio con-
structed from the N assets combined with the market portfolio. Deﬁne a
weight vector w̄0 = (w̄0,1 , w̄0,2 , . . . , w̄0,N )T by

αj /σ2,j
w̄0,j = N , j = 1, 2, . . . , N,
j=1 αj /σ,j
2

N N
assuming that j=1 αj /σ2,j = 0. Note that j=1 w̄0,j = 1.
To form the Treynor–Black portfolio, the market index is given weight
N
∗
(μm − μf )/σ2m − αj βj /σ2,j
j=1
wm = N N (9.23)
j=1 αj /σ,j + (μm − μf )/σm − j=1 αj βj /σ,j
2 2 2

∗
and the portfolio with weight vector w̄0 is given weight 1 − wm .

Example 9.10 Consider the stocks of the ﬁve companies listed in Example
9.2 and analyzed in several examples in this chapter. Recall that the estimates
of αj for these stocks are stored in the variable stks.alpha and estimates of
σ,j are stored in the variable stks.s. The weights w̄0,j , j = 1, 2, 3, 4, 5 are
calculated as follows.

> wbar0<-(stks.alpha/stks.s^2)/sum(stks.alpha/stks.s^2)
> wbar0
CVC EIX EXPE HUM WMT
-0.0012 0.3587 0.0796 0.2752 0.2877

These results suggest that the best combination of the ﬁve stocks to com-
bine with the market index consists primarily of Expedia, Humana, and
Wal-Mart stocks, in roughly equal weights. The other stocks have weights
that are relatively small in magnitude.
The weight given to the market index using this approach is given by

> c1<-mean(sp500)/var(sp500) - sum(stks.alpha*stks.beta/stks.s^2)

> wm-c1/(c1 + sum(stks.alpha/stks.s2))
> wm
V1
V1 0.0781

with the remainder, 1 − 0.0781 = 0.922, invested in the portfolio of the ﬁve
stocks, with the weights given in the variable wbar0 calculated earlier.

T&F Cat #K31368 — K31368 C009— page 301 — 6/14/2017 — 22:05

302 Introduction to Statistical Methods for Financial Models

Alternatively, the Treynor–Black portfolio may be described in terms of

the weight vector for the six assets (the ﬁve stocks plus the market portfolio);
such a vector is given by

> c((1-wm)*wbar0, wm)

[1] -0.0011 0.3307 0.0734 0.2537 0.2652 0.0781

Properties of the Treynor–Black Portfolio

Proposition 9.4. Consider the framework of Proposition 9.3. Let μTB and
σTB denote the mean and standard deviation, respectively, of the return on the
Treynor–Black portfolio, and let βTB denote the value of beta in the market
model for the Treynor–Black portfolio. Then
⎛ ⎞
1 ⎝ (μm − μf )2
α2j ⎠
N
μTB − μf = + ,
c σ2m σ2
j=1 ,j

⎛ ⎞
1 (μm − μf )
N
α 2 2
= 2⎝
j ⎠
σ2TB + ,
c σ2m σ 2
j=1 ,j

and
1 μm − μf
βTB = .
c σ2m
Here c is as deﬁned in the statement of Proposition 9.3:

N
μm − μf

N
c= αj /σ2,j + − αj βj /σ2,j .
j=1
σ2m j=1

Proof. For j = 1, 2, . . . , N , let μj = E(Rj,t ). Then, using the expression for

the portfolio weights given in Proposition 9.3,

1
αj 1
αj βj
N N
1 μm − μf
μTB − μf = (μj − μf ) + (μm − μf ) + (μm − μf )
c j=1 σ,j
2 c σm 2 c j=1 σ2,j

1 (μm − μf )2 1
αj
N
= + (μj − μf − βj (μm − μf ))
c σ2m c j=1 σ2,j
⎛ ⎞
1 ⎝ (μm − μf )2
α2j ⎠
N
= + ,
c σ2m σ2
j=1 ,j

as given in the statement of the proposition.

T&F Cat #K31368 — K31368 C009— page 302 — 6/14/2017 — 22:05

The Single-Index Model 303

Note that because the market index has a beta of 1,

⎛ ⎞
1 μm − μf
N
αj βj ⎠ 1
αj
N
1 μm − μf
βTB = ⎝ − + 2 βj = c .
c σ2m j=1
σ2
,j c σ
j=1 ,j
σ2m

Now consider σ2TB . It is convenient to consider separately the market and

nonmarket components of the return variance. The market index has zero
nonmarket variance; therefore, under the single-index model, the nonmarket
component of σ2TB is given by
2 ⎛ ⎞
1
1 ⎝
α2j ⎠
N N
αj
σ2,j = 2 (9.24)
c2 j=1 σ2,j c σ2
j=1 ,j

and the market component of σ2TB is β2TB σ2m , where βTB is the value of beta
for the Treynor–Black portfolio.
Using the expression for βTB derived previously, the market component of
σ2TB is given by
1 (μm − μf )2
. (9.25)
c2 σ2m
Adding (9.24) and (9.25) shows that
⎛ ⎞
1 ⎝ (μm − μf )2
α2j ⎠
N
σ2TB = 2 + .
c σ2m σ2
j=1 ,j

Let SRTB = (μTB − μf )/σTB denote the Sharpe ratio of the Treynor–Black
portfolio. By construction, it is at least as large as the Sharpe ratio of the
market index; an expression for the diﬀerence in the squared Sharpe ratios is
given in the following corollary to Proposition 9.4.

Corollary 9.2.

(μm − μf )2
N
α2j
(SRTB )2 − = . (9.26)
σ2m σ2
j=1 ,j

Example 9.11 The Sharpe ratio for the market index corresponding to the
S&P 500 index is given by

> mean(sp500)/sd(sp500)
[1] 0.290

For the stocks represented in the data matrix stks, the estimated diﬀerence
between the squared Sharpe ratio of the Treynor–Black portfolio described

T&F Cat #K31368 — K31368 C009— page 303 — 6/14/2017 — 22:05

304 Introduction to Statistical Methods for Financial Models

in Example 9.10 and the squared Sharpe ratio of the market portfolio is
given by
> sum((stks.alpha^2)/stks.s^2)
[1] 0.125
Thus, the estimated Sharpe ratio of the Treynor–Black portfolio is
1
(0.290)2 + 0.125 2 = 0.457.
N
Bias in the Estimator of j=1 α2j /σ2,j
N
The quantity j=1 α2j /σ2,j measures the diﬀerence between the squared Sharpe
ratio of the Treynor–Black portfolio and the squared Sharpe ratio of the mar-
ket portfolio; hence, it gives a measure of the possible improvement in the
market portfolio by combining it with a portfolio of the assets under consider-
ation. Of course, in practice, N j=1 α 2
j /σ2
,j must be estimated using parameter
estimators from the market models for the assets under consideration.
NHowever, such an estimator tends to overestimate the true value of
α
j=1 j
2
/σ 2
,j giving an overly optimistic assessment of the beneﬁt from
,
modifying the market portfolio. The reason for this is that the estimator
N
j=1 α̂j /σ̂,j is a sum of squared random variables α̂j /σ̂,j , j = 1, 2, . . . , N .
2 2

Recall that, for a random variable X, E(X 2 ) = E(X)2 + Var(X). Therefore,

even if each α̂j /σ̂,j is an unbiased estimator of αj /σ,j , so that
E (α̂j /σ̂,j ) = αj /σ,j ,
it follows that
⎛ ⎞

N

E⎝ α̂2j /σ̂2,j ⎠ = E α̂2j /σ̂2,j
j=1 j=1
N

2
= E (α̂j /σ̂,j ) + Var (α̂j /σ̂,j )
j=1

N
= α2j /σ2,j + Var (α̂j /σ̂,j )
j=1 j=1

N
> α2j /σ2,j .
j=1

One way to correct for this bias is to use the bootstrap method, as we
did in Section 8.9 when estimating measures of portfolio performance. This is
illustrated in the following example.
N
Example 9.12 Consider estimating the bias in j=1 α̂2j /σ̂2,j as an estimator
N
of j=1 α2j /σ2,j using the function boot in the package boot. Recall that a
function to be used in boot must take two arguments: the data, in the form

T&F Cat #K31368 — K31368 C009— page 304 — 6/14/2017 — 22:05

The Single-Index Model 305

of a vector or matrix, and the indices of the data values to be used in the
computation. Deﬁne the function shrp.sq.diff

> shrp.sq.diff<-function(x, ind){

+ stks<-x[ind, -1*ncol(x)]
+ mkt<-x[ind, ncol(x)]
+ mm<-lm(stks~mkt)
+ shrp.m<-mean(mkt)/sd(mkt)
+ alpha<-mm$coefficients[1,]
+ beta<-mm$coefficients[2, ]
+ s.hat<-(apply(mm$residuals^2, 2, sum)/mm$df.residual)^.5
+ sum((alpha^2)/(s.hat^2))
+ }

This function assumes that the argument x is a matrix of returns, with the
last column corresponding to the market portfolio and the remaining columns
corresponding to the assets to be used in forming the Treynor–Black portfo-
lio. Calculation of s.hat, the vector of estimates of σ,j , is carried out using
the component $residuals of the output from the function lm; this compo-
nent consists of a matrix of residuals corresponding to the different response
variables used in the lm function, in this case, the returns on the different
assets. Adding the squares of these residuals for each asset and dividing by
the degrees of freedom, yields estimates of σ2,j .
Therefore, shrp.sq.diff takes the values in x corresponding to the indices
in the vector ind and uses those values to compute the difference in the esti-
mated Sharpe ratios of the Treynor–Black and market portfolios. For example,
taking x = cbind(stks, sp500) and ind = 1:60 returns the difference of
the estimated squared Sharpe ratios calculated in Example 9.11

> shrp.sq.diff(cbind(stks, sp500), 1:60)

[1] 0.125

that agrees with the value obtained earlier.

The estimated bias of the estimator of N j=1 αj /σ,j can now be obtained
2 2

using the function boot.

> boot(cbind(stks, sp500), shrp.sq.diff, 10000)

ORDINAR{Y} NONPARAMETRIC BOOTSTRAP

Bootstrap Statistics :
original bias std. error
t1* 0.125 0.111 0.129

Therefore, the estimated bias is 0.111 and the bias-corrected estimate is only
0.125 − 0.111 = 0.004, considerably smaller than the original estimate.

T&F Cat #K31368 — K31368 C009— page 305 — 6/14/2017 — 22:05

306 Introduction to Statistical Methods for Financial Models

The output from the boot also gives the standard error of the estimate.
Although this value is useful for getting a rough idea of the variability in
the estimates, in this case, it is not useful for constructing
N an approximate
conﬁdence interval for the true diﬀerence value of j=1 α2j /σ2,j .
N
Note that the parameter j=1 α2j /σ2,j is nonnegative and the correspond-
ing estimator is a nonnegative random variable. Hence, if the true value of
N
j=1 αj /σ,j is close to zero, then the distribution of the estimator is not well
2 2

approximated by a normal distribution; therefore, an approximate conﬁdence

interval of the form of the estimate plus or minus 1.96 times the standard
error will not have the usual coverage property. One way to see that in the
present example is to note that such an interval will include negative values,
which are impossible. However, given that the standard error is 0.129 and the
bias-corrected estimate is 0.004, it is clear that there is little evidence that
the Treynor–Black portfolio has a larger Sharpe ratio than does the portfolio
based on the market index.

Numerical Computation of the Treynor–Black

Portfolio Weights
In Sections 5.7 and 5.8, it was shown that the weights of the tangency portfolio
may be found using quadratic programming. Given that the Treynor–Black
portfolio is a type of tangency portfolio, it is clear that the same approach
can be used to calculate the weights of the Treynor–Black portfolio. One
advantage of calculating the weights in this way is that it is a simple matter
to impose certain types of constraints on the weights; see Section 5.8 for a
discussion of the type of constraints that may be handled in this manner. The
following example illustrates the calculation when the weights are constrained
to be nonnegative.
Example 9.13 Consider the stocks of the five companies listed in Example
9.2; the portfolio weights of the Treynor–Black portfolio were calculated in
Example 9.10; here we repeat this calculation using the function solve.QP in
the package quadprog.
Recall that the covariance matrix for these stocks based on the single-index
model is stored in the variable stks.Sig; to calculate the portfolio weights, we
need to extend this matrix to a covariance matrix for the returns on the five
stocks, along with the returns on the market portfolio, in this case taken to
be the returns on the S&P 500 index. Note that the covariance of the returns
on a stock and the returns on the market portfolio is given by β2i σ2m , where βi
is the value of beta for the stock and σ2m is the variance of the return on the
market portfolio. Here, the values of beta for the five stocks are stored in the
variable stks.beta. Hence, the extended covariance matrix is given by
> stks.Sig.plus<-rbind(cbind(stks.Sig, stks.beta*var(sp500)),
+ c( (stks.beta)*var(sp500), var(sp500)))
> stks.Sig.plus

T&F Cat #K31368 — K31368 C009— page 306 — 6/14/2017 — 22:05

The Single-Index Model 307

CVC EIX EXPE HUM WMT SP500

CVC 0.00839 0.00072 0.00135 0.00099 0.00069 0.00151
EIX 0.00072 0.00241 0.00060 0.00044 0.00031 0.00067
EXPE 0.00135 0.00060 0.01226 0.00082 0.00057 0.00126
HUM 0.00099 0.00044 0.00082 0.00543 0.00042 0.00092
WMT 0.00069 0.00031 0.00057 0.00042 0.00198 0.00064
SP500 0.00151 0.00067 0.00126 0.00092 0.00064 0.00141

Recall that to compute the tangency portfolio weights numerically, we min-

imize the variance of the portfolio, subject to the constraint that the portfolio
mean is one; the resulting weights are then normalized to sum to 1. Hence,
let A1 denote the 6 × 1 matrix of asset means, which may be calculated by
A1<-as.matrix(c(apply(stks, 2, mean), mean(sp500)))
Then the weights of the Treynor–Black portfolio may be calculated using
> w_tb1<-solve.QP(Dmat=stks.Sig.plus, dvec=rep(0, 6), Amat=A1,
+ bvec=c(1), meq=1)$solution
> w_tb<-w_tb1/sum(w_tb1)
> w_tb
[1] -0.0011 0.3307 0.0734 0.2537 0.2653 0.0781
Note that these weights match those calculated in Example 9.10.
To include the condition that all weights are nonnegative, we modify the
arguments Amat and bvec.
> w_tb2<-solve.QP(Dmat=stks.Sig.plus, dvec=rep(0, 6),
+ Amat=cbind(A1,diag(6)),
+ bvec=c(1, rep(0, 6)), meq=1)$solution
> w_tb_nonneg<-w_tb2/sum(w_tb2)
> w_tb_nonneg
[1] 0.0000 0.3307 0.0734 0.2537 0.2653 0.0769
As expected, the constrained weights are very similar to the unconstrained
weights, with the weight given to CVC set to zero and slight modiﬁcations to
the other weights.

9.7 Suggestions for Further Reading

The single-index model is one of the most commonly used statistical models in
ﬁnance; it is also the starting point for many of the more sophisticated models
used in ﬁnancial modeling. Good detailed discussions of the single-index model
are available from Francis and Kim (2013, Chapter 8) and Elton et al. (2007,
Chapter 7); see also Sharpe (1963). Partial correlation is discussed by Agresti

T&F Cat #K31368 — K31368 C009— page 307 — 6/14/2017 — 22:05

308 Introduction to Statistical Methods for Financial Models

and Finlay (2009, Section 11.7). A number of useful results on matrix inverses,
including the one used in Lemma 9.1, are available from Henderson and Searle
(1981); the inverses of partitioned matrices are discussed by Lu and Shiou
(2002).
Active portfolio management uses a variety of methods in an attempt
to outperform the benchmark portfolio; see Grinold and Kahn (2000) and
Chincarini and Kim (2006) for book-length treatments of this area. The
Treynor–Black method is attributed to Treynor and Black (1973); see also
Francis and Kim (2013, Section 17.2) and Kane et al. (2012). Optimal active
portfolios based on the properties of their residual returns are considered by
Grinold and Kahn (2000, Chapter 5); see Qian et al. (2007, Section 2.2.4) for
an alternative approach.
Given that active portfolio management relies on the ineﬃciency of the
market portfolio, it is not surprising that some analysts are skeptical of its
beneﬁts; see, for example, Samuelson (1974) and Sharpe (1991).

9.8 Exercises
1. Let β denote a vector in N , let Σ denote an N × N matrix, and
suppose σ2m > 0. Show that

σ2m ββT + Σ (9.27)

is a positive-deﬁnite matrix provided that Σ is positive deﬁnite.

Suppose that Σ is nonnegative definite but not positive definite.
Is it possible that (9.27) is positive definite? Why or why not?
2. Consider a set of four assets with betas given by 1.1, 0.9, 0.4, 1.5,
respectively, and error standard deviations σ,i given by 0.2, 0.4, 0.5,
0.8 respectively; suppose that σm = 0.1. Find the covariance matrix
of the return vector for these assets under the assumption that
the single-index model holds and give the corresponding correlation
matrix.
3. Calculate the monthly excess returns for the period ending Decem-
ber 31, 2015, for three stocks, Papa John’s International, Inc.
(symbol PZZA), Bed Bath & Beyond, Inc. (BBBY), and Time
Warner, Inc. (TWX); for the risk-free rate, use the return on the
three-month Treasury Bill, available on the Federal Reserve website.
For each pair of two stocks, calculate the partial correlation coef-
ficient of the stock excess returns given the excess return on the S&P
500 index. Based on these results, does it appear that the single-index
model holds for the returns on these three stocks? Why or why not?
4. Consider five mutual funds, representing five different indus-
tries: Fidelity Select Semiconductors Portfolio (symbol FSELX),

T&F Cat #K31368 — K31368 C009— page 308 — 6/14/2017 — 22:05

The Single-Index Model 309

Fidelity Select Energy Portfolio (FSENX), Fidelity Select Health

Care Services Portfolio (FSHCX), Fidelity Real Estate Investment
Portfolio (FRESX), and Fidelity Select Transportation Portfolio
(FSRFX). Mutual funds like these, which focus on a single industry
or sector of the market, are sometimes called sector funds.
For each fund, calculate the monthly excess returns for the
period ending December 31, 2015; for the risk-free rate, use the
return on the three-month Treasury Bill, available on the Federal
Reserve website.
For each pair of two funds, calculate the partial correlation coeffi-
cient of stock excess returns given the excess return on the S&P 500
index. Based on these results, does it appear that the single-index
model holds for the returns on these assets? Why or why not?
5. Consider the three stocks analyzed in Exercise 3. Assume that the
single-index model holds for these stocks and estimate the vector
of betas, β, and the error covariance matrix, Σ . Take the market
index to be the S&P 500 index.
Based on these estimates, together with an estimate of σ2m , esti-
mate the stock return correlation matrix under the single-index
model assumption and compare it to the sample correlation matrix.
6. Consider the five mutual funds analyzed in Exercise 4. Assume that
the single-index model holds for these funds and estimate the vector
of betas, β, and the error covariance matrix, Σ . Take the market
index to be the S&P 500 index.
Based on these estimates, together with an estimate of σ2m , esti-
mate the stock return correlation matrix under the single-index
model assumption and compare it to the sample correlation matrix.
7. Consider the three stocks analyzed in Exercise 3. Using the func-
tion solve.QP in the quadprog package, calculate the weight vector
of the risk-averse portfolio with risk-aversion parameter λ=5, first
using the sample covariance matrix and then using the estimate of
the covariance matrix based on the single-index model. Compare
the results.
Repeat the calculations using the restriction that all asset
weights must be nonnegative. Compare the results.
8. For the five mutual funds analyzed in Exercise 4, estimate the
weights of the tangency portfolio using the estimate of the return
covariance matrix based on the single-index model. Compare the
results to the estimate based on the sample covariance matrix.
9. Estimate the appraisal ratios for the five mutual funds analyzed in
Exercise 4. Based on these results, if you were to modify your invest-
ment in the market portfolio, as represented by the S&P 500 index,

T&F Cat #K31368 — K31368 C009— page 309 — 6/14/2017 — 22:05

310 Introduction to Statistical Methods for Financial Models

by increasing the weight given to one of these funds and decreas-

ing the weight given to the market portfolio, which one would you
choose? In this new portfolio, what weight would you give to the
market portfolio?
10. For the five mutual funds analyzed in Exercise 4, estimate the
weights of the Treynor–Black portfolio. Estimate the weight for
each of the five sector funds along with the weight given to the
portfolio corresponding to the S&P 500 index.
11. Using the result in Corollary 9.2, estimate the Sharpe ratio of the
Treynor–Black portfolio constructed from the five sector mutual
funds in Exercise 10. Compare the result to the estimated Sharpe
ratio of the S&P 500 index.
12. Consider the Treynor–Black portfolio for a set of N assets together
with the market portfolio. Let αTB denote the value of alpha in the
market model for this portfolio. Show that α2TB may be written in
terms of α21 , α22 , . . . , α2N , the squared values of alpha for the N assets;
σ2,1 , σ2,2 , . . . , σ2,N , the error variances in the market models for the
N assets; and σ2,TB , the error variance in the market model for the
Treynor–Black portfolio.
13. For the five mutual funds analyzed in Exercise 4, estimate
the weights of the Treynor–Black portfolio subject to the constraint
that the weights are nonnegative. Estimate the Sharpe ratio of the
resulting portfolio.

T&F Cat #K31368 — K31368 C009— page 310 — 6/14/2017 — 22:05

10
Factor Models

10.1 Introduction
Although there are many assets available to an investor, the returns on these
assets are often correlated—in some cases, highly correlated. One reason for
this correlation is that the returns on a set of assets may all be affected by cer-
tain changes in the economy; alternatively, the assets may correspond to firms
with similar properties. A factor model describes the returns on a set of assets
in terms of a few underlying “factors” potentially affecting all of the assets.
We have already seen one example of a factor model, the single-index
model discussed in Chapter 9, which describes the returns on a set of assets
in terms of the returns on a market index. An important implication of this
model is that the covariance between the returns on two assets arises from the
fact that both assets’ returns are related to the return on the market index.
Although it may be reasonable to assume that the behavior of the market as a
whole is the most important factor affecting asset returns, empirical research
has shown that there are other factors, in addition to the return on the market
index, that have important effects on asset returns. These factors are useful
for describing the correlation structure of a set of asset returns as well as for
describing the behavior of the mean returns of the assets, extending the type
of relationship described by the single-index model in Chapter 9.
The goal of this chapter is to present the statistical methodology under-
lying these factor models along with the implications of these models for
understanding the behavior of asset returns and for constructing and analyzing
portfolios.

10.2 Limitations of the Single-Index Model

One role of the single-index model is to model the covariance between the
returns on two assets in terms of the relationship between each asset’s returns
on the returns on a market index. However, as noted in the introduction, in
some cases, there may be important economic variables, in addition to the
return on the market, that have important eﬀects on the relationship between
the returns on the assets. This is illustrated in the following example.

311

T&F Cat #K31368 — K31368 C010— page 311 — 6/14/2017 — 22:05

312 Introduction to Statistical Methods for Financial Models

Example 10.1 Consider the returns on two stocks, JetBlue Airways Corp.
(symbol JBLU) and EV Energy Partners, L.P. (EVEP), an oil and natu-
ral gas company. The variables jblu and evep contain 5 years of monthly
excess returns on JBLU and EVEP stock, respectively, for the period end-
ing December 31, 2014, and suppose that sp500 contains the corresponding
excess returns on the Standard & Poors (S&P) 500 index. Then the estimated
correlation of the returns on these stocks is given by

> cor(jblu, evep)

[1] -0.150

The estimated correlations of each return with the return on the S&P 500
index are given by
> cor(jblu, sp500)
[1] 0.311
> cor(evep, sp500)
[1] 0.268

Therefore, each stock’s returns are positively correlated with the return on
the market index, but the returns are negatively correlated with each other.
Note that relationships of this type are not possible under the single-index
model. The estimates of beta for the two stocks are given by

> lm(jblu~sp500)$coef
(Intercept) sp500
0.0137 0.8770
> lm(evep~sp500)$coef
(Intercept) sp500
-0.00437 0.74013
The estimated return variance for the S&P 500 index is

> var(sp500)
[1] 0.00141
According to the single-index model, the estimated covariance of the returns
on JBLU and EVEP stock is

(0.877)(0.740)(0.00141) = 0.000915,

corresponding to an estimated correlation of

> 0.000915/(sd(jblu)*sd(evep))
[1] 0.0832

Hence, although the sample correlation of returns on JBLU and EVEP stock is
negative (−0.150), the estimated correlation based on the single-index model
is positive (0.0832).

T&F Cat #K31368 — K31368 C010— page 312 — 6/14/2017 — 22:05

Factor Models 313

One reason for this behavior may be the presence of other economic vari-
ables that are affecting the returns on JBLU and EVEP stock. For example,
JBLU, as an airline stock, is likely to be negatively affected by increasing
oil prices; EVEP, on the other hand, as a gas and oil stock, is likely to be
positively affected by increasing oil prices. Hence, oil prices might have an
important effect on the relationship between the returns on these two stocks.
One commonly used benchmark for crude oil prices is the price of West
Texas Intermediate (WTI) oil, which is generally refined in the United States.
Historical prices for WTI oil are available on the Federal Reserve Eco-
nomic Data (FRED) website at https://research.stlouisfed.org/fred2/series/
DCOILWTICO/downloaddata; like stock prices, these data are available for
different sampling frequencies, such as daily or monthly prices. Let the vari-
able oil denote the proportional change in monthly prices of WTI oil for the
5-year period ending December 31, 2014; thus, oil is calculated the same
way that asset returns are calculated, except that oil prices, rather than stock
prices, are used.
Note that, as expected, the returns on JBLU stock are negatively cor-
related with the change in oil prices, while the returns on EVEP stock are
positively correlated with the change in oil prices:
> cor(jblu, oil)
[1] -0.265
> cor(evep, oil)
[1] 0.528
Therefore, in modeling the relationship between JBLU and EVEP stock,
it may be important to take into account changes in oil prices. This is likely
to be true when analyzing the returns on other stocks thought to be related
to oil prices.
A second use of the single-index model is in understanding the role of an
asset’s relationship with the market index in the expected return on the asset.
According to the single-index model or, equivalently, the market model, the
expected excess return on asset i, μi − μf , is related to the expected excess
return on the market index, μm − μf , by
μi − μf = αi + βi (μm − μf ),
where αi and βi are the parameters in the market model for asset i.
If the asset is priced correctly, in the sense described in Section 8.4, then
αi = 0 and the expected excess return on an asset is proportional to its value
of beta; see Section 7.7 for further details. This fact suggests that assets with
greater values of beta will tend to have higher expected excess returns. The
following example shows that this interpretation of beta is not always useful
in practice.
Example 10.2 Consider stocks for firms represented in the S&P 500 index.
Five years of monthly returns for the period ending December 31, 2014, were
analyzed; 474 of the stocks had returns for that entire period.

T&F Cat #K31368 — K31368 C010— page 313 — 6/14/2017 — 22:05

314 Introduction to Statistical Methods for Financial Models

For each stock, the parameters of the market model were estimated,
along with the sample mean excess return. These results suggest that all
474 stocks are priced correctly; for instance, the minimum p-value for testing
αi = 0 is 0.00236 so that, using the Bonferroni method, we fail to reject the
hypothesis that all αi are equal to zero at any level.
The estimates of beta for the 474 stocks are stored in the variable
sp474.mmbeta. Note that there is considerable variation in the estimates of
beta:

> summary(sp474.mmbeta)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.004 0.732 1.060 1.080 1.390 2.870

Hence, according to the CAPM, we expect that stocks with large estimates of
beta will tend to have higher sample mean excess returns.
Figure 10.1 contains a plot of the sample mean excess returns versus the
estimated value of beta for the 474 stocks. Note that there is, at most, a
very weak relationship between a stock’s sample mean excess return and its
estimate of beta. Furthermore, this plot does not support the idea that stocks
with larger values of beta tend to have large mean excess returns. For instance,
the sample correlation of the estimates of beta and the sample mean excess
returns based on these data is only 0.0359.
√ The standard error of this estimate
when the true correlation is zero is 1/ N , where N is the number of observa-
tions; here N = 474. Hence, the standard error of the estimate is 0.0459 and
the correlation is not signiﬁcantly diﬀerent than zero. The squared sample

0.05
Sample mean excess returns

0.04

0.03

0.02

0.01

−0.01

0 0.5 1.0 1.5 2.0 2.5

Estimates of beta

FIGURE 10.1
Plot of sample mean excess returns versus estimates of beta for stocks in the
S&P 500 index.

T&F Cat #K31368 — K31368 C010— page 314 — 6/14/2017 — 22:05

Factor Models 315

correlation is about 0.0013, so only about 0.13% of the variation in the sam-
ple mean excess returns on the stocks may be explained by their estimates of
beta. Thus, the theoretical relationship expressed in Figure 7.1 does not hold
for these data.

The results in the previous example suggest that, at least in some cases, the
relationship between the returns on an asset and the returns on a market index
is not suﬃcient to eﬀectively describe the mean excess returns on an asset.
That is, these results are consistent with the idea that it may be important to
include factors other than the return on a market index in a model for asset
returns.

10.3 The Model and Its Estimation

The single-index model relates the returns on a given asset, Ri,t , t =
1, 2, . . . , T , to the returns on a market index, Rm,t , t = 1, 2, . . . , T , through
the model
Ri,t − Rf,t = αi + βi (Rm,t − Rf,t ) + i,t , t = 1, 2, . . . , T.

Here Rf,t is the return on the risk-free asset at time t. The idea behind this
model is that all assets are related to “the market” and volatility in the market
induces volatility in the returns of individual assets. Furthermore, under this
model, the correlation between the returns of two assets is a result of the fact
that both assets are related to the market.
A general factor model extends the single-index model by including other
risk factors, in addition to the market return, in the model. These factors
may represent economic conditions that, like the return on a market index,
affect all assets. Or the factors might reflect properties of the assets under
consideration, such as the size of the company, in the case of a stock. This
flexibility in the factors, which may be chosen to represent the analyst’s beliefs
and goals, is one of the strengths of factor models. In this section, we consider
the form and properties of a factor model, along with parameter estimation;
selection of the factors is considered in the following section.
Let F1,t , F2,t , . . . , FK,t denote the values of K factors at time t, t =
1, 2, . . . , T . For i = 1, 2, . . . , N , let Ri,t denote the return on asset i at time t.
Then a factor model that describes Ri,t in terms of F1,t , F2,t , . . . , FK,t has the
form
Ri,t − Rf,t = αi + βi,1 F1,t + βi,2 F2,t + · · · + βi,K FK,t + i,t , t = 1, 2, . . . , T,

where i,1 , i,2 , . . . , i,T are unobserved mean-zero random variables that are
uncorrelated with the factors. These terms represent the component of the
asset’s excess return not explained by the factors.
Note that the values of the factors F1,t , F2,t , . . . , FK,t are the same for
each asset and, hence, do not depend on i; in this sense, they may be viewed

T&F Cat #K31368 — K31368 C010— page 315 — 6/14/2017 — 22:05

316 Introduction to Statistical Methods for Financial Models

as common factors. The parameters βi,1 , βi,2 , . . . , βi,K , known as the factor
sensitivities for asset i, measure how the factors aﬀect a particular asset’s
returns. Hence, these parameters depend on i; however, they are assumed to
be constant over the observation period, so that they do not depend on t.
In the factor model, the factor sensitivities, like β in the single-index model,
are unknown parameters that must be estimated.
It is assumed that the same factor model applies to all assets under
consideration so that, for t = 1, 2, . . . , T ,

R1,t − Rf,t = α1 + β1,1 F1,t + β1,2 F2,t + · · · + β1,K FK,t + 1,t

R2,t − Rf,t = α2 + β2,1 F1,t + β2,2 F2,t + · · · + β2,K FK,t + 2,t
..
.
RN,t − Rf,t = αN + βN,1 F1,t + βN,2 F2,t + · · · + βN,K FK,t + N,t

or, in matrix notation,

Rt − Rf,t 1 = α + βFt + t , (10.1)

where α = (α1 , . . . , αN )T , β is the N × K matrix of factor sensitivities,

⎛ ⎞
β1,1 β1,2 ··· β1,K
⎜ β2,1 β2,2 ··· β2,K ⎟
⎜ ⎟
β=⎜ . .. .. ⎟ ,
⎝ .. . ··· . ⎠
βN,1 βN,2 ··· βN,K

Ft = (F1,t , F2,t , . . . , FK,t )T is the K × 1 vector of factor values at time t,

and t = (1,t , 2,t , . . . , N,t )T is an N × 1 random vector of unobserved model
errors at time t.
We assume that the stochastic process
T
{ (Rt − Rf,t 1)T , FtT : t = 1, 2, . . .}

is a weakly stationary process, so any linear function of Rt − Rf,t 1, Ft is

a weakly stationary process; in particular, 1 , 2 , . . . is a weakly stationary
process. Furthermore, we assume that

Cov(t , Ft ) = 0, t = 1, 2, . . . , T,

so that Σ, the covariance matrix of Rt , may be written as

Σ = βΣF βT + Σ , (10.2)

T&F Cat #K31368 — K31368 C010— page 316 — 6/14/2017 — 22:05

Factor Models 317

where ΣF denotes the covariance matrix of Ft and Σ denotes the covariance

matrix of t ; by the weak stationarity assumption, these parameters do not
depend on t.
As with the single-index model, the errors for diﬀerent assets are assumed
to be uncorrelated so that Σ is a diagonal matrix of the form
⎛ 2 ⎞
σ,1 0 ... 0
⎜ .. ⎟
⎜ 0 σ2,2 . . . . ⎟
Σ = ⎜
⎜ ..
⎟
⎟
⎝ . .. ..
. . 0 ⎠
0 . . . 0 σ2,N

where σ2,i = Var(i,t ), t = 1, 2, . . . , N . Under this assumption, any correlation

among asset returns for diﬀerent assets is attributable to the common factors
that aﬀect all assets.
Thus, the factor model may be viewed as an extension of the single-index
model in which the excess return on the market index, Rm,t − Rf,t , is replaced
by the factors F1,t , F2,t , . . . , FK,t and the vector β = (β1 , β2 , . . . , βN )T of the
single-index model, which gives the value of beta for each asset, is replaced by
the matrix β. The ith row of β, (βi,1 , βi,2 , . . . , βi,K ), gives the factor sensitivities
for asset i; the jth column of β, (β1,j , β2,j , . . . , βN,j )T gives the sensitivities to
factor j for each of the N assets.

Portfolios
Consider a portfolio of the N assets under consideration based on a weight
vector w = (w1 , w2 , . . . , wN )T . Then the return on the portfolio at time t,
Rp,t , may be written as

Rp,t = wT Rt , t = 1, 2, . . . , T.

It is straightforward to show that under the model (10.1)

Rp,t − Rf,t = αp + βTp Ft + p,t , t = 1, 2, . . . , T,

where αp = wT α and βp denotes the K × 1 vector factors sensitivities for the

portfolio,
βTp ≡ (βp,1 , βp,2 , . . . , βp,K ) = wT β,
and p,t = wT t .
That is, the factor model applies to the portfolio as well, with the fac-
tor sensitivities for the portfolio given by weighted sums of the asset factor
sensitivities:

N
βp,j = wi βi,j , j = 1, 2, . . . , K.
i=1

T&F Cat #K31368 — K31368 C010— page 317 — 6/14/2017 — 22:05

318 Introduction to Statistical Methods for Financial Models

The expected excess return on the portfolio at time t is given by

E(Rp,t ) − μf,t = wT α + wT βE(Ft )

= αp + βTp E(Ft )

and the variance of the return at time t is given by

Var(Rp,t ) = βTp ΣF βp + wT Σ w.

The ﬁrst term in this expression, βTp ΣF βp , represents a measure of the

systematic risk of the portfolio, that is, the risk explained by the factors,
while the second term, wT Σ w, is a measure of the portfolio’s speciﬁc risk.
The systematic risk depends on the variation in the factors, as measured by
ΣF , along with factor sensitivities of the portfolio, βp .

Estimation
Given a set of factors thought to be relevant to the asset returns under con-
sideration, the factor sensitivities are estimated from the available data. Let
F1,t , F2,t , . . . , FK,t denote the values of K factors at time t, t = 1, 2, . . . , T , and
let Ri,t , t = 1, 2, . . . , T denote the returns on a given asset. Then, according
to the factor model,

Ri,t − Rf,t = αi + βi,1 F1,t + βi,2 F2,t + · · · + βi,K FK,t + i,t , t = 1, 2, . . . , T,

where i,t is uncorrelated with F1,t , F2,t , . . . , FK,t ; here, Rf,t is the return on
the risk-free asset at time t.
Hence, as in the case of the single-index model, the parameter estimates
for asset i may be obtained using least-squares regression based on the returns
on asset i; that is, all parameter estimates may be obtained from N regression
analyses, one for each asset. Speciﬁcally, the parameters αi , βi,1 , βi,2 , . . . , βi,K
may be estimated using least-squares regression with response variable
Ri,t − Rf,t at time t and predictor variables F1,t , F2,t , . . . , FK,t at time t. The
error variance for asset i, σ2,i = Var(it ), may be estimated using the usual
estimator from the regression analysis.

Example 10.3 Consider the returns on JetBlue and EV Energy Partners

stock, as discussed in Example 10.1. According to that example, oil prices
apparently have an important eﬀect on the returns on these stocks. Consider
a factor model with two factors, the return on the S&P 500 index and the
change in oil prices; recall that these data are stored in the variables sp500
and oil, respectively. Thus, using the general notation of this section, K = 2,
with F1,t given by the excess returns on the S&P 500 index at time t and F2,t
given by the change in West Texas Intermediate oil at time t.

T&F Cat #K31368 — K31368 C010— page 318 — 6/14/2017 — 22:05

Factor Models 319

The excess returns on JetBlue stock are stored in the variable jblu.
To estimate the parameters of the factor model for JetBlue stock, we use
the lm function to ﬁt a regression model with predictor variables sp500
and oil:
> summary(lm(jblu~sp500+oil))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.00965 0.01282 0.75 0.4547
sp500 1.15544 0.34078 3.39 0.0013 **
oil -0.60598 0.19626 -3.09 0.0031 **

Residual standard error: 0.0948 on 57 degrees of freedom

Multiple R-squared: 0.226, Adjusted R-squared: 0.199
F-statistic: 8.33 on 2 and 57 DF, p-value: 0.000672
Therefore, the parameter estimates for the factor model for JetBlue stock,
which we take to be asset 1, are α̂1 = 0.00965, β̂1,1 = 1.155, β̂1,2 = −0.606,
and σ̂,1 = 0.0948.
These results may be compared to the estimates from the market model
for JetBlue stock:
> summary(lm(jblu~sp500))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0137 0.0137 1.00 0.321
sp500 0.8770 0.3520 2.49 0.016 *

Residual standard error: 0.102 on 58 degrees of freedom

Multiple R-squared: 0.0967, Adjusted R-squared: 0.0811
F-statistic: 6.21 on 1 and 58 DF, p-value: 0.0156
Note that the estimated coefficient of the market index is different in the
market model and the factor model. This is not unexpected; note that the
coefficients have different interpretations. In the market model, the coefficient
represents the change in the expected excess return on JetBlue stock corre-
sponding to a change in the return on the market index, while in the factor
model, it represents the change in the return on JetBlue stock corresponding
to a change in the return on the market index holding the change in oil prices
constant.
Similarly, the negative coefficient of oil in the factor model indicates that
an increase in oil prices corresponds to a decrease in the expected excess return
on JetBlue stock, holding the return on the S&P 500 constant.
Note that the value of R-squared in the factor model, 0.226, is larger
than the value in the market model, 0.0967. Again, this is to be expected:
the factors sp500 and oil explain more of the variation in the returns on
JetBlue stock than does sp500 alone. In general, when adding predictor

T&F Cat #K31368 — K31368 C010— page 319 — 6/14/2017 — 22:05

320 Introduction to Statistical Methods for Financial Models

variables to a regression model, the R-squared value increases or stays the

same.
Therefore, when comparing R-squared values from regression models with
diﬀerent numbers of predictors, it is generally preferable to use adjusted
R-squared. As the name suggests, adjusted R-squared includes an adjustment
for the number of predictors in the model. Adding a predictor to a model can
lead to a decrease in adjusted R-squared. In the present case, the adjusted
R-squared value for the market model is 0.0811, while for the factor model
it is 0.199, so the basic conclusion does not change: Including the change in
oil prices in the model explains much more of the variation in the returns on
JetBlue stock.
The estimates of the factor model for EV Energy Partners stock are
given by

> summary(lm(evep~sp500+oil))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000828 0.011979 0.07 0.95
sp500 0.380479 0.318424 1.19 0.24
oil 0.782726 0.183384 4.27 7.5e-05 ***

Residual standard error: 0.0886 on 57 degrees of freedom

Multiple R-squared: 0.297, Adjusted R-squared: 0.272
F-statistic: 12 on 2 and 57 DF, p-value: 4.43e-05

Here, the coeﬃcient of oil in the estimated regression model is negative, so

that an increase in the change in oil prices corresponds to an increase in the
expected excess return on EVEP stock, holding the return on the S&P 500
constant.
An estimate of the covariance matrix of the returns on JetBlue and EV
Energy Partners stock based on the factor model may be obtained using the
relationship given in (10.2). Note that ΣF , the covariance matrix of the fac-
tors, may be estimated by the sample covariance matrix of observed factor
values; the other parameters in (10.2) may be obtained from the factor model
regressions. The relevant R commands are

> jblu.fm<-lm(jblu~sp500+oil)
> evep.fm<-lm(evep~sp500+oil)
> betamat<-rbind(jblu.fm$coef[2:3], evep.fm$coef[2:3])
> Sig.eps<-diag(c(summary(jblu.fm)$sigma,
+ summary(evep.fm)$sigma)^2)
> cov.fm<-betamat%*%cov(cbind(sp500, oil))%*%t(betamat) +
+ Sig.eps

Here jblu.fm and evep.fm contain the results from the factor model regres-
sions. The matrix betamat is the matrix of coeﬃcient estimates; note that

T&F Cat #K31368 — K31368 C010— page 320 — 6/14/2017 — 22:05

Factor Models 321

$coef extracts the coeﬃcient estimates from the result of lm. Thus, here
betamat contains the estimated factor sensitivities,

> betamat
sp500 oil
[1,] 1.16 -0.606
[2,] 0.38 0.783

The matrix Sig.eps is the estimate of Σ from factor model regressions

> Sig.eps
[,1] [,2]
[1,] 0.00899 0.00000
[2,] 0.00000 0.00785

The matrix cov.fm is the estimate of Σ based on the factor model

> cov.fm
[,1] [,2]
[1,] 0.01153 -0.00096
[2,] -0.00096 0.01104

and the corresponding correlation matrix is given by

> cov2cor(cov.fm)
[,1] [,2]
[1,] 1.0000 -0.0851
[2,] -0.0851 1.0000

Thus, the factor model with two factors, the return on the S&P 500 index and
the change in the price of WTI oil, captures the negative correlation between
the returns on JetBlue and EV Energy Partners stock.

10.4 Factors
As noted in the previous section, there is considerable freedom in the selection
of the factors to include in a factor model. Factors are often divided into two
categories. Economic factors are variables measuring the general state of the
market or of the economy; for instance, the return on a market index and
the unemployment rate are two examples of economic factors. Fundamental
factors are based on the characteristics of a particular firm, such as the size
of the firm, as measured by its market capitalization; however, because the
factors in our model must apply to all assets, such characteristics must first
be converted to a common factor.
An economic factor may be based on any macroeconomic variable thought
to have an important influence on the asset returns under consideration.

T&F Cat #K31368 — K31368 C010— page 321 — 6/14/2017 — 22:05

322 Introduction to Statistical Methods for Financial Models

As noted earlier, the return on a market index is one example of an economic

factor; another, used in the previous section, is the change in oil prices. Other
commonly used economic factors include the unemployment rate, measures
of industrial production, inﬂation rate, and measures of consumer sentiment
or conﬁdence. The data needed to construct such factors are available from
a variety of sources, such as the FRED website and the Bureau of Labor
Statistics website.
Example 10.4 Consider the four stocks of four companies, Caterpillar, Inc.
(symbol CAT), maker of construction equipment; Cintas Corp. (CTAS), which
provides uniforms and related services to businesses; Exxon Mobile Corp.
(XOM), and Reliance Steel and Aluminum Co. (RS). Five years of monthly
data for the period ending December 31, 2014, were analyzed using a factor
model. The data matrix of excess return data for these stocks is stored in the
matrix stks4.
Three factors were used, the excess returns on the S&P 500 index, the pro-
portional change in the Industrial Production Index, and the unemployment
rate, expressed as a proportion, rather than as a percentage. The Industrial
Production Index is a measure of the industrial output of U.S. facilities; it is
available on the FRED website at https://research.stlouisfed.org/fred2/series/
INDPRO. The series of proportional changes in the index are stored in the
variable indpro. The unemployment rate is also taken from the FRED web-
site at https://research.stlouisfed.org/fred2/series/UNRATE. These data are
stored in the variable unemp.
The parameter estimates for the factor model may be calculated by using
the lm function with the data matrix as the response.
> summary(lm(stks4~sp500+indpro+unemp))
Response CAT :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.1081 0.0471 -2.29 0.026 *
sp500 1.7498 0.1927 9.08 1.3e-12 ***
indpro 0.0290 0.0169 1.71 0.092 .
unemp 1.1782 0.5821 2.02 0.048 *

Residual standard error: 0.0549 on 56 degrees of freedom

Multiple R-squared: 0.605, Adjusted R-squared: 0.583
F-statistic: 28.5 on 3 and 56 DF, p-value: 2.49e-11
Response CTAS :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0517 0.0290 1.78 0.080 .
sp500 0.8775 0.1187 7.39 7.8e-10 ***
indpro 0.0203 0.0104 1.95 0.056 .
unemp -0.5635 0.3587 -1.57 0.122

T&F Cat #K31368 — K31368 C010— page 322 — 6/14/2017 — 22:05

Factor Models 323

Residual standard error: 0.0339 on 56 degrees of freedom

Multiple R-squared: 0.512, Adjusted R-squared: 0.485
F-statistic: 19.6 on 3 and 56 DF, p-value: 8.46e-09

Response RS :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0782 0.0416 -1.88 0.065 .
sp500 1.7668 0.1700 10.39 1.1e-14 ***
indpro 0.0304 0.0149 2.03 0.047 *
unemp 0.7603 0.5137 1.48 0.144

Residual standard error: 0.0485 on 56 degrees of freedom

Multiple R-squared: 0.662, Adjusted R-squared: 0.644
F-statistic: 36.5 on 3 and 56 DF, p-value: 3.27e-13

Response XOM :
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.03568 0.02590 -1.38 0.1738
sp500 0.84442 0.10591 7.97 8.6e-11 ***
indpro -0.02685 0.00931 -2.88 0.0056 **
unemp 0.52112 0.31996 1.63 0.1090

Residual standard error: 0.0302 on 56 degrees of freedom

Multiple R-squared: 0.59, Adjusted R-squared: 0.568
F-statistic: 26.8 on 3 and 56 DF, p-value: 6.97e-11

Note that the statistical significance of the different factors varies considerably
by stock. This is not surprising; although, in general, the factors are related to
the stocks’ returns, the nature of the relationship depends on the particular
features of the stock under consideration. It is interesting to note that, for
each stock considered, one of indpro and unemp is statistically significant, or
close to significant, at the 0.05 level.
The estimate of the matrix of factor coefficients, β, may be extracted from
the result of the lm function and is stored in the R matrix betamat.

> betamat<- t(lm(stks4~sp500+indpro+unem)$coef[-1, ])

> betamat
sp500 indpro unem
CAT 1.750 0.0290 0.01178
CTAS 0.878 0.0203 -0.00564
RS 1.767 0.0304 0.00760
XOM 0.844 -0.0268 0.00521

T&F Cat #K31368 — K31368 C010— page 323 — 6/14/2017 — 22:05

324 Introduction to Statistical Methods for Financial Models

The index [-1,] is used to drop the estimate of the intercept when
constructing betamat.
The estimates of ΣF , the covariance matrix of the factors, and Σ , the error
covariance matrix, are stored in the matrices Sig_F and Sig_eps, respectively.
> Sig.F<-cov(cbind(sp500, indpro, unem))
> Sig.F
sp500 indpro unem
sp500 0.00141 -0.00235 -0.00283
indpro -0.00235 0.18510 0.06878
unem -0.00283 0.06878 1.53847
> f.sig<-function(y){summary(lm(y~sp500+indpro+unem))$sigma}
> Sig.eps<-diag(apply(stks4, 2, f.sig)^2)
> Sig.eps
CAT CTAS RS XOM
CAT 0.00302 0.00000 0.00000 0.000000
CTAS 0.00000 0.00115 0.00000 0.000000
RS 0.00000 0.00000 0.00235 0.000000
XOM 0.00000 0.00000 0.00000 0.000912
The estimate of the covariance matrix of the assets based on the factor
model, that is, using (10.2), is given by
> Sig<-betamat%*%Sig.F%*%t(betamat) + Sig.eps
and the corresponding correlation matrix is given by
> cov2cor(Sig)
CAT CTAS RS XOM
CAT 1.000 0.494 0.618 0.506
CTAS 0.494 1.000 0.535 0.420
RS 0.618 0.535 1.000 0.530
XOM 0.506 0.420 0.530 1.000
This can be compared to the estimate based on the single-index model
CAT CTAS RS XOM
CAT 1.000 0.499 0.578 0.528
CTAS 0.499 1.000 0.531 0.485
RS 0.578 0.531 1.000 0.561
XOM 0.528 0.485 0.561 1.000
and to the sample correlation matrix
> cor(stks4)
CAT CTAS RS XOM
CAT 1.000 0.432 0.729 0.484
CTAS 0.432 1.000 0.569 0.451
RS 0.729 0.569 1.000 0.579
XOM 0.484 0.451 0.579 1.000

T&F Cat #K31368 — K31368 C010— page 324 — 6/14/2017 — 22:05

Factor Models 325

The estimates based on the factor model and the single-index model are gen-
erally similar, but there are some differences. For instance, the correlation of
the returns on XOM and the returns on the other stocks is smaller for the
factor model than for the single-index model. This is likely because of the fact
that the estimate of the coefficient of indpro for XOM is negative, while the
estimates of this coefficient for the other stocks are positive, reducing the fac-
tor model estimate of correlation as compared to the estimate based on the
single-index model.

Using Fundamental Factors in a Factor Model

Variables measuring certain characteristics of the assets may also be used to
construct a factor. However, because factors must be common to all assets,
and not specific to individual assets, such variables must be converted to
common factors in order to use them in a factor model. The method used for
this is based on the pioneering work of Fama and French (1993).
For instance, Fama and French (1993) note that the size of a firm is related
to its profitability. Therefore, it may be useful to include a “size” factor in a
factor model, such as one based on a firm’s market capitalization, the price
per share of the firm’s stock times the number of shares outstanding.
To construct a common factor capturing a size effect, the following proce-
dure is used. A large set of stocks, all those traded on the New York Stock
Exchange in the case of Fama and French (1993), is divided into three groups,
“small” firms, “medium-sized” firms, and “big” firms. Stocks in the “small”
group correspond to firms whose size is in the bottom one-third of the market
capitalizations of the stocks under consideration; the “big” group corresponds
to firms whose size is in the top one-third. Two portfolios are formed, one using
stocks from the “small” group and one using stocks from the “big” group.
For instance, we might consider equally-weighted portfolios of small and big
stocks, respectively, although the method used by Fama and French (1993)
to construct their portfolios is more sophisticated. The Fama–French “small
minus big” (SMB) is the return on the “small” portfolio minus the return
on the “big” portfolio. Therefore, SMB is the return on a zero-investment
portfolio designed to capture risk factors related to the size of the firm.
The same basic approach may be used to convert any variable measuring
some characteristic of a firm into a common factor. Let X denote a char-
acteristic of a firm that can be measured for all assets under consideration.
Rank all assets by their value of X and form “low-X” and “high-X” groups.
For instance, when defining the factor SMB, the low-X group is taken to be
the one-third with the smallest X-values and the high-X group is taken to be
the one-third with the largest X-values; however, other criteria for choosing
the groups can be used. Then, form two portfolios, one from the low-X group
and one from the high-X group; equally weighted portfolios are often used
but, as in Fama and French (1993), other approaches might be considered.
The factor corresponding to X is then the difference in the returns of the

T&F Cat #K31368 — K31368 C010— page 325 — 6/14/2017 — 22:05

326 Introduction to Statistical Methods for Financial Models

low-X portfolio and those of the high-X portfolio. This procedure yields a
common factor that applies to all assets; it may also be used in a factor model
for a portfolio, for which it may not be possible to measure X, such as in the
ﬁrm size example.

Example 10.5 Consider the stocks of the ﬁve companies, Cablevision

Systems Corp. (symbol CVC), Edison International (EIX), Expedia, Inc.
(EXPE), Humana, Inc. (HUM), and Wal-Mart Stores, Inc. (WMT), that
were analyzed in Chapter 9, in Examples 9.2 and 9.3. The data matrix of
5 years of excess monthly returns for the period ending December 31, 2014,
is stored in the variable stks.
In Example 9.3, the parameters of the single-index model, using the S&P
500 index as the market index, were estimated. The estimates of α and β for
these stocks are given by

> stks.mm$coef
CVC EIX EXPE HUM WMT
(Intercept) -9.75e-05 0.00914 0.0108 0.0162 0.0059
sp500 1.07e+00 0.47407 0.8902 0.6516 0.4568

In this example, the returns on these stocks are analyzed using a factor
model based on three factors. The first factor is the return on the S&P 500
index and the second factor is the SMB factor described earlier.
For the third factor, we use a factor based on the book-to-market ratio
of the stock. The “book value” of a stock is the value of the stock based on
accounting information for the firm, while the “market value” of the stock
is based on the price of the stock. Thus, a large book-to-market ratio suggests
that the stock is undervalued by the market; such a stock is often referred to as
a value stock. Fama and French (1993) construct a factor based on the book-to-
market ratio of the stocks, using the same general procedure used to construct
the factor SMB. The factor based on the book-to-market ratio is known as
HML, for “high minus low”; thus, this factor is based on a zero-investment
portfolio contrasting stocks with high book-to-market ratios with stocks with
low book-to-market ratios. The model with factors SMB and HML, together
with the return on a market index, is known as the Fama–French three-factor
model.
Data on the factors SMB and HML are available from the Kenneth R.
French Data Library, on the website http://mba.tuck.dartmouth.edu/pages/
faculty/ken.french/data library.html. This site contains extensive data useful
in analyzing financial data. The values for the factors SMB and HML are given
in the file “Fama/French 3 Factors” found in the section on U.S. Research
Returns Data.
Five years of monthly data on SMB and HML for the period ending
December 31, 2014, are stored in the variables smb and hml, respectively.
Note that the data in the French Data Library are generally in the form of

T&F Cat #K31368 — K31368 C010— page 326 — 6/14/2017 — 22:05

Factor Models 327

percentage returns, while here we use proportional returns; hence, the vari-
ables smb and hml contain the values given in the French Data Library, divided
by 100.
To fit the factor model with these three factors to the stocks represented
in the data matrix stks, we may use the function lm:
> stks.fact<-lm(stks~sp500+smb+hml)
The coefficient estimates for these factors may be extracted by
> stks.fact$coef
CVC EIX EXPE HUM WMT
(Intercept) 0.0026 0.0094 0.0072 0.016 0.0044
sp500 0.8379 0.4461 1.1570 0.535 0.6333
smb 0.5268 0.0637 -0.4224 0.625 -0.5659
hml 0.9994 0.1158 -1.5607 -0.389 -0.3341
Wal-Mart (WMT) has the largest market capitalization of the five stocks con-
sidered; hence, it is not surprising that its sensitivity to SMB is negative. CVC
has the smallest market capitalization and its sensitivity to SMB is fairly large
and positive. However, the sensitivities are not a direct measure of the firm’s
size; they measure a property of the relationship between the stock’s returns
and those of the zero-investment portfolio on which SMB is based. Hence, it
is possible for the stock returns of a large firm to have a positive sensitivity
to SMB or the returns of a small company to have a negative sensitivity to
SMB; that is, in fact, the case for EXPE, which has a fairly small market
capitalization but a negative sensitivity to SMB.
The summary function may be used to obtain the standard errors of the esti-
mates along with other useful information such as the R-squared and adjusted
R-squared values for the regressions; see Example 10.4.
Estimates of the residual standard deviations may be obtained by defining
a function f.sighat by
> f.sighat<-function(y){summary(lm(y~sp500+smb+hml))$sigma}
and then using the apply function
> apply(stks, 2, f.sighat)
CVC EIX EXPE HUM WMT
0.0811 0.0465 0.1033 0.0692 0.0398

Note that the factors in a factor model are, in general, correlated. For
instance, for the three factors analyzed in the previous example, the estimated
correlation matrix of the factors is
> cor(cbind(sp500, smb, hml))
sp500 smb hml
sp500 1.000 0.4380 0.2122
smb 0.438 1.0000 0.0614
hml 0.212 0.0614 1.0000

T&F Cat #K31368 — K31368 C010— page 327 — 6/14/2017 — 22:05

328 Introduction to Statistical Methods for Financial Models

An important consequence of this is that the estimated sensitivity of an

asset’s returns to the excess returns on the S&P 500 index is diﬀerent than
the asset’s estimated value of beta in the market model. For example,

> lm(wmt~sp500)$coef
(Intercept) sp500
0.0059 0.4568

10.5 Arbitrage Pricing Theory

Like the single-index model, a factor model may be used to model the risk of an
asset as well as the covariance structure of a set of asset returns. In addition,
it gives information regarding the relationship between the expected returns
on an asset and the factors used in the model.
In this section, we consider the implications of a factor model for modeling
the expected return on an asset using an approach known as arbitrage pricing
theory (APT). Because the ideas in this section apply to any single time
period, for notational convenience, the subscript t will be omitted from the
random variables representing returns, factors, and so on.
First consider the market model for a given asset, asset i,

Ri − Rf = αi + βi (Rm − Rf ) + i ,

where Ri , Rm , and Rf are the returns on asset i, the returns on a market

index, and the returns on the risk-free asset, respectively. The random variable
i is assumed to be uncorrelated with Rm and to have mean zero so that the
expected excess return on the asset may be written

E(Ri − Rf ) = E(Ri ) − μf = αi + βi E(Rm − Rf ).

However, without any further assumptions, this relationship is not very

useful—in fact, it is always true by simply deﬁning αi to be

αi = E(Ri − Rf ) − βi E(Rm − Rf ).

According to the CAPM, if Rm is the return on a market portfolio that

is eﬃcient in the sense that it has the maximum Sharpe ratio, then αi = 0.
Therefore, if the CAPM holds with respect to the market index used in the
market model, then

E(Ri ) − μf = βi E(Rm − Rf ).

That is, under the eﬃciency of the portfolio with return Rm , the expected
excess return on an asset is proportional to its value of beta, βi . Thus, the

T&F Cat #K31368 — K31368 C010— page 328 — 6/14/2017 — 22:05

Factor Models 329

diﬀerence in the expected returns on two assets may be described in terms of

the diﬀerence in their values of beta:

E(Ri ) − E(Rj ) = (βi − βj )E(Rm − Rf ).

See Section 7.3 for further discussion of the implications of the CAPM.
Now consider a factor model of the form

Ri − Rf = αi + βi,1 F1 + βi,2 F2 + · · · + βi,K FK + i ,

where F1 , F2 , . . . , FK are the values of the factors. Then, under the usual
assumptions on i , the expected excess return on asset i may be written as

E(Ri ) − μf = αi + βi,1 E(F1 ) + βi,2 E(F2 ) + · · · + βi,K E(FK ).

However, like the corresponding relationship based on the market model,

without some conditions on αi , this equation is not useful for describing the
expected return on an asset.
Thus, the goal of APT is to derive a result similar to the CAPM for a
general factor model. Note that such a result does not need to conclude that
αi = 0 for all i in order to be useful; it is enough that the diﬀerence in the
expected returns on two assets can be described in terms of the diﬀerences in
their factor sensitivities, βi,k − βj,k , k = 1, 2, . . . , K:

E(Ri ) − E(Rj ) = (βi,1 − βj,1 )Γ1 + (βi,2 − βj,2 )Γ2 + · · · + (βi,K − βj,K )ΓK ,

for some constants Γ1 , Γ2 , . . . , ΓK . Note that such a result holds under the
condition that all αi are equal,

α1 = α2 = · · · = αN ≡ α,

for some constant α but, as we will see, it also holds under weaker conditions
on α1 , α2 , . . . , αN .
The key assumption of the CAPM is the efficiency of the market portfolio.
Given the generality of a factor model—there is considerable flexibility in the
specific factors used in the model—a more general approach is needed for
factor models. Hence, we rely on a concept more fundamental than efficiency,
arbitrage.
Roughly speaking, an arbitrage opportunity is one in which an investor
makes no net investment, has no chance of losing money, and has at least
some chance of making money.
Let R denote an N × 1 vector of asset returns. Consider a zero-investment
portfolio, that is, one based on a weight vector v satisfying v T 1 = 0, with
return Rp = v T R. The portfolio corresponding to v is said to be an arbitrage
portfolio if there is zero probability of a negative return, P(Rp < 0) = 0, and
there is positive probability of a positive return P(Rp > 0) > 0. Thus, an
arbitrage portfolio requires no investment, has zero probability of a negative

T&F Cat #K31368 — K31368 C010— page 329 — 6/14/2017 — 22:05

330 Introduction to Statistical Methods for Financial Models

return, and positive probability of a positive return. According to economic

theory, a market in equilibrium does not contain any arbitrage portfolios, a
condition that we will refer to as the no-arbitrage assumption.
Consider a zero-investment portfolio with return Rp and suppose
Var(Rp ) = 0. Then, under the no-arbitrage assumption, we must have
E(Rp ) ≤ 0, for if E(Rp ) > 0, then P(Rp > 0) = 1; in that case, there is zero
probability of a negative return and positive probability of a positive return,
violating the no-arbitrage assumption. That is, a zero-investment portfo-
lio with return Rp for which Var(Rp ) = 0 and E(Rp ) > 0 contradicts the
no-arbitrage assumption.
However, a zero-investment portfolio with return Rp for which Var(Rp ) = 0
and E(Rp ) < 0 also violates the no-arbitrage assumption because, if v denotes
the weight vector corresponding to Rp , then the portfolio with weight vec-
tor −v is a zero-investment portfolio with zero variance and positive expected
return. Therefore, we must also have E(Rp ) ≥ 0. Because E(Rp ) must satisfy
both E(Rp ) ≤ 0 and E(Rp ) ≥ 0, it must satisfy E(Rp ) = 0. That is, under the
no-arbitrage assumption, a zero-investment portfolio with return Rp such that
Var(Rp ) = 0 must also have E(Rp ) = 0.
Consider the factor model (10.1) for a set of N assets; as noted previously,
because we are focusing on a single time period, the subscript t is dropped.
Then we may write the vector of asset returns R as

R − Rf 1 = α + βF + .

Taking expectations,
E(R) − μf 1 = α + βE(F ).
Our goal is to show that, under the no-arbitrage assumption, the vector of
expected excess returns E(R) − μf 1 is of the form

α1 + βΓ

for some scalar α and vector Γ = (Γ1 , Γ2 , . . . , ΓK )T .

We begin by considering the case in which the residual returns are all zero,
= 0, so that the excess returns of the assets are completely described by the
factors; thus,
R − Rf 1 = α + βF .
Consider the following simple result from linear algebra.
Lemma 10.1. Let M denote a linear subspace of N and let u, v be elements
of N . If v is orthogonal to M implies that v and u are orthogonal, then
u ∈ M.
Proof. Let M⊥ denote the linear subspace of N consisting of vectors that
are orthogonal to M. Note that M and M⊥ are disjoint linear subspaces so
that they have only the zero vector in common. Also, (M⊥ )⊥ = M; that is,
the set of vectors that are orthogonal to M⊥ is simply M. If v ∈ M⊥ implies

T&F Cat #K31368 — K31368 C010— page 330 — 6/14/2017 — 22:05

Factor Models 331

that u and v are orthogonal, then u must be orthogonal to all vectors that
are orthogonal to M; that is,

u ∈ (M⊥ )⊥ ,

so that
u ∈ M,
proving the result.

Lemma 10.1 may now be used to prove the following simple form of APT.
Proposition 10.1. Suppose that the following factor model holds:

R − Rf 1 = α + βF (10.3)

for some vector of random variables F representing the factors.

Then, if the no-arbitrage assumption holds,

E(R) − μf 1 = α1 + βΓ (10.4)

for some scalar α and some vector Γ ∈ K .

Proof. Let M denote the (K + 1)-dimensional subspace of N spanned by the

columns of β together with 1. Let v ∈ M⊥ . Then v is orthogonal to 1 and to
each column of β so that

1T v = 0 and βT v = 0K . (10.5)

Consider a portfolio with return v T R; according to (10.5) and (10.3), this is

a zero-investment portfolio such that

v T R = v T α + v T βF = v T α.

It follows that the portfolio return has zero variance and an expected value
v T α. Under the no-arbitrage assumption, the expected return must be 0 so
that v T α = 0.
That is, v is orthogonal to M implies that v is orthogonal to α. It now
follows from Lemma 10.1 that α lies in M, the space spanned by 1 and the
columns of β; hence, α must be of the form

α = α1 + βΓ̃

for some scalar α and vector Γ̃ ∈ K . Therefore,

E(R) − μf 1 = α1 + βΓ̃ + βE(F );

the result now follows by writing this equation in the form (10.4) by deﬁning
Γ = Γ̃ + E(F ).

T&F Cat #K31368 — K31368 C010— page 331 — 6/14/2017 — 22:05

332 Introduction to Statistical Methods for Financial Models

According to this result, the expected return on asset i may be written as

E(Ri ) − μf = α + βi,1 Γ1 + βi,2 Γ2 + · · · + βi,K ΓK

for some constants α, Γ1 , Γ2 , . . . , ΓK . Furthermore, the diﬀerence in the

expected returns of two assets, say asset i and asset j, can be written

E(Ri ) − E(Rj ) = (βi,1 − βj,1 )Γ1 + (βi,2 − βj,2 )Γ2 + · · · + (βi,K − βj,K )ΓK .

That is, the diﬀerence in the expected returns of two assets can be described
in terms of the diﬀerences in the factor sensitivities for the two assets.

Asymptotic Arbitrage
Of course, the assumption that the asset returns are completely determined
by the factors is an unrealistic one. More general versions of APT are based
on the idea that if a zero-investment portfolio has a “small” return variance,
then it must have a “small” expected return. Note that this is a type of
continuity assumption. Recall that a function f (·) is continuous at a point x0 if
|f (x) − f (x0 )| is “small” whenever |x − x0 | is “small.” Deﬁnitions of continuity
are generally based on the concept of convergence; for instance, in the case of
a function, f (·) is continuous at x0 if, for a sequence x1 , x2 , . . . , the condition
that xn → x0 implies that f (xn ) → f (x0 ).
Therefore, a more general treatment of APT is based on assumptions
regarding sequences of portfolios. For N = 1, 2, . . . , consider a set of N assets
and let R(N ) denote the corresponding N × 1 vector of asset returns. Let
vN , N = 1, 2, . . . denote a sequence of vectors such that vN is N × 1 and
T
vN 1N = 0, N = 1, 2, . . . . Thus, each vN deﬁnes a zero-investment portfolio
based on the set of N assets.
(N ) T (N )
Let Rp = vN R(N ) , N = 1, 2, . . . so that, for each N , Rp is the return
on a zero-investment portfolio of N assets. We say that the no-asymptotic-
arbitrage assumption holds if

Var(Rp(N ) ) → 0 as N → ∞

implies that
E(Rp(N ) ) → 0 as N → ∞.
Thus, under the no-asymptotic-arbitrage assumption, a zero-investment
portfolio with a small return variance must also have a small mean return.
Suppose that for each N = 1, 2, . . . the factor model

R(N ) − Rf 1N = α(N ) + β(N ) F + (N )

holds, where α(N ) is an N × 1 vector of constants, β(N ) is an N × K matrix

of factor sensitivities, F is a K × 1 vector of factors, and (N ) is an N × 1
random vector with zero mean vector.

T&F Cat #K31368 — K31368 C010— page 332 — 6/14/2017 — 22:05

Factor Models 333

Then, under the no-asymptotic-arbitrage assumption, it may be shown

that
.
E(R(N ) ) − μf = α1 + β(N )Γ (10.6)
where α is a scalar and Γ is a K × 1 vector, both of which depend on N ; that
.
is, they depend on the assets under consideration. Here “=” means that the
diﬀerence between E(R(N ) ) − μf 1 and an expression of the form

α1N + β(N ) Γ

is small when N is large.

A formal proof of this result is quite technical and will not be presented
here. The basic idea is that, by constructing a portfolio from a large number
of assets, the residual risk of the portfolio may be made small, so that the
situation is similar to the one in Proposition 10.1 in which there is no residual
risk. Thus, an important condition is that we may form portfolios such that
the residual returns on the portfolios have small standard deviation. If the
residual returns for different assets are uncorrelated, it is straightforward to
show that such a condition holds; however, it may also hold in cases in which
the residual returns are correlated.
The interpretation of the result is the same as the interpretation of the
result given in Proposition 10.1. The difference in the expected returns of two
assets may be described in terms of the differences in their factor sensitivities.
That is, the factor sensitivities are useful in describing the expected returns
on the assets. In particular, the expected excess return on an asset may be
written as a sum of a constant, not depending on the asset, plus a weighted
sum of its factor sensitivities.

10.6 Factor Premiums

The APT discussed in the previous section shows that factor models can be
used to describe the vector of asset mean returns. Consider a factor model for
an asset return vector Rt of the form

Rt − Rf,t 1 = α + βFt + t , t = 1, 2, . . . , T, (10.7)

where Ft is a vector of factor values and the residual return vector t satisfying
E(t ) = 0, in addition to the other conditions described in Section 10.3.
According to APT, we may write

E(Rt − Rf,t 1) = α1 + βΓ, t = 1, 2, . . . , T (10.8)

for a scalar α and a vector Γ. Thus, estimates of α and Γ may be used to

estimate the expected excess returns on the assets.

T&F Cat #K31368 — K31368 C010— page 333 — 6/14/2017 — 22:05

334 Introduction to Statistical Methods for Financial Models

More importantly, this expression gives a relationship between the fac-

tor sensitivities of an asset and its expected excess return that is useful for
understanding the roles played by various economic variables in describing the
expected returns of the diﬀerent assets. In particular, the factor premiums tell
us how an asset’s sensitivities to various factors are compensated by a higher
expected return.
For i = 1, 2, . . . , N , let

T
R̄i − R̄f = (Ri,t − Rf,t )
T t=1

denote the sample mean excess return on the asset. It follows from (10.8) that

E(R̄i − R̄f ) = α + Γ1 βi,1 + Γ2 βi,2 + · · · + ΓK βi,K (10.9)

where βi,1 , βi,2 , . . . , βi,K are the factor sensitivities of asset i and
α, Γ1 , Γ2 , . . . , ΓK are unknown parameters; Γk is known as the risk premium
or, simply, the premium of factor k. The premium of a factor represents
the reward, in terms of a greater expected return, for assuming the risk
associated with a factor. Thus, our goal is to estimate the factor premiums
Γ1 , Γ2 , . . . , ΓK or, equivalently, the vector Γ = (Γ1 , Γ2 , . . . , ΓK )T along with
the scalar parameter α.

Role of Arbitrage Pricing Theory

Before considering the estimation of α and Γ, it is useful to reiterate some of
the important aspects of APT and how they relate to the estimation of factor
premiums.
Taking expectations in the model (10.7), the expected excess returns on
the assets may be written as

E(Rt − Rf,t 1) = α + βE(Ft ), t = 1, 2, . . . , T. (10.10)

This equation gives an expression for the excess mean return vector
E(Rt − Rf 1) in terms of the factor sensitivities for the assets, as given by the
matrix β, the vector α, which is a property of the assets under consideration,
and the vector of factor means E(Ft ).
APT tells us that the vector α is approximately of the form

α1 + βΓ̃ (10.11)

for a scalar α and a vector Γ̃. Using this fact in (10.10), and deﬁning Γ to
be Γ̃ + E(Ft ), leads to (10.8). Thus, it is important to keep in mind that, in
the expression (10.8), the parameter Γ is diﬀerent than the vector of factor
means, E(Ft ). In particular, we cannot estimate Γ by the sample means of
the factors.

T&F Cat #K31368 — K31368 C010— page 334 — 6/14/2017 — 22:05

Factor Models 335

A second important fact to keep in mind is that the result of APT is valid
only if the factor model correctly describes the asset returns under consider-
ation. There are at least two aspects to this. First, we assume that the error
term t in (10.7) is uncorrelated with the factor vector Ft . In particular, if we
inadvertently omit an important factor from the factor model, and that factor
is correlated with the factors in the model, then the least-squares estimators
of the factor sensitivities will be biased, a case of omitted-variable bias.
A second aspect of model misspecification is its effect on the correlation
structure of t . A key part of the proof of (10.6) is that we can form a portfolio
in which the variance of residual returns of the portfolio is negligible. If the
covariance matrix of t is diagonal, then this is generally possible when N
is large. However, if, after accounting for the factors, the asset returns are
still correlated, it might not be possible. Thus, it is important that the factor
model includes those factors that play an important role in describing the
correlation between the returns of different assets. That is, there is a second
potential problem related to an omitted factor, in addition to possible omitted
variable bias—the omitted variable may lead to correlation in the residual
returns for different assets, so that the expression in (10.11) is not an accurate
approximation to α.

Two-Stage Least-Squares Estimation

To estimate the parameters α, Γ1 , Γ2 , . . . , ΓK , we use a two-stage approach. In
the ﬁrst stage, we estimate the factor sensitivities βi,1 , βi,2 , . . . , βi,K for each
asset using the least-squares method described in Section 10.3. This procedure
yields estimates β̂i,1 , β̂i,2 , . . . , β̂i,K for each i = 1, 2, . . . , N .
In the second stage, we estimate α, Γ1 , Γ2 , . . . , ΓK using the relationships

E(R̄i − R̄f ) = α + βi,1 Γ1 + βi,2 Γ2 + · · · + βi,N ΓN

for i = 1, 2, . . . , N but with the estimates β̂i,k replacing the parameters βi,k .
Specifically, we use least-squares regression with the sample mean excess
returns, R̄i − R̄f , i = 1, 2, . . . , N , as the observed response variable and the
first-stage estimates of the factor sensitivities, β̂i,1 , β̂i,2 , . . . , β̂i,K , as the pre-
dictor variables corresponding to R̄i − R̄f . The methodology is illustrated in
the following example.
Example 10.6 Recall that in Example 10.2, stock return data for firms rep-
resented in the S&P 500 index were analyzed; 474 of the stocks had returns
for the period under consideration. The data matrix for these 474 stocks is
stored in the variable sp474.data.
The first step is to estimate the parameters of a factor model for each
stock. Here, we use four factors: the return on the S&P 500 index (sp500),
the factors SMB and HML described in Example 10.5 (variables smb and hml,
respectively), and a “momentum” factor, denoted by MOM. Like SMB and
HML, the momentum factor is based on constructing portfolios from two sets

T&F Cat #K31368 — K31368 C010— page 335 — 6/14/2017 — 22:05

336 Introduction to Statistical Methods for Financial Models

of stocks, those that have performed well in recent months and those that have
performed poorly; the factor MOM is the return on a zero-investment portfolio
based on the difference in the returns on these two portfolios. Thus, MOM is
different than SMB and HML in the sense that it is based on previous returns
of the assets rather than on properties of the firms issuing the stock. The
momentum factor is originally attributed to Carhart (1997); values of MOM
are available in the Kenneth R. French Data Library, in the section “Sorts
involving Prior Returns.” The variable mom contains the values of MOM from
the French Data Library, divided by 100.
The factor model estimates for the 474 stocks may be calculated using

> sp474.fm<-lm(sp474.data~sp500+smb+hml+mom)

The estimates of β1 , the coefficient of the S&P 500 index; β2 , the coefficient
of SMB; β3 , the coefficient of HML; and β4 , the coefficient of MOM, for each
of the 474 stocks are stored in the matrix sp474.beta, which has 474 rows
and 4 columns and that may be obtained by the command

> sp474.beta<-t(sp474.fm$coef)[, -1]

> head(sp474.beta)
sp500 smb hml mom
MMM 1.01 0.3661 -0.0988 -0.0107
ABT 0.80 -0.6508 -0.0792 0.2731
ACN 1.27 -0.0683 -0.6131 -0.4408
ATVI 1.19 -0.4622 -0.3889 0.4143
AYI 1.28 1.6768 -0.5286 0.3750
ADBE 1.23 0.1768 0.0784 0.0318

Note that the index [, -1] is used in deﬁning sp474.beta in order to drop
the column of intercept estimates; the transpose function t is used so that
sp474.beta has the same form as the parameter β.
To estimate the factor premiums, we ﬁt a regression model with the sample
mean excess returns for each of the 474 stocks as the response variable and
beta estimates as the predictor variables. Note that, in the function lm, we
may specify the model in terms of a matrix of predictor variables; then the
columns of the matrix are taken as the predictors.

> sp474.mean<-apply(sp474.data, 2, mean)

> summary(lm(sp474.mean~sp474.beta))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.014464 0.000901 16.06 < 2e-16 ***
sp474.betasp500 0.000372 0.000825 0.45 0.65
sp474.betasmb 0.006519 0.000666 9.79 < 2e-16 ***
sp474.betahml -0.004380 0.000615 -7.12 4.1e-12 ***
sp474.betamom 0.006333 0.000886 7.15 3.3e-12 ***

T&F Cat #K31368 — K31368 C010— page 336 — 6/14/2017 — 22:05

Factor Models 337

Residual standard error: 0.00726 on 469 degrees of freedom

Multiple R-squared: 0.254, Adjusted R-squared: 0.248
F-statistic: 39.9 on 4 and 469 DF, p-value: <2e-16

Therefore, the estimated mean excess return on a stock with beta estimates
given by β̂1 , β̂2 , β̂3 , and β̂4 is

0.0145 + 0.000372β̂1 + 0.00652β̂2 − 0.00438β̂3 − 0.00633β̂4.

According to the R-squared value for the regression, about 25.4% of the vari-
ation in the sample mean excess returns on the 474 stocks can be explained
by their estimates of the factor sensitivities; a better estimate of this quantity
is the adjusted R-squared value, which, in this case, is essentially the same
at 24.8%.
For instance, for the stock of the 3M Company (symbol MMM), β̂1 = 1.008,
β̂2 = 0.00366, β̂3 = −0.000988, and β̂4 = −0.000107. Therefore, the estimate
of the mean excess return on 3M stock is

0.0145 + 0.000372(1.008) + 0.00652(0.366) − 0.00438(−0.0988)

− 0.00633(−0.0107) = 0.0176;

for comparison, the observed sample mean excess returns for MMM is 0.0148.
For the 474 stocks, the estimated mean excess returns are given by fitted
values from the regression
> sp474.fit<-lm(sp474.mean~sp474.beta)$fitted.values
and the average error in the fitted values as estimates of sample mean excess
return is
> mean(abs(sp474.fit-sp474.mean))
[1] 0.00535
The correlation of the fitted values and the sample mean excess returns is
> cor(sp474.fit, sp474.mean)
[1] 0.504
which is simply the square root of the R-squared value from the regression.

Obtaining Standard Errors of the

Premium Estimates
In the example, the factor premiums were estimated using ordinary least-
squares regression with R̄i − R̄f , i = 1, 2, . . . , N , as the response variable and
the estimated factor sensitivities β̂i,k , k = 1, 2, . . . , K, i = 1, 2, . . . , N as the
predictor variables.
The least-squares procedure used in the example gives valid estimates of
the factor premiums. However, the standard errors given by lm are based on

T&F Cat #K31368 — K31368 C010— page 337 — 6/14/2017 — 22:05

338 Introduction to Statistical Methods for Financial Models

the assumption that observations on the response variable are uncorrelated.

In practice, the returns on diﬀerent assets in a given time period are correlated;
hence, R̄i and R̄j are correlated. It follows that the methods used by lm tend to
overstate the amount of information available to estimate the factor premiums,
and the reported standard errors are generally too small.
To obtain valid standard errors of the estimated factor premiums, we may
use an alternative approach in which the factor premiums are estimated for
each time period, using

R1,t − Rf,t , R2,t − Rf,t , . . . RN,t − Rf,t

as the dependent variable in the analysis for period t. Thus, if our data con-
sist of 5 years of monthly returns, we would obtain 60 estimates of each factor
premium—one estimate for each time period. These estimates are obtained
using least-squares regression with the estimated factor sensitivities as the
predictor variables; note that the same predictor variables are used in each
regression. The final estimates of the factor premiums are given by the sam-
ple means of these 60 estimates, and the standard errors of the premiums
estimates are given by the usual expression for the standard error of a sample
mean.
Example 10.7 Consider estimation of the factor premiums for the factor
model analyzed in Examples 10.2 and 10.6. Recall that the data matrix for
the 474 stocks under consideration is stored in the variable sp474.data and
the corresponding estimates of the factor sensitivities for the assets are stored
in the matrix sp474.beta.
To estimate the factor premiums for each time period, we may use the
command
> spcoef<-lm(t(sp474.data)~sp474.beta)$coef
Note that the data matrix must be transposed in order to use it as the response
variable in lm. The result of this command, spcoef, is a matrix with five rows,
one for the intercept of the model and one for each factor coefficient in the
model, and 60 columns, one for each time period. For instance, the first
column of the matrix
> spcoef[,1]
(Intercept) sp500 smb hml mom
0.0334 -0.0681 0.0118 0.0117 -0.0593
gives the estimated factor premiums based on the data in period 1.
To obtain the overall estimates of the factor premiums, we average the
estimates in each row of spcoef
> apply(spcoef, 1, mean)
(Intercept) sp500 smb hml mom
0.014464 0.000372 0.006519 -0.004380 0.006333

T&F Cat #K31368 — K31368 C010— page 338 — 6/14/2017 — 22:05

Factor Models 339

The standard errors of the estimates

√ are given by the sample standard
deviations of each row, divided by 60
> apply(spcoef, 1, sd)/(60^.5)
(Intercept) sp500 smb hml mom
0.00276 0.00553 0.00324 0.00272 0.00366
These results may be compared to those based on the analysis in Example
10.6:
> summary(lm(sp474.mean~sp474.beta))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.014464 0.000901 16.06 < 2e-16 ***
sp474.betasp500 0.000372 0.000825 0.45 0.65
sp474.betasmb 0.006519 0.000666 9.79 < 2e-16 ***
sp474.betahml -0.004380 0.000615 -7.12 4.1e-12 ***
sp474.betamom 0.006333 0.000886 7.15 3.3e-12 ***
Note that the parameter estimates based on averaging the 60 estimates are
identical to the ones obtained in Example 10.6. This is a general property of
least-squares estimates; it follows from the facts that the least-squares esti-
mates of regression coeﬃcients are linear functions of the response data and
the same predictor variables are used in each of the 60 regressions. The stan-
dard errors of these estimates, using the sample standard deviations of the
60 estimates, are considerably larger than those obtained in Example 10.6.
For instance, for estimating the premium of HML, the least-squares standard
error is 0.000615, while the two-stage standard error is 0.00272.

When interpreting estimates of the factor premiums, it is useful to keep in

mind that the estimated factor sensitivities tend to be correlated across assets.
For instance, in the previous example, the estimated correlation matrix of the
factor sensitivities for the 474 stocks analyzed is
> cor(sp474.beta)
sp500 smb hml mom
sp500 1.000 0.166 0.030 -0.279
smb 0.166 1.000 0.109 -0.451
hml 0.030 0.109 1.000 -0.152
mom -0.279 -0.451 -0.152 1.000
For example, stocks with a large sensitivity to the MOM factor tend to have
smaller sensitivities to the other factors.

Rolling Regressions
In the models used in this section to estimate factor premiums, the excess
returns on the stocks under consideration are related to those stocks’ estimated

T&F Cat #K31368 — K31368 C010— page 339 — 6/14/2017 — 22:05

340 Introduction to Statistical Methods for Financial Models

factor sensitivities. An important aspect of these analyses is that those excess

returns used are from the same data used to estimate the factor sensitivi-
ties. However, in practice, a common goal of a factor-model analysis is to
use current data to study the properties of future returns. Thus, it may be
more appropriate to estimate factor premiums using a response variable based
on return data for periods following the ones used to estimate the factor
sensitivities.

Example 10.8 Consider estimation of the factor premiums for the factor
model analyzed in Examples 10.2 and 10.6. In Example 10.6, estimates of
the factor premiums were obtained using least-squares regression with the
response variable taken to be the sample mean excess returns for the 474
assets and the predictor variables given by the estimated factor sensitivities.
The response and predictor variables are all based on data in the matrix
sp474.data, which contains the monthly excess returns on the 474 stocks for
the 5-year period ending December 31, 2014.
Now suppose we are interested in using the estimated factor sensitivities
to describe future asset returns. The variable sp474.data.115 contains the
excess returns for the 474 stocks for January 2015. Therefore, we can estimate
the factor premiums for the four factors in the model using least-squares
regression with response variable sp474.data.115 and the predictor variables
given in the matrix sp474.beta.

> summary(lm(sp474.data.115~sp474.beta))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04205 0.00821 5.12 4.5e-07 ***
sp474.betasp500 -0.06412 0.00752 -8.52 < 2e-16 ***
sp474.betasmb 0.00330 0.00607 0.54 0.59
sp474.betahml -0.02486 0.00561 -4.43 1.2e-05 ***
sp474.betamom 0.00248 0.00808 0.31 0.76

Residual standard error: 0.0662 on 469 degrees of freedom

Multiple R-squared: 0.178, Adjusted R-squared: 0.171
F-statistic: 25.4 on 4 and 469 DF, p-value: <2e-16

The estimates obtained in the previous example have the drawback that
the response variable in the regression is based on the stock returns from
only a single month. One way to include additional months of returns in
the analysis is to use rolling regressions. For instance, suppose that we have
m months of asset returns, along with the corresponding factor values, and
suppose that our goal is to estimate the factor premiums using estimates of
the factor sensitivities based on 60 months of data.
To do this, we ﬁrst estimate the factor sensitivities using data from months
1 through 60 and then use those results as the predictor variables in a regres-
sion model, with the response variable taken to be the excess returns for

T&F Cat #K31368 — K31368 C010— page 340 — 6/14/2017 — 22:05

Factor Models 341

all stocks from month 61. This procedure may then be repeated, estimating
the factor sensitivities using data from months 2 through 61 and estimating
the factor premiums using a regression model with response variable based
on data from month 62. We may continue in this way until the ﬁnal regres-
sion with the returns from month m as the response variable and the factor
sensitivities estimated using data from months m − 60 through m − 1 as the
predictors. The result is m − 60 estimates of the factor premiums, which can
be averaged using an approach similar to the one used in Example 10.7 to
obtain the ﬁnal estimates.

Example 10.9 Consider estimation of the factor premiums as in Example 10.8,

using rolling regressions as discussed previously. The data matrix sp474.data6
contains 6 years of monthly excess returns on the 474 stocks described in
Example 10.2, and the matrix factor6 contains 6 years of monthly factor values
for the factors described in Example 10.8. Hence, we will calculate 12 sets of
factor premium estimates, each obtained using the same basic procedure used
in Example 10.8. The ﬁrst set of estimates uses data from months 1 through 60
to estimate the factor sensitivities, which are then used as predictor variables in
a regression model for the returns in month 61. The second set of estimates uses
data from months 2 to 61 to estimate the factor sensitivities, which are then
used as predictor variables in a regression model for the returns in month 62,
and so on.
To perform these rolling regressions, we may use the following commands.

> pmat<-matrix(0, 5, 12)

> for (j in 1:12){
+ dat<-sp474.data6[j:(j+59), ]
+ x.fact<-factor6[j:(j+59), ]
+ bet<-t(lm(dat~x.fact)$coef)[, -1]
+ pmat[, j]<-lm(sp474.data6[(j+60), ]~bet)$coef
+ }

The matrix pmat stores the 12 sets of factor premiums, one in each column.
For a given value of the index j, dat contains the return data and x.fact
contains the factor data for months j to j + 59 that will be used to obtain the
estimates of the factor sensitivities; these estimates are calculated using the
command lm(dat~x.fact)$coef and are stored in the variable bet. Note that
the transpose function t is used to put the estimates in the correct format
to use in a subsequent function, and the index [, -1] is used to drop the
estimate of the intercept term from the results.
The command lm(sp474.data6[(j+60), ]~bet) runs the regression with
the returns in month j + 60 as the response variable and the estimated factor
sensitivities in bet as the predictor variables. The estimated coeﬃcients from
this regression are stored in the jth column of the matrix pmat. After the loop
is completed, pmat contains 12 estimates of each factor premium.

T&F Cat #K31368 — K31368 C010— page 341 — 6/14/2017 — 22:05

342 Introduction to Statistical Methods for Financial Models

Using the approach from Example 10.7, the factor premium estimates
may then be obtained by averaging the 12 columns of the matrix pmat for
each column
> apply(pmat, 1, mean)
[1] 0.00762 -0.00851 0.00235 -0.00878 0.00895
and the standard errors may be estimated using
> apply(pmat, 1, sd)/(12^.5)
[1] 0.00846 0.00889 0.00290 0.00391 0.00875
Of course, the first value in these vectors refers to the intercept, which is not
a factor.
These results may be compared to those obtained in Example 10.7, in
which the estimated factor premiums are given by
> apply(spcoef, 1, mean)
(Intercept) sp500 smb hml mom
0.014464 0.000372 0.006519 -0.004380 0.006333
with standard errors given by
> apply(spcoef, 1, sd)/(60^.5)
(Intercept) sp500 smb hml mom
0.00276 0.00553 0.00324 0.00272 0.00366
The two sets of premium estimates have several differences that are likely
to be important in practice. For instance, the estimate of the premium for the
return on the S&P 500 index was found to be 0.00372 in Example 10.7; using
future returns as the predictor variable, the estimate is −0.00851. However,
the standard errors of both estimates are relatively large so that the difference
is unlikely to be statistically significant.
The average R-squared value for the 12 regressions used to calculate
the estimates in pmat is roughly 0.11, which is considerably less than the
R-squared value in Example 10.6. This is not surprising; we expect that
estimates of the factor sensitivities will have a weaker relationship with
future returns than they have with returns for the time period used for their
estimation.
Note that the standard errors of the estimates based on the rolling
regressions are generally larger than those of the estimates calculated in
Example 10.7, which is not surprising since they are based on the averages
of only 12 rolling estimates as compared to the averages of 60 estimates used
in Example 10.7. Of course, by going back further in time, we could obtain
more premium estimates, thus reducing the standard errors; as always, such
an approach is only useful if the relationships among the variables do not
change in important ways over the time period considered. Thus, one draw-
back of the rolling-regression method is that the estimates must be based
on a relatively long series of data in order to achieve the standard errors

T&F Cat #K31368 — K31368 C010— page 342 — 6/14/2017 — 22:05

Factor Models 343

of the same general magnitude as those obtained by the method used in

Example 10.7.

10.7 Applications of Factor Models

The ultimate goal of analyzing return data using a factor model is to provide
information that is useful in the investment process. There are several ways in
which a factor model may be used to better understand the properties of assets
and to guide the selection of portfolio weights. The most straightforward of these
is to use the factor model to estimate parameters of the distribution of the return
vector on the assets under consideration, such as its covariance matrix.

Example 10.10 Consider stocks for ﬁrms represented in the S&P 100 index;
see Example 8.5. The data matrix sp96.data contains 5 years of monthly
returns for the period ending December 31, 2014, for each stock; only 96 of
the 100 stocks had 5 years of monthly returns available, so sp96.data has 60
rows and 96 columns.
Consider the Fama–French three-factor model used in Example 10.5; using
this model, the covariance matrix of the returns may be estimated using the
same approach used in Example 10.4.

> sp96.ff<-lm(sp96.data~sp500+smb+hml)
> sp96.ff.beta<-sp96.ff$coef[-1,]
> Sig.FF<-cov(cbind(sp500, smb, hml))
> f.sighat.ff<-function(y){summary(lm(y~sp500+smb+hml))$sigma}
> sp96.Sig.fact<-t(sp96.ff.beta)%*%Sig.FF%*%sp96.ff.beta +
+ diag(apply(sp96.data, 2, f.sighat.ff)^2)

Hence, the matrix sp96.Sig.fact contains an estimate of the return

covariance matrix for the 96 stocks represented in sp96.data.
Consider the minimum-variance portfolio of those stocks, subject to the
constraint that all portfolio weights are nonnegative. Estimates of the port-
folio weights may be obtained using the function solve.QP in the quadprog
package.

> library(quadprog)
> sp96.mv<-solve.QP(Dmat=2*sp96.Sig.fact, dvec=rep(0, 96),
+ Amat=cbind(rep(1,96), diag(96)), bvec=c(1, rep(0, 96)),
+ meq=1)$solution

Note that many of the estimated weights are close to zero, but not exactly
zero

> head(sp96.mv)
[1] -4.2e-18 -2.5e-17 -5.3e-17 -3.0e-17 -2.8e-17 1.0e-02

T&F Cat #K31368 — K31368 C010— page 343 — 6/14/2017 — 22:05

344 Introduction to Statistical Methods for Financial Models

The function round may be used to round values to a speciﬁed number of

decimal places; hence, the command
> sp96.mv<-round(sp96.mv, 4)
rounds the elements of sp96.mv to four decimal places:
> head(sp96.mv)
[1] 0.000 0.000 0.000 0.000 0.000 0.010.
After rounding, only 20 stocks have nonzero weights in the minimum-variance
portfolio using the constraint that all weights are nonnegative.
Note that the sample covariance matrix cannot be used in this context.
> solve.QP(Dmat=2*cov(sp96.data), dvec=rep(0, 96),
+ Amat=cbind(rep(1, 96), diag(96)), bvec=c(1, rep(0, 96)),
+ meq=1)$solution
Error in solve.QP(Dmat = 2 * cov(sp96.data), dvec = rep(0, 96),
Amat = cbind(rep(1, : matrix D in quadratic function is not
positive definite!

The mean excess return on an asset may be estimated using the same gen-
eral approach. Such an estimate is based on estimates of the factor premiums,
as discussed in Section 10.6, along with estimates of the factor sensitivities
for a given asset. The methodology is illustrated in the following example.
Example 10.11 Consider the factor model with factors SMB, HML, MOM,
along with the excess return on the S&P 500 index. Estimates of the factor
premiums were obtained in Example 10.6; they are stored in the variable
fact.prem
> fact.prem
sp500 smb hml mom
0.000372 0.006519 -0.004380 0.006333
The constant in the equation for an asset’s excess mean return in terms of its
factor sensitivities is 0.01446.
For example, for 3M Company stock (symbol MMM), the factor sensitiv-
ities are
> sp474.beta["MMM",]
sp500 smb hml mom
1.0080 0.3661 -0.0988 -0.0107
It follows that the estimated mean excess return for 3M stock is
> 0.01446 + sum(fact.prem*sp474.beta["MMM", ])
[1] 0.0176
This may be compared to the sample mean excess return of 0.0148.

T&F Cat #K31368 — K31368 C010— page 344 — 6/14/2017 — 22:05

Factor Models 345

Using Factor Sensitivities to Describe a Portfolio

Although it may be more natural to describe a portfolio in terms of the weights
given to the assets in the portfolio, such an approach does not take into
account the fact that many assets are closely related, with returns that are,
in some cases, highly correlated. Hence, even a portfolio constructed from a
large number of assets may not be truly diversiﬁed. An alternative approach
is to describe a portfolio in terms of its sensitivities to the diﬀerent factors
used in a factor model. The estimated sensitivities of a portfolio’s returns to
the various factors give a succinct description of its properties.

Example 10.12 Consider the returns on three mutual funds, TIAA-CREF

Small-Cap Equity Fund (TISEX), T. Rowe Price Small-Cap Value Fund
(PRSVX), and Vanguard Value Index Fund (VIVAX). Like mutual funds in
general, these three funds have holdings in a large number of stocks, with
around 300 stocks in each fund; hence, it is difficult to obtain an under-
standing of a fund by analyzing the weights given to each stock. One way to
summarize this information is to look at the composition of the fund by mar-
ket sector, such as energy, financial services, health care, and so on; although
such an analysis is often useful, not all stocks in a given sector are affected
the same way by different economic conditions and returns for stocks in dif-
ferent sectors may still be correlated, making assessment of the diversification
of the portfolio difficult. Analyzing a portfolio’s returns using a factor model
summarizes the relationship between the returns and the various underlying
factors that affect their volatility.
Suppose that the returns on these funds are analyzed using the fac-
tor model described in Example 10.6, which has factors SMB, HML, and
MOM, along with the excess return on the S&P 500 index. The data matrix
funds3.data contains monthly excess returns on the three funds for the 5-
year period ending December 31, 2014. The estimated factor sensitivities for
the three mutual funds are given by

> funds3.fact<-lm(funds3.data~sp500+smb+hml+mom)
> t(funds3.fact$coef)[,-1]
sp500 smb hml mom
tisex 1.040 0.9498 -0.0427 -0.0278
prsvx 0.893 0.9412 0.2335 -0.0494
vivax 0.950 -0.0161 0.2578 -0.0483

The adjusted R-squared values for the four-factor model regressions are given
by

> f.rsq<-function(y)
+ {summary(lm(y~sp500+smb+hml+mom))$adj.r.squared}
> apply(funds3.data, 2, f.rsq)
tisex prsvx vivax
0.985 0.974 0.983

T&F Cat #K31368 — K31368 C010— page 345 — 6/14/2017 — 22:05

346 Introduction to Statistical Methods for Financial Models

indicating that the returns of each of the funds are closely related to the four
factors.
The estimated coefficients of the return on the market index are simi-
lar for the three funds; however, their sensitivities to the other factors are
often quite different. For instance, TISEX and PRSVX both have a relatively
large sensitivity to SMB of about 0.95, indicating that the returns on both
funds have a positive relationship to the returns on small cap stocks, and
their sensitivities to MOM are similar; however, PRSVX has a relatively large
positive sensitivity to HML while TISEX has a negative sensitivity to this
factor. This suggests that PRSVX tends to invest in more value stocks than
does TISEX. The funds PRSVX and VIVAX have similar coefficients of the
market index, HML, and MOM; however, the coefficient of SMB for VIVAX
is small and negative, while for PRSVX it is large and positive. This sug-
gests that the returns on PRSVX are taking advantage of a “small-cap stock
effect” while the returns on VIVAX are approximately unrelated to the size
factor.

Imposing Factor-Sensitivity Constraints

As discussed in Section 10.3, the factor sensitivities of a portfolio are weighted
sums of the factor sensitivities of the individual assets,

N
βp,j = wi βi,j , j = 1, 2, . . . , K,
i=1

where w1 , w2 , . . . , wN are the portfolio weights. Hence, it is often possible to

choose the portfolio weights in order to achieve a desired factor sensitivity for
the portfolio. This is illustrated in the following example.

Example 10.13 Consider the stocks represented in the data matrix big8
and consider a factor model based on ﬁve economic factors: the return on the
S&P 500 index, stored in the variable sp500; the unemployment rate, stored
in the variable unemp; the change in the Industrial Production Index stored in
the variable indpro; the change in the Consumer Sentiment Index, stored
in the variable consum; and the change in the Consumer Price Index, stored
in the variable cpi.
The Industrial Production Index and the unemployment rate are described
in Example 10.4. The Consumer Sentiment Index is based on the Univer-
sity of Michigan’s Surveys of Consumers; these data are available on the
FRED website at https://research.stlouisfed.org/fred2/series/UMCSENT.
The Consumer Price Index is the Consumer Price Index for All Urban
Consumers prepared by the Bureau of Labor Statistics; it is also avail-
able on the FRED website at https://research.stlouisfed.org/fred2/series/
CPIAUCSL.

T&F Cat #K31368 — K31368 C010— page 346 — 6/14/2017 — 22:05

Factor Models 347

Some summary statistics of the factors for the time period considered are
given by

> apply(cbind(sp500, unemp, indpro,consum, cpi), 2, summary)

sp500 unemp indpro consum cpi
Min. -0.082 0.056 -0.702 -0.1290 -0.00336
1st Qu. -0.014 0.073 0.012 -0.0132 0.00021
Median 0.018 0.082 0.263 0.0119 0.00161
Mean 0.011 0.080 0.266 0.0057 0.00141
3rd Qu. 0.033 0.090 0.485 0.0386 0.00256
Max. 0.108 0.099 1.560 0.1060 0.00599

The estimated factor sensitivities for the eight stocks are stored in the
variable big8.fact.beta:

>big8.fact<-lm(big8~sp500+unemp+indpro+consum+cpi)
>big8.fact.beta<-t(big8.fact$coefficients)[, -1]
> big8.fact.beta
sp500 unemp indpro consum cpi
AAPL 0.940 0.282 0.011 0.000 0.988
BAX 0.678 -0.304 -0.013 0.017 2.888
KO 0.507 0.092 0.013 -0.119 -0.275
CVS 1.096 -0.446 0.015 0.036 -1.989
XOM 0.846 0.496 -0.022 0.142 2.298
IBM 0.617 0.950 -0.008 0.030 2.030
JNJ 0.504 -0.138 -0.022 -0.023 -1.107
DIS 1.196 0.083 -0.014 0.079 -5.806

The estimate of the return covariance matrix based on the factor model is
stored in the variable big8.Sig.fact

> f.sighat.fact<-function(y){summary(lm(y~sp500+unem+indpro+
+ consum+cpi))$sigma^2}
> Sig.FF1<-cov(cbind(sp500, unem, indpro, consum, cpi))
> big8.Sig.fact<-t(big8.fact.beta)%*%Sig.FF1%*%big8.fact.beta +
+ diag(apply(big8, 2, f.sighat.fact))

Using this estimate, along with the sample mean excess returns, the
estimated weight vector of the risk-averse portfolio with the risk-aversion
parameter taken to be λ = 10 is given by

> big8.mean<-apply(big8, 2, mean)

> solve.QP(Dmat=10*big8.Sig.fact, dvec=big8.mean,
+ A=cbind(rep(1,8)), bvec=c(1), meq=1)$solution
[1] 0.282 -0.078 0.180 0.337 -0.296 -0.105 0.359 0.320

T&F Cat #K31368 — K31368 C010— page 347 — 6/14/2017 — 22:05

348 Introduction to Statistical Methods for Financial Models

If we enforce the restriction that all asset weights must be nonnegative, the
estimated weights are given by
> big8.ra10<-solve.QP(Dmat=10*big8.Sig.fact, dvec=big8.mean,
+ A=cbind(rep(1,8),diag(8)), bvec=c(1, rep(0, 8)),
+ meq=1)$solution
> big8.ra10
[1] 0.244 0.000 0.084 0.270 0.000 0.000 0.195 0.207
The estimated factor sensitivities for the portfolio with weight vector
big8.ra10 are given by
> big8.fact.beta%*%big8.ra10
[,1]
sp500 0.9144
unemp -0.0538
indpro 0.0004
consum 0.0119
cpi -1.7397
Suppose we wish to construct a portfolio that is insensitive to the factors
unemp and cpi. This may be done using the function solve.QP, including the
constraint that the portfolio sensitivities to those factors are both zero. Deﬁne
a matrix A by
> A<-cbind(rep(1, 8), t(big8.fact.beta[c(2, 5),]), diag(8))
Then the second and third rows of the transpose of A contain the factor
sensitivities for unemp and cpi, respectively. Thus, the command
> big8.ra10.con<-solve.QP(Dmat=10*big8.Sig.fact, dvec=big8.mean,
+ Amat=A, bvec=c(1, 0, 0, rep(0, 8)), meq=3)$solution
computes the weights of the risk-averse portfolio based on the risk-aversion
parameter λ = 10 subject to the constraints that all portfolio weights are
nonnegative and that the factor sensitivities for unemp and cpi are zero. The
weights are given by
> big8.ra10.con
[1] 0.317 0.080 0.139 0.226 0.028 0.035 0.175 0.000
and the estimated factor sensitivities are
> big8.fact.beta%*%big8.ra10.con
[,1]
sp500 0.8039
unemp 0.0000
indpro 0.0026
consum -0.0059
cpi 0.0000

T&F Cat #K31368 — K31368 C010— page 348 — 6/14/2017 — 22:05

Factor Models 349

Thus, we expect that the returns on the portfolio with weight vector
big8.ra10.con will be less sensitive to the economic conditions reﬂected in
unemp and cpi than is the portfolio with weight vector big8.ra10.
The estimated mean excess returns and return standard deviations for the
two portfolios are given by

> sum(big8.ra10*big8.mean)
[1] 0.0193
> sum(big8.ra10.con*big8.mean)
[1] 0.0173
> (big8.ra10%*%big8.Sig.fact%*%big8.ra10)^.5
[,1]
[1,] 0.0411
> (big8.ra10.con%*%big8.Sig.fact%*%big8.ra10.con)^.5
[,1]
[1,] 0.0392

Thus, the constrained portfolio has a smaller estimated mean excess return
but, because its return does not depend on the factors unemp and cpi, its
estimated return standard deviation is smaller as well. In terms of the risk-
aversion criterion function, the value for the unconstrained portfolio is 0.0109,
while the value for the constrained portfolio is slightly less, at 0.0096.

10.8 Suggestions for Further Reading

Factor models are among the core methodologies for understanding the behav-
ior of asset returns; hence, there are many books and research papers covering
their properties. Chincarini and Kim (2006) present an in-depth discussion of
the application of factor models to portfolio management; this book is par-
ticularly useful for its detailed treatment of the possible factors to include in
such a model. Alexander (2008, Chapter II.1), Elton et al. (2007, Chapter 8),
Fabozzi et al. (2006, Chapter 8), Francis and Kim (2013, Chapter 8), and
Ruppert (2004, Section 7.8) provide introductions to a number of different
aspects of factor models. A more technical treatment of these topics is avail-
able in Campbell et al. (1997, Chapter 6). The momentum factor used in
Sections 10.6 and 10.7 is attributed to Carhart (1997).
The analysis in Example 10.2 of Section 10.2, showing that the market
model beta is not closely related to the average returns on an asset, is a sim-
plified version of the discussion in Fama and French (1993). The procedure,
discussed in Section 10.4, for constructing a factor based on the characteris-
tics of the various firms, is covered in detail by Chincarini and Kim (2006,
Section 7.4); a more comprehensive treatment is presented by Bali et al. (2016,
Chapter 5), where this methodology is called “portfolio analysis.”

T&F Cat #K31368 — K31368 C010— page 349 — 6/14/2017 — 22:05

350 Introduction to Statistical Methods for Financial Models

The proposition on APT, along with the corresponding discussion, given

in Section 10.5 is based on the work of Francis and Kim (2013, Chapter 16),
which provides an excellent introduction to this topic; see also Reilly and Brown
(2009, Chapter 9). Shumway (2000) outlines further details on asymptotic
arbitrage, presenting important results without using extensive mathematics;
see also Huberman and Wang (2008).
The use of factor models to estimate expected returns and the estimation of
factor premiums are important topics in financial statistics, and the discussion
in Section 10.6 provides just an overview of this topic. The two-stage approach,
in which the factor sensitivities are first estimated and these estimates are then
the predictor variables in a model used to estimate the factor premiums, is gen-
erally known as “Fama–MacBeth regression” after Fama and MacBeth (1973).
See Cochrane (2000, Chapter 12) for a useful introduction to this method; a
more general treatment is presented by Bali et al. (2016, Chapter 6). In par-
ticular, Bali et al. (2016) discuss the use of “future” returns as the response
variable in the model used to estimate factor premiums. Because different
assets have different error variances, some analysts use weighted least-squares
when estimating factor premiums; see Draper and Smith (1981, Section 2.11)
for a general discussion of weighted least-squares estimation and Shanken and
Zhou (2007) for a comparison of weighted least-squares with other methods
of estimating factor premiums.
The application of factor models to the investment process is often
described by the term factor-based investing; useful nontechnical introduc-
tions to the ideas of factor-based investing are given by Bender et al. (2013)
and Pappas and Dickson (2015). The use of factor models in describing the
properties of a portfolio is described by Francis and Kim (2013, Chapter 19);
factor-sensitivity constraints are discussed by Chincarini and Kim (2006,
Section 9.7).
The type of factor model discussed in this chapter is often described
as an “economic factor model.” Two other types of factor models that are
often used are “fundamental factor models” and “statistical factor models.”
A fundamental factor model has the same basic form as the models consid-
ered in this chapter. However, in such models, the goal is to describe the
returns on an asset in terms of asset-specific variables, such as those related
to a firm’s value or its profitability, rather than in terms of common factors.
Because these variables depend on the properties of each asset, they corre-
spond to the factor sensitivities in the factor model, and the factor values
are treated as unknown parameters to be estimated. In a statistical factor
model, the goal is to model the covariance matrix of the asset return vector
directly by describing it in terms of the covariances of certain linear combina-
tions of the asset returns, using a technique known as principal components
analysis. See, for example, Chincarini and Kim (2006, Chapter 3) and Qian
et al. (2007, Chapter 3) for general discussions of the different types of factor
models.

T&F Cat #K31368 — K31368 C010— page 350 — 6/14/2017 — 22:05

Factor Models 351

10.9 Exercises
1. Let β denote a N × K matrix, let ΣF denote a K × K symmetric
matrix, and let Σ denote an N × N symmetric matrix. Show that

βΣF βT + Σ (10.12)

is a positive-deﬁnite matrix provided that ΣF and Σ are positive

definite.
Suppose that one of ΣF and Σ is nonnegative definite but not
positive definite. Does it follow that (10.12) is positive definite?
Why or why not?
2. Consider a factor model with three factors. Suppose that, for a set
of four assets, the matrix of factor sensitivities is given by
⎛ ⎞
1.0 0.3 −0.2
⎜0.9 0.7 0.5 ⎟
β=⎜ ⎟
⎝1.1 −0.1 1.0 ⎠
0.5 1.1 0

and that the residual standard deviations for these assets are given
by 0.4, 0.5, 0.2, 0.6, respectively. Let
⎛ ⎞
0.15 0.035 0.015
ΣF = ⎝0.035 0.050 0.002⎠ .
0.015 0.002 0.030

Find the covariance matrix of the return vector for these assets
under the assumption that the factor model holds and give the
corresponding correlation matrix.
3. Consider a factor model with three factors applied to a set of four
assets. For the parameter values given in Exercise 2, find the fac-
tor sensitivities and residual standard deviation for the equally
weighted portfolio of the four assets.
4. Calculate 5 years of monthly excess returns for the period ending
December 31, 2015, for five stocks, Papa John’s International, Inc.
(symbol PZZA), Bed Bath & Beyond, Inc. (BBBY), Netflix, Inc.
(NFLX), Time Warner, Inc. (TWX), and Verizon Communications,
Inc. (VZ); for the risk-free rate, use the return on the 3-month
Treasury Bill, available on the Federal Reserve website.

T&F Cat #K31368 — K31368 C010— page 351 — 6/14/2017 — 22:05

352 Introduction to Statistical Methods for Financial Models

Consider a factor model with three factors, the excess return on

the S&P 500 index (symbol ^GSPC) and the Fama–French factors
SMB and HML. Data on the factors SMB and HML are avail-
able from the Kenneth R. French Data Library, available on the
website http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/
data library.html in the file “Fama/French 3 Factors” found in the
section on U.S. Research Returns Data. Note that the data in the
French Data Library are generally in the form of percentage returns;
hence, you should divide the values by 100.
Estimate the factor sensitivities for these five stocks, along with
the residual standard deviations.
5. Is a portfolio consisting entirely of an investment in the risk-free
asset, which has a positive return with probability one, considered
an arbitrage portfolio? Why or why not?
6. Show that, under the no-arbitrage assumption, all risk-free assets
must have the same return.
7. Consider a factor model with three factors. Suppose that the factor
premiums for the factors are 0.002, 0.001, and 0.0025, respectively,
and that the value of α in the equation for expected return on an
asset is 0.005.
Suppose that a given asset has factor sensitivities β1 = 1.0,
β2 = 0.75, and β3 = −0.10. Find the expected return on the asset.
8. Consider the returns on the five stocks and the factor model
described in Exercise 4. Using these data, estimate the factor
premiums for the three factors using the procedure described in
Example 10.7. Provide standard errors of the premium estimates.
9. Consider the returns on the five stocks and the factor model
described in Exercise 4.

a. Using the estimated factor sensitivities for that model, esti-

mate the covariance function of the assets’ returns. Calculate
the corresponding correlation matrix.
b. Using the estimated covariance function based on the factor
model and the sample mean vector for the assets, estimate the
weights of the risk-averse portfolio with risk-aversion parameter
λ = 10.
10. Consider the returns on the ﬁve stocks and the factor model
described in Exercise 4 and analyzed in Exercise 9. Find the risk-
averse portfolio with risk-aversion parameter λ = 10 subject to the
following constraints: all portfolio weights are nonnegative; the sen-
sitivity of the portfolio to the factor based on the S&P 500 index is
one; the sensitivity of the portfolio to the factor SMB is at least 0.25.

T&F Cat #K31368 — K31368 C010— page 352 — 6/14/2017 — 22:05

Factor Models 353

Use the estimated covariance matrix of the assets’ returns based on

the factor model together with the sample mean vector of the assets.
11. Consider the following five large mutual funds: American Funds
Income Fund of America Class A (symbol AMECX), Dodge & Cox
Stock Fund (DODGX), Fidelity Contrafund (FCNTX), Franklin
Income Fund Class A (FKINX), and Vanguard Wellington Fund
(VWENX). For each fund, calculate 5 years of monthly excess
returns for the period ending December 31, 2015; for the risk-free
rate, use the return on the 3-month Treasury Bill, available on the
Federal Reserve website.
Consider a factor model with three factors, the excess return on
the S&P 500 index (symbol ^GSPC) and the Fama–French factors
SMB and HML. Data on the factors SMB and HML are avail-
able from the Kenneth R. French Data Library, available on the
website http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/
data library.html in the file “Fama/French 3 Factors” found in the
section on U.S. Research Returns Data. Note that the data in the
French Data Library are generally in the form of percentage returns;
hence, you should divide the values by 100.
a. Estimate the factor sensitivities for these five funds. Based on
these results, identify the two funds that appear to be the most
similar and identify the two funds that appear to be the most
different.
b. Estimate the correlation matrix of the funds’ returns. Are the
returns of the “similar” funds identified in Part (a) highly corre-
lated compared to other pairs of funds? Do the “different” funds
identified in Part (a) have a low return correlation compared to
the other funds?
c. Compute the sample mean excess returns and sample return
standard deviations for the five funds. Are the sample mean
excess returns and the sample return standard deviations of the
“similar” funds identified in Part (a) generally similar, as com-
pared to the results for the other funds? Are the sample mean
excess returns and the sample return standard deviations of the
“different” funds identified in Part (a) generally different, as
compared to the results for the other funds?
d. Based on your results, would you conclude that the factor
sensitivities are useful for identifying portfolios with similar
properties? Why or why not?

T&F Cat #K31368 — K31368 C010— page 353 — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page vi — 6/14/2017 — 22:05
References

Agresti, A. and Finlay, B. (2009). Statistical Methods for the Social Sciences.
Pearson, Upper Saddle River, NJ, fourth edition.

Alexander, C. (2008). Practical Financial Econometrics. Wiley, Chichester.

Bali, T. G., Engle, R. F., and Murray, S. (2016). Empirical Asset Pricing:
The Cross Section of Stock Returns. Wiley, Hoboken, NJ.

Bender, J., Briand, R., Melas, D., and Subramanian, R. A. (2013). Foun-
dations of Factor Investing. Technical report. MSCI Index Research,
New York, NY.

Benninga, S. (2008). Financial Modeling. The MIT Press, Cambridge, MA,

third edition.

Bernstein, W. J. (2001). The Intelligent Asset Allocator: How to Build Your

Portfolio to Maximize Returns and Minimize Risk. McGraw-Hill, New York,
NY.

Best, M. J. and Grauer, R. R. (1991). On the Sensitivity of Mean-Variance-

Eﬃcient Portfolios to Changes in Asset Means: Some Analytical and
Computational Results. The Review of Financial Studies, 4:315–342.

Black, F. (1972). Capital Market Equilibrium with Restricted Borrowing.

Journal of Business, 45:444–455.

Blattberg, R. C. and Gonedes, N. J. (1974). A Comparison of the Stable

and Student Distributions as Statistical Models for Stock Prices. Journal
of Business, 47:244–280.

Blitzstein, J. K. and Hwang, J. (2015). Introduction to Probability. CRC Press,

Boca Raton, FL.

Bouchaud, J.-P. and Potters, M. (2011). Financial Applications of Ran-

dom Matrix Theory: A Short Review. In Akemann, G., Baik, J., and
Francesco, P. D., editors, Oxford Handbook on Random Matrix Theory,
chapter 40, pages 824–850. Oxford University Press, Oxford, UK.

Bradﬁeld, D. (2003). Investment Basics XLVI. On Estimating the Beta

Coeﬃcient. Investment Analysts Journal, 57:47–53.

355

T&F Cat #K31368 — K31368 A001— page 355 — 6/14/2017 — 22:05

356 References

Caeiro, F. and Mateus, A. (2014). randtests: Testing randomness in R.

R package version 1.0.

Campbell, J. Y., Lo, A. W., and MacKinlay, A. C. (1997). The Econometrics

of Financial Markets. Princeton University Press, Princeton, NJ.

Canty, A. and Ripley, B. (2015). boot: Bootstrap R (S-plus) Functions.

R package version 1.3-17.

Carhart, M. M. (1997). On Persistence in Mutual Fund Performance. Journal

of Finance, 52:57–82.

Carlin, B. P. and Louis, T. A. (2000). Bayes and Empirical Bayes Methods

for Data Analysis. CRC Press, Boca Raton, FL, second edition.

Chambers, J. M., Cleveland, W. S., Keiner, B., and Tukey, P. A. (1983).

Graphical Methods for Data Analysis. Wadsworth, Belmont, CA.

Chincarini, L. B. and Kim, D. (2006). Quantitative Equity Portfolio Manage-

ment. McGraw-Hill, New York, NY.

Cochrane, J. H. (2000). Asset Pricing. Princeton University Press, Prince-

ton, NJ.

Cowpertwait, P. S. P. and Metcalfe, A. V. (2009). Introductory Time Series

with R. Springer, New York, NY.

Dalgaard, P. (2008). Introductory Statistics with R. Springer, New York, NY,

second edition.

Damodaran, A. (1999). Estimating Risk Parameters. Report No.

S-CDM-99-02. Faculty Digital Archive, New York University.
http://hdl.handle.net/2451/26789.

Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their

Application. Cambridge University Press, Cambridge, UK.

DeMiguel, V., Garlappi, L., and Uppal, R. (2009). Optimal versus Naive
Diversiﬁcation: How Ineﬃcient Is the 1/N Portfolio Strategy? The Review
of Financial Studies, 22:1915–1953.

DeMiguel, V., Martin-Utrerab, A., and Nogalesb, F. J. (2013). Size Mat-

ters: Optimal Calibration of Shrinkage Estimators for Portfolio Selection.
Journal of Banking and Finance, 37:3018–3034.

Draper, N. R. and Smith, H. (1981). Applied Regression Analysis. Wiley,

New York, NY, second edition.

Efron, B. and Morris, C. (1970). Stein’s Paradox in Statistics. Scientiﬁc

American, 236:119–127.

T&F Cat #K31368 — K31368 A001— page 356 — 6/14/2017 — 22:05

References 357

Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap.

Chapman & Hall, New York, NY.
Elton, E. J., Gruber, M. J., Brown, S. J., and Goetzmann, W. N. (2007).
Modern Portfolio Theory and Investment Analysis. Wiley, Hoboken, NJ,
seventh edition.
Evans, M. J. and Rosenthal, J. S. (2004). Probability and Statistics: The
Science of Uncertainty. Freeman, New York, NY.
Fabozzi, F. J., Focardi, S. M., and Kolm, P. N. (2006). Financial Modeling of
the Equity Market: From CAPM to Cointegration. Wiley, Hoboken, NJ.
Fama, E. F. (1965). Random Walks in Stock Market Prices. Financial
Analysts Journal, 51:75–80.
Fama, E. F. (1970). Eﬃcient Capital Markets: A Review of Theory and
Empirical Work. Journal of Finance, 25:383–417.
Fama, E. F. (1976). Foundations of Finance: Portfolio Decisions and
Securities Prices. Basic Books, New York, NY.
Fama, E. F. and French, K. R. (1993). Common Risk Factors in the Returns
on Stocks and Bonds. Journal of Financial Economics, 33:3–56.
Fama, E. F. and MacBeth, J. D. (1973). Risk, Return, and Equilibrium:
Empirical Tests. Journal of Political Economy, 81:607–636.
Foulkes, A. S. (2009). Applied Statistical Genetics with R. Springer, New York,
NY.
Francis, J. C. and Kim, D. (2013). Modern Portfolio Theory: Foundation,
Analysis, and New Developments. Wiley, Hoboken, NJ.
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., and Hothorn, T.
(2016). mvtnorm: Multivariate Normal and t Distributions. R package
version 1.0-5.
Granger, C. W. J. (1963). A Quick Test for Serial Correlation Suitable for
Use with Non-Stationary Time Series. Journal of the American Statistical
Association, 58:728–736.
Gray, J. B. and French, D. W. (1990). Empirical Comparisons of Distribu-
tional Models for Stock Index Returns. Journal of Business Finance &
Accounting, 17:451–459.
Grinold, R. C. and Kahn, R. N. (2000). Active Portfolio Management.
McGraw-Hill, New York, NY, second edition.
Hammersley, J. M. and Handscomb, D. C. (1964). Monte Carlo Methods.
Metheun, London.

T&F Cat #K31368 — K31368 A001— page 357 — 6/14/2017 — 22:05

358 References

Henderson, H. V. and Searle, S. R. (1981). On Deriving the Inverse of a Sum

of Matrices. SIAM Review, 23:53–60.
Hogg, R. V. and Tanis, E. A. (2006). Probability and Statistical Inference.
Pearson, Upper Saddle River, NJ, seventh edition.
Huberman, G. and Wang, Z. (2008). Arbitrage Pricing Theory. In
Durlauf, S. N. and Blume, L. E., editors, The New Palgrave Dictionary
of Economics, pages 197–205. Palgrave Macmillan, Basingstoke, UK.
Hult, H., Lindskog, F., Hammarlid, O., and Rehn, C. J. (2012). Risk and
Portfolio Analysis: Principles and Methods. Springer, New York, NY.
Jobson, J. D. and Korkie, B. (1980). Estimation for Markowitz Eﬃcient
Portfolios. Journal of the American Statistical Association, 75:544–554.
Jobson, J. D. and Korkie, B. (1981). Putting Markowitz Theory to Work.
Journal of Portfolio Management, 7:70–74.
Johnson, R. A. and Wichern, D. W. (2007). Applied Multivariate Statistical
Analysis. Pearson, Upper Saddle River, NJ, sixth edition.
Jorion, P. (1986). Bayes-Stein Estimation for Portfolio Analysis. Journal of
Financial and Quantitative Analysis, 21:279–292.
Kane, A., Kim, T., and White, H. (2012). Active Portfolio Management : The
Power of the Treynor-Black Model. In Kyrtsou, C. and Vorlow, C. editors,
Progress in Financial Markets Research, pages 311–332. Nova Publishers,
New York, NY.
Larson, R. and Edwards, B. H. (2014). Calculus. Brooks Cole, Boston, MA,
tenth edition.
Ledoit, O. and Wolf, M. (2004). Honey, I Shrunk the Sample Covariance
Matrix. Journal of Portfolio Management, 30:110–119.
Lintner, J. (1952). The Valuation of Risk Assets and the Selection of Risky
Investments in Stock Portfolios and Capital Budget. Review of Economics
& Statistics, 47:13–37.
Lo, A. W. (1991). Long-Term Memory in Stock Market Prices. Econometrica,
59:1279–1313.
Lo, A. W. (1997). A Nonrandom Walk Down Wall Street: Recent Advances
in Financial Technology. TIAA-CREF Research Dialogues, 52:1–7.
Lo, A. W. and MacKinlay, A. C. (2002). A Non-Random Walk Down Wall
Street. Princeton University Press, Princeton, NJ.
Lu, T.-T. and Shiou, S.-H. (2002). Inverses of 2 × 2 Block Matrices. Computers
& Mathematics with Applications, 43:119–129.

T&F Cat #K31368 — K31368 A001— page 358 — 6/14/2017 — 22:05

References 359

Malkiel, B. G. (1973). A Random Walk Down Wall Street. W. W. Norton &

Company, New York, NY.

Malkiel, B. G. (2003). A Random Walk Down Wall Street: The Time-Tested

Strategy for Successful Investing. W. W. Norton, New York, NY.

Markowitz, H. (1952). Portfolio Selection. Journal of Finance, 7:77–91.

Markowitz, H. M. (1987). Mean-Variance Analysis in Portfolio Choice and

Capital Markets. Wiley, New York, NY.

Martin, R. D. and Simin, T. T. (2003). Outlier-Resistant Estimates of Beta.

Financial Analysts Journal, 59:56–69.

Merton, R. C. (1972). An Analytic Derivation of the Eﬃcient Portfolio

Frontier. Journal of Financial and Quantitative Analysis, 7:1851–1872.

Michaud, R. O. (1989). The Markowitz Optimization Enigma: Is ‘Optimized’

Optimal? Financial Analysts Journal, 45:31–42.

Miller, M. B. (2012). Mathematics and Statistics for Financial Risk Manage-

ment. Wiley, Hoboken, NJ.

Modigliani, F. and Pogue, G. A. (1974). An Introduction to Risk and Return:

Concepts and Evidence. Financial Analysts Journal, 30:68–80.

Montgomery, D. C., Jennings, C. L., and Kulahci, M. (2008). Introduction to

Time Series Analysis and Forecasting. Wiley, Hoboken, NJ.

Mossin, J. (1966). Equlibrium in a Capital Asset Market. Econometrica,

34:768–783.

Newbold, P., Carlson, W. L., and Thorne, B. M. (2013). Statistics for Business
and Economics. Pearson, Upper Saddle River, NJ, eighth edition.

Pappas, S. N. and Dickson, J. M. (2015). Factor-Based Investing. Technical

report. Vanguard Research, Springﬁeld, VA.

Praetz, P. D. (1972). The Distribution of Share Price Changes. Journal of

Business, 45:49–55.

Qian, E. E., Hua, R. H., and Sorensen, E. H. (2007). Quantitative Equity

Portfolio Management: Modern Techniques and Applications. CRC Press,
Boca Raton, FL.

Reilly, F. K. and Brown, K. C. (2009). Investment Analysis and Portfolio

Management. South-Western, Mason, OH, ninth edition.

Rice, J. A. (2007). Mathematical Statistics and Data Analysis. Brooks/Cole,

Belmont, CA, third edition.

T&F Cat #K31368 — K31368 A001— page 359 — 6/14/2017 — 22:05

360 References

Roll, R. (1977). A Critique of the Asset Pricing Theory’s Tests: Part I: On Past
and Potential Testability of the Theory. Journal of Financial Economics,
4:129–176.

Ross, S. (2006). A First Course in Probability. Pearson, Upper Saddle River,

NJ, seventh edition.

Ross, S. A. (1977). The Capital Asset Pricing Model (CAPM), Short-Sale

Restrictions and Related Issues. Journal of Finance, 32:177–183.

Ross, S. M. (2013). Simulation. Academic Press, San Diego, CA, ﬁfth edition.

Ruppert, D. (2004). Statistics and Finance: An Introduction. Springer-Verlag,

New York, NY.

Samuelson, P. (1965). Proof that Properly Anticipated Prices Fluctuate

Randomly. Industrial Management Review, 6:41–49.

Samuelson, P. (1974). Challenge to Judgement. Journal of Portfolio Manage-

ment, 1:17–19.

Sclove, S. L. (2013). A Course on Statistics for Finance. CRC Press, Boca

Raton, FL.

Severini, T. A. (2005). Elements of Distribution Theory. Cambridge University

Press, Cambridge, UK.

Severini, T. A. (2016). A Nonparametric Approach to Measuring the Sensi-

tivity of an Asset’s Return to the Market. Annals of Finance, 12:179–199.

Shanken, J. and Zhou, G. (2007). Estimation and Testing Beta Pricing

Models: Alternative Methods and their Performance in Simulations. Journal
of Financial Economics, 84:40–86.

Sharpe, W. F. (1963). A Simpliﬁed Model for Portfolio Analysis. Management

Science, 9:277–293.

Sharpe, W. F. (1964). Capital Asset Prices: A Theory of Market Equilibrium

under Conditions of Risk. Journal of Finance, 19:425–442.

Sharpe, W. F. (1991). The Arithmetic of Active Management. Financial

Analysts Journal, 47:7–9.

Shumway, T. (2000). Course Notes for Bus Admin 855. http://www-personal.

umich.edu/∼shumway/courses.dir/ba855.dir/Notes1.pdf.

Stewart, J. (2015). Calculus. Brooks Cole, Boston, MA, eighth edition.

Tamhane, A. C. and Dunlop, D. D. (2000). Statistics and Data Analysis: From

Elementary to Intermediate. Prentice-Hall, Upper Saddle River, NJ.

T&F Cat #K31368 — K31368 A001— page 360 — 6/14/2017 — 22:05

References 361

Touloumis, A. (2015). Nonparametric Stein-Type Shrinkage Covariance

Matrix Estimators in High-Dimensional Settings. Computational Statistics
and Data Analysis, 83:251–261.
Trapletti, A. and Hornik, K. (2016). tseries: Time Series Analysis and
Computational Finance. R package version 0.10-35.
Treynor, J. L. and Black, F. (1973). How to Use Security Analysis to Improve
Portfolio Selection. Journal of Business, 46:66–86.
Turlach, B. A. and Weingessel, A. (2013). quadprog: Functions to Solve
Quadratic Programming Problems. R package version 1.5-5.
Vasicek, O. A. (1973). A Note on Using Cross-Sectional Information in
Bayesian Estimation of Betas. Journal of Finance, 28:1233–1239.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S.
Springer, New York, fourth edition.
Warnes, G. R., Bolker, B., and Lumley, T. (2015). gtools: Various R
Programming Tools. R package version 3.5.0.
Wei, W. W. S. (2006). Time Series Analysis: Univariate and Multivariate
Methods. Pearson, Boston, MA, second edition.
Woolridge, J. M. (2013). Introductory Econometrics: A Modern Approach.
South-Western, Mason, OH, ﬁfth edition.

T&F Cat #K31368 — K31368 A001— page 361 — 6/14/2017 — 22:05

T&F Cat #K31368 — K31368 C000— page vi — 6/14/2017 — 22:05
Index

A returns, negatively
Active portfolio management, correlated, 72
Treynor–Black method and, risk-free, 81–84
292–307 risky, 84, 85, 90
adding single asset to market volatility of, 20
portfolio, 293–298 Assets, N (portfolios of), 95–102
benchmark portfolio, 292 correlation matrix, 100
estimator bias 304–306 diversiﬁcation, 101–102
numerical computation of eigenvalues, 100
portfolio weights, 306–307 inner product, 97
portfolio of N assets together matrix notation, 96–101
with market portfolio, nonnegative deﬁnite matrix, 99
298–302 random vector, 96
properties of Treynor–Black Autocorrelation function, 16
portfolio, 302–304 Autocovariance function, 16
Adjusted beta, 243–244
Adjusted prices, 9–14 B
Adjusted R-squared, 320 Benchmark portfolio, 292
Appraisal ratio, 258–259, 295 Best linear predictor, 51
Arbitrage pricing theory (APT), Beta, adjusted, 243–244
328–333 Bias
arbitrage portfolio, 329 corrected estimate, 264
asymptotic arbitrage, omitted-variable, 335
332–333 “Bloomberg adjusted beta,” 243
factor premiums and, 334 Bonferroni inequality, 234
no-arbitrage assumption, 330 Bonferroni method, 235
no-asymptotic arbitrage Bootstrap method, 261, 304
assumption, 332 Box–Ljung test, 55
Assets
appraisal ratio of, 295 C
correctly priced, 232–237 Capital asset pricing model
excess return of, 83 (CAPM), 2, 197–220
investable weight factor of, 223 applying the CAPM to a
market capitalization of, 222 portfolio, 206–208
mispriced, 208–211 capital market line, 199
prices, random walk models for, CAPM without risk-free asset,
48–54 211–214

363

T&F Cat #K31368 — K31368 A001— page 363 — 6/14/2017 — 22:05

364 Index

describing the expected returns nonnegative deﬁnite matrix, 99

on a set of assets, 215–217 opportunity set, 103
efficiency of market portfolio, portfolio constraints, 133–139
role of, 210–211 portfolios of N assets, 95–102
implications of, 202–206 quadratic programming
linear regression analysis, problem, 110
relationship to, 201–202 random vector, 96
market capitalization, 209 risk-aversion criterion, 121–128
market portfolio, 197 Sharpe ratio, 137–139
mispriced assets, 208–211 tangency portfolio, 129–132
relationship between risk and two-fund theorem, 110
reward, 205–206 variance matrix, 101
security market line, 198–202 zero-investment portfolios, 106
tangency portfolio, 198
Eigenvalues, 100
zero-beta portfolio, 212, 214
Empirical Bayes estimation, 189
Cap-weighted indices, 221–224
Estimation, 145–195
Cauchy–Schwarz inequality, 105
basic sample statistics, 145–151
Central limit theorem (CLT),
central limit theorem, 149
149, 178
Conditional expectation, 41–45 data matrix, 152–155
Correlation matrix, 100 decay parameter, 157
Covariance function, 15 effective sampling size, 159
exponentially weighted moving
D average estimator, 157
Data matrix, 152–155 mean vector and covariance
Decay parameter, 157 matrix, 151–157
Diversification, 101–102 observation period, 145
Dividends, 5, 8–9 parametric bootstrap, 190
E plug-in estimator, 171
Effective sampling size, 159 portfolio weights, estimation of,
Efficient frontier, 105, 77, 118 171–174
Efficient market hypothesis, 2 return means and standard
Efficient portfolio theory, 76, 95–144 deviations, estimation of,
affine combinations, 109 146–148
Cauchy–Schwarz inequality, 105 sample covariance matrix,
correlation matrix, 100 properties of, 156–157
diversification, 101–102 sample covariances and
efficient frontier, 105, 118–120 correlations, 148–149
eigenvalues, 100 sampling horizon, 145
holding constraints, 136–137 shrinkage estimators, 163–171
inner product, 97 standard error, 150
matrix notation, 96–101 statistical properties of
minimum-risk frontier, 103–113 estimators, 149–151
minimum-variance portfolio, target matrix, 168
113–117 trace of a matrix, 156

T&F Cat #K31368 — K31368 A001— page 364 — 6/14/2017 — 22:05

Index 365

using Monte Carlo simulation to risk premium, 334

study the properties of role of arbitrage pricing theory,
estimators, 174–189 334–335
weighted estimators, 157–163 rolling regressions, 339–343
Excess return of assets, 83 “small minus big,” 325
Exponentially weighted moving two-stage least-squares
average (EWMA) estimation, 335–337
estimator, 157 using factor sensitivities to
Extractor functions, 231 describe a portfolio,
345–346
F
value stock, 326
Factor models, 3, 311–353
False discovery rate (FDR),
adjusted R-squared, 320
235–237
applications of, 343–349
Fama–French three-factor
arbitrage portfolio, 329
model, 326
arbitrage pricing theory,
Federal Reserve Economic Data
328–333
(FRED), 3
asymptotic arbitrage, 332–333
Financial engineering, 1
common factors, 316
Float-adjusted index, 223
economic factors, 321
Freedman–Diaconis rule, 32
estimation, 318–321
Fundamental analysis, 1
factor premiums, 333–343
factors, 321–328
G
factor sensitivities, 316
Geometric random walk, 52
Fama–French three-factor
Gross return, 6
model, 326
fundamental factors, 321, H
325–328 “High minus low” (HML), 326
“high minus low,” 326 Holding constraints, 136–137
imposing factor-sensitivity
constraints, 346–349 I
limitations of single-index Inner product, 97
model, 311–315 Inter-quartile range (IQR) of
model and its estimation, data, 32
315–321 Iterated conditional expectations, 46
no-arbitrage assumption, 330
no-asymptotic arbitrage J
assumption, 332 Jensen’s alpha, 257–258
obtaining standard errors of
premium estimates, K
337–339 K -period return, 6
omitted-variable bias, 335
portfolios, 317–318 L
principal components Linear regression analysis, 201–202
analysis, 350 Log-returns, 7–8

T&F Cat #K31368 — K31368 A001— page 365 — 6/14/2017 — 22:05

366 Index

M stock screening and multiple

Market capitalization, 209 testing, 233–235
Market model, 2, 221–272 time-dependent portfolio
adjusted beta, 243–244 weights, 246–247
appraisal ratio, 258–259 Treynor ratio, 254–257
bias-corrected estimate, 264 Market portfolio, 3, 197
“Bloomberg adjusted beta,” 243 Markowitz portfolio theory, 2, 91
Bonferroni inequality, 234 Martingale model, 46–48
Bonferroni method, 235 Mean function, 14
bootstrap procedure, 261 Mean squared error (MSE), 188
cap-weighted indices, 221–224 Mean-variance analysis, 1
comparison of portfolios, Minimum-risk frontier, 103–113
266–268 affine combinations, 109
correctly priced asset calculating the weight vector of
(hypothesis testing), a portfolio on, 110–113
232–237 Cauchy–Schwarz inequality, 105
decomposition of risk, 237–239 characterization, 106–109
diversification and, 247–254 efficient frontier, 105
estimation, 228–232 opportunity set, 103
extractor functions, 231 portfolios constructed from
false discovery rate, 235–237 portfolios on, 109–110
float-adjusted index, 223 quadratic programming
interpretation of βi , 228 problem, 110
investable weight factor of the zero-investment portfolios, 106
asset, 223 Minimum-variance portfolio, 110, 78,
Jensen’s alpha, 257–258 113–117
market capitalization, 222 Mispriced assets, 208–211
market indices, 221–226 Model formula, 230
model and its estimation, Modern portfolio theory, 1
226–232 Monte Carlo simulation, properties
model formula, 230 of estimators studied using,
portfolio performance, 174–189
measurement of, 254–259 comparison of estimators,
portfolio risk, 252–254 186–189
portfolios, application to, description of sampling
244–247 distribution of a statistic,
portfolios of several assets, 182–186
249–252 simulating a return vector,
price-weighted indices, 225–226 178–182
relationship to CAPM, 227–228 MSE, see Mean squared error
residual returns, 226
shrinkage estimation, 239–244 N
standard errors of estimated N assets, portfolios of, 95–102
performance measures, correlation matrix, 100
259–268 diversification, 101–102

T&F Cat #K31368 — K31368 A001— page 366 — 6/14/2017 — 22:05

Index 367

eigenvalues, 100 portfolios of two risky assets and

inner product, 97 a risk-free asset, 84–91
matrix notation, 96–101 portfolio theory, 69
nonnegative definite matrix, 99 portfolio weights, 69
random vector, 96 risk-aversion criterion, 79–81
Net return, 5; see also Returns risk-free assets, 81–84
No-arbitrage assumption, 330 Sharpe ratio, 87–88
Nonnegative definite matrix, 99 tangency, 88–91
Normal probability plot, 33 weights, estimation of, 171–174
O zero-beta, 212, 214
Observation period, 145 Portfolios of N assets, 95–102
Omitted-variable bias, 335 correlation matrix, 100
Opportunity set, 75, 103 diversification, 101–102
eigenvalues, 100
P
inner product, 97
Parametric bootstrap, 190
matrix notation, 96–101
Passive investing, 292
nonnegative definite matrix, 99
Plug-in estimator, 171
random vector, 96
Portfolio constraints, 133–139
holding constraints, 136–137 variance matrix, 101
Sharpe ratio, 137–139 Prices, adjusted, 9–14
Portfolios, 1, 69–94; see also Efficient Price-weighted indices, 225–226
portfolio theory Principal components analysis, 350
arbitrage, 329
basic concepts, 69–72 Q
benchmark, 292 Quadratic programming, 110,
comparison of, 266–268 127–128
diversification, 71–72 Quantile–quantile plot
efficient frontier, 77 (Q–Q plot), 33
efficient portfolios, 76–78 Quantitative finance, 1
excess return of the asset, 83
“forced buy-in,” 74 R
“margin call,” 74 Random matrix theory, 189
market, 197 Random vector, 96
Markowitz portfolio theory, 91 Random walk hypothesis, 2, 41–67
minimum-variance portfolio, application of random walk
78–79 models to asset prices,
negative portfolio weights (short 52–54
sales), 73–74 asset prices, random walk
opportunity set, 75 models for, 48–54
optimal portfolios of two assets, best linear predictor, 51
74–81 Box–Ljung test, 55
performance, measurement of, conditional expectation, 41–45
254–259 definitions of random walk,
portfolio selection problem, 69 49–52

T&F Cat #K31368 — K31368 A001— page 367 — 6/14/2017 — 22:05

368 Index

drift, 50 stationarity, 15
efficient markets, 45–48 statistical properties of, 14–20
geometric random walk, 52 stochastic process, 14
increments of the process, 49 Sturges’ rule, 32
iterated conditional time series, 14
expectations, 46 variance function, 14
martingale model, 46–48 volatility of the asset, 20
rescaled range test, 59–61 weak stationarity, 15–19
runs test, 58–59 weak white noise, 19–20
sample autocorrelation function, Revenue, 6
test based on, 55 Risk-aversion criterion, 79–81,
stock returns, 61–63 121–128
tests of, 54–61 finding wλ using quadratic
variance-ratio test, 56–58 programming, 127–128
volatility, 50
properties of risk-averse
Rescaled range test, 59–61
portfolios, 124–127
Residual returns, 226
Risk-free assets, 81–84
Returns, 5–40
Risk premium, 334
adjusted prices, 9–14
Rolling regressions, 340
analyzing return data, 20–37
Root mean squared error
application to asset returns, 20
(RMSE), 188
autocorrelation function, 16
autocovariance function, 16 R-squared, adjusted, 320
basic concepts, 5–9 Running means, 27
covariance function, 15 Runs test, 58–59
dividends, 5, 8–9
S
Freedman–Diaconis rule, 32
Sample
gross return, 6
autocorrelation function, test
k -period return, 6
based on, 55
log-returns, 7–8
mean function, 14 covariance matrix, properties of,
monthly returns, 24–26 156–157
net return, 5 covariances and correlations,
normal probability plot, 33 148–149
quantile–quantile plot, 33 statistics, 145–151
return interval, 21 Sampling
revenue, 6 distribution of a statistic,
running means and standard 182–186
deviations, 26–29 frequency of data, 21
sample autocorrelation function, horizon, 145
29–32 size, effective, 159
sampling frequency of data, 21 Second-order properties (returns), 16
second-order properties, 16 Sector funds, 309
shape of return distribution, Security market line (SML), 198–202
32–37 capital market line, 199

T&F Cat #K31368 — K31368 A001— page 368 — 6/14/2017 — 22:05

Index 369

linear regression analysis, ﬁnancial engineering, 1

relationship to, fundamental analysis, 1
201–202 market model, 2
Sharpe ratio, 87–88, 129, 137 market portfolio, 3
Shrinkage estimators, 163–171 Markowitz portfolio theory, 2
Single-index model, 273–310 mean-variance analysis, 1
adding single asset to market modern portfolio theory, 1
portfolio, 293–298 portfolio, 1
applications to portfolio quantitative finance, 1
analysis, 286–291 random walk hypothesis, 2
benchmark portfolio, 292 Stochastic process, 14
correlation of asset returns Stock returns, random walk model
under, 276–278 and, 61–63
covariance structure of returns Sturges’ rule, 32
under, 275–281
estimation, 281–286 T
estimator bias, 304–306 Tangency portfolio, 88, 129–132
limitations of, 311–315 Target matrix, 168
matrix inverses, preliminary Time series, 14
results on, 288–289 Treynor–Black method, active
model, 273–275 portfolio management and,
numerical computation of 292–307
portfolio weights, 306–307 adding single asset to market
partial correlation, 278–281 portfolio, 293–298
passive investing, 292 benchmark portfolio, 292
portfolio of N assets together estimator bias, 304–306
with market portfolio, numerical computation of
298–302 portfolio weights, 306–307
sector funds, 309 portfolio of N assets together
with market portfolio,
Treynor–Black method, active
298–302
portfolio management and,
292–307 properties of Treynor–Black
portfolio, 302–304
weight vector of tangency
portfolio under, 290–291 Treynor ratio, 254–257
“Small minus big” (SMB), 325 Two-fund theorem, 110
SML, see Security market line U
Stationarity, 15 U.S. Treasury Bill, 82
Statistical methods for financial
models, introduction to, 1–4 V
capital asset pricing model, 2 Value stock, 326
data analysis and computing, Variance function, 14
3–4 Variance matrix, 101
efficient market hypothesis, 2 Variance-ratio test, 56–58
factor model, 3 Volatility of assets, 20

T&F Cat #K31368 — K31368 A001— page 369 — 6/14/2017 — 22:05

370 Index

W Z
Weak white noise, 19–20 Zero-beta portfolio, 212, 214
Weighted estimators, 157–163 Zero-investment portfolios, 106
decay parameter, 157
eﬀective sampling size, 159
exponentially weighted moving
average estimator, 157
of mean vector and covariance
matrix, 160–163

T&F Cat #K31368 — K31368 A001— page 370 — 6/14/2017 — 22:05

Time Series Analysis by State Space Methods
100% (9)
Time Series Analysis by State Space Methods
369 pages
Bayesian Statistical Methods
100% (10)
Bayesian Statistical Methods
288 pages
Essentials of Probability Theory For Statisticians
67% (3)
Essentials of Probability Theory For Statisticians
419 pages
Time Series For Data Science Analysis and Forecasting (Wayne A. Woodward, Bivin Philip Sadler Etc.) (Z-Library)
100% (1)
Time Series For Data Science Analysis and Forecasting (Wayne A. Woodward, Bivin Philip Sadler Etc.) (Z-Library)
529 pages
Van Der Post H. Financial Econometrics With Python. A Pythonic Guide 5ed 2024
No ratings yet
Van Der Post H. Financial Econometrics With Python. A Pythonic Guide 5ed 2024
413 pages
Missing and Modified Data in Nonparametric Estimation
100% (1)
Missing and Modified Data in Nonparametric Estimation
465 pages
.Gianin, Sgarra - Mathematical Finance - Theory Review and Exercises - pp.286
100% (3)
.Gianin, Sgarra - Mathematical Finance - Theory Review and Exercises - pp.286
286 pages
Financial Econometrics Mathematics and Statistics Theory Method and Application Hardcovernbsped 1493994271 9781493994274 - Compress
100% (1)
Financial Econometrics Mathematics and Statistics Theory Method and Application Hardcovernbsped 1493994271 9781493994274 - Compress
657 pages
Resampling Methods For Dependent Data
No ratings yet
Resampling Methods For Dependent Data
382 pages
Models For Multi-State Survival Data - Per Kragh Andersen, Henrik Ravn (Chapman & Hall - CRC Texts in Statistical Science) - CRC (2024)
No ratings yet
Models For Multi-State Survival Data - Per Kragh Andersen, Henrik Ravn (Chapman & Hall - CRC Texts in Statistical Science) - CRC (2024)
293 pages
Mathematical Statistics With Applications PDF
100% (16)
Mathematical Statistics With Applications PDF
644 pages
2013 Book BayesianAndFrequentistRegressi PDF
No ratings yet
2013 Book BayesianAndFrequentistRegressi PDF
700 pages
Kulkarni Modeling and Analysis of Stochastic Systems 2011
100% (4)
Kulkarni Modeling and Analysis of Stochastic Systems 2011
566 pages
Probability Theory An Analytic View PDF
100% (2)
Probability Theory An Analytic View PDF
551 pages
Introduction To Statistical Modelling PDF
100% (1)
Introduction To Statistical Modelling PDF
133 pages
Untitled
No ratings yet
Untitled
633 pages
Asymptotical Statistics
100% (2)
Asymptotical Statistics
460 pages
Applied Univariate, Bivariate, and Multivariate Statistics Using Python
100% (3)
Applied Univariate, Bivariate, and Multivariate Statistics Using Python
300 pages
The BUGS Book: A Practical Introduction To Bayesian Analysis
No ratings yet
The BUGS Book: A Practical Introduction To Bayesian Analysis
393 pages
Bayesian Analysis of Time Series - Broemeling L. D. (CRC 2019) (1st Ed.)
100% (4)
Bayesian Analysis of Time Series - Broemeling L. D. (CRC 2019) (1st Ed.)
293 pages
Computational Statistics in Data Science - Piegorsch,..., 2022
100% (6)
Computational Statistics in Data Science - Piegorsch,..., 2022
674 pages
Applied Stochastic Modelling, Second Edition PDF
100% (5)
Applied Stochastic Modelling, Second Edition PDF
363 pages
Applied Survival Analysis R
100% (4)
Applied Survival Analysis R
245 pages
An Introduction to Statistical Computing: A Simulation-based Approach
From Everand
An Introduction to Statistical Computing: A Simulation-based Approach
Jochen Voss
No ratings yet
Linear Models and The Relevant Distributions and Matrix Algebra
No ratings yet
Linear Models and The Relevant Distributions and Matrix Algebra
539 pages
Statistical Regression and Classification - From Linear Models To Machine Learning
100% (9)
Statistical Regression and Classification - From Linear Models To Machine Learning
532 pages
Texts in Statistical Science Vidyadhar G. Kulkarni - Modeling and Analysis of Stochastic Systems-Cha
No ratings yet
Texts in Statistical Science Vidyadhar G. Kulkarni - Modeling and Analysis of Stochastic Systems-Cha
606 pages
Theory of Stochastic Objects - Probability, Stochastic Processes, and Inference (PDFDrive)
100% (1)
Theory of Stochastic Objects - Probability, Stochastic Processes, and Inference (PDFDrive)
409 pages
121 Stochastic Processes An Introduction Peter W. Jones Peter Smith Edisi 3 2018
100% (1)
121 Stochastic Processes An Introduction Peter W. Jones Peter Smith Edisi 3 2018
271 pages
Essentials of Probability Theor - Michael A. Proschan
No ratings yet
Essentials of Probability Theor - Michael A. Proschan
361 pages
Advance Statistical Methods in Data Science Chen
100% (4)
Advance Statistical Methods in Data Science Chen
229 pages
AAAIntroduction To Statistical Decision Theory Utility Theory and Causal Analysis (Silvia Bacci, Bruno Chiandotto) (Z-Library)
100% (2)
AAAIntroduction To Statistical Decision Theory Utility Theory and Causal Analysis (Silvia Bacci, Bruno Chiandotto) (Z-Library)
305 pages
Biblio 2
No ratings yet
Biblio 2
48 pages
Bayesian Inference Data Evaluation and Decisions Second Edition
100% (2)
Bayesian Inference Data Evaluation and Decisions Second Edition
245 pages
(Monographs On Statistics and Applied Probability 113) Lang Wu-Mixed Effects Models For Complex Data-CRC Press (2010)
No ratings yet
(Monographs On Statistics and Applied Probability 113) Lang Wu-Mixed Effects Models For Complex Data-CRC Press (2010)
440 pages
Regression Modeling Strategies - With Applications To Linear Models by Frank E. Harrell
100% (4)
Regression Modeling Strategies - With Applications To Linear Models by Frank E. Harrell
598 pages
Time Series Econometrics
100% (4)
Time Series Econometrics
421 pages
Damon Berridge - Robert Crouchley - Multivariate Generalized Linear Mixed Models Using R-CRC Press (2011)
No ratings yet
Damon Berridge - Robert Crouchley - Multivariate Generalized Linear Mixed Models Using R-CRC Press (2011)
284 pages
(Chapman & Hall - CRC Texts in Statistical Science) Anthony Almudevar - Theory of Statistical Inference (2021, Chapman and Hall - CRC) - Libgen - Li
100% (2)
(Chapman & Hall - CRC Texts in Statistical Science) Anthony Almudevar - Theory of Statistical Inference (2021, Chapman and Hall - CRC) - Libgen - Li
470 pages
Fundamentals of Statistical Inference: What Is The Meaning of Random Error?
100% (1)
Fundamentals of Statistical Inference: What Is The Meaning of Random Error?
141 pages
Elements of Nonlinear Series Analysis and Forecasting PDF
100% (5)
Elements of Nonlinear Series Analysis and Forecasting PDF
626 pages
Time Series Analysis Book
100% (2)
Time Series Analysis Book
202 pages
Applied Time Series Modelling and Forecasting
100% (3)
Applied Time Series Modelling and Forecasting
313 pages
An Introduction To Bootstrap Methods With Applications To R
No ratings yet
An Introduction To Bootstrap Methods With Applications To R
236 pages
Mathematical and Statistical Methods For Actuarial Sciences and Finance
100% (2)
Mathematical and Statistical Methods For Actuarial Sciences and Finance
170 pages
A Concise Introduction To Statistical Inference
100% (5)
A Concise Introduction To Statistical Inference
231 pages
Statistical Prediction and Machine Learning
100% (2)
Statistical Prediction and Machine Learning
314 pages
Applied Categorical and Count Data Analysis (PDFDrive)
50% (2)
Applied Categorical and Count Data Analysis (PDFDrive)
380 pages
10.1007@978 0 387 39351 3 PDF
No ratings yet
10.1007@978 0 387 39351 3 PDF
316 pages
Multilevel Modeling Using R - Finch Bolin Kelley
No ratings yet
Multilevel Modeling Using R - Finch Bolin Kelley
82 pages
Koch I. Analysis of Multivariate and High-Dimensional Data 2013
100% (17)
Koch I. Analysis of Multivariate and High-Dimensional Data 2013
532 pages
Bayesian Hierarchical Models - With Applications Using R - Congdon P.D. (CRC 2020) (2nd Ed.)
100% (3)
Bayesian Hierarchical Models - With Applications Using R - Congdon P.D. (CRC 2020) (2nd Ed.)
593 pages
(Probability and Statistics For Programmers) Allen Downey - Think Stats. Probability and Statistics For programmers-O'Reilly Media (2012) PDF
100% (9)
(Probability and Statistics For Programmers) Allen Downey - Think Stats. Probability and Statistics For programmers-O'Reilly Media (2012) PDF
142 pages
New Sample Mathode 1
100% (2)
New Sample Mathode 1
698 pages
Computational Bayesian Statistics. An Introduction - Amaral, Paulino, Muller PDF
100% (3)
Computational Bayesian Statistics. An Introduction - Amaral, Paulino, Muller PDF
257 pages
Likelihood An Bayesian Exp
100% (1)
Likelihood An Bayesian Exp
409 pages
(Chapman & Hall_CRC Texts in Statistical Science) Piotr Kokoszka, Matthew Reimherr - Introduction to Functional Data Analysis-Chapman and Hall_CRC (2017)
No ratings yet
(Chapman & Hall_CRC Texts in Statistical Science) Piotr Kokoszka, Matthew Reimherr - Introduction to Functional Data Analysis-Chapman and Hall_CRC (2017)
307 pages
STA130
No ratings yet
STA130
578 pages
124 Stochastic Processes From Applications To Theory Pierre Del Moral Spiridon Penev Edisi 1 2016
100% (1)
124 Stochastic Processes From Applications To Theory Pierre Del Moral Spiridon Penev Edisi 1 2016
916 pages
Data Science With R
No ratings yet
Data Science With R
53 pages
MATH2920 Notes
No ratings yet
MATH2920 Notes
33 pages
Vector Calculus C1-C5
No ratings yet
Vector Calculus C1-C5
96 pages
Advice - On Becoming A Quant PDF
No ratings yet
Advice - On Becoming A Quant PDF
20 pages
Open Logic Text
No ratings yet
Open Logic Text
995 pages
Mark Sainsbury - Logical Forms An Introduction To Philosophical Logic (2001, Blackwell)
No ratings yet
Mark Sainsbury - Logical Forms An Introduction To Philosophical Logic (2001, Blackwell)
220 pages
Lectures On Infinitary Model Theory
No ratings yet
Lectures On Infinitary Model Theory
163 pages
Financial Market Data For R
No ratings yet
Financial Market Data For R
194 pages
Category Theory II
No ratings yet
Category Theory II
161 pages
Form 4 Math Placement Test 2021
No ratings yet
Form 4 Math Placement Test 2021
8 pages
The World of Science and Innovation 9 11.12.2020
No ratings yet
The World of Science and Innovation 9 11.12.2020
1,012 pages
Gcse Computer Science Answers
No ratings yet
Gcse Computer Science Answers
81 pages
SR Maths IIA PF-II 25-02-2025
No ratings yet
SR Maths IIA PF-II 25-02-2025
2 pages
Dr. Joon-Yeoul Oh: IEEN 5335 Principles of Optimization
No ratings yet
Dr. Joon-Yeoul Oh: IEEN 5335 Principles of Optimization
27 pages
QCAA Examplar
No ratings yet
QCAA Examplar
24 pages
PHD Thesis University of Texas Austin
100% (2)
PHD Thesis University of Texas Austin
6 pages
Chapter1 Theory of Consumer Behavior and DD
No ratings yet
Chapter1 Theory of Consumer Behavior and DD
33 pages
12 Cbse Cont, Dbility, Diff Chain, Trig Sub Apr 2024
No ratings yet
12 Cbse Cont, Dbility, Diff Chain, Trig Sub Apr 2024
2 pages
Lab 9
No ratings yet
Lab 9
9 pages
Ug Syllabus 202906
No ratings yet
Ug Syllabus 202906
14 pages
MATH1 Lesson 1
No ratings yet
MATH1 Lesson 1
8 pages
Small Business Project
No ratings yet
Small Business Project
2 pages
43 Amol Padiyar Numpy Notes
No ratings yet
43 Amol Padiyar Numpy Notes
28 pages
Cambridge O Level: Additional Mathematics 4037/22
No ratings yet
Cambridge O Level: Additional Mathematics 4037/22
16 pages
Lab 1 Report: Rmit University Vietnam School of Science and Technology
No ratings yet
Lab 1 Report: Rmit University Vietnam School of Science and Technology
15 pages
Math 110 Prob Sets 5 Trigonometry
No ratings yet
Math 110 Prob Sets 5 Trigonometry
4 pages
Kasparov1988 Article EquivariantKK-theoryAndTheNovi
No ratings yet
Kasparov1988 Article EquivariantKK-theoryAndTheNovi
55 pages
Chapter 4. One-Dimensional Problems: Problem 4.8
No ratings yet
Chapter 4. One-Dimensional Problems: Problem 4.8
10 pages
BITS Pilani Campus
No ratings yet
BITS Pilani Campus
11 pages
Work Power & Energy Past Paers 2023
No ratings yet
Work Power & Energy Past Paers 2023
46 pages
Department of Mathematics and Comuting Numerical Methods Tutorial Sheet-III
No ratings yet
Department of Mathematics and Comuting Numerical Methods Tutorial Sheet-III
2 pages
Simevents 3.0 - Getting Started Guide
No ratings yet
Simevents 3.0 - Getting Started Guide
119 pages
PCD Lab Syllabus
No ratings yet
PCD Lab Syllabus
3 pages
Case Study Report (AutoRecovered)
No ratings yet
Case Study Report (AutoRecovered)
11 pages
Lecture - 4 Inverse Kinematics
No ratings yet
Lecture - 4 Inverse Kinematics
39 pages
Minggu 13-14
No ratings yet
Minggu 13-14
147 pages
Solutions Manual For Introduction To Modern Statistical Mechanics
100% (4)
Solutions Manual For Introduction To Modern Statistical Mechanics
48 pages
MC
No ratings yet
MC
7 pages
Survey Camp Theory
No ratings yet
Survey Camp Theory
46 pages