0% found this document useful (0 votes)
87 views12 pages

New Methods For The Cross-Section of Returns

Uploaded by

Hao OuYang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views12 pages

New Methods For The Cross-Section of Returns

Uploaded by

Hao OuYang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

New Methods for the Cross-Section of

Returns
G. Andrew Karolyi
Cornell University SC Johnson College of Business

Stijn Van Nieuwerburgh


Columbia University Graduate School of Business

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


The cross-section and time series of stock returns contains a wealth of information
about the stochastic discount factor (SDF), the object that links cash flows to prices.
A large empirical literature has uncovered many candidate factors—many more than
seem plausible—to summarize the SDF. This special volume of the Review of Financial
Studies presents recent advances in extracting information from both the cross-section
and the time series, in dealing with issues of replication and false discoveries, and
in applying innovative machine-learning techniques to identify the most relevant asset
pricing factors. Our editorial summarizes what we learn and offers a few suggestions
to guide future work in this exciting new era of big data and empirical asset pricing.
(JEL G10, G12, G17)

1. Background
In the wake of the well-documented empirical failures of the static capital asset
pricing (CAPM) model and consumption-based asset pricing model, research
in asset pricing over the past forty years has made great strides in documenting
the properties of the stochastic discount factor (SDF). The SDF is the object
that relates today’s price of a risky asset to its future cash flows. Most of this
literature has been focused on equity returns and disproportionately focused on
U.S. markets.
Starting with the pioneering work of Shiller (1981), Campbell and Shiller
(1988), and Fama and French (1988), we, as scholars in finance, have come
to understand that aggregate stock market returns are predictable over time.
Indeed, the Review published a special issue on stock return predictability

This editorial is written for a special issue of the Review of Financial Studies focused on new methods in the cross-
section of stock returns. The authors thank Tarun Chordia, Ken French, Itay Goldstein, Valentin Haddad, Bryan
Kelly, Ralph Koijen, Lira Mota, Alessio Saretto, Raman Uppal, Michael Weber, and Lu Zhang for comments.
Send correspondence to Van Nieuwerburgh, Columbia University Graduate School of Business, 3022 Broadway,
New York, NY 10027; (212) 854-2289. Email: [email protected].

The Review of Financial Studies 33 (2020) 1879–1890


© The Author(s) 2020. Published by Oxford University Press on behalf of The Society for Financial Studies.
All rights reserved. For permissions, please e-mail: [email protected].
doi:10.1093/rfs/hhaa019 Advance Access publication February 26, 2020

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1879 1879–1890


The Review of Financial Studies / v 33 n 5 2020

over a decade ago (Spiegel, 2008). Koijen and van Nieuwerburgh (2011) and
Cochrane (2011) provide more recent updates.
From the pioneering work of Banz (1981), Basu (1983), Rosenberg, Reid, and
Lanstein (1985), Fama and French (1992), Carhart (1997), and beyond, we have
also learned that cross-sectional variation in average stock returns is not well
described by heterogeneous exposure to a single economically motivated factor,
be it the aggregate market factor or aggregate consumption growth (Hansen and
Singleton, 1983, Mehra and Prescott, 1985). A rich body of work has sorted
stocks on firm-level characteristics into portfolios, and calculated returns on
investment strategies that go long stocks at one end of the spectrum of a given
characteristic and short stocks at the opposite end of that spectrum. These long-
short portfolio returns capture new factors in stock returns associated with return
differences that are not accounted for by exposure to the market factor; they

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


generate CAPM alphas. Size, value, and momentum were the most prominent
examples of such return anomalies early on. Investment and profitability factors
have come to the fore more recently (Novy-Marx, 2013, Hou, Xue, and Zhang,
2015, Fama and French, 2015). In the past twenty years, hundreds of additional
portfolio sorts have been discovered that not only have positive CAPM alphas
(Harvey, Liu, and Zhu, 2016) but also have returns that cannot be explained by
the state-of-the-art four-factor (Carhart, 1997, Hou, Xue, and Zhang, 2015) or
five-factor (Fama and French, 2015) models.
The time-series and cross-sectional strands of the literature are not
independent. In most economic models, fluctuations in the investment
opportunity set are driven by persistent state variables, many of which are
linked to capital market or macroeconomic time series, which serve, in turn,
as predictors of returns (French, Schwert, and Stambaugh, 1987, Ferson and
Harvey, 1991). Their innovations are priced risk factors in the cross-section.
While the cross-sectional factors themselves may display predictability, just
like the aggregate market return, the predictors of the cross-sectional factors
may be different from those of the aggregate market return.
In short, there is a wealth of information in both the cross-section and the
time series of stock returns about the properties of the SDF.
What we now face is a bewildering “zoo” of new asset pricing factors. Not
only are there hundreds of new anomalies, but there are also thousands more new
candidate factors once we consider combinations with conditioning variables,
nonlinear functions of factors, and (linear and non-linear) combinations of
multiple factors. This abundance of cross-sectional predictors leads Cochrane
(2011) to ask: “Which characteristics really provide independent information
about average returns? Which are subsumed by others?” Can they really all
be part of the true SDF? There must be some considerations that stem from
data dimensionality constraints and that lead us, as scholars, to false positives
among some of these potential characteristics as signals.
One important branch of the empirical literature has argued that some,
perhaps many, anomalies may not be robust features of the data. They may not

1880

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1880 1879–1890


New Methods for the Cross-Section of Returns

hold up outside the estimation sample, or they may be sensitive of empirical


design choices, even within sample. Researchers may have “mined” different
factors and explored empirical specifications before reporting the one finding
that came back with a t-statistic above 1.96 (Harvey, Liu, and Zhu, 2016,
Harvey, 2017). Indeed, McLean and Pontiff (2015) find that post publication,
the anomaly return spread decreases by two-thirds of its pre-publication
magnitude. While this suggests that a large number of return anomalies are
data-mined, a nontrivial fraction persists. Three of the papers in this special
issue contribute to this line of inquiry.
Another branch has tried to use holdings data—for example, on investor
flows—to ask which asset pricing model (and which factors in particular)
investors seem to consider when they make portfolio choice decisions (Berk
and van Binsbergen, 2016, Barber, Huang, and Odean, 2016, Jegadeesh and

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


Mangipudi, Forthcoming). Koijen and Yogo (2019) show that household and
institutional portfolios are not well explained by the characteristics of leading
factor models. The residual, which can be interpreted as a latent demand,
explains the majority of price variation.
A third line of inquiry has uncovered a lot of commonality among the
anomaly factors, so there are few (combinations of) factors that summarize
most of the information. In 2018, the editorial team at the Review took
active interest in an emerging and still very active literature that uses tools
imported from machine learning (ML) in artificial intelligence and computer
science. In contrast with existing approaches in empirical finance and financial
econometrics, ML tools emphasize (i) prediction over inference and testing,
(ii) model selection from large model sets rather than preselection of a model
based on a hypothesis, (iii) regularization, or choosing a level of model
complexity (parameter richness) that minimizes out-of-sample error rather than
starting from a simple model, and (iv) flexible functional form rather than
imposing simple functional forms. While ML algorithms allow for complex,
nonlinear models that can predict the data well, they also risk overfitting the
data, which is why the regularization element is so important.
When faced with a limited time series of (stock) returns, a large number
of potential predictors, and linear or nonlinear combinations of predictors,
ordinary least squares become inefficient or even infeasible. ML techniques
designed to get around this issue become a necessity. Since (stock) return
predictability is a low signal-to-noise ratio problem, bringing more time-series
and cross-sectional information to bear on the problem in a disciplined way
should hold great promise in improving our measurement of expected returns
and the SDF.
In the context of empirical asset pricing, ML algorithms search for the
(nonlinear) combination of conditioning variables and cross-sectional factors
that best describes the returns in a panel of individual stocks or portfolios
of stocks. While some models, such as neural networks, potentially result in
“dimension expansion,” a hope with ML algorithms is that a modest number

1881

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1881 1879–1890


The Review of Financial Studies / v 33 n 5 2020

of factors is shown to be sufficient to accurately summarize the information in


the cross-section and time series of stock returns. In other words, a substantial
“dimension reduction” of the factor zoo ensues.
In what follows, we describe the contributions in this volume in a some detail
in an effort to guide the reader through this exciting new area of exploration
and to inspire more.

2. Dual Submission Conference


The Review of Financial Studies co-organized a dual submission conference
with the University of Chicago Booth School of Business on September 28,
2018. The conference was supported by the Fama-Miller Center for Research

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


in Finance, EDHEC Business School, and the Society for Financial Studies.
A call for paper submissions was distributed widely in early 2018, with a
submission deadline on June 15, 2018. We received 136 submissions, of which
88 (65%) were dually submitted to the Review of Financial Studies. The large
dual-submission volume confirmed to us, as the Review’s sponsoring editors,
the vibrant new research activity that was taking place on this topic in the asset
pricing community.
The program chairs for the conference were EDHEC’s Raman Uppal
and Booth’s Michael Weber. They assembled a seventeen-person program
committee of renowned scholars in empirical asset pricing. After prescreening
for thematic relevance and quality, 43 papers were sent out for review, including
26 dually submitted papers. The conference program contained nine papers, of
which seven were dually submitted.
Based on program reviews and our own reading, we, as sponsoring editors,
invited these seven papers as well as several more for formal submission to the
Review for what we believed could be an impactful special volume. All went
through the regular review process. In the end, five of the 88 dually submitted
papers found their way into this special issue. We added four more papers that
were submitted as part of the regular paper submission flow for the Review
because of their thematic relevance and fit. Those four include a paper by one
of the conference organizers, Raman Uppal, which was already under review
prior to the conference and not included in the conference. The paper by the
other conference organizer, Michael Weber, was included in the conference but
was already at the revise-and-resubmit stage before the conference.

3. What Is Included in This Special Issue?


3.1 Extracting information from both the cross-section and time series
Fama and French (2020) lead off the volume by acknowledging how the
history of scholarship in the development of asset-pricing methods has led
us to an interesting fork in the road on how we harvest the information in
the cross-section and time series of returns to perform testing. Factors that

1882

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1882 1879–1890


New Methods for the Cross-Section of Returns

have been developed in time-series asset pricing models—typically, long-short


portfolios related to size, value, profitability, investment, or momentum—were
all motivated by evidence from cross-sectional regressions inspired by methods
in Fama and MacBeth (1973). Fama and French refine the approach by noting
that the slope coefficients estimated in these cross-sectional regressions are, in
fact, portfolio returns that can be interpreted as factors. The authors propose
to extract information from the cross-sectional regression approach directly
in order to construct cross-sectional factors that correspond to the time-series
factors that were built directly from the characteristics. It turns out that such
cross-sectional factors capture the rich variation in returns in time-series tests
better than those that use traditional characteristics-based time-series factors.
The findings are particularly intriguing when, and how, these time-series-based
tests are implemented with time-varying loadings and the form that these

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


loadings can take.
Another paper in this volume by Daniel, Mota, Rottke, and Santos (2020)
picks up on the same theme as Fama and French by questioning the standard
procedures that asset pricing scholars employ to form characteristics-based
time-series factors for testing. In fact, the first step in the paper is to show how
the standard procedure will not produce a mean-variance efficient portfolio.
This arises because the characteristic picks up not only the variation in the
loading with respect to the priced risk factor, but also that in the loadings with
respect to unpriced sources of common variation in returns. The Sharpe ratio of
the characteristics-based time-series factor is necessarily lower than the Sharpe
ratio of the projection of the risk factor on the variation in returns, leaving
it no longer mean-variance efficient. What the paper then proceeds to do is
propose a means by which to purge the unpriced sources of common variation
in returns. How they proceed to purge them is where creative elements arise.
If characteristics are a good proxy for expected returns, then a characteristic-
balanced portfolio will have a zero expected return, and thus a zero loading on
priced factors. Variation in the loadings across these characteristics-balanced
portfolios must then be capturing variation in the loadings on unpriced factors.
Hedge portfolios are long-short portfolios with zero expected returns but highly
correlated with the original characteristics-based factors. The authors show
how an optimal combination of the original characteristics-based factors and
the hedge portfolios delivers more efficient factors. Daniel, Mota, Rottke, and
Santos are agnostic about what are possible sources of unpriced common
variation, but they propose industry as one candidate. When they hedge out
the industry component of, say, the five Fama-French characteristics-based
portfolios (Fama and French, 2015), the (squared) Sharpe ratio is nearly 75%
higher.
Haddad, Kozak, and Santosh (2020) focus on time-series predictability of
cross-sectional return factors. They show that to obtain a plausible SDF—which
does not imply too high Sharpe ratios—one can focus on predicting the largest
PCs of returns, and they find that several of the largest principal components

1883

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1883 1879–1890


The Review of Financial Studies / v 33 n 5 2020

(PCs) of fifty anomaly portfolio returns are predictable. Each market-neutral


anomaly return is predicted by its own lagged book-to-market ratio; two out
of the largest five PCs display strong predictability. This finding has important
implications for the SDF: it is twice as variable as it would be in a model in
which only the aggregate stock market return, but not the cross-sectional return
factors, is predictable. Factor timing is very valuable, raising the Sharpe ratio
beyond that implied by market timing and static factor investing separately.
The cross-sectional factor risk premia are less persistent than the aggregate
market risk premium and covary with other macro-economic variables than the
aggregate equity risk premium. The paper offers important, yet straightforward,
implications for the design of new asset pricing theories. Those models need to
feature multiple sources of variation in risk premia and generate quite volatile
SDFs, an echo of early seminal work by Hansen and Jagannathan (1991).

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


3.2 Replicating anomalies, multiple hypothesis testing, and transaction
costs
Researchers in finance, and in many other disciplines, have long worried about
the possibility that a substantial number of reported discoveries might be
false. Three of the papers in this special volume seek to understand just how
severe is the problem for empirical asset pricing studies of the cross-sectional
predictability in stock returns.
Hou, Xue, and Zhang (2020) compile a data library of 452 anomaly variables,
taking stock of the vast cross-sectional literature that precedes this special issue.
They employ a common set of replication procedures, extend the sample until
the end of 2016, and then find that many anomalies fail to survive. One important
reason is that small stocks, which represent only about 3% of aggregate market
capitalization, drive many anomalies. Once returns are value-weighted, 65%
of anomalies disappear. Employing a higher test hurdle to account for multiple
testing, as advocated by Harvey, Liu, and Zhu (2016), further reduces replication
rates to a mere 18%. Similar findings apply even when the original, rather than
the updated, sample is used. One of their conclusions is that theory can help
restore the credibility of the anomalies literature by raising the ratio of ex ante
true relations to false relations tested.
When many hypotheses are tested, some will be rejected at conventional
levels from single hypothesis testing even if they are true under the null. Such
false rejections from single-hypothesis tests belie the fact that many researchers
independently carry out tests on a common question and a common data set.
Multiple hypothesis testing is one way in which we can understand just how
severe is the false-rejection problem. This is the goal of the paper in this special
volume by Chordia, Goyal, and Saretto (2020). Of course, theirs is not the
first paper to propose multiple hypothesis testing (MHT) methods (Harvey,
Liu, and Zhu, 2016, Harvey, 2017, among others), and there are many one
can employ. Chordia, Goyal, and Saretto recommend one of the many MHT
methods for finance applications. One complicating aspect to these methods,

1884

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1884 1879–1890


New Methods for the Cross-Section of Returns

whichever one employs, is asking which set of trading strategies to study in


the first place. Studying only those that are publicly known will leave out a
large number of strategies that have possibly been tried by researchers. By
using the information from the publicly known strategies and a larger set
generated by the combination of the three firm-level variables, Chordia, Goyal,
and Saretto obtain the properties of the set of strategies that the researchers could
have tested. The application of the recommended MHT method results in a
higher threshold for t-statistics of time-series alphas and cross-sectional Fama-
MacBeth regression slopes of 3.8 and 3.4, respectively. The authors explain
that the more-restrictive, dual thresholds they recommend are important in
decreasing the probability of false rejections.
Characterizing the dimension of the cross-section of stock returns by
identifying a small set of characteristics that subsume the rest is already a

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


hard enough problem. It gets harder. Most papers in this volume - and in the
broader cross-sectional asset pricing literature- ignore transaction costs (Novy-
Marx and Velikov, 2016). Transaction costs matter for the dimension of the
cross-section because they can affect the number of characteristics that are
jointly significant for an investor’s optimal portfolio. An important exception
among the papers in this volume is that of DeMiguel, Martín-Utrera, Nogales,
and Uppal (2020), which offers the valuable observation that the presence of
transaction costs could lead researchers to consider more, rather than fewer,
firm characteristics. After all, combining characteristics allows one to diversify
trading, just as combining them allows one to diversify risk. Trades in the
underlying stocks required to rebalance different characteristics often cancel
out. This paper flips our existing inferences about the effect of transaction costs
on anomaly returns when they are considered in isolation. What these authors
show is that transaction costs provide an economic rationale for considering
a larger number of characteristics than that in prominent asset pricing models
because of the benefits from trading diversification.

3.3 Machine-learning tools


One of the advantages of ML techniques is that the high-dimensional nature of
the methods enhance their flexibility relative to more traditional econometric
prediction techniques. The flexibility comes from better approximating the
unknown and likely complex data-generating processes underlying returns.
The paper by Gu, Kelly, and Xiu (2020) provides a comparison of the main
machine-learning tools available to finance researchers interested in modeling
the conditional mean function of stock returns. It applies the techniques to a
common big-data set of 30,000 stocks observed over the 1957–2016 period,
94 cross-sectional characteristics, each of which interacted with 8 conditioning
variables, and 74 industry dummy variables, for a total of over 900 predictors.
The paper compares: (i) linear models, including ordinary least squares (OLS);
(ii) generalized linear models with penalization, including elastic net, LASSO
(least absolute shrinkage and selection operator), and ridge regression penalty

1885

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1885 1879–1890


The Review of Financial Studies / v 33 n 5 2020

functions; (iii) dimension reduction techniques such as principal components


regression and partial least squares; (iv) regression trees, including boosted
regression trees and random forests, which allow for nonlinear interactions of
predictors; and (v) neural networks. The paper makes for a great introduction
to a wide variety of ML techniques, each with strengths and weaknesses. All
ML methods provide substantial improvements in out-of-sample predictability
over the traditional approach. Neural network models with a small number of
layers perform the best, but are also the hardest to interpret.
Principal component analysis is a dimension-reduction technique that finds
a few linear combinations of factors that explain most of the overall covariance
structure of the factors. But those combinations are not necessarily good return
predictors since PCA uses no information on mean returns. Lettau and Pelger
(2020) propose a risk premium PCA estimator that adds to the traditional PCA

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


objective function a no-arbitrage penalty term that helps price the cross-section
of average returns. The method can be combined with existing factor-reduction
techniques such as those in Kozak, Nagel, and Santosh (2020), Kelly, Pruitt, and
Su (2019). In this paper, the risk-premium PCA estimator detects high Sharpe-
ratio factors that affect only a subset of underlying assets. They find that five
factors are sufficient to fit the first and second moments among 37 anomaly
strategies. One factor is the market, two high Sharpe-ratio factors matter for
the time-series variation and the SDF, while two other factors are important for
capturing the cross-section of returns.
One approach to deal with the zoo of new characteristics in the cross-
section of returns is to determine which firm characteristics provide incremental
information when one seeks to limit just how strong the assumptions need to
be about the functional form for characteristics and returns. One study in this
volume by Freyberger, Neuhierl, and Weber (2020) proposes a nonparametric
method invoking an adaptive (two-stage) group LASSO procedure, developed
by Huang, Horowitz, and Wei (2010), for model selection (which characteristics
matter) and for nonparametric estimation (no strong functional form). They
advance several applications to showcase LASSO, but the most intriguing is
that of 62 familiar characteristics for a common sample of U.S. returns. They
show that only 13 of them have incremental explanatory power for the full
sample period and all stocks. However, only 11 have predictive power for the
first half alone, and only nine are resilient for stocks with market capitalization
above the 20th percentile of NYSE stocks. Of special interest is the out-of-
sample performance of the nonparametric models relative to traditional linear
models that most of the literature adopts as well as the predictive power of
characteristics and how it varies over time, reminiscent of this volume’s opening
contribution by Fama and French (2020). This paper is not the first to invoke
adaptive group LASSO methods in finance, but it offers a novel application
for model selection that lays the groundwork for more in the dimensionality
challenge of the factor zoo.

1886

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1886 1879–1890


New Methods for the Cross-Section of Returns

The SDF that is at the heart of asset pricing is a complex, nonlinear function
of many cross-sectional factors and conditioning variables. The ML papers in
this volume show great promise in reducing that complexity to a manageable
number of factors. The new approach also results in models with substantially
higher and more volatile Sharpe ratios than those obtained with traditional
approaches.

4. Lessons for the Future


4.1 Heading off model and method mining
The multiple testing bias of Harvey (2017) requires the researcher to make
multiple implementation decisions. Decisions include, among others: (i) how
to transform the data (what scale to use, whether to group into quintiles or

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


deciles, among others); (ii) what sample to use; (iii) what hyper-parameters
to choose; and (iv) what starting values to impose. While the specification
may work well for the sample at hand, the fear is that there is overfitting in
the testing sample. The same structure may not work well in other settings,
such as when using data from other countries (Hou, Karolyi, and Kho, 2011,
among others). Providing Monte Carlo simulations, allowing for a larger set of
hyper-parameters, broadening the sample of securities, and proving statistical
properties of the estimator are all steps researchers can take. We believe a set
of new best practices needs to develop to assuage concerns that results are
sensitive to implementation choices.

4.2 Seeking common ground across methods


The different dimension-reduction techniques often yield a modest number of
factors (typically, below ten). Several papers published in this volume confirm
this fact. But different methods appear to result in different factors. We offer
that there would be great value for scholars in a better understanding of how
the factors that result from the various methods differ from each other and why.
For example, one useful diagnostic would be to compute the (conditional)
correlation of the candidate SDFs that result from the different methods.
Isolating their common component may also be fruitful.

4.3 The need for economic interpretability


The goal of ending up with a small number of important cross-sectional asset
pricing factors and conditioning variables is still only an intermediate step in
the holy-grail quest for the SDF. The end goal should be that we arrive at an
SDF that comes closer to achieving the mean-variance efficient portfolio and
that the SDF be economically interpretable. Economic interpretability remains
a challenge for all ML approaches, and may be particularly acute for the more
black-box-like approaches such as neural networks. Only with solid economic
intuition can such a synthesis of empirical evidence serve as the basis for more
realistic asset pricing theories. Understanding the economic mechanisms at

1887

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1887 1879–1890


The Review of Financial Studies / v 33 n 5 2020

play is essential, especially if the goal is robust and credible out-of-sample


predictability. While all papers evaluate the approach based on its out-of-sample
predictability prowess, we have only one out-of-sample sample. The risk is that
the most advanced prediction model will fail in the future unless we have a good
grasp of the economic mechanisms behind the factors it has selected. We believe
much work remains to be done toward achieving this goal, but our hope is that
this special issue represents an important milestone.
The natural next step for this literature is to build in economic mechanisms
from equilibrium asset pricing into the estimation process, and to figure out
how to best use ML techniques in this context. Kelly, Pruitt, and Su (2019),
Feng, Giglio, and Xiu (2019), Kozak, Nagel, and Santosh (2020), and Haddad,
Kozak, and Santosh (2020) are examples of good first steps.
Another natural next step is to incorporate holdings data and to use ML

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


to better understand what factors (and what information) drive individual and
institutional portfolio choices. It strikes us as important to better understand how
prices and expected returns reflect the information actual investors use to form
their portfolios. The increasing availability of large-scale, higher-frequency, and
globally accessible holdings data sets would seem to provide fertile ground for
new applications of ML tools.

4.4 Other fertile ground for new methods


Many of the tools proposed in this special issue would be valuable in other
finance settings, be it for predicting bond returns, analyzing corporate decisions,
evaluating investment managers, predicting real economic quantities or goods
prices, and analyzing the information contained in text data. Indeed, exciting
research in several of these areas has begun, and the Review looks forward to
supporting the most innovative work in this area.

References

Banz, R. 1981. The relation between return and market value of common stocks. Journal of Financial Economics
9:3–18.

Barber, B., X. Huang, and T. Odean. 2016. Which factors matter to investors? Evidence from mutual fund flows.
Review of Financial Studies 29:2600–42.

Basu, S. 1983. The relationship between earnings’ yield, market value and return for nyse common stocks:
Further evidence. Journal of Financial Economics 12:129–56.

Berk, J., and J. H. van Binsbergen. 2016. Assessing asset pricing models using revealed preference. Journal of
Financial Economics 119 (1):1–23.

Campbell, J. Y., and R. J. Shiller. 1988. The dividend-price ratio and expectations of future dividends and discount
factors. Review of Financial Studies 1:195–227.

Carhart, M. M. 1997. On the persistence of mutual fund performance. Journal of Finance 52:57–82.

Chordia, T., A. Goyal, and A. Saretto. 2020. Anomalies and false rejections. Review of Financial Studies 33:2134–
79.

Cochrane, J. 2011. Discount rates. Journal of Finance 64:1047–108.

1888

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1888 1879–1890


New Methods for the Cross-Section of Returns

Daniel, K., L. Mota, S. Rottke, and T. Santos. 2020. The cross-section of risk and return. Review of Financial
Studies, 33:1927–79.

DeMiguel, V., A. Martín-Utrera, F. J. Nogales, and R. Uppal. 2020. A transaction-cost perspective on the multitude
of firm characteristics. Review of Financial Studies 33:2180–2222.

Fama, E. F., and K. R. French. 1988. Dividend yields and expected stock returns. Journal of Financial Economics
22:3–27.

———. 1992. The cross-section of expected stock returns. The Journal of Finance 47:427–65.

———. 2015. A five-factor asset pricing model. Journal of Financial Economics 116:1–22.

———. 2020. Comparing cross-section and time-series factor models. Review of Financial Studies 33:1891–1926.

Fama, E. F., and J. MacBeth. 1973. Risk, return and equilibrium: Empirical tests. Journal of Political Economy
81:607–36.

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


Feng, G., S. Giglio, and D. Xiu. 2019. Taming the factor zoo: A test of new factors. NBER Working Paper 25481.

Ferson, W. E., and C. R. Harvey. 1991. The variation of economic risk premiums. Journal of Political Economy
99:385–415.

French, K., G. W. Schwert, and R. Stambaugh. 1987. Expected stock returns and volatility. Journal of Financial
Economics 19:3–30.

Freyberger, J., A. Neuhierl, and M. Weber. 2020. Dissecting characteristics nonparametrically. Review of
Financial Studies 33:2326–77.

Gu, S., B. Kelly, and D. Xiu. 2020. Empirical asset pricing via machine learning. Review of Financial Studies
33:2223-73.

Haddad, V., S. Kozak, and S. Santosh. 2020. Factor timing. Review of Financial Studies 33:1980–2018.

Hansen, L. P., and R. Jagannathan. 1991. Restrictions on intertemporal marginal rates of substitution implied by
asset returns. Journal of Political Economy 99:225–62.

Hansen, L. P., and K. Singleton. 1983. Stochastic consumption, risk aversion, and the temporal behavior of asset
returns. Journal of Political Economy 91:249–65.

Harvey, C. R. 2017. The scientific outlook in financial economics. Journal of Finance 72:1399–440.

Harvey, C. R., Y. Liu, and C. Zhu. 2016. ... and the cross-section of expected returns. Review of Financial Studies
29:5–68.

Hou, K., G. A. Karolyi, and B. C. Kho. 2011. What factors drive global stock returns? Review of Financial
Studies 24:2527–74.

Hou, K., C. Xue, and L. Zhang. 2015. Digesting anomalies: An investment approach. Review of Financial Studies
28:650–705.

———. 2020. Replicating anomalies. Review of Financial Studies 33:2019–2133.

Huang, J., J. L. Horowitz, and F. Wei. 2010. Variable selection in nonparametric additive models. Annals of
Statistics 38:2282–313.

Jegadeesh, N., and C. S. Mangipudi. Forthcoming. What do fund flows reveal about asset pricing models and
investor sophistication? Review of Financial Studies.

Kelly, B., S. Pruitt, and Y. Su. 2019. Characteristics are covariances: A unified model of risk and return. Journal
of Financial Economics 134:501–24.

Koijen, R., and S. van Nieuwerburgh. 2011. Predictability of stock returns and cash flows. Annual Review of
Financial Economics 3:467–91.

1889

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1889 1879–1890


The Review of Financial Studies / v 33 n 5 2020

Koijen, R., and M. Yogo. 2019. A demand system approach to asset pricing. Journal of Political Economy
127:1475–515.

Kozak, S., S. Nagel, and S. Santosh. 2020. Shrinking the cross section. Journal of Financial Economics
135:271–92.

Lettau, M., and M. Pelger. 2020. Factors that fit the time series and cross-section of stock returns. Review of
Financial Studies 33:2274–2325.

McLean, R. D., and J. Pontiff. 2015. Does academic research destroy stock return predictability? Journal of
Finance 71:5–32.

Mehra, and E. Prescott. 1985. The equity premium: A puzzle. Journal of Monetary Economics 15:145–61.

Novy-Marx, R. 2013. The other side of value: The gross profitability premium. Journal of Financial Economics
108:1–28.

Novy-Marx, R., and M. Velikov. 2016. A taxonomy of anomalies and their trading costs. Review of Financial

Downloaded from https://academic.oup.com/rfs/article/33/5/1879/5758275 by guest on 11 May 2021


Studies 29:104–47.

Rosenberg, B., K. Reid, and R. Lanstein. 1985. Persuasive evidence of market inefficiency. Journal of Portfolio
Management Spring:9–16.

Shiller, R. J. 1981. The use of volatility measures in assessing market efficiency. Journal of Finance 36:291–304.

Spiegel, M. 2008. Forecasting the equity premium: Where we stand today. Review of Financial Studies 21:1453–4.

1890

[17:41 2/4/2020 RFS-OP-REVF200019.tex] Page: 1890 1879–1890

You might also like