Journal of Financial Economics - Charting by Machines
Journal of Financial Economics - Charting by Machines
Charting by machines
Scott Murray ∗ , Yusen Xia, Houping Xiao
Georgia State University, Robinson College of Business, 35 Broad Street, Atlanta, GA, 30303, United States
A R T I C L E I N F O A B S T R A C T
Dataset link: https:// We test the efficient market hypothesis by using machine learning to forecast stock returns from historical
dx.doi.org/10.17632/x63r376783.2 performance. These forecasts strongly predict the cross-section of future stock returns. The predictive power
holds in most subperiods and is strong among the largest 500 stocks. The forecasting function has important
JEL classification:
G11 nonlinearities and interactions, is remarkably stable through time, and captures effects distinct from momentum,
G12 reversal, and extant technical signals. These findings question the efficient market hypothesis and indicate that
technical analysis and charting have merit. We also demonstrate that machine learning models that perform well
Keywords: in optimization continue to perform well out-of-sample.
Efficient market hypothesis
Machine learning
Deep learning
Charting
Technical analysis
Cross-section of stock returns
1. Introduction tant and highly statistically significant predictive power. This predictive
power prevails in most subperiods of our focal 196307-202212 test pe-
The weak form of the efficient market hypothesis (EMH hereafter) riod, including the most recent 201501-202212 subperiod, and remains
stipulates that the construction of a profitable portfolio based only on strong among the largest 500 stocks. The forecasting function is remark-
information discernable from plots depicting the historical performance ably stable through time and highly complex, with substantial nonlinear
of stocks (price plots hereafter) should not be possible. As such, techni- and interaction components that are important for prediction. Finally,
cal analysis, or charting, should be a fruitless investment technique. the predictive power of our ML-based forecasts is not explained by the
Academic research on technical analysis has broadly supported this well-known momentum (Jegadeesh and Titman (1993)) and reversal
prediction. Despite the broad dismissal of technical analysis in the aca- (Jegadeesh (1990)) effects, nor by previously studied technical or ML-
demic literature, it remains widely used by investment managers. The based signals.
continued widespread use of technical analysis suggests that its merit Execution of the ML process requires us to make several implemen-
may not be fully discovered in academic research, and that further in- tation decisions. To alleviate concerns related to data mining (Harvey
vestigation is warranted. et al. (2016)) or out-of-sample forecasting power (McLean and Pontiff
In this paper, we test the weak form of the EMH by examining (2016), Green et al. (2017)), we use data from 192701-196306, which
whether forecasts produced by machine learning (ML hereafter) predict we refer to as the “optimization period”, to select our ML model.1 Our
the cross-section of future stock returns. The forecasts are based only analyses lead us to use a convolutional neural network with long short-
on data that are easily discernable from historical price plots, specif- term memory as the ML architecture, mean-squared error as the loss
ically the cumulative stock returns over the past 12 months. We find function, a weighting scheme that assigns the same total weight to ob-
strong evidence that the ML-based forecasts have economically impor- servations from each month and equal weight to each stock within a
* Corresponding author.
E-mail addresses: [email protected] (S. Murray), [email protected] (Y. Xia), [email protected] (H. Xiao).
1
Lo and MacKinlay (1990) and Fama (1991) also raise concerns about falsely significant relations between predictive variables and the cross-section of future
stock returns. Giglio et al. (2021) propose a methodology to address the issue of multiple testing, raised by Harvey et al. (2016), in the context of linear asset pricing
models. Schwert (2003) shows that the magnitudes of the size and value effects decrease after the period examined by the initial research on these effects.
https://doi.org/10.1016/j.jfineco.2024.103791
Received 27 May 2022; Received in revised form 12 January 2024; Accepted 13 January 2024
Available online 24 January 2024
0304-405X/© 2024 Elsevier B.V. All rights reserved.
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
month, and a normalized measure of future return as the dependent returns on 𝑀𝐿𝐸𝑅, historical cumulative returns, and terms captur-
variable.2 Our use of data from only the optimization period to make ing nonlinearities in the forecasts, indicate that the predictive power of
these ML model decisions ensures that our main test results truly reflect 𝑀𝐿𝐸𝑅 remains strong after controlling for these terms.
out-of-sample predictive power. Our next tests examine whether the predictive power of the ML-
We use the optimized ML model to generate stock return forecasts based forecasts is subsumed by previously-documented relations be-
during the 196307-202212 period, which we refer to as the “test peri- tween variables calculated from historical market data and future stock
od”. Our main analyses examine the performance of portfolios formed returns. First, we find that while the ML-based forecasts do include com-
by sorting stocks on the ML-based forecasts, which we denote 𝑀𝐿𝐸𝑅. ponents related to the momentum (Jegadeesh and Titman (1993)) and
We find that 𝑀𝐿𝐸𝑅 is a strong predictor of the cross-section of future reversal (Jegadeesh (1990), Lehmann (1990)) effects, a large portion
stock returns. The average excess returns of value-weighted decile port- of the ML-based forecasts’ predictive power is unrelated to these phe-
folios constructed using breakpoints calculated from only NYSE-listed nomena. We next show that our ML-based forecasts have a positive but
stocks increase from −0.14% per month for the decile one portfolio to relatively weak relation with the image-based ML forecasts of Jiang et
0.93% per month for the decile 10 portfolio. The 1.08% per month gen- al. (2022) and that the predictive power of our forecast remains strong
erated by the portfolio that is long the decile 10 portfolio and short the after controlling for the image-based forecasts. We also find that none
decile one portfolio is not only economically large, but highly statis- of the 14 technical signals examined by Neely et al. (2014) or 14 trad-
tically significant, with a 𝑡-statistic of 5.51. While our methodology is ing friction variables examined by Freyberger et al. (2020) explain the
designed to ensure that our results reflect out-of-sample performance, relation between our ML-based forecasts and future stock returns. Fi-
this and other 𝑡-statistics from our asset pricing tests far surpass the nally, we visually examine plots associated with high and low future
benchmark 𝑡-statistics proposed by Harvey et al. (2016). We find no returns and find that even among stocks with similar values of momen-
evidence that variation in the average returns of the 𝑀𝐿𝐸𝑅-sorted tum and reversal, there are differences in such charts that can easily be
portfolios is attributable to risk. Alphas of the decile portfolios with discerned by a human chart reader.
respect to several established factor models exhibit patterns similar to Our last tests examine the effectiveness of our optimization proce-
those of the average excess returns. Risk metrics such as volatility, skew- dure. Specifically, we investigate whether ML models that performed
ness, value at risk, and expected shortfall do not support a risk-based well during the 192701-196306 optimization period continued to per-
explanation for the patterns in average returns. form well during the 196307-202212 test period. To assess whether this
The predictive power of 𝑀𝐿𝐸𝑅 is strong throughout most of is the case, we construct long-short portfolios by sorting stocks based on
our 196307-202212 sample period. The portfolio that is long (short) forecasts generated from each of the candidate ML models considered
stocks with high (low) values of 𝑀𝐿𝐸𝑅 generates a large and highly- in our optimization exercise. We find that the correlation between val-
significant average excess return of more than 1% per month during ues of the metric used to assess model performance in our optimization
most subperiods. The exception is the 200501-201412 subperiod, dur- procedure and the associated Sharpe ratios of the long-short portfolios
ing which the average excess return is close to zero due to a small num- during the test period is 0.69. This strong positive correlation indicates
ber of very large monthly losses in 2009. These losses are attributable to that the optimization process successfully identified which ML models
would produce better out-of-sample forecasts.
the portfolio’s unusually large momentum tilt, combined with large mo-
Our work contributes to three broad strands of research. First, we
mentum strategy losses, in 200904 and 200908. During the most recent
add to the literature examining whether past returns contain informa-
201501-202212 subperiod, the portfolio generates 1.20% per month
tion useful for predicting future returns, which is tantamount to the
with an associated 𝑡-statistic of 2.13. The predictive power of 𝑀𝐿𝐸𝑅
literature on the weak form of the EMH. The most prominent findings in
remains remarkably strong when the sample is restricted to only large
this literature are the aforementioned momentum and reversal effects. A
stocks. Most notably, when using only the top 500 stocks by market
subset of this literature explicitly investigates the validity of technical
capitalization, the average excess return of the long-short portfolio is
analysis and charting. Several papers examine the ability of technical
0.72% per month (𝑡-statistic = 4.37).
signals to predict the performance of broad market indices or diver-
The objective of our next tests is to characterize the predictive power
sified portfolios. Brock et al. (1992) find that simple technical signals
of the ML-based forecasts. The main challenge in this is that the ML pro-
predict the future returns of the Dow Jones Index, but subsequent work
cess is a black box methodology and the forecasting function generated
attributes this finding to data snooping (Sullivan et al. (1999), Ready
by the learning process is difficult to interpret.
(2002), and Bajgrowicz and Scaillet (2012)) and nonsynchronous trad-
The first such tests examine whether the relation between past price
ing (Bessembinder and Chan (1998)). Allen and Karjalainen (1999) use
patterns and future stock returns is stable through time. We find that
genetic algorithms (a form of ML) to identify technical rules for trad-
this is indeed the case. Forecasts based on fits of subsets of the data
ing the S&P 500 index and find that they do not work out of sample.
that are separated by long periods of time are highly-correlated, and
More recently, Zhu and Zhou (2009) find that technical analysis can
the associated long-short portfolios have substantial common holdings
be useful for making asset allocation decisions, Moskowitz et al. (2012)
and high return correlation. Furthermore, the performance of the ML-
find time-series momentum in a large number of asset classes, Neely et
based forecasts is slightly better when the ML model is fit to all available
al. (2014) show that technical indicators are useful for predicting the
past data compared to when the fitting process is applied to data from
market risk premium, and Han et al. (2013) find evidence that moving
rolling-window subperiods. average strategies work well for timing investment in volatility-sorted
We then investigate the importance of nonlinearities and interaction portfolios.3
terms in the forecasting function, and find them to be responsible for a Despite the wide-spread use of technical analysis in practice
substantial portion of the variation in forecasts and important for pre- (Menkhoff (2010), Lo and Hasanhodzic (2010)), research on the pre-
diction. Nearly two thirds of variation in the forecasts is attributable to dictive power of technical signals is sparse. Most recently, Jiang et al.
nonlinearities and interactions in the forecasting function, and nearly
half of this variation is driven by interactions. Regressions of future
3
Sweeney (1986), Levich and Thomas (1993), Neely et al. (1997), Chang and
Osler (1999), and Gehrig and Menkhoff (2006) find evidence that investment
2
We rely on the machine learning literature (see e.g. Goodfellow et al. strategies based on technical analysis are profitable in foreign exchange mar-
(2016)) to motivate our choices of hyperparameters, such as the number of kets, a finding that Osler (2003) attributes to the clustering of stop-loss and
layers in the neural network, the number of epochs used for early stopping to take-profit orders. LeBaron (1999) argues that the profitability of such strate-
prevent overfitting, etc. The details of the configurations of our neutral net- gies is due to central bank intervention in currency markets, but Neely (2002)
works are described in Section I and Figure A1 of the Internet Appendix. concludes the opposite.
2
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
(2022) use ML to predict whether a stock will generate a positive or casts are based only on 12 past monthly cumulative returns. The main
negative future return based on image representations of past price and benefit of ML in our setting is its ability to detect highly complex re-
volume data.4 Their forecasts differ from ours in many ways. First, the lations between past and expected future returns.7 This contrasts with
input to their ML-based forecasts is an image of a chart that depicts previous work that harnesses ML’s ability to generate meaningful fore-
daily open, close, high, and low prices, as well as trading volume and casts from a large number of predictive variables. Unlike the work that
moving average prices over a past period of between five days and 3 identifies factors and arbitrage portfolios, which uses ML to identify
months. Our approach is simpler in that we use only 12 past monthly optimal tradeoffs between risk and expected return, our objective of
cumulative return observations as the basis for our forecasts. Further- assessing the efficacy of technical analysis for predicting future stock
more, Jiang et al. (2022) scale their images to reflect only the shape, returns leads us to use ML to forecast only the expected return.
but not the magnitude, of patterns in historical price and volume plots. Finally, we contribute to the broader literature that uses ML for asset
We do not scale the data upon which our forecasts are based, thereby pricing applications by demonstrating the effectiveness of model selec-
allowing our ML process to learn patterns based on both the shape tion via ex-ante optimization. Most research using ML to predict returns
and magnitude of past returns. Not surprisingly given the difference in uses some form of model optimization procedure to select model pa-
our approaches, the predictive power of our ML-based forecasts is not rameters. Gu et al. (2020) compare the performance of different ML
subsumed by that of the forecasts generated by Jiang et al. (2022). Ad- models in generating estimates of risk premia from a large number of
ditionally, while Jiang et al. (2022) find that the predictive patterns are stock-level characteristics but do not examine whether out-of-sample
context-independent, meaning that the same patterns hold in different performance correlates with performance in the optimization process.
markets and using different time scales, we find that the predictive pat- Our finding of a strong correlation between optimization period and
terns we detect are persistent through time, suggesting that it is possible test period performance of different ML models shows that such opti-
for a chartist to learn the patterns over a relatively long period. Our pa- mization processes are effective.
per is most similar to Lo et al. (2000), who use smoothing estimators The remainder of this paper is organized as follows. Section 2 de-
to extract nonlinear relations between historical price patterns and fu- scribes the construction of our sample. Section 3 describes how we
ture stock returns. This finding has been the subject of much scrutiny, optimize our ML model and the calculation of the ML-based return fore-
most notably by Jegadeesh (2000), who argues that Lo et al. (2000)’s casts. Section 4 presents our main evidence that the ML-based forecasts
findings are not robust to variation in the bandwidth parameter used successfully predict the cross-section of future stock returns. Section 5
for smoothing, a subjective empirical decision, and that the profitability characterizes the nature of the predictive power of the ML-based fore-
of trading strategies based on the patterns detected by Lo et al. (2000) casts. In Section 6 we conduct an evaluation of our optimization proce-
is close to zero. Our paper is similar in spirit to Lo et al. (2000), ex- dure. Section 7 concludes.
cept we use a more modern technology (ML) than they do (smoothing
estimators) to detect the relations between past price patterns and fu- 2. Sample and variables
ture returns. Our paper overcomes the robustness critique of Jegadeesh
(2000) by implementing a systematic procedure to optimize our ML In this section we describe our sample and the variables used in our
model using only data from the period prior to our main test period. tests.
Our use of portfolio analyses as our main empirical methodology ad-
dresses Jegadeesh (2000)’s profitability critique. The main contribution
2.1. Sample
of our paper, therefore, is to demonstrate the merit of technical analy-
sis, and thus provide evidence contradicting the EMH, in a manner that
overcomes the critiques of previous work in this area. The stock data used in our study come from CRSP. Our sample
Second, we add to the growing literature using ML to understand contains stock, month observations for months 𝑡 from 192701-202212
the cross-section of stock returns.5 Messmer (2017), Messmer and Au- (inclusive). In each month 𝑡, the sample contains all stocks that, on the
drino (2020), and Freyberger et al. (2020) use ML to identify relations last trading day of month 𝑡 − 1, are common shares of US-based firms
between stock-level characteristics and expected stock returns. Kelly et listed on the NYSE, AMEX, or NASDAQ. To ensure that the construc-
al. (2019), Gu et al. (2020, 2021), Kozak et al. (2020), Lettau and Pel- tion of a one-year historical price plot for each stock in the sample
ger (2020), Bryzgalova et al. (2023), Chen et al. (2023), and Feng et would have been feasible, we require that each included stock has a
al. (2023) use ML to extract latent factors, factor exposures, and risk non-missing return for each of months 𝑡 − 12 through 𝑡 − 1, inclusive.
premia from characteristics. Guijarro-Ordonez et al. (2023) use deep Finally, because our focal analyses weight stocks by market capital-
learning to construct optimal arbitrage portfolios.6 A consistent theme ization, we require that the market capitalization of each stock in the
in this work is the use of ML to synthesize the information in a broad sample as of the end of month 𝑡 − 1, which we define as the number of
set of stock-level variables that previous work has already found to be shares outstanding times the price of each share, can be calculated. Our
related to expected stock returns. Our work differs in that our fore- focal tests examine the ability of ML-based forecasts calculated as of the
end of month 𝑡 − 1 to predict the cross-section of month 𝑡 stock returns.
To ensure that the results of our focal tests reflect out-of-sample pre-
4
Moritz and Zimmermann (2016) apply tree-based conditional portfolio sorts dictive power, we use sample months 𝑡 from 192701-196306, inclusive,
to past return deciles and find that the associated forecasts predict the cross- to optimize our ML model, and refer to this period as the “optimization
section of future stock returns. Their use of past return deciles, instead of the period”. The first month of the optimization period is 192701 because
actual cumulative returns, makes it difficult to reach a conclusion about the return data in CRSP begin in 192601, making 192701 the first month
efficacy of traditional charting from their results. for which a full year of prior return data are available. Our focal asset
5
Earlier work includes Allen and Karjalainen (1999), who use genetic pro- pricing tests cover months 𝑡 from 196307 through 202212, which we
gramming to search for profitable technical patterns in the S&P 500 index, and
refer to as the “test period”. We begin the test period in 196307 (and
find none. In related work, Feng et al. (2018) and Rossi (2018) use ML to pre-
end the optimization period in 196306) to conform with the start date
dict S&P 500 index returns, and Bianchi et al. (2021) use ML to predict bond
returns. of the sample in Fama and French (1992, 1993) and several subsequent
6
Rapach et al. (2013) use ML to generate forecasts of international stock asset pricing studies.
returns based on past US market returns. Chinco et al. (2019) use ML to generate
forecasts of one-minute-ahead future stock returns based on returns in the past
7
three minutes. Bali et al. (2022) use ML to generate forecasts of corporate bond We find similar results when we using 60 months’ past cumulative returns
and stock returns. to generate the forecasts.
3
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
2.2. Variables As input variables, we use the monthly cumulative stock returns
over the 12 months prior to the month whose return is to be fore-
For each stock 𝑖 in each month 𝑡, we calculate several variables. The casted, 𝐶𝑅1 , … , 𝐶𝑅12 . While return predictability based on past returns
focal dependent variable is the month 𝑡 excess stock return, calculated of any frequency or horizon violates the EMH, we use one year of past
as the delisting-adjusted stock return minus the return of the risk-free monthly cumulative returns for several reasons. Our use of one year of
security in month 𝑡.8 The independent variables, which are used to past data follows previous technical analysis research, which typically
predict the cross-section of month 𝑡 returns, are calculated from data uses at most one year of data to generate signals.10 The use of a year’s
available as of the end of month 𝑡 − 1. The focal independent variable worth of past returns also ensures that our input variables contain the
is the ML-based return forecast, 𝑀𝐿𝐸𝑅. Section 3 describes in detail same information as the standard measure of momentum, which is the
how we calculate 𝑀𝐿𝐸𝑅. For now, we simply note that the forecast cumulative return over months 𝑡 − 12 through 𝑡 − 2. We use monthly
of stock 𝑖’s month 𝑡 return is calculated by applying the function gener- (instead of daily or some other shorter frequency) cumulative returns
ated by the ML process to the cumulative returns of stock 𝑖 over months to ensure that any patterns detected by our ML process could plausi-
𝑡 − 12 through 𝑡 − 1, which we denote 𝐶𝑅1 , … , 𝐶𝑅12 . Specifically, 𝐶𝑅𝑘 bly be detected by a human chartist, who may only be able to discern
is the cumulative stock return over the 𝑘-month period covering months relatively coarse data from a one-year price plot. This decision follows
𝑡 − 12 through 𝑡 − 12 + 𝑘 − 1. This notation is motivated from the point previous ML research that uses data measured at the monthly frequency
of view of an investor looking at a one-year price plot created at the when calculating most historical return-based variables (see e.g. Neely
end of month 𝑡 − 1. 𝐶𝑅1 is the stock return in the first month that ap- et al. (2014), Gu et al. (2020), and Freyberger et al. (2020)). The use of
pears on the plot, and more generally 𝐶𝑅𝑘 is the cumulative return of monthly data also alleviates the need to address challenges associated
the stock during the first 𝑘 months that appear on the plot. with different years having different numbers of days when implement-
ing our ML model. In Section III and Table A1 of the Internet Appendix,
3. ML model and forecasts we show that the results of tests that use ML-based forecasts based on
60 months of past return data are qualitatively similar to those of our
In this section we describe how we select our ML model using data main measure. While our focal tests use cum dividend returns to gen-
from the optimization period and how we use the selected ML model to erate forecasts of future returns, traditional chartists frequently work
generate forecasts of future stock returns. with price histories that do not account for the impact of dividends on
prices. Therefore, in Section IV and Table A2 of the Internet Appendix,
3.1. Optimizing the ML model we show that our results hold when using price-based cumulative re-
turns, instead of cum dividend cumulative returns, to forecast future
Our objective in using ML is to generate a function 𝑓 that, given val- returns.
ues of input variables discernable from a historical price plot of a stock, The remaining decisions related to selection of our ML model are
will produce a forecast of the stock’s future return. Very generally, the based on analysis of data from the optimization period, which covers
ML process takes a panel data set that includes one or more input vari- sample months 𝑡 from 192701-196306. We aim to find the combination
ables, a dependent variable, and a weight variable, which combined we of neural network architecture, loss function, loss function weight vari-
refer to as the fitting data, and attempts to “learn” the function 𝑓 that able, and dependent variable that generates the best forecasts of future
describes the relation between the input variables and the expectation stock returns.
of the dependent variable. The ML process attempts to learn 𝑓 by min- Following LeCun et al. (2015) and Goodfellow et al. (2016), we con-
imizing the value of a loss function . Once the function 𝑓 has been sider four neural network architectures: a feed-forward neural network
learned, it can be applied to data not included in the fitting data to
(FNN), a convolutional neural network (CNN), a long short-term mem-
generate forecasts.
ory network (LSTM), and a convolutional neural network with long
ML is a broad term that encapsulates several different empirical
short-term memory (CNNLSTM). To avoid the curse of dimensionality,
methods, including linear dimension reduction techniques such as prin-
we rely on the ML literature to motivate our choices of hyperparam-
cipal components regression and partial least squares, nonlinear meth-
eters, such as the number of layers in the neural network, etc. The
ods such as group LASSO that both perform dimension reduction and
neutral networks’ configuration details, along with graphical depictions
allow the forecast to be nonlinear in the inputs, nonparametric meth-
of each neural network architecture, are provided in Section I and Fig-
ods such as boosted regression trees and random forests that account
ure A1 of the Internet Appendix. All four architectures are widely used
for interactions between input variables, and neural networks, which
representation-based deep learning methods designed to learn complex
accommodate most features of all other models and are therefore the
relations in the data. Here, we give a very brief comparison of the dif-
most complex but most general and flexible learning tool.9 We choose
ferent architectures. We refer the reader to LeCun et al. (2015) and
to use neural networks to generate our ML-based forecasts for two rea-
Goodfellow et al. (2016) for more technical discussions.
sons. First, the learning process underlying neural networks is intended
FNN is the most general of the four architectures, but it does not
to mimic that of the human brain. We want our empirical design to be
explicitly consider the grid topology or variable dependency that fre-
well-suited for capturing any predictive power discernable by the type
quently characterize sequential and time-series data. The CNN and
of human chartist described in Lo et al. (2000), who learns complex pat-
LSTM architectures are designed to overcome this shortfall of FNN.
terns in stock returns from observation of historical price plots. Second,
CNN is a specialized type of neural network designed to process data
since our forecasts are based on only 12 input variables, the main ben-
with a grid-like topology, such as our cumulative return data, which
efit of ML in our context is its ability to account for both nonlinearities
can be interpreted as a one-dimensional grid. Previous work (Hosein-
and interactions in the forecasting function. Methods whose main ben-
zade and Haratizadeh (2019)) has shown CNNs to be highly successful
efit is dimension reduction or variable selection, therefore, are likely to
at time-series prediction. LSTM is a form of recurrent neural network
be less useful in our context.
(RNN). RNNs are explicitly designed to process one-dimensional se-
quential data. However, conventional RNNs have been shown to be
8
The month 𝑡 delisting-adjusted stock return is calculated following
Shumway (1997). Details of the calculation are in Section II of the Internet
10
Appendix. Monthly risk-free security returns are from Ken French’s website: Brock et al. (1992), Zhu and Zhou (2009), and Han et al. (2013) consider
https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html. moving average prices of up to 200 days, whereas Neely et al. (2014) use data
9
Gu et al. (2020) provide an excellent overview of the different machine from the past year to calculate many technical signals. Allen and Karjalainen
learning models. (1999) use a 250 day moving average to scale S&P 500 index levels.
4
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
5
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 1
ML Model Optimization. This table presents the results of tests used to determine the optimal ML model. The tests are run
using only data from the 192701-196306 optimization period. We examine all combinations of four ML architectures, two
loss functions, three loss function weighting methodologies, and four dependent variables. The ML architectures are feed-
forward neural network (FNN), convolutional neural network (CNN), long short-term memory (LSTM), and convolutional
neural network with long short-term memory (CNNLSTM). The loss functions are the mean squared error (MSE) and the
mean absolute error (MAE). The loss function weighting methodologies are to equal-weight each observation (EW), to
equal-weight each month and within each month give equal weight to each observation (EWPM), and to equal-weight
each month and within each month weight each observation according to its market capitalization (EWPMVW). The four
dependent variables we examine are the excess stock return (𝑟), the standardized excess stock return (𝑟𝑆𝑡𝑑 ), the normalized
excess stock return (𝑟𝑁𝑜𝑟𝑚 ), and the percentile of the stock return (𝑟𝑃 𝑐𝑡𝑙 ). Using each of the 96 possible combinations of
implementation choices, we train each model 30 times using data from the even months in even years and odd months
in odd years, to generate 30 forecasting functions. We then apply each of the forecasting functions to each observation
in odd months in even years and even months in odd years, and for each such observation, take the average of the 30
resulting values to be the forecast. The table shows the time-series averages of the monthly cross-sectional Spearman rank
correlations between the forecasts and the actual excess return. The column labeled “Dependent Variable” indicates the
dependent variable. The column labeled “Weighting Methodology” indicates the weighting methodology. The remaining
column headers indicate the ML architecture and loss function. The Spearman rank correlations are shown in percent.
Dependent Variable Weighting Methodology FNN FNN CNN CNN LSTM LSTM CNNLSTM CNNLSTM
MAE MSE MAE MSE MAE MSE MAE MSE
Specifically, the chartist would observe a plot of past prices (not trans- model using non-fitting month observations. Each of the 30 fittings pro-
formed prices) for different stocks and from those charts assess which duces a different forecasting function 𝑓 . We then apply each of these
stocks are likely to outperform. The commonality in individual stock 30 forecasting functions 𝑓 to each non-fitting month observation in the
returns described above may therefore cause the ML-based forecasts in optimization period. Finally, for each such observation, we take the av-
any given month to be either predominantly high or predominantly low erage of those 30 forecasts to be the forecast based on the given ML
based on whether the market portfolio as a whole has a past return pro- model.12 This approach of averaging several forecasts is referred to as
file that leads to a high or low forecast, respectively. A benefit of our “ensemble averaging.”13 In Section VI and Figure A3 of the Internet Ap-
approach, however, is that it may help the learning process detect pat- pendix, we show that using an average of 30 forecasts removes almost
terns that are distinct from traditional momentum and reversal signals, all randomness from the ensemble forecast. Removing this randomness
which are based on relative past performance. Since all of our tests are is important to ensure that our results can be replicated by future re-
cross-sectional in nature, even in a month where all stocks have a high searchers. In the end, for each non-fitting month observation in the
(or low) forecast future return, our tests will still assess whether the optimization period, we have 96 different ensemble forecasts, one cor-
forecasts have the ability to discern which stocks are likely to have rel- responding to each different ML model.
atively high (or low) future returns. We evaluate the models using the time-series average of the monthly
In sum, we consider 96 different ML models, found by taking all cross-sectional Spearman rank correlations between forecasts and ac-
combinations of four architectures (FNN, CNN, LSTM, CNNLSTM), two tual excess returns, calculated from non-fitting month observations.
loss functions (MSE, MAE), three weighting schemes (EW, EWPM, EW- We use Spearman rank, instead of Pearson product-moment, correla-
PMVW), and four dependent variables (𝑟, 𝑟𝑆𝑡𝑑 , 𝑟𝑁𝑜𝑟𝑚 , 𝑟𝑃 𝑐𝑡𝑙 ). tions because our main tests are portfolio analyses that rely solely on
To assess the effectiveness of each model, we first apply each model the ordering of stocks with respect to the forecasts. Furthermore, be-
to a subset of the data in the 192701-196306 optimization period. cause several of the forecasts are for transformed versions of the future
Specifically, we fit each model 30 times on data from optimization pe- excess stock return, the linearity assumption inherent in the Pearson
riod sample months 𝑡 corresponding to even months from even years correlation is unlikely to hold. Table 1 presents the average Spearman
and odd months from odd years (fitting months hereafter). Each model rank correlations for each model. The highest average correlation is
is fit 30 times because the ML process is random. If we run the same generated by the CNNLSTM architecture using the MSE loss function
model on the same data twice, the two resulting functions will not be with EWPM weights and 𝑟𝑁𝑜𝑟𝑚 as the dependent variable. We therefore
the same. Each fitting uses 70% of the observations for training and the choose this model (CNNLSTM, MSE, EWPM, and 𝑟𝑁𝑜𝑟𝑚 ) to generate
other 30% for validation. Because each fitting selects a different 30%
of observations for validation, our methodology essentially uses cross 12
Krizhevsky et al. (2012), Sutskever et al. (2014), and Krogh and Vedelsby
validation. However, since the 30% of observations used for validation
(1995) discuss the benefits of taking the average of multiple forecasts when
are randomly selected, they come from the same months as the obser- using neural networks.
vations used for training and may not be considered independent of 13
Other papers that use ensemble averaging in asset pricing applications are
the training month observations. For this reason, as described in the Moritz and Zimmermann (2016), Gu et al. (2020, 2021), Bianchi et al. (2021),
next paragraph, we evaluate the out-of-sample performance of each Bali et al. (2022), and Jiang et al. (2022).
6
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 2
ML-Based Forecast Summary Statistics. This table presents summary statistics for the ML-based return forecasts, 𝑀𝐿𝐸𝑅.
𝑀𝐿𝐸𝑅 is a forecast of the normalized excess stock return. The column labeled “Period” indicates the period used
to calculate the summary statistics in the given row. The columns labeled “Mean”, “S.D.”, “Min.”, “𝑃1 ”, “𝑃5 ”, “𝑃25 ”,
“Median”, “𝑃75 ”, “𝑃95 ”, “𝑃99 ”, and “Max.” present the time-series averages of the monthly cross-sectional mean, standard
deviation, and the minimum, first percentile, fifth percentile, 25th percentile, median, 75th percentile, 95th percentile,
99th percentile, and maximum, respectively, values of 𝑀𝐿𝐸𝑅. The column labeled “n” shows the time-series average
of the number of observations in each month. The set of stocks included in the month 𝑡 sample is all common shares
of US-based firms that are listed on the NYSE, AMEX, or NASDAQ as of the end of month 𝑡 − 1. We also require that a
return is available for each stock in each of months 𝑡 − 12 through 𝑡 − 1, and that the stock’s market capitalization as of
the end of month 𝑡 − 1 is available. Values of 𝑀𝐿𝐸𝑅 are calculated as of the end of month 𝑡 − 1.
Period Mean S.D. Min. 𝑃1 𝑃5 𝑃25 Median 𝑃75 𝑃95 𝑃99 Max. n
196307-202212 −0.02 0.11 −0.61 −0.33 −0.22 −0.07 −0.00 0.05 0.12 0.19 0.38 4166
196307-197412 −0.01 0.09 −0.53 −0.28 −0.17 −0.06 −0.00 0.04 0.12 0.20 0.41 2245
197501-198412 −0.01 0.11 −0.75 −0.35 −0.21 −0.06 0.01 0.06 0.13 0.20 0.40 4332
198501-199412 −0.02 0.12 −0.66 −0.38 −0.24 −0.07 −0.00 0.05 0.15 0.26 0.57 5337
199501-200412 −0.04 0.12 −0.61 −0.36 −0.26 −0.11 −0.02 0.04 0.12 0.20 0.38 5792
200501-201412 −0.01 0.10 −0.53 −0.33 −0.21 −0.06 0.00 0.05 0.11 0.15 0.26 3956
201501-202212 −0.02 0.10 −0.57 −0.32 −0.22 −0.07 0.00 0.05 0.10 0.14 0.21 3486
our focal ML-based forecasts, 𝑀𝐿𝐸𝑅. Notably, small deviations from each stock 𝑖 and month 𝑡 observation in our sample. The forecast nor-
this model, such as using the MAE loss function, EW weights, or either malized excess return of stock 𝑖 in month 𝑡 that we use in our focal
𝑟𝑆𝑡𝑑 or 𝑟𝑃 𝑐𝑡𝑙 as the dependent variable, also perform well. This suggests empirical tests, 𝑀𝐿𝐸𝑅𝑖,𝑡 , is the forecast based on the most recent past
that variation in performance across models is not spurious and that our execution of the fitting process. We therefore have:
optimization procedure is likely to have merit. Ex-post tests of the effec-
tiveness of our optimization procedure, discussed in Section 6, confirm ⎧ 192701,196306
this hypothesis. ⎪𝑀𝐿𝐸𝑅𝑖,𝑡 , if 196307 ≤ 𝑡 ≤ 197412;
⎪𝑀𝐿𝐸𝑅192701,197412 , if 197501 ≤ 𝑡 ≤ 198412;
⎪ 𝑖,𝑡
3.2. ML-based forecasts ⎪𝑀𝐿𝐸𝑅192701,198412
𝑖,𝑡 , if 198501 ≤ 𝑡 ≤ 199412;
𝑀𝐿𝐸𝑅𝑖,𝑡 = ⎨ 192701,199412
⎪𝑀𝐿𝐸𝑅𝑖,𝑡 , if 199501 ≤ 𝑡 ≤ 200412;
We generate our focal ML-based forecasts, 𝑀𝐿𝐸𝑅, by applying the ⎪𝑀𝐿𝐸𝑅192701,200412 ,
𝑖,𝑡 if 200501 ≤ 𝑡 ≤ 201412;
ML model to expanding-window past subsets of our sample, and us- ⎪
ing the functions generated by these expanding window fits to produce ⎪𝑀𝐿𝐸𝑅192701,201412
𝑖,𝑡 , if 201501 ≤ 𝑡 ≤ 202212.
𝑡 ,𝑡
⎩
forecasts for subsequent periods.14 In general, we define 𝑀𝐿𝐸𝑅𝑖,𝑡1 2
to be the forecast of the normalized excess return of stock 𝑖 in month
3.3. Summary statistics
𝑡 that results from fitting our ML model on data from the period be-
tween months 𝑡1 and 𝑡2 , inclusive. Specifically, for any 𝑡1 , 𝑡2 > 𝑡1 , 𝑖, and
𝑡 ,𝑡
𝑡, we calculate 𝑀𝐿𝐸𝑅𝑖,𝑡1 2 as follows. First, we apply the ML process Table 2 presents the time-series averages of monthly cross-sectional
30 times to observations for all stocks from sample months between summary statistics for 𝑀𝐿𝐸𝑅 for the entire 196307-202212 test pe-
𝑡1 and 𝑡2 , inclusive. The result is 30 different forecasting functions. riod and for subperiods of the test period. In interpreting the forecasts,
We then apply each of these 30 functions to the cumulative returns recall that the forecasts are for the normalized future excess return, not
of stock 𝑖 in months 𝑡 − 12 through 𝑡 − 1. The result is 30 different fore- the excess return itself. In the average month, 𝑀𝐿𝐸𝑅 has a mean of
casts of the normalized excess return of stock 𝑖 in month 𝑡. We then −0.02, a median very close to zero, and a cross-sectional standard devi-
𝑡 ,𝑡
take 𝑀𝐿𝐸𝑅𝑖,𝑡1 2 to be the average of these 30 forecasts. For example, ation of 0.11. Extreme negative values of 𝑀𝐿𝐸𝑅 occur more frequently
to calculate 𝑀𝐿𝐸𝑅192701,196306 , we first run the ML process 30 times than extreme positive values, since the minimum, first percentile, and
𝑋,200809
using all observations in our sample from months 𝑡 between 192701 fifth percentile values are further below the mean than the 95th per-
and 196306, inclusive. We then apply each of the resulting forecast- centile, 99th percentile, and maximum values, respectively, are above
ing functions to the monthly cumulative returns of stock 𝑋 that would it. The prevalence of large negative values of 𝑀𝐿𝐸𝑅 compared to large
be observed in stock 𝑋 ’s price plot covering the period from 200709 positive values should not impact most of our asset pricing tests, since
through 200808. The result of applying these functions to these cumu- we focus on portfolio analyses, which rely on the ordering (but not the
lative returns is 30 forecasts of the normalized excess return of stock 𝑋 magnitude or distribution) of 𝑀𝐿𝐸𝑅 across stocks. In the average test
in 200809. We then take 𝑀𝐿𝐸𝑅192701,196306
𝑋,200809
to be the average of these period month, our sample contains 4,166 stocks. The subperiod results
30 forecast values. indicate that the cross-sectional distribution of 𝑀𝐿𝐸𝑅 is highly consis-
The expanding windows to which we apply the ML process cover tent through time, since the salient characteristics of the cross-sectional
sample months 𝑡 between 192701 and each of 196306, 197412, distribution of 𝑀𝐿𝐸𝑅 do not change much across the different sub-
198412, 199412, 200412, and 201412. As a result, we calculate periods. The number of stocks in the sample ranges from an average
𝑀𝐿𝐸𝑅192701,196306
𝑖,𝑡 , 𝑀𝐿𝐸𝑅192701,197412
𝑖,𝑡 , 𝑀𝐿𝐸𝑅192701,198412
𝑖,𝑡 , of 2,245 stocks per month for the 196307-197412 subperiod to 5,792
𝑀𝐿𝐸𝑅192701,199412 , 𝑀𝐿𝐸𝑅192701,200412 , and 𝑀𝐿𝐸𝑅192701,201412 for stocks per month for the 199501-200412 subperiod.
𝑖,𝑡 𝑖,𝑡 𝑖,𝑡
7
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 3
Portfolio Analysis. This table presents the results of a portfolio analysis examining the ability of 𝑀𝐿𝐸𝑅 to predict the cross-
section of future stock returns. At the end of each month 𝑡 − 1, all stocks in the month 𝑡 sample are sorted into 10 portfolios
based on an ascending ordering of 𝑀𝐿𝐸𝑅. The breakpoints used to determine which stocks are in which portfolio are the deciles
of 𝑀𝐿𝐸𝑅 calculated using only stocks listed on the NYSE. The month 𝑡 excess return of each portfolio is then taken to be the
market capitalization-weighted average month 𝑡 excess return of all stocks in the portfolio, with market capitalization calculated
as of the end of month 𝑡 − 1. We also calculate the excess return of a zero-cost portfolio that is long portfolio 10 and short
portfolio 1. The columns labeled “𝑀𝐿𝐸𝑅 1” through “𝑀𝐿𝐸𝑅 10” present results for decile portfolios 1 through 10. The column
labeled “𝑀𝐿𝐸𝑅 10 − 1” presents results for the zero-cost long-short portfolio that is long the decile 10 portfolio and short the
decile 1 portfolio. The rows labeled “𝑟” and “SD” present the time-series averages and standard deviations, respectively, of the
monthly portfolio excess returns for each of the portfolios, reported in percent per month. The row labeled “Sharpe” presents the
annualized Sharpe ratio of each portfolio. The values in parentheses are 𝑡-statistics, calculated following Newey and West (1987)
using 12 lags, testing the null hypothesis that the average monthly excess return is equal to zero. The analysis covers return
months 𝑡 from July 1963 through December 2022, inclusive.
𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅
Value 1 2 3 4 5 6 7 8 9 10 10 − 1
𝑟 −0.14 0.32 0.39 0.55 0.54 0.65 0.71 0.66 0.77 0.93 1.08
(−0.55) (1.39) (1.89) (2.77) (2.91) (3.94) (4.10) (3.99) (4.36) (5.18) (5.51)
SD 6.46 5.66 5.26 5.11 4.71 4.63 4.50 4.38 4.47 5.03 4.79
Sharpe −0.08 0.20 0.26 0.37 0.40 0.49 0.55 0.52 0.59 0.64 0.78
4.1. 𝑀𝐿𝐸𝑅-sorted portfolio returns factor innovations are unpredictable and serially uncorrelated, making
it unlikely that covariances with factor innovations produce return pat-
We test the EMH by examining the performance of portfolios formed terns that repeat through time. For the ML-based forecasts to predict
by sorting stocks based on 𝑀𝐿𝐸𝑅. Each month 𝑡, we sort all stocks into the cross-section of future stock returns, there must be repeated pat-
decile portfolios based on an ascending ordering of 𝑀𝐿𝐸𝑅, which is terns in past returns that are informative about future expected returns.
calculated from data available at the end of month 𝑡 − 1. The break- There is good economic reason, therefore, to think that the differences
points determining which stocks are in which portfolios are the deciles in average excess returns between the 𝑀𝐿𝐸𝑅-sorted portfolios are not
of 𝑀𝐿𝐸𝑅 calculated from the subset of stocks that are listed on the a manifestation of exposure to systematic risk factors. Nonetheless, we
NYSE at the end of month 𝑡 − 1. The month 𝑡 excess return of each port- investigate whether exposure to systematic risk factors can explain the
folio is calculated as the weighted average excess return of all stocks performance of the 𝑀𝐿𝐸𝑅-sorted portfolios.
in the portfolio, with weights proportional to market capitalization at
the end of month 𝑡 − 1 (value-weighted hereafter).15 We also calculate 4.2.1. Full sample factor model regressions
the excess return of a zero-cost portfolio that is long the decile 10 port- We begin by using factor analysis to estimate the average risk-
folio and short the decile one portfolio (10−1 portfolio hereafter). Our adjusted excess return (alpha) of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio.17 The
portfolio construction methodology follows Hou et al. (2020) and is in- portfolio’s alpha is calculated as the intercept coefficient from a time-
tended to limit the impact of small-capitalization stocks on our analysis. series regression of excess portfolio returns on excess factor returns. A
Table 3 presents the time-series averages of the monthly portfo- non-zero (zero) alpha indicates that exposures to the systematic risk fac-
lio excess returns during the 196307-202212 test period. The average tors captured by the factor model do not explain (explain) the portfolio’s
monthly excess returns increase nearly monotonically from −0.14% for average excess return. Since the true factor model that prices securities
decile portfolio one to 0.93% for portfolio 10. The average excess return is not known, we conduct the analyses using six previously-established
of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio of 1.08% per month is highly statistically empirical models: a one-factor market model (CAPM), the three-factor
significant, with a Newey and West (1987) 𝑡-statistic of 5.51. This port- model of Fama and French (1993, FF), the four-factor model of Carhart
folio’s annualized Sharpe ratio is 0.78, which is higher than that of the (1997, FFC), the FFC model augmented with a short-term reversal fac-
market factor (0.42), the factors in the Fama and French (2015) factor tor (FFC+REV), the five-factor model of Fama and French (2015, FF5),
model (between 0.26 and 0.51), the momentum factor (0.53), and the and the four-factor model of Hou et al. (2015, Q).18
reversal factor (0.51) during the same 196307-202212 period.16 The Table 4 Panel A shows that regardless of which model is used, the
patterns in average excess returns contradict the prediction of the EMH alpha of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio is positive, economically large,
that technical analysis should be fruitless. and statistically significant. Because the momentum and reversal fac-
tors are constructed by sorting stocks on variables calculated only from
4.2. Risk-adjusted returns of 𝑀𝐿𝐸𝑅-sorted portfolios past returns, it is not surprising that the 𝑀𝐿𝐸𝑅 10 − 1 portfolio’s al-
pha is smaller when using the FFC and FFC+REV models than when
A refined version of the EMH allows for profitable technical strate- using the CAPM, FF, and FF5 models. That the alpha with respect to
gies if the associated average returns are compensation for risk. Asset the Q model is similar to that of the FFC model is consistent with Hou
pricing theory predicts that expected stock returns are a function of co- et al. (2015)’s finding that the Q model explains the momentum ef-
variances between individual stock returns and innovations in priced fect. However, even when using the FFC+REV model, which includes
risk factors, or betas. Stocks with similar betas, therefore, are likely to
have both similar expected returns and similar past return patterns due
to their covariances with factor innovations. However, by definition, 17
In Section IX and Table A5 of the Internet Appendix, we present the results
of analyses examining risk-adjusted returns for each of the individual decile
portfolios.
15 18
In Section VII and Table A3 (Section VIII and Table A4) of the Internet Monthly excess returns for factors in the CAPM, FF, FFC, FFC+REV, and FF5
Appendix, we show that our results hold when using equal-weighted portfolios models are from Ken French’s website. Monthly excess returns for the factors in
(breakpoints calculated from all stocks). the Q model are from Chen Xue’s website: http://global-q.org/factors.html. Q
16
Monthly excess factor returns are from Ken French’s website. Factor Sharpe model factor excess returns are available for the period from 196701-202112,
ratios are not tabulated. thus all analyses using the Q factor model are subject to this data constraint.
8
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 4
Risk-Adjusted Returns. This table presents the results of analyses examining whether the average re-
turns of the portfolios formed by sorting on ML-based forecasts are compensation for risk. Panel A
presents the results of factor model regressions of the excess returns of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio on
the excess returns of the factors in different factor models. The 𝑀𝐿𝐸𝑅 10 − 1 portfolio is the same
as that whose average excess returns are shown in Table 3. The rows labeled “𝛼 ”, “𝑆𝐷(𝜖)”, and “Adj.
𝑅2 ” present the intercept, residual standard deviation, and adjusted 𝑅2 , respectively, of the regres-
sion. The row labeled “Sharpe(𝛼 + 𝜖 )” presents the annualized Sharpe ratio of the portfolio that is long
the 𝑀𝐿𝐸𝑅 10 − 1 portfolio and short the factor portfolios in amounts dictated by the slope coeffi-
cients estimated by the regression, calculated as the alpha divided by the residual standard deviation,
√
times 12. Panel B presents results from 6-month non-overlapping subperiod factor model regressions
estimated using daily excess return data. The row labeled “𝛼 ” presents the average monthly alphas (in-
tercept coefficients multiplied by 21) from these regressions. The row labeled “𝑆𝐷(𝜖)” presents the
√
standard deviation of the residuals from these regressions times 21. The standard deviation of the
residuals from these regressions is calculated by using the residuals from the individual short subperiod
regressions to generate a full time-series of daily residuals for the entire 196301-202212 period and
calculating the standard deviation of this full time-series of residuals. The row labeled “Sharpe(𝛼 + 𝜖)”
presents the annualized Sharpe ratio of the portfolio, calculated by dividing the average monthly alpha
√
by the monthly standard deviation of the residuals and multiplying by 12. The row labeled “Adj. 𝑅2 ”
presents the average adjusted 𝑅2 value from the short subperiod regressions. Except for the column la-
beled “Value”, the column headers indicate the factor model used to generate the results in the given
column. All excess returns, standard deviations, and alphas are reported in percent per month. The val-
ues in parentheses are 𝑡-statistics testing the null hypothesis that the average monthly alpha is equal
to zero. The 𝑡-statistics in Panel A are calculated following Newey and West (1987) using 12 lags. The
analyses cover return months 𝑡 from July 1963 through December 2022, inclusive.
Panel A: 𝑀𝐿𝐸𝑅 10 − 1 Portfolio Full Sample Factor Regressions
both the momentum and reversal factors, the 𝑀𝐿𝐸𝑅 10 − 1 portfo- turns of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio on the daily factor excess returns.19
lio’s alpha of 0.45% per month (𝑡-statistic = 3.58) is economically large We multiply the intercept coefficient by 21 because there are approx-
and highly significant. Furthermore, while the adjusted 𝑅2 values from imately 21 days in the average month, and take this value to be the
the FFC and FFC+REV model regressions of 29.90% and 49.05%, re- monthly alpha of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio during the given sub-
spectively, are higher than those of other models, even the model that period. The average alphas from these subperiod regressions, shown in
includes both the momentum and reversal factors (FFC+REV) explains Panel B of Table 4, are economically large and highly statistically signif-
less than half of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio’s variance. Thus, while icant for all factor models, ranging from 0.56% per month (𝑡-statistic =
there is overlap between the ML-based forecasts and each of momen- 3.77) for the FFC+REV model to 1.48% per month (𝑡-statistic = 11.57)
for the FFC model. These results provide no indication that time-varying
tum and reversal, the 𝑀𝐿𝐸𝑅 10 − 1 portfolio’s returns also contain
factor exposures explain the performance of the 𝑀𝐿𝐸𝑅 10 − 1 portfo-
a substantial component that is unrelated to these effects. We present
lio.
further evidence supporting this conclusion in Section 5.3.
9
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 5
Portfolio Subperiod Analysis. This table describes the performance of the 𝑀𝐿𝐸𝑅 decile portfolios during different subperiods. The portfolios are
the exact same portfolios whose performances are examined in Table 3. The column labeled “Subperiod” indicates the subperiod covered by each
analysis. The columns labeled “𝑀𝐿𝐸𝑅 1” through “𝑀𝐿𝐸𝑅 10” and “𝑀𝐿𝐸𝑅 10 − 1” present results for the portfolio indicated in the column
header. The rows labeled “𝑟” and “Sharpe” present the time-series averages of the monthly portfolio excess returns, reported in percent per month,
and the annualized Sharpe ratios, respectively, for each of the portfolios in the given period. The values in parentheses are 𝑡-statistics, calculated
following Newey and West (1987) using 12 lags, testing the null hypothesis that the average monthly excess return is equal to zero.
𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅
Subperiod Value 1 2 3 4 5 6 7 8 9 10 10 − 1
196307-197412 𝑟 −0.90 −0.47 −0.26 −0.14 −0.20 −0.03 0.15 0.07 −0.03 0.25 1.15
(−1.39) (−0.84) (−0.49) (−0.32) (−0.50) (−0.09) (0.33) (0.17) (−0.07) (0.62) (3.10)
Sharpe −0.56 −0.32 −0.19 −0.10 −0.15 −0.03 0.12 0.06 −0.02 0.17 1.16
197501-198412 𝑟 0.16 0.27 0.42 0.65 0.65 0.66 0.85 0.81 1.03 1.47 1.32
(0.31) (0.58) (0.89) (1.38) (1.46) (1.60) (1.90) (2.00) (2.28) (3.13) (4.41)
Sharpe 0.09 0.17 0.30 0.47 0.50 0.50 0.63 0.62 0.73 0.93 1.15
198501-199412 𝑟 −0.17 0.49 0.44 0.63 0.85 0.78 0.79 0.84 0.98 1.00 1.17
(−0.57) (1.51) (1.54) (2.08) (2.49) (2.49) (2.67) (2.68) (3.19) (2.77) (4.07)
Sharpe −0.13 0.37 0.34 0.51 0.65 0.59 0.59 0.63 0.71 0.67 1.16
1995012-20041 𝑟 −0.14 0.66 0.50 0.68 0.55 0.93 0.72 0.94 1.10 1.57 1.71
(−0.18) (1.15) (0.96) (1.17) (1.14) (2.28) (1.52) (2.60) (2.51) (4.06) (3.21)
Sharpe −0.06 0.35 0.30 0.41 0.40 0.67 0.58 0.83 0.96 1.16 0.90
200501-201412 𝑟 0.48 0.49 0.91 0.62 0.59 0.97 0.88 0.64 0.56 0.40 −0.08
(0.67) (0.67) (1.51) (1.09) (1.19) (2.35) (2.13) (1.28) (1.18) (0.71) (−0.15)
Sharpe 0.24 0.27 0.57 0.42 0.45 0.76 0.72 0.52 0.46 0.27 −0.05
201501-202212 𝑟 −0.19 0.66 0.43 1.07 1.00 0.70 1.03 0.77 1.17 1.01 1.20
(−0.23) (0.99) (0.71) (1.94) (1.83) (1.40) (2.16) (2.04) (2.99) (3.16) (2.13)
Sharpe −0.09 0.38 0.23 0.62 0.61 0.46 0.73 0.58 0.85 0.78 0.75
the 𝑀𝐿𝐸𝑅 portfolios. In these tests, the factor hedge ratios of the
portfolios are estimated from regressions that use data from prior to
portfolio formation. Because the hedge ratios are known at the time of
portfolio formation and updated monthly, this approach ensures that
the resulting alphas are those of a tradable portfolio while also accom-
modating time-varying betas. The results of these tests, described in
Section X and Tables A6 and A7 of the Internet Appendix, indicate that
the alpha of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio is positive, economically large,
and highly statistically significant for all factor models.
Second, we examine the total risk of the 𝑀𝐿𝐸𝑅 decile portfolios.
If the dispersion in average excess returns of the 𝑀𝐿𝐸𝑅 decile portfo-
lios is due to risk, we would expect a positive, negative, negative, and
negative relation between the average excess returns and the standard
deviation, skewness, value at risk, and expected shortfall, respectively,
of these portfolios. Analyses of the total risk of the 𝑀𝐿𝐸𝑅 decile port-
folios, shown in Section XI and Table A8 of the Internet Appendix, find
no evidence of these patterns.
In sum, our results suggest that the patterns in the average returns
of the portfolios formed by sorting on the ML-based forecasts do not
have a risk-based explanation.
10
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 6
Predictive Power Among Large Stocks. This table presents the results of portfolio analyses examining the ability of the ML-based forecasts to predict the
cross-section of future stock returns among large stocks. The construction of the portfolios is exactly the same as was used to construct the portfolios whose
performances are examined in Table 3, except that we vary the set of stocks used in the calculation of the portfolio breakpoints and the set of stocks that
are sorted into the portfolios. The column labeled “Breakpoints” indicates the set of stocks used to calculate the breakpoints. The column labeled “Holdings”
indicates the set of stocks sorted into the portfolios. The 𝑆𝑖𝑧𝑒 > 𝑃20
NYSE
(NYSE/𝑆𝑖𝑧𝑒 > 𝑃20 NYSE
) and 𝑆𝑖𝑧𝑒 > 𝑃50 NYSE
(NYSE/𝑆𝑖𝑧𝑒 > 𝑃50
NYSE
) subsets include all
(NYSE-listed) stocks with market capitalizations greater than the 20th and 50th percentile values, respectively, among NYSE-listed stocks as of the end of
month 𝑡 − 1. The Top 500 subset includes the top 500 stocks by market capitalization at the end of month 𝑡 − 1. The columns labeled “𝑀𝐿𝐸𝑅 1” through
“𝑀𝐿𝐸𝑅 10” and “𝑀𝐿𝐸𝑅 10 − 1” present results for the portfolio indicated in the column header. The table shows the average excess return of each
portfolio. All excess returns are reported in percent per month. The values in parentheses are 𝑡-statistics, calculated following Newey and West (1987) using
12 lags, testing the null hypothesis that the average monthly excess return is equal to zero. The analyses cover return months 𝑡 from July 1963 through
December 2022, inclusive.
𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅
Breakpoints Holdings 1 2 3 4 5 6 7 8 9 10 10 − 1
𝑆𝑖𝑧𝑒 > 𝑃20
NYSE
𝑆𝑖𝑧𝑒 > 𝑃20
NYSE
−0.13 0.34 0.39 0.57 0.53 0.60 0.75 0.62 0.75 0.92 1.06
(−0.52) (1.43) (1.92) (2.90) (2.86) (3.53) (4.37) (3.78) (4.26) (5.14) (5.68)
Top 500 Top 500 0.10 0.43 0.37 0.58 0.48 0.64 0.58 0.66 0.73 0.82 0.72
(0.41) (1.93) (1.88) (3.10) (2.95) (3.82) (3.26) (3.91) (4.14) (4.56) (4.37)
4.4. Predictive power for large stocks the predictive power of the ML-based forecasts is strong among large
stocks.
Recent work shows that many return anomalies are concentrated
among small stocks (Fama and French (2008, 2018), Hou et al. (2020)) 5. Characterization of ML-based forecasts
and that trading costs make such anomalies unprofitable to trade (Novy-
Marx and Velikov (2016), Detzel et al. (2023)). The portfolios examined Having demonstrated that the ML-based forecasts have strong ability
to this point are value-weighted and constructed using breakpoints cal- to predict the cross-section of future stock returns, we proceed now to
culated from only NYSE-listed stocks. Hou et al. (2020) show that this further characterize this predictive power.
approach minimizes the influence of small stocks on the results of the
analyses. Nonetheless, to ensure that our results persist when small 5.1. Stability of forecasting relation
stocks are excluded, we repeat the portfolio analysis using subsets of
our sample that exclude small stocks from both the set of stocks held Our first tests characterizing the ML-based forecasts examine the
in the portfolios and the set of stocks used to calculate the breakpoints. stability of the relation between past price patterns and future stock
We define the 𝑆𝑖𝑧𝑒 > 𝑃20 NYSE
(NYSE/𝑆𝑖𝑧𝑒 > 𝑃20 NYSE
) and 𝑆𝑖𝑧𝑒 > 𝑃50NYSE
returns (forecasting relation hereafter). Specifically, we investigate
(NYSE/𝑆𝑖𝑧𝑒 > 𝑃50 ) subsets to be the sets of (NYSE-listed) stocks with
NYSE whether price patterns associated with high (low) future stock returns
market capitalizations greater than the 20th and 50th percentile val- in the early part of our test period continue to be associated with high
ues, respectively, among NYSE-listed stocks, and the Top 500 subset (low) stock returns for the duration of our test period. If the forecast-
to include only the top 500 stocks by market capitalization. Each of ing relation is stable, these relations could be learned over a prolonged
the market capitalization-based criteria are evaluated monthly using all period of time, thereby further suggesting the viability of charting.
stocks in the sample for the given month. We continue to use value- To assess the stability of the forecasting relation, we define
weighted portfolios for all of these tests. 𝑀𝐿𝐸𝑅192701,196306
𝑖,𝑡 , 𝑀𝐿𝐸𝑅196307,197412
𝑖,𝑡 , 𝑀𝐿𝐸𝑅197501,198412
𝑖,𝑡 ,
The results of these analyses are shown in Table 6. When we use the 𝑀𝐿𝐸𝑅198501,199412
𝑖,𝑡 , 𝑀𝐿𝐸𝑅199501,200412
𝑖,𝑡 , and 𝑀𝐿𝐸𝑅200501,201412
𝑖,𝑡 to be
𝑆𝑖𝑧𝑒 > 𝑃20NYSE
subset both to calculate the breakpoints and as the set the forecasts for stock 𝑖 in month 𝑡 based on the forecasting function
of stocks held in the portfolios, the 𝑀𝐿𝐸𝑅 10 − 1 portfolio generates generated by running the ML process on data from the months between
an average excess return of 1.06% per month (𝑡-statistic = 5.68). This and inclusive of those in the superscripts. If the forecasting functions
result is extremely similar to that of our main test whose results are learned using data from two subperiods are similar and both predict the
shown in Table 3. Indeed, the correlation between the excess returns cross-section of returns, it would indicate stability in the forecasting re-
of these two portfolios is 0.96 (untabulated). These results strongly sug- lation.
gest that the influence of small stocks on our focal tests is minimal. As We test whether the forecasting functions learned using data from
we impose more stringent restrictions on the size of the stocks used in two different subperiods are similar in three ways. First, we calcu-
the analyses, the average excess returns of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio late the monthly cross-sectional Spearman rank correlations between
and associated 𝑡-statistics become slightly smaller. However, even when the forecasts for all months in the test period that are not included
we use only the largest 500 stocks both to calculate the breakpoints and in the data used to generate either forecasting function. For exam-
as the set of stocks held in the portfolios, the average excess return of ple, we calculate the correlations between 𝑀𝐿𝐸𝑅196307,197412 and
the 𝑀𝐿𝐸𝑅 10 − 1 portfolio of 0.72% per month (𝑡-statistic = 4.37) 𝑀𝐿𝐸𝑅199501,200412 using forecasts from all months in 196307-202212
remains economically large and highly statistically significant. Further- excluding 196307-197412 and 199501-200412. The time-series aver-
more, regardless of the set of stocks used to calculate the breakpoints or ages of these correlations, shown in Table 7 under “Forecast Correla-
held in the portfolios, the decile one (10) portfolio generates the lowest tions”, are all positive and of substantial magnitude, but decrease as the
(highest) average excess return of all portfolios. These tests show that time between the fitting periods increases.
11
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 7
Stability of Forecasting Relation. This table presents the results of analyses examining the stability of the forecasting relation. We define 𝑀𝐿𝐸𝑅𝑡1 ,𝑡2
as the ML-based forecast generated by applying the ML process to data from months 𝑡 between 𝑡1 and 𝑡2 , inclusive. The columns under the “Forecast
Correlations” heading present the time-series averages of monthly cross-sectional Spearman rank correlations between pairs of forecasts. The columns
under the “10 − 1 Common Holdings” heading present the time-series averages of the monthly percentages, in decimal format (i.e. 0.01 = 1%) of holdings
in the 10 − 1 portfolios that are common to both portfolios. The percentage of common holdings in any pair of 10 − 1 portfolios in any month is the
number of stocks held in the same direction (both long or both short) in both portfolios divided by the average of the number of stocks in the portfolios.
The columns under the “10 − 1 Return Correlations” heading present the time-series Pearson product-moment correlations between the excess returns
of the 10 − 1 portfolios. With the exception of the sort variable, the 10 − 1 portfolios are constructed in exactly the same manner as those whose returns
are examined in Table 3. All analyses use data from all months in the 196307-202212 test period except those months whose data are used by the ML
process to generate the forecasting function upon which either forecast is based.
𝑀𝐿𝐸𝑅197501,198412
𝑀𝐿𝐸𝑅198501,199412
𝑀𝐿𝐸𝑅199501,200412
𝑀𝐿𝐸𝑅200501,201412
𝑀𝐿𝐸𝑅196307,197412
𝑀𝐿𝐸𝑅197501,198412
𝑀𝐿𝐸𝑅198501,199412
𝑀𝐿𝐸𝑅199501,200412
𝑀𝐿𝐸𝑅200501,201412
𝑀𝐿𝐸𝑅196307,197412
𝑀𝐿𝐸𝑅197501,198412
𝑀𝐿𝐸𝑅198501,199412
𝑀𝐿𝐸𝑅199501,200412
𝑀𝐿𝐸𝑅200501,201412
𝑀𝐿𝐸𝑅192701,196306 0.62 0.56 0.50 0.49 0.31 0.50 0.48 0.41 0.41 0.30 0.60 0.54 0.37 0.41 0.30
𝑀𝐿𝐸𝑅196307,197412 0.74 0.73 0.71 0.57 0.60 0.58 0.57 0.45 0.84 0.79 0.80 0.66
𝑀𝐿𝐸𝑅197501,198412 0.80 0.78 0.68 0.64 0.62 0.52 0.84 0.88 0.81
𝑀𝐿𝐸𝑅198501,199412 0.84 0.77 0.65 0.58 0.91 0.85
𝑀𝐿𝐸𝑅199501,200412 0.75 0.56 0.83
𝑀𝐿𝐸𝑅196307,197412
𝑀𝐿𝐸𝑅197501,198412
𝑀𝐿𝐸𝑅198501,199412
𝑀𝐿𝐸𝑅199501,200412
𝑀𝐿𝐸𝑅200501,201412
tween the monthly excess returns of each pair of 10 − 1 portfolios using
only excess returns from months in the test period not used in the fitting
process upon which either forecast is based. These correlations, shown
under “10 − 1 Return Correlations” in Table 7, are once again large for
Return Period
all pairs of forecasts, but decrease as the time between the fitting peri-
ods increases. 196307-197412 1.15 1.35 1.46 1.43 0.82
(3.93) (3.49) (3.57) (3.62) (1.85)
The results of all three of these analyses suggest that while the
forecasting function changes through time, this change is slow. Further- 197501-198412 1.23 1.43 1.19 1.32 0.58
more, the similarities between the forecasts generated by applying the (3.01) (4.09) (3.18) (3.47) (1.41)
ML process to data from 192701-196306 and 200501-201412 indicate 198501-199412 0.86 1.27 1.28 1.21 1.05
that there is a substantial time-invariant component of the forecasting (2.94) (3.79) (3.83) (3.16) (2.57)
function.
199501-200412 1.57 1.44 1.08 1.31 0.52
To test whether the ML-based forecasts generated by fitting data (3.12) (2.42) (1.44) (1.95) (0.62)
from each given subperiod predict the cross-section of returns in dif-
200501-201412 0.02 −0.18 0.02 0.20 −0.02
ferent subperiods, in Table 8 we present the average excess returns of
(0.06) (−0.45) (0.05) (0.35) (−0.04)
the 10 − 1 portfolios formed by sorting on each of 𝑀𝐿𝐸𝑅192701,196306
𝑖,𝑡 ,
201501-202212 0.90 0.88 1.03 1.38 1.27 0.51
𝑀𝐿𝐸𝑅196307,197412
𝑖,𝑡 , 𝑀𝐿𝐸𝑅197501,198412
𝑖,𝑡 , 𝑀𝐿𝐸𝑅198501,199412
𝑖,𝑡 ,
(2.38) (1.73) (1.73) (2.00) (1.82) (0.70)
𝑀𝐿𝐸𝑅199501,200412
𝑖,𝑡 , and 𝑀𝐿𝐸𝑅200501,201412
𝑖,𝑡 in all subperiods exclud-
ing that used by the ML process to learn the forecasting function. While
analyses that use a forecast based on a fit period subsequent to the pe- subperiods except for 200501-201412, during which, as shown in Sec-
riod whose returns are examined are not reflective of obtainable trading tion 4.3, the 𝑀𝐿𝐸𝑅 10 − 1 portfolio performs poorly. Notably, the
profits, such analyses are valid statistical tests of the stability of the fore- 𝑀𝐿𝐸𝑅192701,196306 10 − 1 portfolio earns an average excess return of
casting relation. The results show that the 10 − 1 portfolios formed by 0.90% per month (𝑡-statistic = 2.38) during the 201501-202212 subpe-
sorting on each of the ML-based forecasts generate economically large riod. Thus, forecasts based on a fit of the data ending in 196306 contain
and in most cases statistically significant average excess returns in all strong predictive power more than 50 years later. Taken together, the
12
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 9
Expanding-Window and Rolling-Window Forecast Portfolios. This table presents the results of portfolio analyses examining the
ability of ML-based forecasts based on expanding window (𝑀𝐿𝐸𝑅) and rolling-window (𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 ) fit periods to predict the
cross-section of future stock returns. With the exception of the sort variable for the portfolios formed by sorting on 𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 ,
the portfolio construction methodology is identical to that used to construct the portfolios examined in Table 3. The column
labeled “Sort Variable” indicates the variable used to sort stocks into portfolios. The columns labeled “1” through “10” present
results for decile portfolios 1 through 10 formed by sorting on the given variable. The column labeled “10 − 1” presents
results for the zero-cost long-short portfolio that is long the decile 10 portfolio and short the decile 1 portfolio. The rows
with “𝑟” and “SD” in the column labeled “Value” present the time-series averages and standard deviations, respectively, of
the monthly portfolio excess returns for each of the portfolios, reported in percent per month. The rows with “Sharpe” in the
“Value” column present the annualized Sharpe ratio of the given portfolio. The values in parentheses are 𝑡-statistics, calculated
following Newey and West (1987) using 12 lags, testing the null hypothesis that the average monthly excess return is equal to
zero. The analyses cover return months 𝑡 from July 1963 through December 2022, inclusive.
𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 𝑟 −0.13 0.30 0.43 0.56 0.60 0.63 0.72 0.65 0.76 0.84 0.96
(−0.43) (1.21) (1.83) (2.68) (3.32) (3.46) (4.43) (4.01) (4.60) (4.96) (4.24)
SD 7.17 6.40 5.76 5.26 4.89 4.66 4.50 4.45 4.28 4.61 5.37
Sharpe −0.06 0.16 0.26 0.37 0.43 0.47 0.55 0.50 0.61 0.63 0.62
results in Tables 7 and 8 indicate a high degree of stability in the fore- not capture interaction effects, are suboptimal for return forecasting. If
casting function, although it may also contain a slowly time-varying nonlinearities are important but interactions are unimportant, it would
component. indicate that the components of the forecasting function relevant for
The combination of time-invariant and slowly time-varying com- prediction can be described as the summation of components related
ponents of the forecasting relation suggests that there is a trade- to each of the historical cumulative returns, which would substantially
off to be made when deciding what data to use in the ML pro- facilitate description of the forecasts. If neither nonlinearities nor inter-
cess. The time-invariant component will be better estimated when actions are important, then the components of the forecasting function
more data are used in the learning process. The time-varying com- that are relevant for cross-sectional prediction are simply the slope co-
ponent will be better estimated by using only more recent data. We efficients associated with each past return. Our next tests, therefore,
investigate this tradeoff by comparing the performance of portfo- examine the prevalence of nonlinear and interaction components in the
lios formed by sorting on our focal forecasts (𝑀𝐿𝐸𝑅) which use forecasting function, and the degree to which such components are im-
expanding-window fit periods, and forecasts based on forecasting portant for predicting the cross-section of future stock returns.
functions generated from rolling-window fit periods (𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 ). We begin by graphically examining the forecasting function for ev-
𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 for stock 𝑖 in month 𝑡 is defined as 𝑀𝐿𝐸𝑅192701,196306 idence of nonlinearities and interactions. Fig. 3 presents univariate
for months 𝑡 in 196307-197412, 𝑀𝐿𝐸𝑅196307,197412 for months 𝑡 in partial-dependence plots of the relation between 𝑀𝐿𝐸𝑅 and 𝐶𝑅𝑘 , for
197501-198412, 𝑀𝐿𝐸𝑅197501,198412 for months 𝑡 in 198501-199412, 𝑘 ∈ {1, 2, … , 12}, based on a fit of the data from 192701-196306. The
𝑀𝐿𝐸𝑅198501,199412 for months 𝑡 in 199501-200412, 𝑀𝐿𝐸𝑅199501,200412 plotted values of 𝑀𝐿𝐸𝑅 are calculated by setting 𝐶𝑅𝑗 for each stock
for months 𝑡 in 200501-201412, and 𝑀𝐿𝐸𝑅200501,201412 for months 𝑡
and each 𝑗 ∈ {1, 2, … , 12}, 𝑗 ≠ 𝑘, to its mean among all observations
in 201501-202212. If the tradeoff between the use of more data from
in the sample from 196307-202212. We focus on plots based on the fit
a prolonged period and less data from a shorter but more recent time
using data ending in 196306 because these data are included in each
period favors the former, then the 𝑀𝐿𝐸𝑅 10 − 1 portfolio should out-
of the expanding windows used to calculate 𝑀𝐿𝐸𝑅, and thus affect
perform the 𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 10 − 1 portfolio, and vice versa.
the forecasting function for all fit periods.20 The plots indicate that the
Table 9 shows that the average monthly excess return and Sharpe ra-
forecasting function is highly nonlinear, and in most cases nonmono-
tio of the 𝑀𝐿𝐸𝑅𝑅𝑜𝑙𝑙 10 − 1 portfolio of 0.96% (𝑡-statistic = 4.24) and
tonic, in the input variables. The plots for 𝐶𝑅2 , … , 𝐶𝑅10 are all highly
0.62, respectively, are lower, but only by a small amount, than the cor-
nonmonotonic, and several of the plots indicate multiple sign changes
responding values of 1.08% (𝑡-statistic = 5.51) and 0.78, respectively,
for the slope of the relation between the given input variable and the
for the 𝑀𝐿𝐸𝑅 10 − 1 portfolio. The results suggest that the tradeoff be-
tween using more data from a longer time period and less data from a forecast. Furthermore, the plots show that these instances of nonmono-
shorter time period in the ML process favors, but only slightly, the use tonicity are not isolated to extreme values of the focal input variable,
of more data from a longer time period. where the forecasting function may be unreliable due to a small num-
ber of observations with similar values in the data used in the learning
5.2. Nonlinearity and interactions in the forecasting function process.
To graphically illustrate interactions in the forecasting function,
Fig. 4 presents examples of bivariate partial-dependence plots depict-
The results in Section 5.1 demonstrate that the forecasting relation
ing the relation between 𝑀𝐿𝐸𝑅 and 𝐶𝑅𝑘 calculated using the 10th,
is highly stable through time, but gives no indication of its functional
form. The main benefit of using neural networks to generate forecasts 30th, 50th, 70th, and 90th percentile values of one other input variable,
is that they are capable of discerning highly-complex relations that may 𝐶𝑅𝑗 , with 𝐶𝑅𝑙 , 𝑙 ∈ 1, 2, … , 12, 𝑙 ≠ 𝑗 and 𝑙 ≠ 𝑘 set to its mean, using the
include nonlinear components and components based on interactions of forecasting function based on a fit of the data from 192701-196306. The
the input variables. The challenge that comes with this benefit is that percentiles and mean values are based on all observations from 196307-
the complexities of the forecasting function are difficult, if not impossi-
ble, to succinctly describe. If the nonlinear and interaction components
are important for prediction, it would indicate that traditional tech- 20
In Section XIII and Figure A4 of the Internet Appendix, we present similar
niques such as linear regression, as well as ML methodologies that do plots for other fit periods.
13
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Fig. 3. Univariate Partial-Dependence Plots for ML-Based Forecasts. This figure presents partial dependence plots for the ML-based forecasting function. The function
whose partial dependence plots are shown is based on a fit of the data from 192701 through 196306. Each plot depicts the relation between one of 𝐶𝑅1 , … , 𝐶𝑅12 ,
indicated on the 𝑥-axis, and the value of 𝑀𝐿𝐸𝑅, indicated on the 𝑦-axis, with all other input variables set to their sample means based on all observations in the
sample from 196307 through 202212. The range of values included on the 𝑥-axis in each plot is −100 to the 99th percentile value of the variable shown on the
𝑥-axis, calculated using observations in the sample from 196307 through 202212.
202212. If there are no interactions between 𝐶𝑅𝑗 and 𝐶𝑅𝑘 , the lines cross-sectional coefficients and adjusted 𝑅2 values. The average ad-
corresponding to the different percentiles of 𝐶𝑅𝑗 will be parallel.21 justed 𝑅2 of these regressions is 37.55%, indicating that only slightly
The left panel in Fig. 4 indicates strong interactions between 𝐶𝑅4 more than a third of the total variation in 𝑀𝐿𝐸𝑅 is explained by a
and 𝐶𝑅12 in the forecasting function. For high values of 𝐶𝑅12 (the 70th linear combination of 𝐶𝑅1 , … , 𝐶𝑅12 , and therefore that nonlinear and
and 90th percentiles), 𝑀𝐿𝐸𝑅 is decreasing in 𝐶𝑅4 for values of 𝐶𝑅4 interaction components of the forecasting function account for most of
below and slightly above zero, and then increasing for higher values the cross-sectional variation in the forecasts.
of 𝐶𝑅4 . When 𝐶𝑅12 is taken at its 30th or 50th percentile values, the To test the importance of the nonlinearities and interactions for pre-
slopes of 𝑀𝐿𝐸𝑅 take the opposite signs. The middle panel illustrates dicting future stock returns, we once again conduct FM regressions, this
substantial interactions between 𝐶𝑅9 and 𝐶𝑅10 . When 𝐶𝑅10 is taken time using the future excess stock return as the dependent variable
at its 30th (70th) percentile value, for values of 𝐶𝑅9 between approxi- and 𝑀𝐿𝐸𝑅 and 𝐶𝑅1 , … , 𝐶𝑅12 as independent variables. The slope
mately −50 and 0, 𝑀𝐿𝐸𝑅 is decreasing (increasing). Finally, the right coefficient on 𝑀𝐿𝐸𝑅 measures the relation between future stock re-
panel shows an example where interactions are much less severe. The turns and the component of 𝑀𝐿𝐸𝑅 that is linearly orthogonal to the
slopes of the relation between 𝑀𝐿𝐸𝑅 and 𝐶𝑅10 are quite similar for
other independent variables, namely 𝐶𝑅1 , … , 𝐶𝑅12 , and thus can be
all depicted percentiles of 𝐶𝑅4 .
used to assess the importance of the nonlinear and interaction compo-
To formally assess the amount of cross-sectional variation in 𝑀𝐿𝐸𝑅
nents of 𝑀𝐿𝐸𝑅 in prediction. We note, however, that our use of linear
that is driven by the nonlinear and interaction components, we conduct
regression here assumes a linear relation between the component of
a Fama and MacBeth (1973, FM hereafter) regression analysis of the re-
𝑀𝐿𝐸𝑅 that is orthogonal to 𝐶𝑅1 , … , 𝐶𝑅12 and future stock returns.
lation between 𝑀𝐿𝐸𝑅 and 𝐶𝑅1 , … , 𝐶𝑅12 . Specifically, each month 𝑡,
Since our ML model uses the normalized excess stock return as the de-
we run a cross-sectional OLS regression of 𝑀𝐿𝐸𝑅 on 𝐶𝑅1 , … , 𝐶𝑅12 .
pendent variable, there is no strong reason to believe that the relation
Panel A of Table 10 presents the time-series averages of the monthly
between this orthogonal component of 𝑀𝐿𝐸𝑅 and future (not nor-
malized) excess stock returns is linear. While this may make economic
21
Plots for all pairs of input variables 𝐶𝑅𝑗 and 𝐶𝑅𝑘 are presented in Sec- interpretation of the magnitude of the slope coefficient difficult, a statis-
tion XIV and Figure A5 of the Internet Appendix. tically significant coefficient on 𝑀𝐿𝐸𝑅 would be strong evidence that
14
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Fig. 3. (continued)
the nonlinear and/or interaction components of the ML-based forecasts ear combination of the 𝐶𝑅𝑘 and 𝑀𝐿𝐸𝑅𝐶𝑅𝑘 , 𝑘 ∈ 1, 2, … , 12, therefore,
are important for prediction. In addition to equal-weighted (i.e. OLS) must be driven by interactions in the forecasting function. Panel C of
regressions, for consistency with our value-weighted portfolio analyses, Table 10 shows that the average adjusted 𝑅2 from these cross-sectional
we also run the FM regression analysis using value-weighted (i.e. WLS) regressions is 55.36%, which is substantially higher than the 37.55%
regressions with market capitalization as the weight variable. The re- adjusted 𝑅2 from the regressions that used only 𝐶𝑅1 , … , 𝐶𝑅12 as inde-
sults in Panel B of Table 10 show that the average coefficient on 𝑀𝐿𝐸𝑅 pendent variables, but still far from a perfect 100%. This indicates that
of 5.95 (𝑡-statistic = 9.06) in the equal-weighted regressions and 2.81 while a non-trivial portion of variation in 𝑀𝐿𝐸𝑅 is driven by the lin-
(𝑡-statistic = 5.47) in the value-weighted regressions are both highly ear and nonlinear components of the forecasting function, there is also
statistically significant, indicating that nonlinearities and/or interac-
a substantial component, represented by the unexplained variation in
tions in the forecasting function are important for future stock return
the regression, that is attributable to interaction effects.
prediction.
To investigate the importance of the interaction component of the
We next endeavor to discern between nonlinear and interaction
ML-based forecasts for predicting the cross-section of future stock re-
components of the ML-based forecasts. To investigate the amount of
cross-sectional variation in 𝑀𝐿𝐸𝑅 that is associated with the interac- turns, we run equal-weighted and value-weighted FM regression analy-
tion component of the forecasts, we run a FM regression analysis using ses using the future excess stock return as the dependent variable and
𝑀𝐿𝐸𝑅 as the dependent variable and 𝐶𝑅1 , … , 𝐶𝑅12 as well as terms 𝐶𝑅1 , … , 𝐶𝑅12 , 𝑀𝐿𝐸𝑅𝐶𝑅1 , … , 𝑀𝐿𝐸𝑅𝐶𝑅12 , and 𝑀𝐿𝐸𝑅 as indepen-
that capture the nonlinear components of the forecasts as independent dent variables. In these regressions, the coefficient on 𝑀𝐿𝐸𝑅 measures
variables. To capture the nonlinear components, we define 𝑀𝐿𝐸𝑅𝐶𝑅𝑘 the relation between the interaction component of the ML-based fore-
to be the value of the forecasting function evaluated using the true value casts and future stock returns. Panel D of Table 10 shows that for both
of 𝐶𝑅𝑘 with values of 𝐶𝑅𝑗 , for 𝑗 ∈ 1, … , 12 and 𝑗 ≠ 𝑘, set to their mean equal-weighted and value-weighted regressions, the average slope coef-
values among all observations in the given month.22 𝑀𝐿𝐸𝑅𝐶𝑅𝑘 cap- ficient of 𝑀𝐿𝐸𝑅 is positive and highly significant, indicating that the
tures the nonlinear component of the forecasting function with respect interaction components of the forecasting function play an important
to 𝐶𝑅𝑘 . The component of 𝑀𝐿𝐸𝑅 that cannot be captured by a lin- role in prediction.
22
If the forecasting function has no interaction components, then the choice
of values for 𝐶𝑅𝑗 will have no impact on our results. Nonetheless, in Section XV instead of the monthly mean values, for the 𝐶𝑅𝑗 when calculating 𝑀𝐿𝐸𝑅𝐶𝑅𝑘 ,
and Table A11 of the Internet Appendix, we conduct similar analyses using zero, and find that this has no qualitative impact on our results.
15
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Our next tests examine the extent to which the predictive power of
the ML-based forecasts is related to the momentum and reversal effects.
Our measure of momentum, 𝑀𝑜𝑚, is the cumulative stock return dur-
ing the 11-month period covering months 𝑡 − 12 through 𝑡 − 2, inclusive
(skipping month 𝑡 − 1). We measure reversal with 𝑅𝑒𝑣, defined as the
stock return in month 𝑡 − 1. While the asset pricing literature has doc-
umented hundreds of variables that predict future stock returns, these
two effects are of particular relevance here because both 𝑀𝑜𝑚 and 𝑅𝑒𝑣
can be discerned from historical price plots.23 In fact, both 𝑀𝑜𝑚 and
𝑅𝑒𝑣 are very simple functions of the cumulative returns 𝐶𝑅1 , … , 𝐶𝑅12
that are the inputs to the forecasting function. Specifically, 𝑀𝑜𝑚 =
𝐶𝑅11 and 𝑅𝑒𝑣 = 100[(𝐶𝑅12 ∕100 + 1)∕(𝐶𝑅11 ∕100 + 1) − 1].24 It is there-
fore highly plausible that the predictive power of 𝑀𝐿𝐸𝑅 is, at least in
part, driven by the predictive power of 𝑀𝑜𝑚 and 𝑅𝑒𝑣.
In addition to 𝑀𝑜𝑚 and 𝑅𝑒𝑣, we calculate ML-based forecasts,
which we denote 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 , that use only 𝑀𝑜𝑚 and 𝑅𝑒𝑣 as input
variables. The ML model used to calculate 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 is identical
to that used to calculate 𝑀𝐿𝐸𝑅, except for the use of 𝑀𝑜𝑚 and 𝑅𝑒𝑣,
instead of 𝐶𝑅1 , … , 𝐶𝑅12 , as input variables. Our objective in calcu-
lating 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 is to capture any nonlinearities and interaction
effects in the predictive power of 𝑀𝑜𝑚 and 𝑅𝑒𝑣. In Section XVI and
Table A12 of the Internet Appendix, we construct decile portfolios by
sorting on each of 𝑀𝑜𝑚, 𝑅𝑒𝑣, and 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 , and find that all
three variables predict the cross-section of future stock returns in our
test period sample.
16
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 10
Nonlinearities and Interactions in the Forecasting Function. This table presents the results of Fama and MacBeth (1973)
regression analyses examining nonlinearity and interactions of the ML-based forecasts 𝑀𝐿𝐸𝑅. Each month 𝑡 we run a
cross-sectional regression of the dependent variable on one or more independent variables. The results in Panel A are for
regressions with 𝑀𝐿𝐸𝑅 as the dependent variable and 𝐶𝑅1 , … , 𝐶𝑅12 as independent variables. The results in Panel B
are for regressions with the future excess stock return as the dependent variable and 𝑀𝐿𝐸𝑅 and 𝐶𝑅1 , … , 𝐶𝑅12 as inde-
pendent variables. The results in Panel C are for regressions with 𝑀𝐿𝐸𝑅 as the dependent variable and 𝐶𝑅1 , … , 𝐶𝑅12 ,
and 𝑀𝐿𝐸𝑅𝐶𝑅1 , … , 𝑀𝐿𝐸𝑅𝐶𝑅12 as independent variables. The results in Panel D are for regressions with the future ex-
cess stock return as the dependent variable and 𝑀𝐿𝐸𝑅, 𝐶𝑅1 , … , 𝐶𝑅12 , and 𝑀𝐿𝐸𝑅𝐶𝑅1 , … , 𝑀𝐿𝐸𝑅𝐶𝑅12 as independent
variables. 𝑀𝐿𝐸𝑅𝐶𝑅𝑘 is the value of the ML-based forecast calculated using 𝐶𝑅𝑘 and the within-month means of 𝐶𝑅𝑗 for
𝑗 ∈ {1, … , 12} and 𝑗 ≠ 𝑘, as the values of the input variables, and therefore captures the component of the ML-based forecast
that is nonlinear in 𝐶𝑅𝑘 . In Panels B and D, the sections with “EW” and “VW” in the “Weight” column present results for
equal-weighted and market capitalization-weighted regressions, respectively. The table presents the time-series averages of
the monthly cross-sectional coefficients, along with 𝑡-statistics, calculated following Newey and West (1987) using 12 lags,
testing the null hypothesis that the average coefficient is equal to zero (in parentheses). The column labeled “Adj. 𝑅2 ” in
Panels A and C presents the time-series average of adjusted 𝑅2 values from the monthly cross-sectional regressions. In Panels
A and C, we report the estimated slope coefficients times 100. The analyses cover sample months 𝑡 from July 1963 through
December 2022, inclusive.
Panel A: FM Regressions of 𝑀𝐿𝐸𝑅 on 𝐶𝑅1 , … , 𝐶𝑅12
To construct bivariate portfolios, each month 𝑡 we sort stocks into five 𝑀𝐿𝐸𝑅 quintile 𝑘 portfolios (one for each control variable quin-
five groups based on an ascending ordering of the control variable. tile). Finally, we define the month 𝑡 bivariate 𝑀𝐿𝐸𝑅 5 − 1 portfolio
We then sort all stocks within each control variable-based group into excess return to be the month 𝑡 excess return of the bivariate 𝑀𝐿𝐸𝑅
five portfolios based on an ascending ordering of 𝑀𝐿𝐸𝑅. We take the quintile five portfolio minus that of the bivariate 𝑀𝐿𝐸𝑅 quintile one
month 𝑡 excess return for each of the resulting 25 portfolios to be the portfolio.
value-weighted average month 𝑡 excess return of all stocks in the port- We construct the trivariate portfolios in a similar manner, except
folio. To create a single portfolio for each 𝑀𝐿𝐸𝑅 quintile, we define before sorting on 𝑀𝐿𝐸𝑅 we sort on both 𝑀𝑜𝑚 and 𝑅𝑒𝑣. Specifically,
the month 𝑡 excess return of the bivariate 𝑀𝐿𝐸𝑅 quintile 𝑘 portfolio each month 𝑡 we sort the stocks in our sample into 25 groups based
to be the equal-weighted average of the month 𝑡 excess returns of the on the intersections of five 𝑀𝑜𝑚 groups and five 𝑅𝑒𝑣 groups con-
17
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 10 (continued)
Panel D: FM Regressions of Future Excess Returns on 𝑀𝐿𝐸𝑅, 𝐶𝑅1 , … , 𝐶𝑅12 and 𝑀𝐿𝐸𝑅𝐶𝑅1 , … , 𝑀𝐿𝐸𝑅𝐶𝑅12
18
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 12
Multivariate Portfolio Analysis - Control for Momentum and Reversal. This table presents the
results of univariate, bivariate, and trivariate portfolio analyses examining the ability of the
ML-based forecasts to predict the cross-section of future stock returns after controlling for mo-
mentum and reversal. The procedure used to generate the univariate portfolios is identical to
that used to generate the portfolios in Table 3, except we create five quintile portfolios instead
of 10 decile portfolios. The bivariate portfolios are constructed by sorting all stocks into quin-
tiles based on a control variable, either 𝑀𝑜𝑚, 𝑅𝑒𝑣, or 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 , and then within each
control variable quintile, into five 𝑀𝐿𝐸𝑅 portfolios. The trivariate portfolios are formed by
independently sorting stocks into quintiles of 𝑀𝑜𝑚 and 𝑅𝑒𝑣, and then sorting stocks in each
of the 25 groups formed by the intersections of the 𝑀𝑜𝑚 and 𝑅𝑒𝑣 quintiles, into 𝑀𝐿𝐸𝑅 quin-
tiles. Breakpoints for all sorts are calculated using only NYSE-listed stocks. Values of 𝑀𝐿𝐸𝑅,
𝑀𝑜𝑚, 𝑅𝑒𝑣, and 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 are calculated as of the end of month 𝑡 − 1. The month 𝑡 ex-
cess return of each of the resulting portfolios is taken to be the market capitalization-weighted
average month 𝑡 excess return of all stocks in the given portfolio, with market capitalization
calculated as of the end of month 𝑡 − 1. For the bivariate (trivariate) portfolios, the month
𝑡 excess return of the 𝑀𝐿𝐸𝑅 quintile 𝑘 portfolio is taken to be the equal-weighted aver-
age, across all quintiles of the control variable (across all 25 𝑀𝑜𝑚 and 𝑅𝑒𝑣 groups), of the
𝑀𝐿𝐸𝑅 quintile 𝑘 portfolio. Finally, the 𝑀𝐿𝐸𝑅 5 − 1 portfolio excess return is taken to be
the difference between the 𝑀𝐿𝐸𝑅 quintile five and quintile one portfolio excess returns.
The table presents the time-series averages of the monthly portfolio excess returns for each of
the 𝑀𝐿𝐸𝑅 quintile portfolios. The column labeled “Control Variable(s)” indicates the con-
trol variable(s) used to construct the portfolios. The columns labeled “𝑀𝐿𝐸𝑅 1” through
“𝑀𝐿𝐸𝑅 5” and “𝑀𝐿𝐸𝑅 5 − 1” present the average monthly excess portfolio returns along
with 𝑡-statistics, calculated following Newey and West (1987) using 12 lags, testing the null
hypothesis that the average monthly excess return of the given portfolio is equal to zero. All
excess returns are reported in percent per month. The analyses cover return months 𝑡 from
July 1963 through December 2022, inclusive.
𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 and the trivariate portfolio analysis that controls for casts are based on charts of 60 days (𝐼60∕𝑅20), 20 days (𝐼20∕𝑅20),
𝑀𝑜𝑚 and 𝑅𝑒𝑣 is not surprising because the trivariate portfolio anal- and 5 days (𝐼5∕𝑅20) of past daily return and volume data.25 The fore-
ysis captures both nonlinearities and interaction effects in the ability casts are available for sample months 𝑡 from 200102-202001. All tests
of 𝑀𝑜𝑚 and 𝑅𝑒𝑣 to predict the cross-section of future stock returns. that use these variables cover this period.
As with the univariate portfolios, the bivariate and trivariate portfo-
lios exhibit monotonically increasing average excess returns across the 5.4.1. Relations between ML-based forecasts and image-based forecasts
𝑀𝐿𝐸𝑅 quintiles. We begin by examining the strength of the relation between the JKX
The results of the multivariate portfolio analyses show that while variables and 𝑀𝐿𝐸𝑅 by running FM regression analyses with 𝑀𝐿𝐸𝑅
momentum and reversal contribute to 𝑀𝐿𝐸𝑅’s predictive power, as the dependent variable and combinations of 𝐼60∕𝑅20, 𝐼20∕𝑅20,
𝑀𝐿𝐸𝑅 also has substantial predictive power that is unrelated to these and 𝐼5∕𝑅20 as independent variables. Table 13 shows that the average
effects. In Section XVII and Table A13 of the Internet Appendix we adjusted 𝑅2 s from these cross-sectional regressions are 7.23%, 2.43%,
show that FM regression analyses examining the ability of 𝑀𝐿𝐸𝑅 to and 1.07% when only 𝐼60∕𝑅20, 𝐼20∕𝑅20, and 𝐼5∕𝑅20, respectively,
predict the future stock returns after controlling for 𝑀𝑜𝑚, 𝑅𝑒𝑣, and are included as independent variables. When all three JKX variables
𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 lead to the same conclusion. are included in the regression specification, the average adjusted 𝑅2
is 7.86%. In all specifications, the average slope coefficient on each of
5.4. Image-based forecasts the JKX variables is significantly positive. The results therefore indi-
cate that while there is some commonality in the forecasts, most of the
In addition to momentum and reversal, there is some evidence of variation in 𝑀𝐿𝐸𝑅 is unexplained by the JKX variables. The stronger
other technical signals that predict future stock returns. Most notably, relation between 𝑀𝐿𝐸𝑅 and 𝐼60∕𝑅20 than between 𝑀𝐿𝐸𝑅 and ei-
Jiang et al. (2022, JKX hereafter) show that forecasts generated by ap- ther 𝐼20∕𝑅20 or 𝐼5∕𝑅20 is likely due to the greater overlap in the past
plying a convolutional neural network to images depicting past price data used to generate these forecasts.
and volume data are informative about future returns. We therefore ex-
amine whether controlling for JKX’s forecasts explains the predictive
power of 𝑀𝐿𝐸𝑅. Specifically, we control for each of JKX’s forecasts 25
We are very grateful to Bryan Kelly and Dacheng Xiu for sharing their stock,
that are designed to predict one-month-ahead stock returns. These fore- month-level forecasts.
19
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
27
Other papers with short sample periods (e.g. Martin (2017)) face similar
26
In Section XVIII and Table A14 of the Internet Appendix, we repeat our issues of low statistical power.
tests using factors defined as the excess return of the long-short portfolio con- 28
The 𝑀𝐿𝐸𝑅 10 − 1 produces an average excess return greater than 0.82%
structed from decile portfolios formed by sorting on the JKX variables and find per month in 397 of 487 consecutive 19-year subperiods during the full 196307-
no qualitative change in the results of our analyses. 202212 sample period.
20
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 14
Portfolio Analysis - 200102-202001. This table presents the results of a portfolio analysis examining the ability of 𝑀𝐿𝐸𝑅 to
predict the cross-section of future stock returns during the 200102-202001 period. The analysis is identical to that whose results
are shown in Table 3 except that the results presented in this table are for return months 𝑡 from February 2001 through January
2020, inclusive.
𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅 𝑀𝐿𝐸𝑅
Value 1 2 3 4 5 6 7 8 9 10 10 − 1
𝑟 −0.02 0.41 0.50 0.56 0.52 0.71 0.68 0.58 0.68 0.80 0.82
(−0.03) (0.86) (1.22) (1.46) (1.56) (2.58) (2.37) (1.90) (2.36) (2.32) (1.72)
SD 7.52 6.21 5.57 5.30 4.44 4.22 4.00 3.98 3.88 4.50 5.75
Sharpe −0.01 0.23 0.31 0.36 0.40 0.59 0.59 0.51 0.61 0.62 0.49
Table 15
Alphas Using Image-Based Forecast Factors. This table presents the results of factor model regressions of the excess
returns of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio on the excess returns of factors in different factor models. The 𝑀𝐿𝐸𝑅 10 − 1
portfolio is the same as that whose average excess returns are shown in Table 3. The excess return of the 𝐹𝑋 factor, for
𝑋 ∈ {𝐼60∕𝑅20, 𝐼20∕𝑅20, 𝐼5∕𝑅20}, is that of the long-short portfolio formed by sorting on both market capitalization
and 𝑋 , constructed using a methodology, described in Section 5.4.2, similar to that of Fama and French (1993). The
column labeled “Model” indicates the factor model, where “Base Model” refers to the model indicated in the header
of columns other than the “Model” and “Value” columns and “Base Model+ 𝐹𝐼60∕𝑅20 + 𝐹𝐼20∕𝑅20 + 𝐹𝐼5∕𝑅20 ” refers to a
model that augments the factors in the model shown in the column header with 𝐹𝐼60∕𝑅20 , 𝐹𝐼20,𝑅20 , and 𝐹𝐼5∕𝑅20 . The rows
with “𝛼 ” and “Adj. 𝑅2 ” in the “Value” column present the intercept and adjusted 𝑅2 , respectively, of the regression.
The rows with “𝛽𝐹𝑋 ” in the “Value” column present the slope coefficient on 𝐹𝑋 from the regression. Alphas are reported
in percent per month. The values in parentheses are 𝑡-statistics, calculated following Newey and West (1987) using 12
lags, testing the null hypothesis that the average monthly alpha or slope coefficient is equal to zero. The analyses cover
return months 𝑡 from February 2001 through January 2020, inclusive.
Base Model+ 𝐹𝐼60∕𝑅20 + 𝐹𝐼20∕𝑅20 + 𝐹𝐼5∕𝑅20 𝛼 1.01 1.10 0.67 0.61 0.72 0.47
(2.79) (3.09) (2.30) (2.44) (2.64) (1.36)
𝛽𝐹𝐼60∕𝑅20 1.20 1.20 0.18 0.31 0.83 0.38
(3.29) (3.31) (0.67) (1.41) (1.83) (1.39)
𝛽𝐹𝐼20∕𝑅20 −0.33 −0.39 0.18 0.19 −0.31 −0.11
(−1.10) (−1.32) (0.70) (0.73) (−1.08) (−0.42)
𝛽𝐹𝐼5∕𝑅20 −0.23 −0.28 −0.12 −0.28 −0.22 −0.23
(−0.78) (−0.93) (−0.51) (−1.47) (−0.68) (−0.98)
Adj. 𝑅2 27.80% 29.82% 56.94% 65.61% 33.70% 50.70%
each of 𝐹𝐼60∕𝑅20 , 𝐹𝐼20∕𝑅20 , and 𝐹𝐼5∕𝑅20 are also insignificant in the combination of the JKX variables, 𝑀𝑜𝑚, 𝑅𝑒𝑣, and 𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 is
augmented FF5 and Q factor models. included as controls in the regression specification, the average coeffi-
In sum, the results in Table 15 support the conclusion that the cient on 𝑀𝐿𝐸𝑅 remains positive and highly statistically significant.
predictive power of 𝑀𝐿𝐸𝑅 remains strong after controlling for the Our next tests investigate whether the predictive power of 𝑀𝐿𝐸𝑅
image-based forecasts of JKX. persists after accounting for any potential nonlinearity in the relation
between the JKX variables and 𝑀𝐿𝐸𝑅. Including the JKX variables
5.4.3. Additional tests that control for image-based forecasts as controls in FM regression analyses may not effectively control for
We conduct several additional tests to further investigate whether the relation between 𝑀𝐿𝐸𝑅 and these variables if the relation be-
the predictive power of 𝑀𝐿𝐸𝑅 persists after controlling for the image- tween the JKX variables and 𝑀𝐿𝐸𝑅 is not linear. To assess the degree
based forecasts of JKX. Our first such tests are bivariate portfolio anal- of nonlinearity in the relations between these variables, we begin by
yses. The results of these tests, presented in Section XIX and Table A15 generating plots of 𝑀𝐿𝐸𝑅 and the JKX variables. These plots, shown
of the Internet Appendix, demonstrate that portfolios constructed to in Figure A6 of the Internet Appendix, provide no evidence of non-
be neutral to one of the JKX variables while taking long (short) posi- linear relations between 𝑀𝐿𝐸𝑅 and the JKX variables. We then run
tions in stocks with high (low) values of 𝑀𝐿𝐸𝑅 generate economically FM regression analyses that include, in addition to the untransformed
large and highly statistically significant average excess returns. These JKX variables, the natural log, square root, and square of 𝐼60∕𝑅20,
tests provide further evidence that the predictive power of 𝑀𝐿𝐸𝑅 is 𝐼20∕𝑅20, and 𝐼5∕𝑅20 as independent variables in the regressions.29
distinct from that of 𝐼60∕𝑅20, 𝐼20∕𝑅20, and 𝐼5∕𝑅20. High correlations between the untransformed and transformed versions
We next investigate the ability of the JKX variables to explain the of the JKX variables cause multicollinearity challenges with these re-
predictive power of 𝑀𝐿𝐸𝑅 using FM regression analyses with the
one-month-ahead excess stock return as the dependent variable and
combinations of 𝑀𝐿𝐸𝑅, 𝐼60∕𝑅20, 𝐼20∕𝑅20, 𝐼5∕𝑅20, 𝑀𝑜𝑚, 𝑅𝑒𝑣, and 29
Our approach is similar to that of Kirby (2020), who includes the squared
𝑀𝐿𝐸𝑅𝑀𝑜𝑚,𝑅𝑒𝑣 as independent variables. These analyses allow us to and cubed values, as well as pairwise interactions, of stock-level characteristics
simultaneously control for all of the JKX variables, as well as momen- in cross-sectional regressions designed to estimate the returns associated with
tum and reversal. The results of these tests, presented in Section XX pure play factor portfolios when the relation between the characteristics and
and Table A16 of the Internet Appendix, show that regardless of which expected returns is nonlinear.
21
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
gressions that make it difficult to interpret the slope coefficients on the stock’s trading volume (the negative of the stock’s trading volume) if
JKX variables. These issues do not affect the average slope coefficients the stock’s price at the end of month 𝑘 is greater than or equal to (less
on 𝑀𝐿𝐸𝑅, which remain positive and highly statistically significant in than) its price at the end of month 𝑘 − 1.30
all such tests (untabulated). Finally, to address the concern of nonlinear- FNW study the joint predictive power of 62 stock-level character-
ity while overcoming the challenges associated with multicollinearity istic variables, many of which they characterize as “trading friction”
in the FM regression analyses, we use feed forward neural networks variables. We use 14 trading frictions variables examined in FNW,
to create forecasts of 𝑀𝐿𝐸𝑅 based on the JKX variables, and include most of which are calculated from past market data and thus may be
these forecasts, which we denote 𝐸[𝑀𝐿𝐸𝑅|JKX], as a control in FM discernable from plots depicting such data.31 These variables include
regressions analyses. The calculation of 𝐸[𝑀𝐿𝐸𝑅|JKX] is described total volatility (𝑇 𝑜𝑡𝑉 𝑜𝑙 ) and idiosyncratic volatility (𝐼𝑑𝑖𝑜𝑉 𝑜𝑙 ) calcu-
in detail in Section XX of the Internet Appendix. The results of tests lated following Ang et al. (2006), price relative to its 52-week high
using 𝐸[𝑀𝐿𝐸𝑅|JKX] are shown in Tables A17 and A18 of the In- (𝑃 𝑟𝑖𝑐𝑒∕𝐻𝑖𝑔ℎ, see George and Hwang (2004)), the maximum daily re-
ternet Appendix. In all specifications, the average slope coefficient on turn in the past month (𝑀𝑎𝑥, see Bali et al. (2011)), the standard
𝑀𝐿𝐸𝑅 remains positive and highly statistically significant after con- deviation of daily turnover (𝜎𝑇 𝑢𝑟𝑛 ) and the standard deviation of daily
trolling for 𝐸[𝑀𝐿𝐸𝑅|JKX]. These tests, therefore, provide additional volume (𝜎𝑉 𝑜𝑙𝑢𝑚𝑒 ) of Chordia et al. (2001), beta (𝛽𝐹 𝑃 ) calculated fol-
evidence that the predictive power of 𝑀𝐿𝐸𝑅 is distinct from that of lowing Frazzini and Pedersen (2014), beta (𝛽𝐿𝑁 ) calculated follow-
the image-based forecasts of JKX. ing Lewellen and Nagel (2006), detrended market-adjusted turnover
Our final tests investigating whether the predictive power of 𝑀𝐿𝐸𝑅 (𝐷𝑇 𝑂 ) and standardized unexpected volume (𝑆𝑈 𝑉 ) of Garfinkel
persists after controlling for the JKX forecasts use the omitted factor (2009), turnover (𝑇 𝑢𝑟𝑛, see Datar et al. (1998)), total assets (𝐴𝑇 ),
methodology of Giglio and Xiu (2021, GX hereafter). As described in market capitalization (𝑀𝑘𝑡𝐶𝑎𝑝), and industry-demeaned market cap-
Section XXI and Table A19 of the Internet Appendix, we use the GX italization (𝑀𝑘𝑡𝐶𝑎𝑝𝐼𝑛𝑑𝐴𝑑𝑗 ). Detailed descriptions of the calculations of
methodology to estimate the risk premium of the univariate 𝑀𝐿𝐸𝑅 these variables are in Section XXII of the Internet Appendix. While 𝐴𝑇 ,
10 − 1 portfolio based on a large set of base test assets that does not 𝑀𝑘𝑡𝐶𝑎𝑝, and 𝑀𝑘𝑡𝐶𝑎𝑝𝐼𝑛𝑑𝐴𝑑𝑗 are not calculated from histories of mar-
include portfolios formed by sorting on the JKX variables (JKX port- ket data, we include them because they are plausibly related to market
folios hereafter), and then again with an augmented set of test assets frictions that may manifest in patterns evident in historical price plots.
that includes JKX portfolios. We find that adding the JKX portfolios In Panel A of Table 16 we present the average excess returns of long-
to the set of test assets has almost no effect on the estimate of the short portfolios formed by sorting on 𝑀𝐿𝐸𝑅 that are neutral to one of
𝑀𝐿𝐸𝑅 10 − 1 portfolio’s risk premium, indicating that the JKX vari- the NRTZ technical signals. For each NRTZ technical signal, we sort
ables do not provide incremental ability to explain the predictive power stocks with each value of the signal (zero or one) into decile 𝑀𝐿𝐸𝑅
of 𝑀𝐿𝐸𝑅 relative to the variables underlying the base set of test as- portfolios using NYSE breakpoints. We calculate the excess returns of
sets. Furthermore, regardless of whether the set of assets used in the GX each of the resulting value-weighted portfolios, and take the 𝑀𝐿𝐸𝑅
methodology includes or excludes the JKX portfolios, the estimates of 10 − 1 portfolio excess return to be the equal-weighted average excess
the 𝑀𝐿𝐸𝑅 10 − 1 portfolio’s risk premium are much smaller than the return of the two 𝑀𝐿𝐸𝑅 decile 10 portfolios minus that of the two
average excess return of the 𝑀𝐿𝐸𝑅 10 − 1 portfolio and statistically 𝑀𝐿𝐸𝑅 decile one portfolios. The results show that the average ex-
insignificant, indicating that the predictive power of 𝑀𝐿𝐸𝑅 is not sub- cess returns of each of these 𝑀𝐿𝐸𝑅 10 − 1 portfolios is positive, large
sumed by a combination of the variables used in the construction of the in magnitude, and highly statistically significant, with all associated 𝑡-
test assets. statistics being 5.67 or higher.
Summarily, the results of these portfolio analyses, FM regression
Table 16 Panel B presents the average monthly excess returns of
analyses, and analyses using the GX methodology, indicate that the
𝑀𝐿𝐸𝑅 5 − 1 portfolios that are constructed to be neutral to a FNW
ability of 𝑀𝐿𝐸𝑅 to predict the cross-section of future stock returns
technical variable. The methodology used to construct these portfo-
remains strong after controlling for the image-based forecasts of JKX,
lios is identical to that used to construct the 𝑀𝐿𝐸𝑅 5 − 1 portfolios
momentum, and reversal.
that control for momentum and reversal, discussed in Section 5.3.2, ex-
cept we use one of the FNW technical signals as the first sort variable.
5.5. Technical signals
The average monthly excess returns of the 𝑀𝐿𝐸𝑅 5 − 1 portfolios that
are neutral to the FNW technical signals are all positive, economically
We next examine whether the predictive power of our ML-based
large, and highly statistically significant, ranging from 0.70% (𝑡-statistic
forecasts is subsumed by previously-studied technical trading signals.
= 6.26) when controlling for 𝛽𝐹 𝑃 to 1.02% (𝑡-statistic = 6.20) when
The technical signals we use come from Neely et al. (2014, NRTZ here-
controlling for 𝑀𝑘𝑡𝐶𝑎𝑝.
after) and Freyberger et al. (2020, FNW hereafter). NRTZ study 14
In Section XXIII and Table A20 of the Internet Appendix we use
technical signals, each of which is an indicator set to one (zero) for
equal-weighted and value-weighted FM regression analyses to further
stocks with positive (negative) forecasts. These signals are well-suited
investigate whether the predictive power of 𝑀𝐿𝐸𝑅 persists after con-
for our purposes because, like 𝑀𝐿𝐸𝑅, they are based on monthly re-
trolling for the technical signals. Regardless of which set of technical
turn data. Six of these signals are based on moving average prices, two
variables we include as controls in the regression specification, the aver-
are momentum-based signals, and the remaining six are derived from
age coefficient on 𝑀𝐿𝐸𝑅 is positive and highly statistically significant.
on-balance volume (OBV). The moving average signals (𝐼𝑀𝐴(𝑠)≥𝑀𝐴(𝑙) )
Finally, we use the excess returns of a long-short portfolio generated
are set to one if the moving average price over the past 𝑠 months
by FNW as a factor in factor model regressions.32 FNW construct this
(𝑀𝐴(𝑠)) is greater than or equal to the moving average price over
portfolio by sorting stocks based on a composite forecast calculated by
the past 𝑙 months (𝑀𝐴(𝑙)), and zero otherwise, where 𝑠 ∈ {1, 2, 3}
and 𝑙 ∈ {9, 12}. The momentum signals (𝐼𝑃𝑡 ≥𝑃𝑡−𝑚 ) are set to one if the
current price is greater than or equal to the price 𝑚 months ago, and 30
Our definitions follow those of NRTZ. Prices and volumes are adjusted for
zero otherwise, for 𝑚 ∈ {9, 12}. The OBV signals (𝐼𝑀𝐴𝑂𝐵𝑉 (𝑠)≥𝑀𝐴𝑂𝐵𝑉 (𝑙) )
splits and stock dividends.
take the value one if the moving average OBV over the past 𝑠 months 31
FNW use 15 trading friction variables in their analyses. We use the 14 vari-
(𝑀𝐴𝑂𝐵𝑉 (𝑠)) is greater than or equal to the moving average OBV over ables that are available for the entirety of our sample period. Due to data
the past 𝑙 months (𝑀𝐴𝑂𝐵𝑉 (𝑙)), for 𝑠 ∈ {1, 2, 3} and 𝑙 ∈ {9, 12}. The constraints discussed in Chung and Zhang (2014), the average daily bid-ask
OBV of any stock 𝑖 in any month 𝑡 is the sum over all months 𝑘 be- spread variable (denoted “Spread” in FNW, see their Table 1) cannot be calcu-
ginning when the stock is listed and ending in month 𝑡, of the stock’s lated for a large portion of our sample period.
month 𝑘 signed trading volume, where the signed trading volume is the 32
We thank Michael Weber for sharing the excess returns of this portfolio.
22
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 16
Multivariate Portfolio Analysis - Control for Technical Signals. This table presents the results of bivariate portfolio analyses examining the ability
of the ML-based forecasts to predict the cross-section of future stock returns after controlling for other technical signals. The technical signals
are those used by Neely et al. (2014, NRTZ) and Freyberger et al. (2020, FNW), described in Section 5.5. Panel A presents the average excess
returns of 𝑀𝐿𝐸𝑅 10 − 1 portfolios designed to be neutral to different NRTZ technical signals (indicated in the column headers). At the end of
each month 𝑡 − 1, stocks with each value of the given technical signal (either 0 or 1), are sorted into decile portfolios based on an ascending
ordering of 𝑀𝐿𝐸𝑅 using breakpoints calculated from NYSE-listed stocks. The month 𝑡 excess return of each portfolio is then taken to be the
market capitalization-weighted average month 𝑡 excess return of all stocks in the portfolio, with market capitalization calculated as of the end
of month 𝑡 − 1. The 𝑀𝐿𝐸𝑅 10 − 1 portfolio excess return is taken to be the equal-weighted average excess return of the two decile 10 𝑀𝐿𝐸𝑅
portfolios minus that of the two decile 1 portfolios. Panel B presents the average excess returns of 𝑀𝐿𝐸𝑅 5 − 1 portfolios designed to be
neutral to different FNW technical signals (indicated in the column headers). The methodology used to construct these 𝑀𝐿𝐸𝑅 5 − 1 portfolios
is identical to that used to construct the bivariate portfolios examined in Table 12 except that one of the FNW technical signals is used as the first
sort variable. The values in parentheses are 𝑡-statistics, calculated following Newey and West (1987) using 12 lags, testing the null hypothesis
that the average monthly excess return is equal to zero. The analyses cover return months 𝑡 from July 1963 through December 2022, inclusive.
Panel A: 𝑀𝐿𝐸𝑅 10 − 1 Port. Avg. Excess Return - Control for NRTZ Variables
𝐼𝑀𝐴(2)≥𝑀𝐴(12)
𝐼𝑀𝐴(3)≥𝑀𝐴(12)
𝐼𝑀𝐴(1)≥𝑀𝐴(9)
𝐼𝑀𝐴(2)≥𝑀𝐴(9)
𝐼𝑀𝐴(3)≥𝑀𝐴(9)
𝐼𝑃𝑡 ≥𝑃𝑡−12
𝐼𝑃𝑡 ≥𝑃𝑡−9
1.13 1.19 1.15 1.18 1.07 1.08 1.15 1.10 1.17 1.15 1.15 1.11 1.10 1.09
(5.67) (7.01) (6.70) (7.63) (6.50) (6.68) (6.73) (6.77) (6.07) (6.06) (6.10) (6.03) (5.77) (6.07)
Panel B: 𝑀𝐿𝐸𝑅 5 − 1 Port. Avg. Excess Return - Control for FNW Variables
𝑀𝑘𝑡𝐶𝑎𝑝𝐼𝑛𝑑𝐴𝑑𝑗
𝑃 𝑟𝑖𝑐𝑒∕𝐻𝑖𝑔ℎ
𝑀𝑘𝑡𝐶𝑎𝑝
𝐼𝑑𝑖𝑜𝑉 𝑜𝑙
𝑇 𝑜𝑡𝑉 𝑜𝑙
𝜎𝑉 𝑜𝑙𝑢𝑚𝑒
𝐷𝑇 𝑂
𝑀𝑎𝑥
𝑆𝑈 𝑉
𝑇 𝑢𝑟𝑛
𝜎𝑇 𝑢𝑟𝑛
𝛽𝐿𝑁
𝛽𝐹 𝑃
𝐴𝑇
0.75 0.83 0.75 0.80 0.83 0.94 0.70 0.76 0.80 0.79 0.81 0.97 1.02 0.93
(6.85) (6.88) (4.66) (6.87) (6.17) (5.76) (6.26) (6.59) (6.02) (5.11) (6.16) (6.55) (6.20) (5.60)
23
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
To discern the features of charts that are important for return predic- 6. Ex-post evaluation of optimization procedure
tion but distinct from momentum or reversal, we examine a subsample
of stocks in which the momentum and reversal effects do not exist. Our final tests evaluate the success of the optimization procedure
The exact process and empirical results that we use to identify such a that determined which ML model would be used to generate our fo-
subsample are described in detail in Section XXVI and Table A22 of the cal forecasts. Specifically, we evaluate whether ML models that the
Internet Appendix. Here, we summarize. We begin by taking only stocks optimization process, which used data from the 192701-196306 opti-
with moderate values of 𝑀𝑜𝑚 and 𝑅𝑒𝑣. Specifically, each month 𝑡, we mization period, identified as performing well, also produced superior
take only stocks with values of 𝑀𝑜𝑚 and 𝑅𝑒𝑣 that are between the 20th out-of-sample forecasts during the 196307-202212 test period. To do
and 80th percentile values among NYSE-listed stocks of both of these so, we calculate forecasts using each of the 95 alternative ML models
variables in the given month. We find that while the momentum ef- that were not selected by the optimization process. With the exception
fect does not exist in this subsample, the reversal effect remains strong of the ML model, all other aspects of calculating these alternative ML-
among these stocks. Thus, we incrementally shrink the range of values based forecasts are identical to those used to calculate 𝑀𝐿𝐸𝑅.
of 𝑅𝑒𝑣 that we include in the subsample until we find a subsample in To evaluate the ability of the different forecasts to predict future
which the reversal effect does not exist. The subsample selected using stock returns, we repeat the decile portfolio analysis whose results are
shown in Table 3 using each of these different forecasts. Table 17
this process includes the intersection of stocks with 𝑀𝑜𝑚 between the
presents the average monthly excess return of the 10 − 1 portfolio gen-
20th and 80th NYSE percentile values of 𝑀𝑜𝑚 and stocks with 𝑅𝑒𝑣 be-
erated from each of these analyses. All but one of the 96 ML models
tween the 40th and 60th NYSE percentile values of 𝑅𝑒𝑣. All percentiles
examined (95 alternative models plus the selected model used to cal-
in these analyses are calculated using only NYSE-listed stocks. In this
culated 𝑀𝐿𝐸𝑅) produce 10 − 1 portfolios that have positive average
subsample, the average excess return of the long-short quintile portfo-
excess returns, 90 of which are statistically significant at the 5% level.
lio formed by sorting 𝑀𝑜𝑚 is 0.04% per month (𝑡-statistic = 0.31) and
The fact that most models perform well during the test period suggests
that of 𝑅𝑒𝑣 is −0.06% per month (𝑡-statistic = −0.52), both of which
that our results are quite robust and not highly sensitive to the choice of
are economically small and statistically insignificant, indicating that the
ML model. Furthermore, while the model used to calculate 𝑀𝐿𝐸𝑅 per-
momentum and reversal effects are non-existent in this subsample of forms well, there are 30 (32) alternative models whose 10 − 1 portfolios
stocks. The long-short portfolio formed by sorting on 𝑀𝐿𝐸𝑅, how- produce average monthly excess returns (𝑡-statistics) that are greater
ever, produces an economically large and highly statistically significant than the 1.08% per month (𝑡-statistic of 5.51) generated by the 𝑀𝐿𝐸𝑅
average excess return of 0.37% per month (𝑡-statistic = 2.23). Since 10 − 1 portfolio. Thus, the model selected by the optimization process
the momentum and reversal effects are non-existent, the results demon- performs well relative to most other candidate models in out-of-sample
strate that the chart patterns picked up by the ML-based forecasts that tests, but as would be expected based on noise in the optimization pro-
are relevant for return prediction in this subsample are something other cess, there are models that performed better than that selected as our
than momentum and reversal. focal model.
To visualize the features of the charts associated with the predictive In Fig. 7 we graphically depict the relation between performance
power of the ML-based forecasts in this subsample, we create charts of the forecasts as measured by the optimization process, and the out-
of the average values of 𝐶𝑅𝑘 among stocks with different values of of-sample test period performance of the associated 10 − 1 portfolio.
𝑀𝐿𝐸𝑅. Since this subsample does have substantial variation in 𝑀𝑜𝑚, Specifically, we plot on the 𝑥-axis the Spearman rank correlations be-
to clearly expose features of the charts that are distinct from 𝑀𝑜𝑚, we tween the forecasts and future stock returns that were used to evaluate
separate this subsample into six different 𝑀𝑜𝑚-based groups. Specifi- model performance in the optimization period (reported in Table 1),
cally, we create separate charts for stocks with values of 𝑀𝑜𝑚 between and on the 𝑦-axis the annualized Sharpe ratios of the 10 − 1 portfo-
the 20th and 30th (decile 3), 30th and 40th (decile 4), 40th and 50th lios associated with the given model during the test period. The plot
(decile 5), 50th and 60th (decile 6), 60th and 70th (decile 7), and 70th illustrates a clear positive relation between the optimization period
and 80th (decile 8) NYSE percentiles of 𝑀𝑜𝑚. We then create 𝑀𝐿𝐸𝑅- Spearman rank correlations and test period Sharpe ratios. The corre-
sorted quintile portfolios using stocks in each of these six groups and lation between these values is 0.69.35
construct the chart of the average value-weighted means of 𝐶𝑅𝑘 for Taken together, the results in Table 17 and Fig. 7 demonstrate that
each of these portfolios.34 These charts, shown in Fig. 6, exhibit a few the optimization process works as expected. The strong positive relation
striking patterns. First, in all six plots, the average stock in the 𝑀𝐿𝐸𝑅 between performance during the optimization period and performance
quintile 5 (1) portfolio has a substantial increase (decrease) in value during the test period shows that ex-ante optimization is useful for de-
in month 10, followed by a substantial decrease (increase) in value in termining which ML model to employ for out-of-sample use. However,
month 11. Furthermore, among stocks in 𝑀𝑜𝑚 deciles 3-5, and to a the optimization process is noisy, and thus is unlikely to select the sin-
gle model that will produce the best out-of-sample performance. Indeed,
lesser degree decile 6, we see that stocks with low values of 𝑀𝐿𝐸𝑅,
in our tests, approximately one third of the alternative models outper-
and thus low forecast returns, actually tend to have the highest aver-
formed the model selected by the optimization process.
age cumulative returns in months 1-6 in the chart, and then tend to
underperform in months 7-10.
7. Conclusion
The charts demonstrate that, even among stocks with similar values
of 𝑀𝑜𝑚 and 𝑅𝑒𝑣, stocks with high and low ML-based forecasts tend to
We test the weak form of the EMH by examining whether ML-based
have sufficiently different charts for the patterns to be recognized by a
forecasts generated from past return data easily observable in histori-
human chart reader. This, combined with the fact that these ML-based
cal price plots can predict the cross-section of future stock returns. We
forecasts predict future stock returns, albeit not through the momentum
begin by using the 192701-196306 period to determine the optimal ML
and/or reversal channels, suggests that human chartists can produce
model to use to generate our ML-based forecasts. We find that using
forecasts that form the basis of successful investment strategies that are
a convolutional neural network with long short-term memory as the
distinct from the momentum and reversal effects. ML architecture, the mean-squared-error as the loss function, weighting
observations in the loss function in a manner that gives the same total
34
In Section XXVII and Figure A8 of the Internet Appendix, we show that
charts created from equal-weighted values of 𝐶𝑅𝑘 are qualitatively the same as 35
The correlation between the optimization period Spearman rank correla-
those in Fig. 6. tions and the test period average excess returns is 0.68.
24
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Fig. 6. Average Charts for 𝑀𝐿𝐸𝑅 Quintile Portfolios in Subsample without Momentum and Reversal Effects. This figure shows plots that depict the average chart
for stocks within different subsets of our sample. The portfolios whose charts are depicted in Panels A, B, C, D, E, and F contain only stocks with values of 𝑅𝑒𝑣
between the 40th and 60th percentile values of 𝑅𝑒𝑣 and values of 𝑀𝑜𝑚 between the 20th and 30th, 30th and 40th, 40th and 50th, 50th and 60th, 60th and 70th,
and 70th and 80th percentile values of 𝑀𝑜𝑚, respectively. All percentiles are calculated on a monthly basis using only stocks listed on the NYSE. In each panel, all
stocks in the given 𝑅𝑒𝑣- and 𝑀𝑜𝑚-based subset are sorted into 5 quintile portfolios based on 𝑀𝐿𝐸𝑅 using breakpoints calculated from only NYSE-listed stocks.
For each portfolio in each month, we calculate the value-weighted mean value of the 𝑘th monthly cumulative return that would appear in charts of the stocks’
cumulative monthly returns over the past year, 𝐶𝑅𝑘 , 𝑘 ∈ {1, 2, … , 12}. We then calculate the time-series averages of these monthly value-weighted means for each
quintile portfolio and each value of 𝑘. The 𝑥-axis depicts the month 𝑘 in the plot. The 𝑦-axis depicts the time-series average of the monthly mean cumulative return.
Values for different 𝑀𝐿𝐸𝑅 quintile portfolios are shown in different colors, as indicated in the legends. For interpretation of the colors in the figure(s), the reader
is referred to the web version of this article. The analysis covers sample months 𝑡 from July 1963 through December 2022, inclusive.
25
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Table 17
Alternative ML Models. This table presents the average monthly excess return for the long-short portfolio formed by sorting on
ML-based forecasts generated using different ML models. We use each ML model described in Table 1 to generate forecasts of
future stock returns using, with the exception of the ML model, the exact same procedure as was used to calculate 𝑀𝐿𝐸𝑅.
We then construct decile portfolios formed by sorting on each of the different forecasts using the exact same decile portfolio
construction methodology as was used in Table 3. The table presents the average monthly excess returns for the 10 − 1 portfolio
formed using each of the forecasts. 𝑡-statistics, calculated following Newey and West (1987) using 12 lags, testing the null
hypothesis that the average monthly excess return is equal to zero, are shown in parentheses. The column labeled “Dependent
Variable” indicates the dependent variable. The column labeled “Weighting Methodology” indicates the weighting methodology.
The remaining column headers indicate the ML architecture and loss function. The analyses cover return months 𝑡 from July
1963 through December 2022, inclusive.
Dependent Variable Weighting Methodology FNN FNN CNN CNN LSTM LSTM CNNLSTM CNNLSTM
MAE MSE MAE MSE MAE MSE MAE MSE
weight to observations in each month and equal-weight to each stock contain nonnegligible components related to the momentum and rever-
within each month, and a normalized measure of the future stock re- sal effects, the majority of the predictive power is unrelated to these
turn as the dependent variable, optimizes the ML model performance phenomena. Similarly, ML-based forecasts based on images depicting
during the 192701-196306 optimization period. Our use of data from historical price and volume data and previously-studied technical sig-
the early part of our sample period to determine the optimal ML model nals fail to explain the predictive power of our ML-based forecasts.
overcomes concerns about data mining and the out-of-sample validity Visual examination reveals differences in price charts associated with
of our results. high and low future returns that are easily discernable by the human
We use the selected ML model to generate out-of-sample forecasts eye and distinct from momentum and reversal, suggesting the viability
of future stock returns during our 196307-202212 test period. Portfolio of charting by humans as a fruitful investment technique.
analyses demonstrate that the ML-based forecasts are a strong predictor Finally, we conduct an ex-post analysis of the effectiveness of the
of the cross-section of future stock returns. Factor analyses and other optimization procedure we undertake to select the ML model used in
risk metrics provide no indication that the variation in average returns our focal tests. Variation in the performance of different ML models
associated with the ML-based forecasts reflects compensation for risk. during the 192701-196306 optimization period is strongly correlated
Further tests demonstrate that the predictive power of the ML-based with the out-of-sample performance of the associated forecasts during
forecasts is strong during most subperiods of our main test period, in- the 196307-202212 test period, indicating that the optimization pro-
cluding the most recent 201501-202212 period. We also find that the cedure has substantial ability to discern the effectiveness of different
forecasts are effective among the largest 500 stocks in our sample, in- ML models. However, approximately one third of the ML models exam-
dicating that the predictive power is not limited to small and illiquid ined have better out-of-sample performance than the selected model,
stocks. suggesting that there is substantial noise in the optimization process.
The forecasting function generated by the ML process is highly stable Long-short portfolios based on forecasts produced by all but one of the
through time. Portfolios formed by sorting on forecasts based on fitting 96 ML models we examine generate positive out-of-sample average ex-
data from nonoverlapping subperiods have similar holdings and exhibit cess returns, almost all of which are statistically significant, indicating
similar performance. Furthermore, we find only a small improvement in that our findings are robust to the use of different ML models.
performance when using forecasts based on applying the learning pro- Our results are strong evidence contrary to the main prediction of
cess to data from an expanding window compared to a rolling window. the EMH that profitable portfolios cannot be constructed from only in-
The results of regression analyses show that the forecasting function formation contained in historical returns. While one might reasonably
has substantial nonlinear and interaction components, and that these argue that the momentum and reversal effects are already strong evi-
components are important for prediction. While the ML-based forecasts dence contradicting the EMH, the complexity of the relations between
26
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
References
Allen, F., Karjalainen, R., 1999. Using genetic algorithms to find technical trading rules.
J. Financ. Econ. 51, 245–271.
Ang, A., Hodrick, R.J., Xing, Y., Zhang, X., 2006. The cross-section of volatility and ex-
pected returns. J. Finance 61, 259–299.
Bajgrowicz, P., Scaillet, O., 2012. Technical trading revisited: false discoveries, persis-
tence tests, and transaction costs. J. Financ. Econ. 106, 473–491.
Bali, T.G., Cakici, N., Whitelaw, R.F., 2011. Maxing out: stocks as lotteries and the cross-
section of expected returns. J. Financ. Econ. 99, 427–446.
Bali, T.G., Engle, R.F., Murray, S., 2016. Empirical Asset Pricing: The Cross Section of
Stock Returns. John Wiley & Sons.
Fig. 7. Relation Between Optimization Period and Test Period Performance. Bali, T.G., Goyal, A., Huang, D., Jiang, F., Wen, Q., 2022. Predicting corporate bond
This figure depicts the relation between the 192701-196306 optimization pe- returns: Merton meets machine learning. Available at SSRN: https://ssrn.com/
riod performance and the 196307-202212 test period performance of ML-based abstract=3686164.
forecasts calculated using different ML models. The 𝑥-axis depicts the Spearman Bessembinder, H., Chan, K., 1998. Market efficiency and the returns to technical analysis.
Financ. Manag., 5–17.
rank correlations used to evaluate the ML model during the 192701-196306
Bianchi, D., Büchner, M., Tamoni, A., 2021. Bond risk premiums with machine learning.
optimization period, which are shown in Table 1. The 𝑦-axis depicts the annual-
Rev. Financ. Stud. 34, 1046–1089.
ized Sharpe ratios during the 196307-202212 test period of the 10 − 1 portfolios Brock, W., Lakonishok, J., LeBaron, B., 1992. Simple technical trading rules and the
formed by sorting stocks based on the forecasts generated by different ML mod- stochastic properties of stock returns. J. Finance 47, 1731–1764.
els. Each point represents the results for a different ML model. Bryzgalova, S., Pelger, M., Zhu, J., 2023. Forest through the trees: building cross-sections
of stock returns. J. Finance. Forthcoming. https://papers.ssrn.com/sol3/papers.cfm?
abstract_id=3493458.
past price patterns and future returns indicate that violations of the Carhart, M.M., 1997. On persistence in mutual fund performance. J. Finance 52, 57–82.
EMH are much more intricate than previously understood. Our findings Chang, P.K., Osler, C.L., 1999. Methodical madness: technical analysis and the irrational-
also suggest that technical analysis, or charting, has greater merit than ity of exchange-rate forecasts. Econ. J. 109, 636–661.
Chen, L., Pelger, M., Zhu, J., 2023. Deep learning in asset pricing. Manag. Sci. https://
acknowledged in academic work, and shed light on why this investment
doi.org/10.1287/mnsc.2023.4695. Forthcoming.
technique remains prevalent among investment practitioners. Chinco, A., Clark-Joseph, A.D., Ye, M., 2019. Sparse signals in the cross-section of returns.
J. Finance 74, 449–492.
Declaration of competing interest Chordia, T., Subrahmanyam, A., Anshuman, V.R., 2001. Trading activity and expected
stock returns. J. Financ. Econ. 59, 3–32.
Chung, K.H., Zhang, H., 2014. A simple approximation of intraday spreads using daily
The authors declare that they have no known competing financial data. J. Financ. Mark. 17, 94–120.
interests or personal relationships that could have appeared to influence Datar, V.T., Naik, N.Y., Radcliffe, R., 1998. Liquidity and stock returns: an alternative
the work reported in this paper. test. J. Financ. Mark. 1, 203–219.
Detzel, A.L., Novy-Marx, R., Velikov, M., 2023. Model comparison with transaction costs.
J. Finance. https://doi.org/10.1111/jofi.13225. Forthcoming.
Data availability Ehsani, S., Linnainmaa, J.T., 2022. Factor momentum and the momentum factor. J. Fi-
nance 77, 1877–1919.
Fama, E.F., 1991. Efficient capital markets: II. J. Finance 46, 1575–1617.
The authors do not have permission to share data. However, the
Fama, E.F., French, K.R., 1992. The cross-section of expected stock returns. J. Finance 47,
code to extract the required data from Wharton Research Data Services, 427–465.
along with all code required to reproduce the paper and an example Fama, E.F., French, K.R., 1993. Common risk factors in the returns on stocks and bonds.
of the sample we use with mostly randomized data, can be found at J. Financ. Econ. 33, 3–56.
Mendeley Data under DOI https://dx.doi.org/10.17632/x63r376783.2. Fama, E.F., French, K.R., 2008. Dissecting anomalies. J. Finance 63, 1653–1678.
Fama, E.F., French, K.R., 2015. A five-factor asset pricing model. J. Financ. Econ. 116,
1–22.
Acknowledgements Fama, E.F., French, K.R., 2018. Choosing factors. J. Financ. Econ. 128, 234–252.
Fama, E.F., MacBeth, J.D., 1973. Risk, return, and equilibrium: empirical tests. J. Polit.
Econ. 81, 607–636.
Nikolai Roussanov was the editor for this article. We thank Nikolai
Feng, G., He, J., Polson, N.G., 2018. Deep learning for predicting asset returns. arXiv
Roussanov (the editor), an anonymous referee, Vikas Agarwal, Turan preprint. https://arxiv.org/pdf/1804.09314.pdf.
Bali, Bryan Kelly, Markus Pelger, Chip Ryan, Baozhong Yang, Guofu Feng, G., He, J., Polson, N.G., Xu, J., 2023. Deep learning in characteristics-sorted factor
Zhou, and seminar participants at the 2020 Australasian Finance and models. J. Financ. Quant. Anal. Forthcoming. https://papers.ssrn.com/sol3/papers.
Banking Conference, 2020 Conference on Asia-Pacific Financial Mar- cfm?abstract_id=3243683.
Frazzini, A., Pedersen, L.H., 2014. Betting against beta. J. Financ. Econ. 111, 1–25.
kets, 2020 International Risk Management Conference, 2020 Shanghai- Freyberger, J., Neuhierl, A., Weber, M., 2020. Dissecting characteristics nonparametri-
Edinburgh Fintech Conference, 2020 World Finance & Banking Sym- cally. Rev. Financ. Stud. 33, 2326–2377.
posium, the 2021 Southwestern Finance Association Conference, 2021 Garfinkel, J.A., 2009. Measuring investors’ opinion divergence. J. Account. Res. 47,
University of Miami Research Conference on Machine Learning and 1317–1348.
Gehrig, T., Menkhoff, L., 2006. Extended evidence on the use of technical analysis in
Business, the University of Alabama, and Georgia State University for foreign exchange. Int. J. Financ. Econ. 11, 327–338.
feedback that has substantially improved this paper. This work used George, T.J., Hwang, C.Y., 2004. The 52-week high and momentum investing. J. Fi-
the Extreme Science and Engineering Discovery Environment (XSEDE), nance 59, 2145–2176.
27
S. Murray, Y. Xia and H. Xiao Journal of Financial Economics 153 (2024) 103791
Giglio, S., Xiu, D., 2021. Asset pricing with omitted factors. J. Polit. Econ. 129, Lo, A.W., Hasanhodzic, J., 2010. The Heretics of Finance: Conversations with Leading
1947–1990. Practitioners of Technical Analysis, vol. 16. John Wiley and Sons.
Giglio, S., Liao, Y., Xiu, D., 2021. Thousands of alpha tests. Rev. Financ. Stud. 34, Lo, A.W., MacKinlay, A.C., 1990. Data-snooping biases in tests of financial asset pricing
3456–3496. models. Rev. Financ. Stud. 3, 431–467.
Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT Press. Lo, A.W., Mamaysky, H., Wang, J., 2000. Foundations of technical analysis: computa-
Green, J., Hand, J.R., Zhang, X.F., 2017. The characteristics that provide independent in- tional algorithms, statistical inference, and empirical implementation. J. Finance 55,
formation about average us monthly stock returns. Rev. Financ. Stud. 30, 4389–4436. 1705–1765.
Gu, S., Kelly, B., Xiu, D., 2020. Empirical asset pricing via machine learning. Rev. Financ. Martin, I., 2017. What is the expected return on the market? Q. J. Econ. 132, 367–433.
Stud. 33, 2223–2273. McLean, R.D., Pontiff, J., 2016. Does academic research destroy stock return predictabil-
Gu, S., Kelly, B., Xiu, D., 2021. Autoencoder asset pricing models. J. Econom. 222, ity? J. Finance 71, 5–32.
429–450. Menkhoff, L., 2010. The use of technical analysis by fund managers: international evi-
Guijarro-Ordonez, J., Pelger, M., Zanotti, G., 2023. Deep learning statistical arbitrage. dence. J. Bank. Finance 34, 2573–2586.
Available at SSRN: https://ssrn.com/abstract=3862004. Messmer, M., 2017. Deep learning and the cross-section of expected stock returns. Avail-
Han, Y., Yang, K., Zhou, G., 2013. A new anomaly: the cross-sectional profitability of able at SSRN: https://ssrn.com/abstract=3081555.
technical analysis. J. Financ. Quant. Anal. 48, 1433–1461. Messmer, M., Audrino, F., 2020. The lasso and the factor zoo - expected returns in the
Harvey, C.R., Liu, Y., Zhu, H., 2016. ...and the cross-section of expected returns. Rev. cross-section. Available at SSRN: https://ssrn.com/abstract=2930436.
Financ. Stud. 29, 5–68. Moritz, B., Zimmermann, T., 2016. Tree-based conditional portfolio sorts: the relation be-
Hoseinzade, E., Haratizadeh, S., 2019. Cnnpred: CNN-based stock market prediction using tween past and future stock returns. Available at SSRN: https://ssrn.com/abstract=
a diverse set of variables. Expert Syst. Appl. 129, 273–285. 2740751.
Hou, K., Xue, C., Zhang, L., 2015. Digesting anomalies: an investment approach. Rev. Moskowitz, T.J., Ooi, Y.H., Pedersen, L.H., 2012. Time series momentum. J. Financ.
Financ. Stud. 28, 650–705. Econ. 104, 228–250.
Hou, K., Xue, C., Zhang, L., 2020. Replicating anomalies. Rev. Financ. Stud. 33, Neely, C., Weller, P., Dittmar, R., 1997. Is technical analysis in the foreign exchange
2019–2133. market profitable? A genetic programming approach. J. Financ. Quant. Anal. 32,
Jegadeesh, N., 1990. Evidence of predictable behavior of security returns. J. Finance 45, 405–426.
881–898. Neely, C.J., 2002. The temporal pattern of trading rule returns and exchange rate in-
Jegadeesh, N., 2000. Foundations of technical analysis: computational algorithms, statis- tervention: intervention does not generate technical trading profits. J. Int. Econ. 58,
tical inference, and empirical implementation: discussion. J. Finance 55, 1765–1770. 211–232.
Jegadeesh, N., Titman, S., 1993. Returns to buying winners and selling losers: implications Neely, C.J., Rapach, D.E., Tu, J., Zhou, G., 2014. Forecasting the equity risk premium: the
for stock market efficiency. J. Finance 48, 65–91. role of technical indicators. Manag. Sci. 60, 1772–1791.
Jiang, J., Kelly, B.T., Xiu, D., 2022. (Re-)imag(in)ing price trend. J. Finance. https:// Newey, W.K., West, K.D., 1987. A simple, positive semi-definite, heteroskedasticity and
doi.org/10.1111/jofi.13268. Forthcoming. autocorrelation consistent covariance matrix. Econometrica 55, 703–708.
Kelly, B.T., Pruitt, S., Su, Y., 2019. Characteristics are covariances: a unified model of risk Novy-Marx, R., Velikov, M., 2016. A taxonomy of anomalies and their trading costs. Rev.
and return. J. Financ. Econ. 134, 501–524. Financ. Stud. 29, 104–147.
Kirby, C., 2020. Firm characteristics, cross-sectional regression estimates, and asset pric- Osler, C.L., 2003. Currency orders and exchange rate dynamics: an explanation for the
ing tests. Rev. Asset Pricing Stud. 10, 290–334. predictive success of technical analysis. J. Finance 58, 1791–1819.
Kozak, S., Nagel, S., Santosh, S., 2020. Shrinking the cross-section. J. Financ. Econ. 135, Rapach, D.E., Strauss, J.K., Zhou, G., 2013. International stock return predictability: what
271–292. is the role of the United States? J. Finance 68, 1633–1662.
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep con- Ready, M.J., 2002. Profits from technical trading rules. Financ. Manag. 31, 43–61.
volutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105. Rossi, A.G., 2018. Predicting stock market returns with machine learning. Working Paper.
Krogh, A., Vedelsby, J., 1995. Validation, and active learning. Adv. Neural Inf. Process. Schwert, G.W., 2003. Anomalies and market efficiency. In: Constantinides, G., Harris, M.,
Syst. 7 (7), 231–238. Stulz, R.M. (Eds.), Handbook of the Economics of Finance. Elsevier Science, B.V.,
LeBaron, B., 1999. Technical trading rule profitability and foreign exchange intervention. Netherlands, pp. 939–974. Chapter 15.
J. Int. Econ. 49, 125–143. Shumway, T., 1997. The delisting bias in CRSP data. J. Finance 52, 327–340.
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436–444. Sullivan, R., Timmermann, A., White, H., 1999. Data-snooping, technical trading rule
Lehmann, B.N., 1990. Fads, martingales, and market efficiency. Q. J. Econ. 105, 1–28. performance, and the bootstrap. J. Finance 54, 1647–1691.
Lettau, M., Pelger, M., 2020. Factors that fit the time series and cross-section of stock Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to sequence learning with neural
returns. Rev. Financ. Stud. 33, 2274–2325. networks. Adv. Neural Inf. Process. Syst. 27.
Levich, R.M., Thomas III, L.R., 1993. The significance of technical trading-rule profits Sweeney, R.J., 1986. Beating the foreign exchange market. J. Finance 41, 163–182.
in the foreign exchange market: a bootstrap approach. J. Int. Money Financ. 12, Towns, J., Cockerill, T., Dahan, M., Foster, I., Gaither, K., Grimshaw, A., Hazlewood,
451–474. V., Lathrop, S., Lifka, D., Peterson, G.D., et al., 2014. XSEDE: accelerating scientific
Lewellen, J., Nagel, S., 2006. The conditional CAPM does not explain asset-pricing anoma- discovery. Comput. Sci. Eng. 16, 62–74.
lies. J. Financ. Econ. 82, 289–314. Zhu, Y., Zhou, G., 2009. Technical analysis: an asset allocation perspective on the use of
Linnainmaa, J.T., Roberts, M.R., 2018. The history of the cross-section of stock returns. moving averages. J. Financ. Econ. 92, 519–544.
Rev. Financ. Stud. 31, 2606–2649.
28