0% found this document useful (0 votes)
15 views41 pages

Nonlinearity Matters Stock Price

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views41 pages

Nonlinearity Matters Stock Price

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Journal Pre-proof

Nonlinearity matters: The stock price – trading volume relation revisited

Simon Behrendt, Alexander Schmidt

PII: S0264-9993(20)31240-2
DOI: https://doi.org/10.1016/j.econmod.2020.11.004
Reference: ECMODE 5391

To appear in: Economic Modelling

Received Date: 14 November 2019


Revised Date: 14 November 2020
Accepted Date: 15 November 2020

Please cite this article as: Behrendt, S., Schmidt, A., Nonlinearity matters: The stock price – trading
volume relation revisited Economic Modelling, https://doi.org/10.1016/j.econmod.2020.11.004.

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.

© 2020 Elsevier B.V. All rights reserved.


Abstract

The purpose of this paper is to investigate the information transfer in the relation between
stock prices and trading volume. While several theoretical models establish this relation,
determining its direction remains an empirical question. Conventional linear approaches, such
as Granger causality, provide only limited insights. Importantly, they do not take into account
the nonlinear nature of this relation which is advocated by theoretical models of
noninformational trading. Moreover, they cannot deduce the dominant direction of the
information transfer. Both shortcomings can be addressed by relying upon the concept of
Shannon transfer entropy. In an empirical application to a large sample of stocks, we employ
this model-free measure and find: (i) A substantial amount of nonlinear information transfer
across stocks, and (ii) this information predominantly flows from returns to trading volume
growth. Thus, we present empirical evidence that the relation between these financial
variables is in fact likely to be nonlinear.

Keywords: stock returns; trading volume; nonlinear dynamics; information transfer

of
JEL: C14, C58, G14

ro
-p
re
lP
na
ur
Jo
1 Introduction

A thorough understanding of the relationship between stock prices and trading volume has

important implications for our insights into the structure of financial markets, the dissemination

of related information, and subsequent actions of the various market participants. As a

consequence, the analysis of the stock price – trading volume relationship has drawn a

considerable amount of attention in the finance literature (for an early summary of empirical

evidence investigating contemporaneous effects, see Karpoff, 1987). Of special interest to market

participants has been the question whether or not there exists a directional information transfer

of
between stock prices and trading volume, i.e., whether knowledge of past movements in stock

ro
prices leads to improved predictions not only of current but also future movements in trading
-p
volume, and vice versa. For instance, the former direction may be derived from a sequential

arrival of information argument in the sense that stock prices incorporate the latest information
re
to the market earlier than trading volume.
lP

In order to illustrate the relation between stock price and trading volume, we make use of tick-
na

by-tick Trade and Quote (TAQ) data for one constituent of the Dow Jones Industrial Average

(DJIA).1 These data are obtained from the Wharton Research Data Services at the frequency of
ur

milliseconds. To be more precise, we exemplarily consider the stock of Procter & Gamble (PG) on
Jo

May 6, 2010. This trading day is special since it is characterized by a significant flash crash,

marked by a large and temporary selling pressure (e.g., Kirilenko et al., 2017; Easley et al., 2011).

Figure 1 depicts the tick-by-tick transaction prices and trading volumes for PG on May 6, 2010,

throughout the trading day, providing an impression of the sudden drop in prices and increase in

traded shares affecting many constituents of the DJIA. Although it is immediately clear from the

figure that some relation exists between a stock’s price and its trading volume, the direction of

the information transfer is not readily deduced. This means, it is not clear what the dominant

direction in the predictive relation between stock prices and trading volume is.

1
Given that the econometric analysis of such high-frequency data comes with its very own considerations, we focus
on the analysis of daily data in the empirical application below.
1
Figure 1: Tick-by-tick transaction prices and trading volume of the Procter & Gamble
stock on May 6, 2010
Considering the stock of Procter & Gamble, the plots show (a) tick-by-tick transaction prices and (b) tick-
by-tick trading volumes for the trading day May 6, 2010.

62

60

58

of
56

ro
10:30:00 12:00:00 13:30:00 15:00:00
time

-p
(a) PG intraday transaction prices
re
lP

90000
na

60000
ur

30000
Jo

10:30:00 12:00:00 13:30:00 15:00:00


time

(b) PG intraday trading volume

Regarding theoretical considerations behind the stock price – trading volume relation, early

models concerned with sequential information arrival are put forward by Copeland (1976) and

Jennings et al. (1981) and indicate the presence of a bidirectional predictive relationship between

absolute stock returns and trading volume. Moreover, different tax- and non-tax-related trading

motives such as the timing of realized capital gains and losses as well as portfolio rebalancing and

contrarian strategies have also been proposed to induce a stock price – trading volume relation

2
(e.g., Lakonishok and Smidt, 1989). Other potential explanations involve mixture of distributions

models (e.g., Clark, 1973; Epps and Epps, 1976) or the positive-feedback trading strategies of

noise traders, which induce a positive directional relation from stock returns to trading volume

(e.g., De Long et al., 1990b).

In order to assure stationary time series, empirical studies usually consider stock return and

transformed trading volume (e.g., trading volume growth) series instead of the price and trading

volume series in levels. Moreover, most of the empirical evidence for the relation of stock returns

and trading volume is based on linear models. For example, Smirlock and Starks (1988) and Lee

and Rui (2002) rely on bivariate vector autoregressive (VAR) models and Granger causality

of
(Granger, 1969) tests. While the former find a positive relation between absolute price changes

ro
(a measure for stock return volatility) and trading volume for individual-level stocks, the latter

-p
discover that trading volume does not Granger-cause stock returns using daily index data from

the stock exchanges in New York, Tokyo and London.2 Chordia and Swaminathan (2000) provide
re
evidence that trading volume determines the lead-lag cross-correlations in stock returns, Gagnon
lP

and Karolyi (2009) find further support for a large sample of internationally cross-listed stocks

and Chuang et al. (2009) use quantile regressions to show that the directional relation of trading
na

volume on stock returns is more heterogeneous across quantiles than the directional relation of
ur

stock returns on trading volume. On a different note, Gallant et al. (1992) stress that large price

movements are followed by high trading volume and Chen (2012) finds that returns of the S&P
Jo

500 predict trading volume in both bear and bull markets, while trading volume predicts returns

only in bear markets.

By contrast, considerations of nonlinearities have less often been a central part of the discussion,

even though empirical evidence of nonlinear dependencies in stock returns is ample (for an early

reference, see Hinich and Patterson, 1985). More recently, the relevance of nonlinear predictions

of end-of-day traded volume has also been shown in Sancetta (2019). Turning to the directional

relation between stock returns and trading volume, Campbell et al. (1993) develop a model in

which expected future stock returns evolve as a nonlinear function of current and past returns as

well as trading volume and document corresponding empirical evidence. Subsequently, Hiemstra

2
Note that the concept of Granger causality is concerned with one time series’ ability to forecast present and
future values of another time series and is thus not to be confused with a true causal relationship
3
and Jones (1994) propose nonlinear Granger causality tests to analyze the relation between daily

returns of the DJIA and trading volume growth on the New York Stock Exchange. They indeed find

a significant bidirectional nonlinear relationship between stock returns and trading volume. In

addition, McMillan (2007) shows that lagged trading volume can be used to improve forecast

accuracy of nonlinear stock return forecasting models.

In this paper, we add to the previous literature in two ways: (i) We quantify and test for the

nonlinear directional information transfer between stock returns and trading volume growth by

drawing upon a practical two-step procedure, and (ii) we obtain novel empirical results from daily

calendar-adjusted log-returns and volume growth for more than 400 stocks over an 18 years time

of
period. The idea of the two-step procedure follows the same reasoning as the nonlinear Granger

ro
causality test of Hiemstra and Jones (1994). First, we remove all linear auto- and cross-

-p
correlations from the bivariate system by applying a linear filter in the form of a VAR model. Any

remaining residual information may then be attributed to nonlinearities in the bivariate system of
re
stock returns and trading volume growth. The second step is different from the nonlinear
lP

Granger causality test proposed by Hiemstra and Jones (1994) in that we make use of Shannon

transfer entropy, as initially introduced by Schreiber (2000), to quantify the remaining nonlinear
na

residual information transfer. Derived from information theory, Shannon transfer entropy is a
ur

nonparametric measure that is sensitive to any statistical dependencies between two time series.

Moreover, we are not only able to infer the existence of nonlinear residual information transfer
Jo

in the bivariate system of stock returns and trading volume growth, but also the dominant

direction of that information transfer, which constitutes an improvement over the usual (non-

)linear Granger causality tests. Thus, even in cases where the transfer entropy estimates point to

a bidirectional nonlinear residual information transfer, we can determine whether or not most of

the information flows from trading volume growth to stock returns or vice versa. In the appendix,

we provide several simulation experiments to demonstrate the problem that nonlinearities

induce in a bivariate system with regards to measuring information transfer and to illustrate the

applicability of the two-step procedure, described in detail below, in such settings.

Overall, our results indicate that after accounting for all linear auto- and cross-correlations in the

bivariate system of stock returns and trading volume growth there still exists a statistically

significant nonlinear residual information transfer in at least one direction for a large number of
4
stocks. While the nonlinear information transfer is bidirectional in many cases and no overall

pattern for all stocks emerges, information predominantly flows from stock returns to trading

volume growth for most of the stocks that we consider. This finding is not specific to stocks

pertaining to a certain industry. As can be expected, the magnitude of the information transfer is

slightly affected when we additionally account for volatility persistence, which could otherwise

be a potential driver of the information transfer (e.g., Clark, 1973; Hiemstra and Jones, 1994), and

when we separately consider the time periods before and after the financial crisis unfolding in

2008. However, our main finding of significant nonlinear information transfer between the VAR

residuals remains robust across all specifications. Therefore, linear empirical models do not seem

of
to suffice to adequately assess the theoretical stock price – trading volume relation.

ro
The remainder of the paper is structured as follows: Section 2 briefly discusses models of

-p
noninformational trading as source for a nonlinear directional relation between stock prices and

trading volume. Next, Section 3 introduces the concept of Shannon transfer entropy and outlines
re
the idea of the two-step procedure used to test for nonlinear residual information. In Section 4,
lP

the data set used in the empirical analysis is described and Section 5 presents the results of the

empirical analysis for the calendar-adjusted bivariate system of stock returns and volume growth
na

(Section 5.1) as well as when we additionally account for volatility persistence (Section 5.2). Lastly,
ur

Section 6 offers some concluding remarks. Additional results and the simulation experiments are

relegated to the appendix.


Jo

2 Models of noninformational trading

Although many channels have been put forward to describe the relation between stock prices

and trading volume, as outlined above, we want to briefly consider one strand of the finance

literature in more detail: Models of noninformational trading. These models are especially

interesting since while the underlying reasons for trading may be different, they make similar

predictions. For example, models along the lines of De Long et al. (1990a) shed light on the effect

of investor sentiment (i.e., subjective beliefs about investment risks and future payoffs) on stock

prices and trading volume. Here, noninformational trading is based on shifting misperceptions of

noise traders related to future payoffs from a risky asset. In these behavioral models, noise

5
traders’ random beliefs influence prices since rational arbitrageurs, who also have a downward-

sloping demand for risky assets, do not aggressively push back prices to fundamentals when noise

traders experience a belief shock. These models predict that, on the one hand, downward price

pressure is generated by low investor sentiment and, on the other hand, high trading volume by

exceptionally high or low investor sentiment.

On a different note, the model of Campbell et al. (1993) turns to the empirical phenomenon that

the first-order daily autocorrelation of stock returns tends to decrease when trading volume is

high. Their model also builds on two types of traders, namely liquidity, or noninformational,

traders and other risk-averse traders that can be thought of as “market makers”.3 Contrary to

of
models in the spirit of De Long et al. (1990a), the model of Campbell et al. (1993) derives

ro
noninformational trading from random shifts in the level of risk aversion related to the large

-p
subset of traders that is made up of liquidity traders. In equilibrium, this change in the level of

risk aversion leads to an increase in the expected return of the risky asset (i.e., stock), which
re
compensates the risk-averse market makers for bearing the risk of holding that asset. Risk is
lP

reallocated from the noninformational traders to the rest of the market since the former are

selling the stock, whereas the market makers accommodate this selling pressure. As a result,
na

trading volume increases, the current stock price falls, inducing a negative current return and
ur

thus a rise in expected future returns.


Jo

Even though the differences between investor sentiment and risk aversion are of a rather

philosophical nature with regards to noise trader and liquidity trader theories (Tetlock, 2007), the

latter allows Campbell et al. (1993) to establish a direct link between trading volume and

expected future returns. Moreover, this link is nonlinear since, in their model, the expected

future stock return is related to the lagged stock return and an interaction term of lagged stock

return and trading volume. Such nonlinearities may pose a problem for econometricians when

trying to capture the relation between stock returns and trading volume with simple linear

models.

3
Market makers in the sense of Grossman and Miller (1988), even though they may not be specialists on the
exchange and potentially hold positions for longer periods of time.
6
Coming back to the example illustrated in Figure 1, in a highly electronic market, a large shift in

the demand for risky assets may also originate with “noninformed” algorithms trading on a signal

and ultimately amplifying the selling pressure in that market (e.g., Kirilenko et al., 2017). While

this discussion only serves to provide a brief outline of one channel that may explain the

relationship between stock prices and trading volume, it is by no means exhaustive. For further

details, we refer the interested reader to the references listed in the previous section.

3 Information transfer and nonlinear dynamics

We draw upon the concept of Shannon transfer entropy to quantify empirically the residual

of
information transfer between stock returns and trading volume growth. Therefore, let us begin

ro
by elaborating on this measure of information transfer in greater detail, before introducing our

two-step testing procedure.


-p
re
3.1 Transfer entropy
lP

Shannon transfer entropy is a model-free measure that characterizes the randomness of draws
na

from a specific probability distribution (for some general remarks, see Bossomaier et al., 2016;

Dehmer et al., 2017).4 Denote by I a discrete random variable that can take on n distinct values
ur

and let i indicate one such outcome of I. Then, taking the probability mass function (pmf) of I,
Jo

, Shannon (1948) defines a measure of information content:5

=− , (1)

with 0 ≤ ≤ 1 and ≥ 0. It follows that is large for small , i.e. unlikely outcomes

convey more information. By taking the expectation of Equation (1) over all possible outcomes i,

we obtain the Shannon entropy:

= η = −∑ ⋅ . (2)

4
Entropy is also defined as the expected log-likelihood ratio in some contexts (Hansen and Sargent, 2008, p. 55)
5
The base of the logarithm only affects the unit of measurement. By taking the base 2 logarithm, the gain in
information when observing i is measured in bits.
7
Since this entropy is a univariate measure for uncertainty, it is maximized when I follows a

uniform distribution and decreases with the degree of dispersion of the probability distribution of

I. Shannon entropy can, however, be extended to a bivariate measure of uncertainty by utilizing

the concept of mutual information, which is in turn based on the Kullback-Leibler divergence

(Kullback and Leibler, 1951). Considering two different pmf’s and for the same random

variable I, the Kullback-Leibler divergence is a measure for the difference between these two

pmf’s, as given by:

=∑ ⋅ . (3)

of
Instead of considering two different pmf’s for the same random variable, we can also rewrite the

ro
Kullback-Leibler divergence in Equation (3) in terms of the difference between the joint and the

-p
marginal pmf’s of two discrete random variables I and J, which is then called mutual information:

= ∑ ∑$
,$
! ,# ⋅
re
$
, (4)

# denoting the marginal pmf’s of I and J, respectively, and , # denoting the


lP

with and

joint pmf. Comparing this equation with Equation (3), the intuition behind mutual information
na

! can be stated analogously: It measures the difference between the distribution of the
ur

random variable I, with its marginal pmf , and the distribution of the random variable J, with

# . Therefore, mutual information is an informational measure of any form of


Jo

its marginal pmf

statistical dependence among the two discrete random variables I and J. Note that only in case of

statistical independence equality of ! is achieved and ! > 0 otherwise. While mutual

information is a symmetric measure, quantifying the decrease in uncertainty compared to the

case of statistical independence of I and J, for our intended empirical application to the bivariate

system of stock returns and trading volume growth, we need an asymmetric measure in order to

identify a dominant direction of the information transfer.

To this end, Schreiber (2000) considers transition probabilities in order to introduce time series

dynamics into the framework of mutual information. Let I and J now denote stationary Markov

processes of order l and h, respectively. By measuring the deviation from the generalized Markov

&'( ) & , #& + = &'( ) & = & , … , &-*'(


* * * .
property , where & and

8
#& + = #& , … , #&-+'( . , Schreiber (2000) proposes to quantify the information transfer from

process J to I in the form of the Shannon transfer entropy, which is given by:
5 6
234 ) 2 ,$2
/!→ = ∑ ∑$ &'( , , #& ⋅ 1 7.
* +
& 5
234 ) 2
(5)

Again, comparing Equation (5) with the previous equation for mutual information, one sees that

the two look similar. Both equations measure deviations from statistical independence. The

important change that enables one to go from mutual information to Shannon transfer entropy is

the introduction of the generalized Markov property. Thereby, this measure now allows us to

quantify the informational gain that we can achieve in predicting the future value of process I by

of
observing past values of J instead of only observing past values of I itself. The information flow

from I to J is quantified analogously by calculating / →8 , allowing to infer the dominant direction

ro
of the information transfer between the two processes by taking the difference of /!→ and / →8 .
-p
This step constitutes an improvement over (non-)linear Granger causality tests, which generally
re
do not permit to deduce the dominant direction of the information transfer.
lP

Given that we work with financial time series, we have to discretize the time series before
na

estimating Shannon transfer entropy. Following Dimpfl and Peter (2013), we subsequently

partition our time series into three bins and use an encoded time series for estimation. Since we
ur

are interested in general tail events, i.e., extreme residual values that potentially contain
Jo

nonlinear information, three bins should suffice for the application we have in mind. Specifying

two quantiles of the empirical distribution of the observed time series {:& }=&<( , ( and , such an

encoded time series {>& }=&<( is obtained as follows:

1 for :& ≤ (
>& = ? 2 for ( < :& < ∀F .
3 for :& ≥
(6)

The encoding replaces each value in {:& }=&<( by one of the three integers {1,2,3}, in line with

Equation (6). Lastly, in order to provide statistical inference, we follow Horowitz (2003) and

bootstrap the underlying Markov processes I and J based on the calculated transition

probabilities (see also the outline in Behrendt et al., 2019). While the statistical dependencies

between I and J are removed, dynamics of each process are preserved. Thus, a bootstrap sample

under the I of independence, i.e., /!→ = 0 and / →! = 0 , can be obtained by repeated


9
estimation of the transfer entropy measure. We use this bootstrap sample to test for statistical

significance of the estimated Shannon transfer entropies by means of a one-sided test.

3.2 Testing for residual information

Our approach to test for nonlinear residual information in the bivariate system of stock returns

and trading volume growth is straightforward and follows the general idea of Hiemstra and Jones

(1994): If we use a VAR model to act as a linear filter on a bivariate system in order to purge the

residual series of any linear auto- and cross-correlations, then all remaining predictive power of

one residual series for the other may be attributed to nonlinear information. Instead of the

of
nonlinear Granger causality test proposed by Hiemstra and Jones (1994), we draw upon the

ro
concept of Shannon transfer entropy, as outlined above, to test for such information transfer

from one residual series to the other. There are two reasons for this choice: Firstly, the nonlinear
-p
Granger causality test proposed by Hiemstra and Jones (1994) is based on the correlation integral,
re
which has initially been introduced by Grassberger and Procaccia (1983) to measure the similarity
lP

of time series obtained from dissipative dynamical systems exhibiting chaotic behavior. However,

their test does not account for temporal correlation in the formulation of the correlation integrals
na

to be estimated and thus it is not clear that these estimators indeed suffice as measures of spatial

correlation (for a detailed discussion, see ch. 6 in Kantz and Schreiber, 2004). As a result, the
ur

correlation integrals used to obtain estimates for the joint probabilities in their formulation of the
Jo

nonlinear Granger causality test are generally biased if some temporal correlation is still present

in the residual series. Secondly, and more importantly, in addition to detecting nonlinear residual

information, relying on Shannon transfer entropy allows us to infer on the dominant direction of

the residual information transfer, which constitutes an improvement over Granger-type causality

tests. Moreover, bootstrap inference for these estimates is readily available.6

Let us be more precise and describe the idea of the two-step testing procedure applied in the

empirical analysis below in more detail. The first step involves the estimation of a bivariate
= =
VAR model for the two appropriately standardized series JK(,& L&<( and JK ,& L&<(:7

6
For a recent contribution to the literature, see Camacho et al. (2020), who develop a nonlinear Granger causality test
based on Shannon entropy for longitudinal data.
7
For more information on the general structure of a VAR model, please see, for example, Lütkepohl (2013).
10
K(,& P( T
$
T(
$
K(,&-$ ε(,&
M N = OP Q + ∑$<( S (( UM N + Oε Q,
K ,& T
$
T
$ K ,&-$ ,&
(7)
(
P( T T(
$ $
where OP Q and S (( U, # = 1, … , , denote the vector of constants and the autoregressive
T( T
$ $

coefficient matrices, respectively. The VAR-order p is chosen large enough to account for any

linear auto- and cross-correlations in the system. In a practical application, the lag order can be

determined with the usual information criteria. If there is any auto- or cross-correlation left, an

overparameterized specification should be estimated to purge the series of all remaining linear

correlation. Obtaining residuals series free of any linear correlation is a crucial step in context of

our investigation into the directional information transfer between stock returns and trading

of
volume growth. Only then are we able to look beyond linear dependencies into potentially

ro
=
remaining nonlinear relations present in the residual series of the VAR model,JεW(,& L&<( and
=
-p
JX̂ ,& L&<( . For the innovations in Equation (7), we may relax the Gaussian white noise assumption,
re
which is usually made in the VAR context, since our main goal is not statistical inference related

to the VAR coefficients. Accordingly, in the second step, /ZW4,2 →ZW[,2 and /ZW4,2 →ZW[,2 are estimated for
lP

the residual series and used to determine whether or not there is still a statistically significant
na

information transfer from one residual series to the other and, if this is the case, the dominant
ur

direction of that information transfer. Note that the above two-step procedure can easily be

extended to more system variables by relying on group transfer entropy, as described by Dimpfl
Jo

and Peter (2018) in the context of volatility transmission. See Appendix 7.2 for some simulation

experiments underlining the usefulness of the two-step procedure in the case of nonlinear

relations between variables of interest.

In finite samples, Shannon transfer entropy estimates are likely to be biased upwards. This is why

we follow Marschinski and Kantz (2002) and calculate a bias corrected, or effective, transfer

entropy through shuffling:

/\W] →\W^ = /\W] →\W^ − /\W] →\W^ , , # ∈ {1,2} and ≠ #,


shuffled
(8)

11
where, /\W] →\W^ denotes
shuffled
the estimated Shannon transfer entropy with a shuffled time series of
= =
JX̂ ,& L&<( . A shuffled time series is obtained from the encoded time series of JεW ,& L&<( by randomly
drawing realizations from it and generating a new time series from these draws. Doing so
destroys any statistical dependence between εW shuffled and εW$ and /\W] →\W^
shuffled
converges to zero

for/ → ∞, while a non-zero value of /\W] →\W^ is an indication of finite sample bias. We calculate

/\W] →\W^ for residual information transfer in both directions by shuffling a sufficient number of

times and subtracting the mean over all shuffles from the Shannon transfer entropy that is
calculated as given by Equation (5).

4 Data

of
After the above introduction of the two-step procedure used to test for nonlinear residual

ro
information transfer in the bivariate system of stock returns and trading volume growth, we now
-p
turn to the description of the data set that is used in the subsequent empirical analysis. We
re
gather daily observations of stock prices and trading volumes for all constituents of the S&P 500
lP

and DJIA between January 3, 2000, and December 29, 2017, from the Thomson Reuters data base.

This amounts to 417 individual-level stocks for which we can test for nonlinear residual
na

information transfer.8 Since the time span of our data includes the global financial crisis in 2008

and 2009, we conduct our analysis not only on the full sample from 2000 to 2017 but also on two
ur

subsamples, representing the pre-crisis period (2000 - 2007) and the post-crisis period (2010 -
Jo

2017). This leads to a maximum of 2263 observations for each stock in the pre-crisis and 2013

observations in the post-crisis sample, while in the overall sample we have a maximum of 4528

observations for each of the 417 stocks.9 For all of these stocks, daily closing prices are used to

calculate log-returns as given by j& = & − &-( , where & denotes the respective

closing price on trading day t. We note that augmented Dickey Fuller (ADF) tests reveal the

presence of unit roots, a sign of nonstationary behavior, in the daily trading volume series,

denoted by k& , for some of the 417 individual-level stocks in the pre- and post-crisis samples. In

order to obtain stationary time series for our analysis, we calculate volume growth for each stock

and, to stay consistent across (sub)samples, proceed with the volume growth series l& instead of

8
The complete list of stocks and detailed descriptive statistics are available from the authors upon request.
9
Due to discrepancies in recorded trading days among the stocks in our sample, the actual number of observations
per stock and sample may differ. However, we only include stocks that have at least 4000 observations in the full
sample, i.e. during the period from 2000 to 2017.
12
trading volume in levels in all three samples. This is consistent with the application in Hiemstra

and Jones (1994) as explained in Section 1.

5 The relation between stock returns and trading volume growth

For our empirical application, we first use a linear VAR model to filter out all linear auto- and

cross-correlations in the bivariate system of stock returns and trading volume growth, as given in

Equation (7). The VAR model with daily log-returns j& and volume growth l& reads

as follows:

of
j& P( T
$
T(
$
j&-$ Xm,&
Ol Q = OP Q + ∑$<( S (( U Ol Q + OX Q .
T T
$ $

ro
& &-$ n,&
(9)
(

-p
As above, we relax the Gaussian white noise assumption for the innovations Jεm,& L&<( and
=
re
=
Jεn,& L&<( since our main objective is not statistical inference within the VAR framework, but the
lP

application of a suitable linear filter. Note that the two system variables j& and l are replaced by
na

calendar- and volatility-adjusted series, indicated by superscripts (∗) and (∗∗), respectively, in the

following empirical analysis. We use the Bayesian information criterion (BIC) to determine the lag
ur

length p of the VAR for each stock individually. Even though the autocorrelation functions for
Jo

most stocks do not suggest any form of strong autocorrelation in the VAR residuals, we also

specify a heavily overparameterized version of each VAR model. These VAR models with lag

length equal to = 20 serve to eliminate any potentially remaining linear dependencies not

captured by the VAR models with shorter BIC-determined lag lengths.10 In our discussion below,

we only report results for the VAR models whose lag length is determined via the BIC since

results for the overparameterized VAR models are qualitatively similar to the reported results.

Detailed results for these overparameterized models are relegated to Appendix 7.3. The residual

series from the VAR model in Equation (9) are then used to estimate Shannon transfer entropy as

a measure that quantifies any remaining nonlinear residual information. Considering that all

residual series of the different VAR models appear to exhibit a pronounced excess kurtosis, a

= 15, we have chosen = 20 for


10
Since the highest lag length chosen by the BIC across all stocks is equal to
the overparameterized models.
13
thorough investigation of the information contained in the tails of the residual distributions via

Shannon transfer entropy and the aforementioned discretization into three bins appears to be a

reasonable approach. We use 200 shuffles to compute the effective transfer entropies and

statistical inference is based on 500 Markov block-bootstrap replications.

5.1 Empirical results for the calendar-adjusted bivariate system

Following the general arguments and references in Hiemstra and Jones (1994), we do not use the

raw stock return and trading volume growth time series as input to the bivariate VAR in Equation

(9), but apply a generic filter to the data beforehand. This filter corrects the mean and variance of

of
both the log-return and trading volume growth time series for the potential systematic influence

ro
that a certain day of the week or month of the year can have on the observations. In order to do

so, we use the approach by Gallant et al. (1992) and create dummy variables for each day of the
-p
week and each month of the year, collected in the matrix &. These serve as independent
re
variables for two linear regressions, which are summarized as follows:
lP

K& = β ⋅ + ε&
na

mean equation: &


(10)
X̂& = γ ⋅ & + η& ,
ur

variance equation:
Jo

where K& stands for either one of the system variables j& and l& . For the mean equation, the log-

return and trading volume growth series, respectively, are regressed on the dummy variables in

rs . Thus, the residuals of the mean equation X̂& should capture the part of K& that is not explained

by calendar effects. Using the logarithm of the squared residuals of this first regression, X̂& ,

as dependent variable, the same dummy variables in rs also serve as regressors for the variance

equation. The idea behind this second regression is essentially the same as for the mean equation.

Lastly, the final calendar adjusted time series are then obtained by standardizing the residuals of

the mean equation:

\u2
K&∗ = vw u⋅y2 /
x
. (11)

14
Applied to our two system variables, this results in calendar adjusted log-returns j&∗ and trading

volume growth l&∗.

Table 1 summarizes the results of our two-step procedure using these two calendar-adjusted

series as input variables to the bivariate system for the complete time period from 2000 to 2017

(Full), the time period from 2000 to 2007 (Pre-Crisis) and the time period from 2010 to 2017

(Post-Crisis). Columns denoted by “# stocks” show the number of stocks for which the respective

test yields estimates that are statistically significant on a 10% significance level.11 Moreover,

columns denoted by “+” (“−”) in Panels B and C of Table 1 report the number of stocks that show

a positive (negative) sign of the difference between the effective transfer entropy estimates,

of
/\W{∗ →\W|∗ − /\W|∗ →\W{∗ . Thus, a positive difference indicates a dominant nonlinear residual

ro
information transfer from calendar-adjusted log-returns to calendar-adjusted trading volume
-p
growth and a negative difference a dominant information transfer in the opposite direction. In
re
order to speak of a dominant direction of the information transfer, we require at least one of the

effective transfer entropy estimates to be statistically significant.12


lP

Panel A of Table 1 reports the results of the linear Granger causality tests on the original time
na

series of calendar-adjusted stock returns and trading volume growth. Importantly, note that we

are not proposing the linear Granger causality test as a benchmark for Shannon transfer entropy.
ur

We rather use Granger causality as a first step in our analysis to capture the linear dependencies
Jo

present between the two time series under investigation. As Panel A shows, for all three samples,

we can find a considerable amount of stocks for which either of the two tested Granger causality

hypotheses can be rejected. It also seems as if the first tested hypothesis (calendar-adjusted log-

returns do not Granger cause calendar-adjusted trading volume growth) is rejected for more

stocks than is the case for the second Granger hypothesis (calendar-adjusted trading volume

growth does not Granger cause calendar-adjusted log-returns). In fact, we find that the former

hypothesis is reject for at least twice as many stocks compared to stocks for which the latter

11
Our main findings are robust to an alternative choice of the significance level of 5%. Results are available upon
request.
12
While it is generally possible to use the bootstrap results to test for statistical significance of the differences as
well, we have chosen this pragmatic approach. The reason is that it is possible to obtain a statistically significant
difference, even though the transfer entropy estimates themselves are statistically insignificant. Obviously, such an
“information transfer” does not make sense economically. On the other hand, if one of the estimates is significant and
the other is not, the latter is fairly small and statistically indistinguishable from zero. Hence, it seems reasonable to still
refer to the difference of these estimates as an indicator of the dominant direction of the nonlinear residual
information transfer.
15
hypothesis can be rejected. This finding is consistent across all three samples and indicates that,

on average, calendar-adjusted stock returns would be a more reliable predictor of trading volume

growth than vice versa. However, Panel A only informs about linear dependencies. The Granger

causality tests also cannot inform about the dominant direction of the information transfer. If, for

example, for a given stock, we can reject both Granger hypotheses, we would still not know

whether the information transfer from stock returns to trading volume growth is dominating or

vice versa. The following Panels B and C can help to shed light on this question. They furthermore

illustrate clearly that important information would remain inaccessible if one would only consider

the linear Granger causality tests. As elaborated above and illustrated by the simulation results in

of
the appendix, any kind of nonlinear information left in the residuals is not captured by this testing

ro
procedure. Thus, finding a statistically significant information flow among the VAR residuals,

which, by construction, are purged of linear dependencies, would imply that a linear model such
-p
as a VAR is not suitable to fully assess the directional information transfer between stock returns
re
and trading volume growth.

For estimation of the effective transfer entropies, we use a Markov order of = ℎ = 1. Given
lP

that the first step removes all linear auto- and cross-correlations in the bivariate system, we
na

consider Markov processes of small order in the transfer entropy calculation.13 Moreover, we

= 0.05 and = 0.95 as the first pair of quantiles used for discretization and = 0.1
ur

choose ( (

= 0.9 as the second pair. In our specific application, the quantile choice of ( = 0.01 and
Jo

and

= 0.99 would imply to rely on about 20 observations for inference in our pre- and post-crisis

samples, which is why we do not report these results here. They are, however, available upon

request. As one can see from the first row of the column “# stocks” in Panel B for the full time

period, we find significant estimates for the residual information transfer from j&∗ to l&∗ for 226 of

the 417 stocks. This implies that there appears to exist an informational gain in predicting future

values of trading volume growth when utilizing past values of stock returns for more than half of

the stocks of our sample. For 91 out of 417 stocks, there also appears to exist an informational

gain using past observations of trading volume growth to predict stock returns, as indicated by

the second row of that column. Similar findings can be observed for the pre-crisis and post-crisis

Results for Markov order = ℎ = 2 can be found in Table 7.4 in the appendix. Generally, for the other
13

specifications below, additional results for Markov order = ℎ = 2 are reported in Appendix 7.3.
16
time periods, with the number of significant estimates being the lowest in the post-crisis time

period. However, even here, we still find significant, bidirectional residual information transfer

for more than a quarter of all stocks.

Another insight we can gain from Panel B is that for most of the stocks the dominant direction of

information transfer in all samples appears to point from stock returns to trading volume growth

since we find more stocks for which the difference between the effective transfer entropy

estimates /\W{∗ →\W|∗ − /\W|∗ →\W{∗ is positive than we observe negative differences (columns “+”

and “−”, respectively). This observation of a dominant direction of information flow from returns

to trading volume growth is even more pronounced for the 10% and 90% quantile choices in

of
Panel C. Especially for the post-crisis sample, we now find that for 221 stocks the information

ro
predominantly flows from returns to trading volume growth, while for only 63 stocks information

-p
flows in the opposite direction. Thus, the results of both quantile choices indicate that the

informational gain we can achieve by using past observations of j&∗ for predicting future values of
re
l&∗ is larger than the informational gain in predicting future values of j&∗ using past observations of

l&∗. While these observations are in line with the Granger causality tests, their informational
lP

content is richer by revealing the direction in which the information predominantly flows. They
na

also show that even after the two series have been purged of any linear dependencies, there still
ur

is important information available in the residual series. Moreover, with respect to these findings,
Jo

no particular industry clustering can be found. Thus, while we discover no overall pattern that

holds for all stocks in our sample, these mixed findings are well in line with the previous empirical

literature. A further disentanglement of the potential reasons why the dominant direction of the

information transfer is different for some stocks than for others is beyond the scope of this paper.

In summary, our findings indicate that a linear model such as a VAR model does not provide an

accurate assessment of the directional relation between stock returns and trading volume growth

since it does not account for all information in the bivariate system. We find evidence for

statistically significant nonlinear residual information transfer from calendar-adjusted log-returns

to calendar-adjusted trading volume growth and vice versa in our sample of stocks and across the

three time periods considered. In terms of magnitude of the transfer entropy estimates, the

dominant direction of information transfers points from returns towards volume for a majority of

17
stocks. This nonlinear residual information transfer in the bivariate system is not detected by the

usual linear Granger causality test.14

Table 1: Calendar-adjusted time series (Markov order 1, BIC)

nonlinear residual information. /\W{∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗,& }, where j ∗
and l ∗ stand for the calendar-adjusted log-return and trading volume growth series, respectively. The
results of the linear Granger causality tests performed on the calendar-adjusted series of stock returns
and trading volume growth are given in Panel A. For each of the tests, the column “# stocks” reports the
amount of stocks that show a p-value of that respective test of < 0.1. The columns “+” and “−”,
respectively, in Panels B and C count for how many of these stocks the difference in effective transfer

is set to = ℎ = 1. For discretization, the 5% and 95% (Panel B) and the 10% and 90% (Panel C)
entropies is positive or negative. For the estimation of the effective transfer entropy, the Markov order

the computation of /ZW{∗ →ZW|∗ and /ZW|∗ →ZW{∗ and statistical inference is based on 500 bootstrap
quantiles of the respective empirical distribution of the residuals are chosen. 200 shuffles are used in

of
replications. The amount of lags used in the VAR model, from which the residuals series are obtained, is
determined by the BIC. The “Full” period ranges from January 2000 to December 2017, while “Pre-Crisis”

ro
ranges from January 2000 to December 2007 and “Post-Crisis” ranges from January 2010 to December
2017.
Panel A
Full
-p Pre-Crisis Post-Crisis
re
# stocks # stocks # stocks
•j€• ‚jm ∗ →n∗
•j€• ‚jn∗ →m ∗
322 221 198
lP

92 100 72
na

( = 0.05
Panel B

= 0.95 Full Pre-Crisis Post-Crisis


ur

/\W{∗ →\W|∗
# stocks + - # stocks + - # stocks + -

/\W|∗ →\W{∗
226 205 21 159 154 5 134 123 11
Jo

91 35 56 53 15 38 50 10 40

( = 0.10
Panel C

= 0.90 Full Pre-Crisis Post-Crisis

/\W{∗ →\W|∗
# stocks + - # stocks + - # stocks + -

/\W|∗ →\W{∗
278 259 19 168 161 7 221 211 10
122 70 52 66 15 51 63 30 33

5.2 Empirical results accounting for persistence in return volatility

14
By construction, the Granger causality test does not reveal any statistically significant information transfer
between the residual series obtained from the VAR model.
18
While accounting for calendar effects in the stock return and trading volume growth time series

seems to be a reasonable approach, persistence in stock return volatility could be responsible for

(part of) the detected nonlinear residual information transfer in the bivariate system and thus

influence the results. Therefore, we now further investigate whether the findings in the previous

section prove robust when additionally accounting for persistence in return volatility. Hiemstra

and Jones (1994) argue along the lines of the common factor model proposed by Clark (1973), in

which daily stock returns j& and trading volume k& are both modeled as a function of the latent

speed of information flow >& at time t in the following manner:

j& = • >& ⋅ ε&

of
(12)
k& = ƒ >& ,

ro
where • ⋅ and ƒ ⋅ are some unknown functions and X& denotes i.i.d. noise. It follows from
-p
Equation (12) that return variance, which can be expressed as l€j j& = • >& ⋅ l€j ε& , is
re
influenced by the latent speed of information flow. Since trading volume is a function of the
lP

latter, both volume and returns are driven by the same factor. As a result, lagged values of

trading volume, k& , capturing temporal dependence of the latent speed of information flow,
na

might induce a spurious correlation between trading volume and stock returns that may
ur

erroneously be detected as nonlinear residual information. This correlation is rooted in the

persistence of return volatility, which is induced by the dependence of l€j j& on. >& Therefore,
Jo

we can eliminate this spurious correlation by accounting for the persistence in return volatility, as

explained by Andersen (1996). In order to do so, we follow Hiemstra and Jones (1994) and use

Nelson’s (1991) exponential GARCH (EGARCH(p,q)) to model return volatility persistence. The

EGARCH(p,q) model is especially appealing since it allows for a leverage effect in the volatility

equation:

j&∗ = ε& , ε& |†&-( ∼ ˆ 0, σ&


\2•^ \2•^
ln σŠ   = αI + ∑ <( α • σ&- + ∑$<( β$ •ϕ • “ + γ •” ” − •2/π“—, (13)
‘’[2•^ ‘’[2•^

19
where F = 1, 2, … , / , †&-( denotes the information set in period F − 1 , αI , … , α are the

parameters for the GARCH effects, and βI , … , β are those for the ARCH effects. While we fix the

lag lengths of the latter to be = 1, we let the lag length p for the GARCH effects vary between 1

and 6 and use the BIC to determine the optimal model fit for each stock individually. As a result,

the calendar- and volatility-adjusted stock return time series j&∗∗ are then given as:

j&∗∗ = j&∗ /W̃& , (14)

where W̃& are obtained via the EGARCH(p,q) model for each stock.

of
Table 2 summarizes the results for the linear Granger causality tests and the two-step procedure

ro
for the residuals of the VAR in Equation (9) using calendar- and volatility-adjusted stock returns as

-p
the first system variable and calendar-adjusted trading volume growth as the second system

variable. The Granger causality tests performed on the two adjusted time series, as they are
re
summarized in Panel A, reveal linear dependencies between stock returns and trading volume
lP

growth for between 12% and 72% of the 417 stocks of the sample. Similar to the case of the

calendar-adjusted stock returns, also for the volatility-adjusted time series, we find that these
na

adjusted stock returns on average constitute a more reliable predictor of trading volume growth
ur

than vice versa. As Table 2 reveals, we find at least three times as many stocks for which we
Jo

reject the first Granger hypothesis (calendar- and volatility-adjusted log-returns do not Granger

cause calendar-adjusted trading volume growth) than we find stocks for which we can reject the

second (calendar-adjusted trading volume growth does not Granger cause calendar- and

volatility-adjusted log-returns).

Turning to the effective transfer entropy estimates and using Markov orders of = ℎ = 1 and

quantiles ( = 0.05 and = 0.95 in Panel B and ( = 0.1 and = 0.9 in Panel C, the results

resemble those found in Panels B and C of Table 1. We find a substantial amount of stocks for

which the effective transfer entropy estimates are significantly different from zero. While the

total amount of stocks with significant effective transfer entropy estimates decreases in

comparison with Table 1, the dominant direction of nonlinear residual information transfer for

the majority of stocks also remains unchanged - pointing from stock returns to trading volume

20
growth. This pattern is increasingly visible when enlarging the quantiles used for discretization

from 5% and 95% to 10% and 90%: In both pre- and post-crisis samples, the amount of stocks for

which we detect a dominant direction of information flow from returns to trading volume growth

increases by 30% to 70%, while for the opposite direction of information flow, the amount of

significant stocks either increases by only 1% (post-crisis) or even decreases (pre-crisis). Again,

these findings are generally in line with the Granger causality test results in Panel A. However,

the latter do not reveal any dominant direction of the information transfer nor do they capture

these nonlinear dependencies. Summarizing the findings in Table 2, accounting for volatility

persistence in stock returns holds two important implications: Firstly, it seems that some of the

of
detected nonlinear residual information transfer detected in Section 5.1 is due to volatility effects,

ro
similar to the findings in Hiemstra and Jones (1994). Secondly, the main conclusions drawn in the

previous section - the failure of a linear model to accurately assess the (nonlinear) relation
-p
between stock returns and trading volume growth and a dominant direction of information
re
transfer from returns to trading volume growth for the majority of stocks - still appear to be valid
lP

after accounting for volatility persistence in stock returns.


na
ur
Jo

21
Table 2: Volatility-adjusted time series (Markov order 1, BIC)

nonlinear residual information. /\W{∗∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗∗,& }, where
j ∗∗ and l ∗ stand for the calendar- and volatility-adjusted log-return and calendar-adjusted trading
volume growth series, respectively. The results of the linear Granger causality tests performed on the
volatility-adjusted series of stock returns and trading volume growth are given in Panel A. For each of
the tests, the column “# stocks” reports the amount of stocks that show a p-value of that respective test
of < 0.1. The columns “+” and “−”, respectively, in Panels B and C count for how many of these stocks

transfer entropy, the Markov order is set to = ℎ = 1. For discretization, the 5% and 95% (Panel B) and
the difference in effective transfer entropies is positive or negative. For the estimation of the effective

200 shuffles are used in the computation of /ZW{∗∗ →ZW|∗ and /ZW|∗ →ZW{∗∗ and statistical inference is based
the 10% and 90% (Panel C) quantiles of the respective empirical distribution of the residuals are chosen.

on 500 bootstrap replications. The amount of lags used in the VAR model, from which the residuals
series are obtained, is determined by the BIC. The “Full” period ranges from January 2000 to December
2017, while “Pre-Crisis” ranges from January 2000 to December 2007 and “Post-Crisis” ranges from
January 2010 to December 2017.

of
Panel A

ro
Full Pre-Crisis Post-Crisis
# stocks # stocks # stocks
•j€• ‚jm ∗∗ →n∗ -p
•j€• ‚jn∗ →m ∗
299 211 175
65 65 52
re

( = 0.05
Panel B
lP

= 0.95 Full Pre-Crisis Post-Crisis

/\W{∗∗ →\W|∗
# stocks + - # stocks + - # stocks + -
na

/\W|∗ →\W{∗∗
186 174 12 115 111 4 105 93 12
72 21 51 53 13 40 79 8 71
ur

( = 0.10
Panel C
Jo

= 0.90 Full Pre-Crisis Post-Crisis

/ZW{∗∗ →ZW|∗
# stocks + - # stocks + - # stocks + -

/ZW|∗ →ZW{∗∗
249 229 20 151 144 7 180 165 15
122 58 64 42 9 33 80 24 56

6 Concluding remarks

In this paper, we have applied a practical two-step procedure to test for nonlinear residual

information in the bivariate system of calendar-adjusted log-returns and calendar-adjusted

trading volume growth for a sample of 417 individual-level stocks over an 18 years time period.

The procedure draws upon the concept of Shannon transfer entropy in the second step, which is

22
not only a highly versatile nonparametric measure to quantify any kind of statistical dependence

between two time series but also allows to infer the dominant direction of the information

transfer. In our application to the bivariate system of stock returns and trading volume growth,

this constitutes a clear improvement over the usual (non-)linear Granger

causality test.

Our main results are robust when considering three different time periods and when additionally

accounting for volatility persistence in stock returns: We find evidence for statistically significant

nonlinear residual information transfer from stock returns to trading volume growth and vice

versa, where for most stocks this nonlinear residual information predominantly flows from

of
returns to trading volume growth. Importantly, the question of why the dominant direction of

ro
the nonlinear information transfer is different for some stocks than for others merits a closer

-p
investigation and is left for future research. From a technical point of view, given the large

fraction of stocks for which we find such nonlinear residual information transfer, it may be
re
argued that the widespread use of linear models to assess this directional relation does not
lP

constitute a sufficient approach. Thus, empirical applications such as forecasting and trading are

likely to benefit from incorporating nonlinear dynamics in their modeling approaches, for which
na

our practical procedure may provide helpful guidance.


ur

We can think of several extensions to further validate our findings. For example, while our sample
Jo

already includes a large number of individual-level stocks, it would be interesting to particularly

look at small to mid cap stocks, which are not part of the S&P 500 or DJIA. Moreover, considering

specific industries that, on the one hand, have a high impact on real economic activity or, on the

other hand, are important from a systemic risk point of view might be a worthwhile endeavor.

The two-step procedure could also be used to study the directional relation of returns and

trading volume for cryptocurrencies, which regularly display highly volatile periods. It should also

be emphasized that the two-step procedure is not restricted to our specific application but may

be used to shed light on other important questions involving financial time series. From a more

technical point of view, it is also possible to extend the second step Shannon transfer entropy

measure to more than two variables possibly extending the testing procedure to higher-

dimensional systems (for a group transfer entropy measure see, for example, Dimpfl and Peter,

2018).
23
References

Andersen, T. G. (1996) Return volatility and trading volume: An information flow interpretation of

stochastic volatility, Journal of Finance, 51, 169–204.

Behrendt, S., Dimpfl, T., Peter, F. J. and Zimmermann, D. J. (2019) RTransferEntropy – Quantifying

information flow between different time series using effective transfer entropy, SoftwareX, 10,

100265.

Bossomaier, T., Barnett, L., Harré, M. and Lizier, J. T. (2016) An introduction to transfer entropy:

Information flow in complex systems, Springer International Publishing, Basel.

of
Brock, W. (1991) Causality, chaos, explanation and prediction in economics and finance, in

ro
Beyond belief: Randomness, prediction and explanation in science (Eds.) J. Casti and A. Karlqvist,

CRC Press, Boca Raton, Fla. -p


re
Camacho, M., Romeu, A. and Ruiz-Marin, M. (2020) Symbolic transfer entropy test for causality in

longitudinal data, Economic Modelling, in press.


lP

Campbell, J. Y., Grossman, S. J. and Wang, J. (1993) Trading volume and serial correlation in stock
na

returns, Quarterly Journal of Economics, 108, 905–939.


ur

Chen, S.-S. (2012) Revisiting the empirical linkages between stock returns and trading volume,

Journal of Banking and Finance, 36, 1781–1788.


Jo

Chordia, T. and Swaminathan, B. (2000) Trading volume and cross-autocorrelations in stock

returns, Journal of Finance, 55, 913–935.

Chuang, C.-C., Kuan, C.-M. and Lin, H.-Y. (2009) Causality in quantiles and dynamic stock return -

volume relations, Journal of Banking and Finance, 33, 1351–1360.

Clark, P. K. (1973) A subordinated stochastic process model with finite variance for speculative

prices, Econometrica, 41, 135–155.

Copeland, T. E. (1976) A model of asset trading under the assumption of sequential information

arrival, Journal of Finance, 31, 1149–1168.

De Long, J. B., Shleifer, A., Summers, L. H. and Waldmann, R. J. (1990a) Noise trader risk in

financial markets, Journal of Political Economy, 98, 703–738.


24
De Long, J. B., Shleifer, A., Summers, L. H. and Waldmann, R. J. (1990b) Positive feedback

investment strategies and destabilizing rational speculation, Journal of Finance, 45, 379– 395.

Dehmer, M., Emmert-Streib, F., Chen, Z., Li, X. and Shi, Y. (Eds.) (2017) Mathematical foundations

and applications of graph entropy, John Wiley & Sons, Weinheim.

Dimpfl, T. and Peter, F. J. (2013) Using transfer entropy to measure information flows between

financial markets, Studies in Nonlinear Dynamics and Econometrics, 17, 85–102.

Dimpfl, T. and Peter, F. J. (2018) Analyzing volatility transmission using group transfer entropy,

Energy Economics, 75, 368–376.

of
Easley, D., De Prado, M. M. L. and O’Hara, M. (2011) The microstructure of the flash crash: Flow

ro
toxicity, liquidity crashes and the probability of informed trading, Journal of Portfolio

Management, 37, 118–128. -p


re
Epps, T. W. and Epps, M. L. (1976) The stochastic dependence of security price changes and

transaction volumes: Implications for the mixture-of-distributions hypothesis, Econometrica, 44,


lP

305–321.
na

Gagnon, L. and Karolyi, G. A. (2009) Information, trading volume, and international stock return

comovements: Evidence from cross-listed stocks, Journal of Financial and Quantitative Analysis,
ur

44, 953–986.
Jo

Gallant, A. R., Rossi, P. E. and Tauchen, G. E. (1992) Stock prices and volume, Review of Financial

Studies, 5, 199–242.

Granger, C. W. J. (1969) Investigating causal relations by econometric modles and crossspectral

methods, Econometrica, 37, 424–438.

Grassberger, P. and Procaccia, I. (1983) Characterization of strange attractors, Physical Review

Letters, 50, 346–349.

Grossman, S. J. and Miller, M. H. (1988) Liquidity and market structure, Journal of Finance, 43,

617–633.

Hansen, L. P. and Sargent, T. J. (2008) Robustness, Princeton university press.

25
Hiemstra, C. and Jones, J. D. (1994) Testing for linear and nonlinear Granger causality in the stock

price-volume relation, Journal of Finance, 49, 1639–1664.

Hinich, M. J. and Patterson, D. M. (1985) Evidence of nonlinearity in daily stock returns, Journal of

Business and Economic Statistics, 3, 69–77.

Horowitz, J. L. (2003) Bootstrap methods for Markov processes, Econometrica, 71, 1049– 1082.

Jennings, R. H., Starks, L. T. and Fellingham, J. C. (1981) An equilibrium model of asset trading

with sequential information arrival, Journal of Finance, 36, 143–161.

Kantz, H. and Schreiber, T. (2004) Nonlinear time series analysis, Cambridge University Press, New

of
York, second edn.

ro
Karpoff, J. M. (1987) The relation between price changes and trading volume: A survey, Journal of
-p
Financial and Quantitative Analysis, 22, 109–126.
re
Kirilenko, A., Kyle, A. S., Samadi, M. and Tuzun, T. (2017) The Flash Crash: High-frequency trading
lP

in an electronic market, Journal of Finance, 72, 967–998.

Kullback, S. and Leibler, R. A. (1951) On information and sufficiency, Annals of Mathematical


na

Statistics, 1, 79–86.
ur

Lakonishok, J. and Smidt, S. (1989) Past price changes and current trading volume, Journal of
Jo

Portfolio Management, 15, 18.

Lee, B.-S. and Rui, O. M. (2002) The dynamic relationship between stock returns and trading

volume: Domestic and cross-country evidence, Journal of Banking and Finance, 26, 51–78.

Lütkepohl, H. (2013) Vector autoregressive models, in Handbook of Research Methods and

Applications in Empirical Macroeconomics, Edward Elgar Publishing.

Marschinski, R. and Kantz, H. (2002) Analysing the information flow between financial time series:

An improved estimator for transfer entropy, European Physical Journal B: Condensed Matter

and Complex Systems, 30, 275–281.

McMillan, D. G. (2007) Non-linear forecasting of stock returns: Does volume help?, International

Journal of Forecasting, 23, 115–126.

26
Nelson, D. B. (1991) Conditional heteroskedasticity in asset returns: A new approach,

Econometrica, pp. 347–370.

Sancetta, A. (2019) Intraday end-of-day volume prediction, Journal of Financial Econometrics

(Advance Article).

Schreiber, T. (2000) Measuring information transfer, Physical Review Letters, 85, 461–464.

Shannon, C. E. (1948) A mathematical theory of communication, Bell System Technical Journal, 27,

379–423.

Smirlock, M. and Starks, L. (1988) An empirical analysis of the stock price - volume relationship,

of
Journal of Banking and Finance, 12, 31–41.

ro
Tetlock, P. C. (2007) Giving content to investor sentiment: The role of media in the stock market,

Journal of Finance, 62, 1139–1168. -p


re
lP
na
ur
Jo

27
7 Appendix

7.1 Application to simulated time series

In the following, we illustrate how conventional approaches to measure information transfer fail

when nonlinear information is present in a bivariate system. Furthermore, we demonstrate the

applicability of the two-step procedure outlined in Section 3.2 to detect the nonlinear

information transfer in such scenarios using three instructive simulation experiments. Note that

for each bivariate system, we report detailed results for one simulation experiment only.

However, qualitatively, results do not significantly change for larger numbers of replications.

of
Monte Carlo results with 200 replications of each simulation and different values for T can be

ro
found below. In most cases, the two-step procedure correctly identifies the residual information
=
transfer. For each of the models, we take Jε ,& L&<( , , # ∈ { 1,2 } and ≠ #, to be Gaussian white
-p
re
noise with zero mean and unit variance. Moreover, for every univariate time series in each of the
™III ™III
bivariate systems, we simulate 4000 observations, i.e., JK(,& L&<( and JK ,& L&<( . The first model is
lP

the same as in the simulation experiment of Dimpfl and Peter (2013), who make use of Shannon
na

transfer entropy to subsequently measure the information transfer between the markets for

credit default swaps and corporate bonds as well as the information transfer between market risk
ur

and credit risk.


Jo

Model (1)

K(,& = 0.2K(,&-( + ε(,& ,


(7.1)
K ,& = K(,&-( + ε ,& .

In this model, {K(,& } follows a stationary autoregressive (AR) process of order one. While the

model does not allow {K ,& } to have any effect on {K(,& }, {K ,& } itself depends linearly on the

lagged value of the first system variable, superimposed with Gaussian white noise. The second

model is similar to the first model, however, the effect of the lagged first system variable on the

second system variable is now nonlinear. In order to avoid negative values of the lagged first

system variable, the absolute value is taken (see Behrendt et al., 2019).
28
Model (2)

K(,& = 0.2K(,&-( + ε(,& ,


(7.2)

K ,& = ‘šK(,&-( š + ε ,& .

In the third model, {K ,& } now follows a stationary AR process of order one, whereas {K(,& }

depends on an interaction term of both lagged system variables, which is superimposed with

Gaussian white noise. This model is related to an example mentioned by Hiemstra and Jones

of
(1994). It dates back to Brock (1991), who uses a similar setup to illustrate how a linear Granger

ro
causality test may fail to reveal an existing nonlinear dynamic relationship.

Model (3) -p
K(,& = 0.4K(,&-( ⋅ K ,&-( + ε(,& ,
re
lP

(7.3)
K ,& = 0.2K ,&-( + ε ,& .
na

Given the true lag structure, for each of the three bivariate systems, we first estimate a VAR(1)
ur

model to purge the residual series of any linear auto- and cross-correlations. In the second step,
œ••• œ•••
after obtaining JεW(,& L&<( and JX̂ ,& L&<( , the effective transfer entropy for both
Jo

possible directions of the residual information transfer is estimated. To this end, we set the

Markov order for both processes to = ℎ = 1 and choose the 5% as well as 95% quantile of the

respective empirical distribution for discretization. It should be noted that the Markov orders l

and h should be set to the same order to facilitate interpretation of the results (Schreiber, 2000).

While = ℎ = 1 is appropriate based on the true lag structure of the three models, the fact that

only Gaussian white noise is considered in these simulation experiments and since it is certainly

more efficient from a computational point of view, setting > 1 and ℎ > 1 might be reasonable

if dynamical residual dependencies lasting longer than one time lag can be expected. Moreover,

estimation of the effective transfer entropy is based on 150 shuffles and statistical inference rests

on 400 bootstrap replications of the underlying Markov processes. For comparison, we also

29
compute the F-statistic and corresponding p-value of the usual linear Granger causality test for

both directions of the residual information transfer.

Similar to the Markov order, we consider the case where the respective other system variable is

included with one lag in the unrestricted model.15 The results are illustrated in Table 7.1.

As can be seen, for the first model, the effective transfer entropy estimates are statistically

insignificant in both directions, correctly indicating that there is no nonlinear residual information

transfer present after having applied a linear filter to the time series. On the other

hand, for the second model /\W4 →\W[ and for the third model /\W[ →\W4 are statistically significant.

Thus, for these models, the effective transfer entropy correctly detects the nonlinear residual

of
information transfer, which the VAR(1) model fails to remove in the first step. By contrast, the

ro
linear Granger causality tests fail to reject the null hypothesis of no residual information transfer

-p
for all three models. While this is not an issue for the first model, relying on linear models would

leave out potentially important nonlinear information when the underlying bivariate system
re
entails nonlinearities, as is the case in the latter two models.
lP

Table 7.1: Results for simulated time series


The table depicts for each of the three models (i) the results of the two-step estimation procedure in
na

Granger causality test. In the first step, each model is filtered by a VAR(1) model. /\W] →\W^ in the first
the form of effective transfer entropy estimates and, for comparison, (ii) the results of the usual linear

two rows refers to the effective transfer entropy from {X̂ ,& } to {X̂$,& }, where , # ∈ {1,2} and ≠ #.
Analogously, •j€• ‚j\W] →\W^ in the last two rows refers to the respective linear Granger causality test.
ur

While for the effective transfer entropy both the estimate and corresponding bootstrapped p-value are

the effective transfer entropy, the Markov order of both processes is set to = ℎ = 1. For discretization,
Jo

given, for the Granger causality test only the p-value of the F-statistic is reported. For the estimation of

the 5% as well as 95% of the respective empirical distribution are chosen. 150 shuffles are used and
statistical inference is based on 400 bootstrap replications. The linear Granger causality test assumes
one lag of the respective other system variable in the unrestricted model.
Model (1) Model (2) Model (3)
Estimate p-value Estimate p-value Estimate p-value

/ZW4→ZW[
0.0008 0.1125 0.0054 0.0000 0.0000 0.3925

/ZW[→ZW4 0.0000 0.4175 0.0000 0.3150 0.0061 0.0000

•j€• ‚jZW4→ZW[ - 0.9822 - 0.8154 - 0.9939

15
Obviously, the results of the linear Granger causality test follow, by construction, from the first step of the
procedure. We report these results merely as an illustration of the problem that might be encountered in a practical
application, not as a benchmark.
30
•j€• ‚jZW[→ZW4 - 0.9665 - 0.5571 - 0.8348

7.2 Monte Carlo results

In Tables 7.2 and 7.3, we present Monte Carlo results for the simulation experiments outlined

above. While the true lag-structure is known in the simulation experiments, the BIC is used to

empirically determine the lag length of the VAR model in the first step. Thus, the setup is the

same as in the empirical applications. For each of the three models, Tables 7.2 and 7.3 report the

percent chosen, i.e., the relative number of statistically significant occurrences of measured

of
information transfer. In Table 7.2, statistical significance is determined with the 10% significance

ro
level, whereas the 5% significance level is used in Table 7.3. The number of replications is set to R
-p
= 200. As can be seen, the two-step procedure selects the true nonlinear residual information
re
transfer in almost all cases. Only in very few cases, an information transfer, which should not be

present, is discovered by the Shannon transfer entropy measure or even the linear Granger
lP

causality test.
na

Table 7.2: Extensive results for simulated time series, 10% significance level
The table depicts for each of the three models described in Section 7.1 (i) the results of the two-step

Sample sizes vary from / = 2000 to / = 12000 as indicted in each respective column. For each of
estimation procedure and (ii) the results of the linear Granger causality test for the simulated series.
ur

significant occurrences of the respective test procedure over all ž = 200 replications. Statistical
these sample sizes, we report the percent chosen, which gives the relative number of statistically
Jo

significance is determined with the 10% significance level.

/ = 2000 / = 4000 / = 8000 / = 12000


Model (1)

/\W4 →\W[
/\W[ →\W4
0.09 0.09 0.07 0.10

•j€• ‚jZW4 →ZW[


0.07 0.08 0.08 0.10

•j€• ‚jZW[ →ZW4


0.00 0.00 0.00 0.00
0.02 0.02 0.02 0.02

/ = 2000 / = 4000 / = 8000 / = 12000


Model (2)

/\W4 →\W[
/\W[ →\W4
0.99 1.00 1.00 1.00

•j€• ‚jZW4 →ZW[


0.12 0.10 0.10 0.09

•j€• ‚jZW[ →ZW4


0.00 0.02 0.00 0.02
0.02 0.02 0.01 0.01

/ = 2000 / = 4000 / = 8000 / = 12000


Model (3)

31
/\W4 →\W[
/\W[ →\W4
0.09 0.07 0.10 0.12

•j€• ‚jZW4 →ZW[


0.98 1.00 1.00 1.00

•j€• ‚jZW[ →ZW4


0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00

Table 7.3: Extensive results for simulated time series, 5% significance level
The table depicts for each of the three models described in Section 7.1 (i) the results of the two-step
estimation procedure and (ii) the results of the linear Granger causality test for the simulated series.
Sample sizes vary from T = 2000 to T = 12000 as indicted in each respective column. For each of these
sample sizes, we report the percent chosen, which gives the relative number of statistically significant
occurrences of the respective test procedure over all R = 200 replications. Statistical significance is
determined with the 5% significance level.

/ = 2000 / = 4000 / = 8000 / = 12000


Model (1)

of
/\W4 →\W[
/\W[ →\W4
0.04 0.05 0.04 0.04

ro
•j€• ‚jZW4 →ZW[
0.04 0.02 0.04 0.04

•j€• ‚jZW[ →ZW4


0.00 0.00 0.00 0.00
0.01
-p 0.00 0.00 0.01
re
/ = 2000 / = 4000 / = 8000 / = 12000
Model (2)

/\W4 →\W[
lP

/\W[ →\W4
0.97 1.00 1.00 1.00

•j€• ‚jZW4 →ZW[


0.08 0.06 0.06 0.05
na

•j€• ‚jZW[ →ZW4


0.00 0.00 0.00 0.01
0.00 0.00 0.00 0.00
ur

/ = 2000 / = 4000 / = 8000 / = 12000


Model (3)

/\W4 →\W[
Jo

/\W[ →\W4
0.05 0.02 0.06 0.06

•j€• ‚jZW4 →ZW[


0.94 1.00 1.00 1.00

•j€• ‚jZW[ →ZW4


0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00

32
7.3 Supplementary tables

Table 7.4: Calendar-adjusted time series (Markov order 2, BIC)

nonlinear residual information. /\W{∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗,& }, where j ∗
and l ∗ stand for the calendar-adjusted log-return and trading volume growth series, respectively. The
results of the linear Granger causality tests performed on the calendar-adjusted series of stock returns
and trading volume growth are given in Panel A. For each of the tests, the column “# stocks” reports the
amount of stocks that show a p-value of that respective test of < 0.1. The columns “+” and “−”,
respectively, in Panels B and C count for how many of these stocks the difference in effective transfer

is set to = ℎ = 2. For discretization, the 5% and 95% (Panel B) and the 10% and 90% (Panel C)
entropies is positive or negative. For the estimation of the effective transfer entropy, the Markov order

the computation of /ZW{∗ →ZW|∗ and /ZW|∗ →ZW{∗ and statistical inference is based on 500 bootstrap
quantiles of the respective empirical distribution of the residuals are chosen. 200 shuffles are used in

replications. The amount of lags used in the VAR model, from which the residuals series are obtained, is

of
determined by the BIC. The “Full” period ranges from January 2000 to December 2017, while “Pre-Crisis”
ranges from January 2000 to December 2007 and “Post-Crisis” ranges from January 2010 to December

ro
2017.

Panel A -p
Full Pre-Crisis Post-Crisis
re
# stocks # stocks # stocks
•j€• ‚jm ∗ →n∗
•j€• ‚jn∗ →m ∗
322 221 198
lP

92 100 72

( = 0.05
Panel B
na

= 0.95 Full Pre-Crisis Post-Crisis


ur

/\W{∗ →\W|∗
# stocks + - # stocks + - # stocks + -

/\W|∗ →\W{∗
234 218 16 197 179 18 132 127 5
Jo

209 139 70 188 108 80 81 38 43

( = 0.10
Panel C

= 0.90 Full Pre-Crisis Post-Crisis

/\W{∗ →\W|∗
# stocks + - # stocks + - # stocks + -

/\W|∗ →\W{∗
258 240 18 164 153 11 115 110 5
130 86 44 116 72 44 43 19 24

33
Table 7.5: Volatility-adjusted time series (Markov order 2, BIC)

nonlinear residual information. /\W{∗∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗∗,& }, where
j ∗∗ and l ∗ stand for the calendar- and volatility-adjusted log-return and calendar-adjusted trading
volume growth series, respectively. The results of the linear Granger causality tests performed on the
volatility-adjusted series of stock returns and trading volume growth are given in Panel A. For each of
the tests, the column “# stocks” reports the amount of stocks that show a p-value of that respective test
of < 0.1. The columns “+” and “−”, respectively, in Panels B and C count for how many of these stocks

transfer entropy, the Markov order is set to = ℎ = 2. For discretization, the 5% and 95% (Panel B) and
the difference in effective transfer entropies is positive or negative. For the estimation of the effective

200 shuffles are used in the computation of /ZW{∗∗ →ZW|∗ and /ZW|∗ →ZW{∗∗ and statistical inference is based
the 10% and 90% (Panel C) quantiles of the respective empirical distribution of the residuals are chosen.

on 500 bootstrap replications. The amount of lags used in the VAR model, from which the residuals
series are obtained, is determined by the BIC. The “Full” period ranges from January 2000 to December
2017, while “Pre-Crisis” ranges from January 2000 to December 2007 and “Post-Crisis” ranges from
January 2010 to December 2017.

of
Panel A

ro
Full Pre-Crisis Post-Crisis
# stocks # stocks # stocks
•j€• ‚jm ∗∗ →n∗ -p
•j€• ‚jn∗ →m ∗
299 211 175
65 65 52
re

( = 0.05
Panel B
lP

= 0.95 Full Pre-Crisis Post-Crisis

/\W{∗∗ →\W|∗
# stocks + - # stocks + - # stocks + -
na

/\W|∗ →\W{∗∗
196 184 12 132 124 8 128 115 13
77 42 35 72 23 49 66 23 43
ur

( = 0.10
Panel C
Jo

= 0.90 Full Pre-Crisis Post-Crisis

/ZW{∗∗ →ZW|∗
# stocks + - # stocks + - # stocks + -

/ZW|∗ →ZW{∗∗
211 203 8 104 102 2 99 93 6
49 25 24 33 14 19 41 14 27

34
Table 7.6: Calendar-adjusted time series (Markov order 1, p = 20)

nonlinear residual information. /\W{∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗,& }, where j ∗
and l ∗ stand for the calendar-adjusted log-return and trading volume growth series, respectively. The
results of the linear Granger causality tests performed on the calendar-adjusted series of stock returns
and trading volume growth are given in Panel A. For each of the tests, the column “# stocks” reports the
amount of stocks that show a p-value of that respective test of < 0.1. The columns “+” and “−”,
respectively, in Panels B and C count for how many of these stocks the difference in effective transfer

is set to = ℎ = 1. For discretization, the 5% and 95% (Panel B) and the 10% and 90% (Panel C)
entropies is positive or negative. For the estimation of the effective transfer entropy, the Markov order

the computation of /ZW{∗ →ZW|∗ and /ZW|∗ →ZW{∗ and statistical inference is based on 500 bootstrap
quantiles of the respective empirical distribution of the residuals are chosen. 200 shuffles are used in

set to = 20. The “Full” period ranges from January 2000 to December 2017, while “Pre-Crisis” ranges
replications. The amount of lags used in the VAR model, from which the residual series are obtained, is

from January 2000 to December 2007 and “Post-Crisis” ranges from January 2010 to December 2017.
Panel A

of
Full Pre-Crisis Post-Crisis

ro
# stocks # stocks # stocks
•j€• ‚jm ∗ →n∗
•j€• ‚jn∗ →m ∗
322 221 198
92 -p 100 72
re
( = 0.05
Panel B

= 0.95
lP

Full Pre-Crisis Post-Crisis

/ZW{∗ →ZW|∗
# stocks + - # stocks + - # stocks + -
na

/ZW|∗ →ZW{∗
204 184 20 122 110 12 119 112 7
104 36 68 57 10 47 54 8 46
ur

( = 0.10
Panel C

= 0.90
Jo

Full Pre-Crisis Post-Crisis

/ZW{∗ →ZW|∗
# stocks + - # stocks + - # stocks + -

/ZW|∗ →ZW{∗
255 230 25 124 113 11 166 150 16
124 59 65 57 12 45 75 17 58

35
Table 7.7: Volatility-adjusted time series (Markov order 1, p = 20)

nonlinear residual information. /\W{∗∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗∗,& }, where
j ∗∗ and l ∗ stand for the calendar- and volatility-adjusted log-return and calendar-adjusted trading
volume growth series, respectively. The results of the linear Granger causality tests performed on the
volatility-adjusted series of stock returns and trading volume growth are given in Panel A. For each of
the tests, the column “# stocks” reports the amount of stocks that show a p-value of that respective test
of < 0.1. The columns “+” and “−”, respectively, in Panels B and C count for how many of these stocks

transfer entropy, the Markov order is set to = ℎ = 1. For discretization, the 5% and 95% (Panel B) and
the difference in effective transfer entropies is positive or negative. For the estimation of the effective

200 shuffles are used in the computation of /ZW{∗∗ →ZW|∗ and /ZW|∗ →ZW{∗∗ and statistical inference is based
the 10% and 90% (Panel C) quantiles of the respective empirical distribution of the residuals are chosen.

series are obtained, is set to = 20. The “Full” period ranges from January 2000 to December 2017,
on 500 bootstrap replications. The amount of lags used in the VAR model, from which the residual

while “Pre-Crisis” ranges from January 2000 to December 2007 and “Post-Crisis” ranges from January
2010 to December 2017.

of
Panel A

ro
Full Pre-Crisis Post-Crisis
# stocks # stocks # stocks
•j€• ‚jm ∗∗ →n∗ -p
•j€• ‚jn∗ →m ∗
299 211 175
65 65 52
re

( = 0.05
Panel B
lP

= 0.95 Full Pre-Crisis Post-Crisis

/\W{∗∗ →\W|∗
# stocks + - # stocks + - # stocks + -
na

/\W|∗ →\W{∗∗
156 144 12 88 85 3 84 70 14
66 16 50 58 11 47 82 9 73
ur

( = 0.10
Panel C
Jo

= 0.90 Full Pre-Crisis Post-Crisis

/ZW{∗∗ →ZW|∗
# stocks + - # stocks + - # stocks + -

/ZW|∗ →ZW{∗∗
236 212 24 119 116 3 140 130 10
108 45 63 39 8 31 81 16 65

36
Table 7.8: Calendar-adjusted time series (Markov order 2, p = 20)

nonlinear residual information. /\W{∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗,& }, where j ∗
and l ∗ stand for the calendar-adjusted log-return and trading volume growth series, respectively. The
results of the linear Granger causality tests performed on the calendar-adjusted series of stock returns
and trading volume growth are given in Panel A. For each of the tests, the column “# stocks” reports the
amount of stocks that show a p-value of that respective test of < 0.1. The columns “+” and “−”,
respectively, in Panels B and C count for how many of these stocks the difference in effective transfer

is set to = ℎ = 2. For discretization, the 5% and 95% (Panel B) and the 10% and 90% (Panel C)
entropies is positive or negative. For the estimation of the effective transfer entropy, the Markov order

the computation of /ZW{∗ →ZW|∗ and /ZW|∗ →ZW{∗ and statistical inference is based on 500 bootstrap
quantiles of the respective empirical distribution of the residuals are chosen. 200 shuffles are used in

set to = 20. The “Full” period ranges from January 2000 to December 2017, while “Pre-Crisis” ranges
replications. The amount of lags used in the VAR model, from which the residual series are obtained, is

from January 2000 to December 2007 and “Post-Crisis” ranges from January 2010 to December 2017.
Panel A

of
Full Pre-Crisis Post-Crisis

ro
# stocks # stocks # stocks
•j€• ‚jm ∗ →n∗
•j€• ‚jn∗ →m ∗
322 221 198
92 -p 100 72
re
( = 0.05
Panel B

= 0.95
lP

Full Pre-Crisis Post-Crisis

/ZW{∗ →ZW|∗
# stocks + - # stocks + - # stocks + -
na

/ZW|∗ →ZW{∗
206 178 28 135 116 19 77 73 4
219 127 92 190 84 106 80 17 63
ur

( = 0.10
Panel C

= 0.90
Jo

Full Pre-Crisis Post-Crisis

/ZW{∗ →ZW|∗
# stocks + - # stocks + - # stocks + -

/ZW|∗ →ZW{∗
211 197 14 115 108 7 54 53 1
126 78 48 116 59 57 45 14 31

37
Table 7.9: Volatility-adjusted time series (Markov order 2, p = 20)

nonlinear residual information. /\W{∗∗ →\W|∗ in Panels B and C refers to the effective transfer entropy from
This table summarizes the amount of stocks for which the two-step procedure detects significant

{X̂m ∗∗,& } to {X̂n ∗,& }, while /ZW|∗ →ZW{∗∗ refers to the effective transfer entropy from {X̂n ∗,& } to {X̂m ∗∗,& }, where
j ∗∗ and l ∗ stand for the calendar- and volatility-adjusted log-return and calendar-adjusted trading
volume growth series, respectively. The results of the linear Granger causality tests performed on the
volatility-adjusted series of stock returns and trading volume growth are given in Panel A. For each of
the tests, the column “# stocks” reports the amount of stocks that show a p-value of that respective test
of < 0.1. The columns “+” and “−”, respectively, in Panels B and C count for how many of these stocks

transfer entropy, the Markov order is set to = ℎ = 2. For discretization, the 5% and 95% (Panel B) and
the difference in effective transfer entropies is positive or negative. For the estimation of the effective

200 shuffles are used in the computation of /ZW{∗∗ →ZW|∗ and /ZW|∗ →ZW{∗∗ and statistical inference is based
the 10% and 90% (Panel C) quantiles of the respective empirical distribution of the residuals are chosen.

series are obtained, is set to = 20. The “Full” period ranges from January 2000 to December 2017,
on 500 bootstrap replications. The amount of lags used in the VAR model, from which the residual

while “Pre-Crisis” ranges from January 2000 to December 2007 and “Post-Crisis” ranges from January
2010 to December 2017.

of
Panel A

ro
Full Pre-Crisis Post-Crisis
# stocks # stocks # stocks
•j€• ‚jm ∗∗ →n∗ -p
•j€• ‚jn∗ →m ∗
299 211 175
65 65 52
re

( = 0.05
Panel B
lP

= 0.95 Full Pre-Crisis Post-Crisis

/\W{∗∗ →\W|∗
# stocks + - # stocks + - # stocks + -
na

/\W|∗ →\W{∗∗
153 143 10 92 76 16 64 58 6
80 31 49 72 9 63 71 7 64
ur

( = 0.10
Panel C
Jo

= 0.90 Full Pre-Crisis Post-Crisis

/ZW{∗∗ →ZW|∗
# stocks + - # stocks + - # stocks + -

/ZW|∗ →ZW{∗∗
171 162 9 72 67 5 49 47 2
46 18 28 41 9 32 38 7 31

38
Nonlinearity matters:
The stock price – trading volume relation revisited

Highlights:

• We investigate the stock price – trading volume relation found in theoretical models

• Most empirical investigations of this relation are restricted by linear models

• We test for nonlinearities in the system of stock returns and trading volume growth

of
• The dominant nonlinear information flow is from returns to trading volume growth

ro
• Our empirical evidence highlights the nonlinear nature of this relation
-p
re
lP
na
ur
Jo

You might also like