High Frequency Trading Illiquidity Patterns
High Frequency Trading Illiquidity Patterns
Finance
Master Thesis in Empirical Finance
Supervisor Candidate
Prof. Paolo Santucci De Magistris Andrea D’Amato
694321
Co-Supervisor ss
Prof. Stefano Grassi ss
Accademic Year
2018-2019
Index
1 Introduction 1
2 (Il-)Liquidity measures 5
2.1 Low Frequency measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Roll covarinace measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Relative Bid-Ask spread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Spread estimate from High and Low prices . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 Amihud Illiquidity measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.4.1 Mid-range (il-)liquidity measure . . . . . . . . . . . . . . . . . . . . . . 10
2.1.5 Other illiquidity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.5.1 Effective Tick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.5.2 High-low Spread Estimator . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 High frequency measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Liquidity volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.2 Estimation models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Realized Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.4 High-frequency data flaws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.5 Other high frequency sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.5.1 High frequency volume . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.5.2 High frequency relative bid-ask spread realized variation . . . . . . . . 17
I
INDEX II
6 Empirical analysis 45
6.1 Stationarity tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Cointegration test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3 Best Liquidity measure proxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.4 Stock illiquidity premium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7 Conclusions 55
References 57
Summary 66
Chapter 1
Introduction
Since the end of the Second World War, the growth rate of the economy has enhanced year by year. This
growth rate has risen thanks to the globalization of the economies. Previously, transactions required
hours or even days to fulfill the request, which implied that it was not possible to dis-invest in a short
time frame in order to satisfy momentary needs. Over the years, investors’ liquidity needs has got a
primary role for any investment opportunity. At present, the need to divest a position can occur also in
seconds, or fractions, and the fear to have an illiquid instrument has risen after the most recent crisis.
High frequency trading (HFT) has clearly stressed the speed at which a transaction is executed, but
concerning only on the speed aspect let underestimate the implications that it carries.
Liquidity and trading activity are important features of present financial markets, but little is known
about their evolution over time or about their time-series determinants. This is caused by the lack of
data provision of U.S. stock markets, which has supplied this data information only on some assets
for roughly four decades. Their fundamental importance has been represented by the influence that
they have on trading costs on specific returns (Amihud and Mendelson, 1986; Jacoby, Fowler, and
Gottesman, 2000) which implies a direct link between liquidity and corporate costs of capital. More
generally, exchange organization, regulation, and investment management could all be improved by
improving the knowledge of factors that influence liquidity and trading activity. Understanding the
scheme of unobservable features of the market such as the magnitude of returns to liquidity provision
(Nagel, 2012), the impact that asymmetric information has on prices (Llorente et al., 2002), and if
trades are more likely to have a permanent or transitory impact on price (Wang, 1994; Vayanos and
Wang, 2012) should increase investor confidence in financial markets and thereby enhance the efficacy
of corporate resource allocation.
Market microstructure is the study of the trading mechanisms applied in financial securities transactions.
There are few historical cases on this field. Some instances can be found as a mere "draft" in the second
half of the past century’s literature, but over the years, it has acquired importance to understand
financial market dynamics.
Microstructure analyses typically touch on one or more of the following aspects of trades: i) sources of
value and reasons for trade, ii) economic setting mechanisms; iii) several prices for the same instrument.
Market microstructure has grown during the past four decades since many orders are executed, for the
same instrument, hundreds of times during a normal trading day. These trading volume frequency is
known as High Frequency trading (HFT). These trading data are also more sensitive to be influenced
by an irregular price formation or by a trading anomaly activity. These issues characterize, mostly, by
an irregular time interval within trading, which reflect the easiness of liquidating an instrument within
1
CHAPTER 1. INTRODUCTION 2
a short-time interval.
Indeed, liquidity is one of the most important subjects in the secondary markets. Accurate definitions
only exist in the contexts of particular models, but these are widely accepted and understood practical
and academic discourse. Liquidity impounds, in its meaning, the economic concept of elasticity. In
a liquid market, a small shift in demand or supply does not result in a large price change. Liquidity
relates also to the cost of trading, something distinct from the price of the security being bought or sold.
Liquid markets are characterized by low trading costs. It has also dynamic traits, since, accomplishing
a purchase or sale over a short horizon does not cost more than spreading the trades over a longer time
interval.
Liquidity is usually measured by "depth, breadth, and resiliency.” In a deep market, by focusing on the
current market price, there is a large incremental quantity available for sale (buy), above (below) the
current market price. In a broad market, considering that there are many participants, none of whom is
presumed to have a significant market power. In a resilient market, the price effects that are associated
with the trading process are small and last for very narrow periods.
With the rise of global economy, the role of liquidity demand and supply has enhanced through "liquidity
externality", that can be identified as a network externality. The attributes of liquidity are important
since individual agents can trade at lower cost, when the number of participants increases. This force
favors market consolidation, the concentration of trading activity in a single mechanism or venue.
However, differences in market participants (e.g., retail versus institutional investors) and innovations
by market designers militate in favor of market segmentation (fragmentation). While liquidity is seen
kindly its counterpart, illiquidity, comes from adverse selection and inventory costs. As an instrument
becomes more liquid, the chances to dis-invest it are higher in order to obtain a "modest" premium,
while as an instrument is more illiquid, more it is sensitive to sudden changes that may impair the
investor. The more an instrument is illiquid, the more an investor may move its price by entering
with a huge position in order to satisfy his needs. In order to study in which instrument an investor
should invest, a new high frequency field is born in order to understand the liquidity patterns of any
instrument.
Size, or the market value of the stock, has seen to be related to liquidity since a larger stock issue
has smaller price impact for a given order flow and a smaller bid-ask spread. Stock expected returns
are negatively related to size, which is considerated as a liquidity driver. The negative return-size
relationship may also result from the size variable being related to a function of the reciprocal of
expected return.
Despite the stress that investors put on this concept there has been a scarce quantity of studies about
these argument and at the present the concept is analyzed by many academics. Roll (1984) was one of
the first pioneer in this field that developed a spread measure that has been used over and over, with
its own limitations. Many scholars modeled from this finding in order to overcome the measure limits.
Hasbrouck (2004, 2009) proposes a Gibbs sampler Bayesian estimation of the Roll model; Lesmond,
Ogden, and Trzcinka (1999) introduce an estimator based on zero returns. Following the same line
of reasoning of this, Fong, Holden, and Trzcinka (2017) formulate a new estimator that simplifies the
existing the previous measures. Holden (2009), jointly with Goyenko, Holden, and Trzcinka (2009),
introduces the Effective Tick measure based on the concept of price clustering. High and low prices
have been usually used to proxy volatility (Garman and Klass, 1980; Parkinson, 1980; Beckers, 1983).
Corwin and Schultz (2012) use them to put forward an original estimation method for transaction
CHAPTER 1. INTRODUCTION 3
costs where they assume that the high (low) price are buyer- (seller-) initiated, and their ratio can
be disentangled into the efficient price volatility and bid-ask spread. Abdi and Ranaldo overcome the
limits that Corwin’s measure had, since it implicitly relied on the same assumptions made by Roll, to
introduce a Mid-range price made as a trivial average price of the high and low prices. There is, in
addition, also the Amihud (2002) illiquidity measure formulated as the daily ratio of absolute stock
return to its dollar volume, averaged over some period. This can be viewed as the daily price response
associated with one dollar of trading volume, which roughly reflect the price impact.
However, while these measure needs "only" data with a lower frequency, usually a day, there are others
that require microstructure data. Market microstructure has been analyzed for almost four decades since
it has been a game changer in the trading industry. Usual measures like variance and correlation were not
applicable also in this context so new measure had to be modeled. Jacod (1994) and Jacod and Protter
(1998) were one of the first studies in this new field and they introduced the new concept of realized
variance, which is the realized counterpart of the conceptual variance. Andersen, Bollerslev, Diebold
and Ebens (2001) and Andersen, Bollerslev, Diebold and Labys (2001 and 2002) re-arrange the realized
measure and model it from short-term price changes, which treat volatility as observable, rather than
as a latent variable. This has been proved to be an useful estimate of fundamental integrated volatility
by Hasbrouck (2018), Hwang and Satchell (2000), Zhang (2010). However, Hansen and Lunde (2006)
pointed out that, when dealing with market microstructure, the presence of market microstructure noise
in high frequency data complicates the estimation of financial volatility and makes standard estimators,
such as the realized variance, unreliable. So market microstructure noise challenges the accountability
of theoretical results that rely on the absence of noise. The best remedy for market microstructure noise
depends on the properties of the noise. The time dependence in the noise and the correlation between
noise and efficient price arise naturally in some models on market microstructure effects, including (a
generalized version of) the bid-ask model by Roll (1984) (Hasbrouck, 2004) and models where agents
have asymmetric information (Glosten and Milgrom, 1985 and Easley and O’Hara, 1987, 1992). Market
microstructure noise has many sources, including the discreteness of the data (see Harris 1990, 1991)
and properties of the trading mechanism (Black, 1976; Amihud and Mendelson, 1987 and O’Hara,
1995). Other studies were focused on one of the most used sizes that plainly represents illiquidity: the
bid-ask spread. Particularly, starting from Amihud and Mendelson’s (1986) liquidity study, it has been
running several researches on the evolution of the volatility of this size. Studies on liquidity volatility
are found mostly on foreign exchange bid-ask spreads. Glassman (1987) and Boothe (1988) study the
statistical properties of the bid-ask spread, Bollerslev and Melvin (1994) evaluate the distribution of
bid-ask spreads and their ability to explain foreign exchange rate volatility based on a tick-by-tick data
set. Furthermore, Andersen et al. (2001) and Cuonz (2018) model a realized variance measure on the
intra-day bid-ask spread which satisfies the realized variation assumptions and gives consistent results,
even if stock returns are not accounted.
By considering all these measure, it is advisable to check if these do not follow a random walk process,
or, worse, an explosive behavior. Many unit-root tests have been established but the ones that are
the most used and accountable are the Augmented Dickey-Fuller (Dickey and Fuller, 1981) and the
Variance Ratio test (Campbell and Mankiw, 1987; Cochrane, 1988; Cogley, 1990). If a process has
an unit root, i.e. it follow a random walk, is referred as integrated of order one, denoted as I(1). To
overcome this issue, lots of academics tried to solve the issue that a non-stationary process brings with
it. By applying usual regression concept and de-trending techniques on non-stationary processes does
CHAPTER 1. INTRODUCTION 4
not solve the problem, and spurious correlation shows up, Granger (1983). Lately, Engle and Granger
(1987) coined the cointegration term and a two-step approach, by having two I(1) processes, non-
stationary, their linear combination would be an I(0) process, stationary. Johansen (1991), developed
an augmented version of the two-step procedure made by Engle and Granger, based on the Dickey-Fuller
test for unit roots in the residuals, which allows to test more than one cointegrating relationship. The
Johansen test is subject to asymptotic properties, i.e. large samples. Indeed, if the sample size is too
small, results will not be reliable and Auto Regressive Distributed Lags (ARDL) should be used.
Studies have been done across the academia to study what are the the best proxies that explain the
reward that an investor should obtain in exchange of the illiquidity risk that he sustains. This study
investigate the relation that runs within the illiquidity premium and low frequency liquidity measures.
The measure that are used in this work are classified as low, daily, frequency and high frequency
measures. The daily measure cover the Amihud illiquidity measure, the relative bid-ask spread and,
a "fresh-formulated" liquidity measure, the mid-range illiquidity price, which is the ratio of the daily
mid-price, developed by Abdi and Ranaldo (2017), over the related daily volume. Their counterpart,
high frequency, measure, instead, embrace the Realized Variance of the stock returns, the Realized
liquidity variation and the intra-day log volume.
Firstly, non-stationary-based tests computed over the measures confirms that these neither follow a
random walk or an explosive process. However, both the bid and ask prices, as noted in the literature,
are non-stationary. Nevertheless, by their linear combination results to be stationary, which allows
to state that these are cointegrated. Despite this, their mid-prices are not cointegrated, and their
combination does not satisfy the stationarity requirements. Any possible combination within the stocks
has been made but these seems to not satisfy either the stationary conditions. Anyway, this result
should be expected due to the low relationship that resides within the firm’s industries.
Then, tests on the relation that elapses with, one of the most accountable illiquidity measures, bid-
ask spread and low and high frequency measure prove that the low ones seem to be good estimates
while the latter give good proxies, but with less explanatory power than the daily ones. Anyhow, a
"comprehensive" test with all variables exposes a slight significance to the previous ones, but stressed
the role of the "new-developed liquidity measure". Lastly, the core test of this study, the relation with
the daily liquidity measures and the illiquidity risk premium, gives consistent results that confirms
Amihud’s finding. Indeed, as a stock is more illiquid, it requires an higher premium that it should
pay to its owner. This result highlights the positive relation with the illiquidity measures, Amihud and
relative bid-ask spread measure, and the negative relation with the liquid one, mid-range liquidity price.
The paper proceeds as follows. Chapter 2 discusses briefly the liquidity-based measures used in this
work and other measures that have been used in the literature. Chapter 3 exposes the theory behind
the realized measure and the microstructure effect on this. Chapter reports a short explanation of
stationary tests and the role of cointegration over non-stationary processes. Chapter 5 highlights the
database and descriptive statistics of the sample and the measures. Chapter 6 embodies the analysis
ran over the measures and the illiquidity reward. Chapter 7 concludes the paper.
Chapter 2
(Il-)Liquidity measures
Liquidity is an subtle concept. It cannot be observed directly, but rather has different features, which,
cannot be shorted in a single measure. The definition of illiquidity concerns an asset that is difficult to
sell because of its expense, lack of interested buyers, or some other reason, which generates adverse se-
lection costs and inventory costs that affect stock price impact (Amihud and Mendelson, 1980; Glosten
and Milgrom, 1985). For standard-size transactions, the price impact is the bid-ask spread, whereas
larger excess demand induces a greater impact on prices (Kraus and Stoll, 1972; Keim and Madhavan,
1996), which may reflect some informed trading actions (Easley and O’Hara, 1987). Since it is difficult
to disentangle order flows generated by informed traders or by liquidity (noise) traders, market makers
set prices that are an increasing function of the asymmetries in the order flow, transaction volume,
which may indicate market manipulation, informed trading, actions. This buyer/seller imbalance has
been found to be positively related with price change, price impact (Kyle, 1985).
It is known that higher the volatility is less liquid is the market. This suggest that the market can face
three different, but sometimes complementary, situations: a) there is a low levels of trading volume, b)
the current volume is buyer/seller-skewed and c) the market is characterized by intermittent transac-
tions, usually characterized by heavy volume positions. The focus of this work is to analyze the latter
aspect of these and analyze the the evolution of its variation.
In the literature, (il-)liquidity measure can be categorized into: high frequency and low (daily) frequency
measures.
The first are characterized by loads of information that resides in the price process and some of the main
measures are the liquidity second moment measure and the average intra-day bid-ask spread. Even if
the first tries used raw liquidity measure, the liquidity data still is difficult to extrapolate in a clean
and useful way, also nowadays. Many processes measure the liquidity deviation without considering the
link that liquidity risk has with it. However, these approaches are often very complex to analyze and
manage. Furthermore, since the magnitude of this data is huge, the price process is usually affected by
microstructure noise which must be accounted during the estimate of the measures.
Instead, the latter use daily frequency data that are easily available in any market for every stock,
sometimes even not listed ones. While the first give consistent and accurate estimates of market and
stock liquidity, they are also characterized by not optimal, more precisely availability, data that are not
given for many instruments; the latter can give also consistent measures which can be calculated easily,
even if these are not accurate as the other.
This chapter is organized as follows: at first, the main low frequency measures will be exposed and
the possible advantages and disadvantages of these will be listed. Lastly high frequency measures will
5
CHAPTER 2. (IL-)LIQUIDITY MEASURES 6
be introduced: the liquidity variation, which is based on the concept of realized variance for a liquid
instrument, the log of the total number of intra-day trades and the intra-day bid-ask spread.
2. The probability distribution of observed price changes is stationary, at least in the short run.
It illustrates a dichotomy fundamental to many microstructure models, the distinction between price
components due to fundamental security value and those attributable to the market organization and
trading process. If the market is informationally efficient, and trading costs are assumed to be zero,
then the market price should reflect, and contain, all the relevant information about the traded security.
A price change occurs only if an unanticipated information is disclosed to market participants.
Although the transaction costs are not equal to zero, the market maker (dealer) should be compensated
by, usually, establishing a bid-ask spread, in which it relies the true market value of the asset. If the
value of any security fluctuates, within this region, randomly, then it may be seen as setting the true
value in the "middle point" in the spread, computed by taking the average between the bid and the ask
price.
Roll model follows a geometric Brownian motion (GBM), and the observed (log) asset price at time t
evolves according to:
s
pt = p∗t + It ,
2
∗ ∗
pt = pt−1 + t , (2.1)
where p∗t is the underlying fundamental log price with innovation, given by t , and the trade direction
indicators {It } are i.i.d. and take value ±1 with equal probability of one half. If the transaction is a
purchase It = 1 (bid), if it is a sale It = −1 (ask). Only the price pt is observed, while all the other
variables are not. The innovation term t is serially uncorrelated and uncorrelated with the trade
direction indicator.
Since recorded transactions occur at the boundaries of the spread, the bid and ask prices, and not in
the "middle point", it occurs that observed market price changes cannot be supposed to be independent
CHAPTER 2. (IL-)LIQUIDITY MEASURES 7
anymore.
If there are no new information about the asset, transactions should arrive into the market randomly
at the buy or sale position by the traders. Since each position is equally likely to occur, the joint
probability of next price change (∆pt ≡ pt − pt−1 = t + ∆It 2s ) in trades depends on the position taken,
if it was at the bid or at the ask. As Roll showed, if the transaction at t − 1 is at the bid (ask) price,
next price change cannot be negative (positive) since there is no new information. So, it is not possible
to see two successive price increases (declines).
Since these transactions are based on sides of the spread on one side, or the other, Roll computed the
covariance between the price change of trades between the "current" time and the following one.
s2
Cov(∆pt , ∆pt+1 ) = − , (2.2)
4
by rearranging the equation, in order to obtain the spread, it has been found that:
√
Spread = 2 −Cov, (2.3)
To have a deeper understanding of the relation between price changes, Roll estimated the variance of the
2
price change is equal to one half the squared spread (i.e. V ar(∆p) = s2 ). By knowing this information
it has been found that the autocorrelation coefficient is equal to minus one half.
At first sight, the autocorrelation coefficient may seem wrong, but it must be noted that, since the
covariance is divided by the variance of the unconditional price changes. From the efficient market
hypothesis about new information, the variance of observed changes is likely to be influenced by the
new set of information, while the covariance between forward price changes cannot be explained by new
information, if markets are efficient.
Roll’s measure is an intuitive and easy measure to compute. It provides estimates with intra-day data.
However, even with a long time series of daily observations, the covariance of price changes is usually
positive, forcing the researchers to convert implement and imaginary number into the spread estimate.
Indeed, since Roll found that cross-sectional average covariances are positive for some years, for these
cases, researchers usually do one of three things:
1. treat the observation as missing;
3. multiply the covariance by negative one and multiply the spread by negative one, in order to
produce a negative spread.
This measure has helped the literature to deal with price movement patterns of trades. However, as
Ranaldo and Abdi noticed, Roll’s assumptions can be seen also as its limits that highlight two issues:
1. It relies on the occurrence of bid-ask bounces;
that affect the market. It is the difference that an investor would earn if he executes an immediate
sale (ask or offer) and an immediate purchase (bid) on the same instrument. To study the develop-
ment of market liquidity, Andersen, Bollerslev, Diebold and Labys (2001) and Cuonz (2018) noticed that
the concept of realized liquidity variation shares the same empirical assumption of the realized volatility.
where PASK and PBID are respectively the ask and bid prices. To overcome the dimension (currency)
limit that may affect this index in different markets with different currencies, relative spread is denoted
as the ratio of the bid-ask spread over the mid price. Working with relative bid-ask spreads has various
advantages. Although the bid-ask spread itself is measured in terms of the underlying currency, the
relative bid-ask spread is dimensionless. Hence, relative spreads from different markets (respectively
currencies) can directly be compared with each other. The relative bid-ask (RBA) is then computed as
BASt
RBAt = . (2.5)
log(ηt )
where ηt is the mid price of the respective stock at time t. Relative spread is thus a measure-free
estimate that, within its easiness, allows to infer about the liquidity of the analyzed stock. Nonetheless,
in an high-frequency market, market market makers can slightly move in their favor the price if they
are sufficiently fast, such as the spread would be slightly biased for the informed party.
s
ct = cet + qt , qt = ±1, (2.6)
2
s
ht = het + ,
2
s
lt = lte − , (2.7)
2
CHAPTER 2. (IL-)LIQUIDITY MEASURES 9
the superscript e indicate the efficient price, so, cet refers the efficient log-price at the closing time and
the qt indicators is similar to the ones introduced by Roll {It }, where it takes values of ±1, 1 if the
trade is buyer initiated, −1 if it is seller initiated.
By defining the mid-range as the average of daily high and low log-prices:
lt + ht
ηt ≡ , (2.8)
2
As the authors exposed in their studies, the mid-range has some characteristics:
1. The mid-range of observed prices matches with the mid-range of efficient prices:
lte + het
ηt = , (2.9)
2
and the efficient price hits ηt at least once during the day;
2. The mid-range of the current day and the following one is an unbiased estimator for the closing
mid-quote:
ηt + ηt+1
e
E ct − = 0, (2.10)
2
ηt + ηt+1 2 s2 1 k1
E cet − = + σe2 ( − ), k1 ≡ 4ln(2), (2.11)
2 4 2 8
4. The variance of changes in mid-ranges is a linear function of the efficient price variance:
2 k1
E ηt + ηt+1 = σe2 2 − . (2.12)
2
Since the mid-ranges are independent of the spread, their difference reflects the volatility. The
estimated efficient price volatility follow the "true" efficient price volatility and this measure is
still unbiased than other high-low volatility estimates.
ηt + ηt+1 2 2
2
s = 4E ct − − E ηt + ηt+1 = 4E (ct − ηt )(ct − ηt+1 . (2.13)
2
At first sight the squared spread looks like the Roll’s. However, this measure does not depend on
constrained assumptions on the serial independence of trades and a similar likelihood of the close price
on the buy and sell side.
CHAPTER 2. (IL-)LIQUIDITY MEASURES 10
Nevertheless, this model may have some estimation errors, that may result the RHS with a negative
value. To overcome this issue, as already stated in the literature, three dealing approach must be
accounted:
2. Set negative estimates to zero and then take the average of the previous two days spreads;
3. Remove negative estimates and compute the spread for only the positive estimates and take their
average.
TAQ data simulations and comparisons indicate that the first two approaches provide better outcomes,
both in terms of bias correction and estimation errors.
i. Bid-ask spread and brokerage fees are higher on stocks, i.e. illiquidity costs have a greater influence
on stocks;
So, Amihud found that expected stock excess return is an increasing function of expected market
illiquidity. This result has been found to be stronger on small firm stocks rather on larger firms. This
effect is enhanced during period in which liquidity is stressed, such that there is a "flight to liquidity"
that makes large stocks more attractive. Small stocks’ sensitivity to illiquidity indicates that these stock
are subject to higher illiquidity risk, that translates in higher illiquidity risk premium.
To put an heavier stress on the illiquidity spin, on the reading of the market return, it has run a re-
model between the mid-range price measure, made by Ranaldo, and the Amihud one. This measure has
been developed as a mix of Amihud and Ranaldo’ s. The main goal of this rephrase is to discover if an
instrument is relatively illiquid, by having at one’s disposal only "trivial" stats of any listed stock (daily
CHAPTER 2. (IL-)LIQUIDITY MEASURES 11
high and low prices and volume). This reinforce has been devised since results with just the mid-range
price were not crucial in the estimation of the return. As the AI, this fit is estimated as:
N
1 X ηt
M It = . (2.15)
N t=1 V olt
As already mentioned, larger is the measure more sensible the stock price is to a large position entrance
in the market, and viceversa.
Holden (2009) and Goyenko, Holden, and Trzcinka (2009), develop The effective tick estimator, which is
based on the idea that wider spreads are associated with larger effective tick sizes. Their model assumes
that, when both the tick size and the bid-ask spread are one-eighth, all possible prices are used, but
when the tick size is one-eighth and the spread is one quarter, only prices ending on even-eighths are
used.
Goyenko, Holden, and Trzcinka (2009) show that their assumed relation between spreads and the
CHAPTER 2. (IL-)LIQUIDITY MEASURES 12
effective tick size allows researchers to use price clustering to infer spreads. Lets suppose that there
are four possible bid-ask spreads for a stock: $ 81 , $ 14 , $ 21 , and $1. The number of quotes with odd-
eighth price fractions, associated only with $ 18 spreads, is given by N1 . The number of quotes with
odd-quarter fractions, which occur with spreads of either $ 81 or $ 41 , is given by N2 . The number of
quotes with odd-half fractions, which can be due to spreads of $ 18 , $ 14 or $ 21 is N3 . Lastly, the number
of whole-dollar quotes, which can occur with any spread width, is given by N4 .
To calculate an effective spread, the proportion of prices observed at each price fraction is calculated as
Nj
Fj = PJ , forj = 1, . . . , J. (2.16)
j=1 Nj
The effective tick model directly assumes price clustering (i.e., a higher frequency on rounder incre-
ments). However, in small samples it is possible that reverse price clustering may be realized (i.e.,
a lower frequency on rounder increments). Reverse price clustering unintentionally causes the uncon-
strained probability of one or more effective spread sizes to go above one or below zero. So, constraints
are added to generate proper probabilities. Let γ̂j be the constrained probability of the jth spread. It
is computed in order from the smallest to the largest as follows:
M in[M ax(Uj , 0), 1],
for j = 1
γ̂j = Pj−1 (2.18)
M in[M ax(Uj , 0), 1 −
k=1 γ̂k ] for j = 2, . . . , J
Lastly, the effective tick measure is a simple probability-weighted average of each effective spread size
divided by P̄i , which is the average price at time i
PJ
j=1 γ̂j sj
Ef f ectiveT ick = (2.19)
P̄i
Corwin and Schultz (2012) develop an estimator based on daily high and low prices. The high-low (HL)
price ratio reflects both the true variance of the stock price and the bid-ask spread. While the variance
component grows proportionately with the time period, the spread component does not. This allows to
solve for both the spread and the variance by deriving two equations, the first a function of the high-low
ratios on 2 consecutive single days and the second a function of the high-low ratio from a single 2-day
period.
They assume that the true or actual value of the stock price follows a diffusion process. They assume
that there is also a spread of S%, which is constant over the 2-day estimation period. Because of the
spread, observed prices for buys are higher than the actual values by S2 , while observed prices for sells
are lower than the actual value by S2 . They hypothesize that the daily high price is a buyer-initiated
CHAPTER 2. (IL-)LIQUIDITY MEASURES 13
trade and is therefore grossed up by half of the spread, while the daily low price is a seller-initiated
trade and is discounted by half of the spread. Hence, the observed high-low price range contains both
the range of the actual prices and the bid-ask spread. Let HtA (LA t ) denotes the actual high (low) stock
price on day t and Hto (Lot ) the observed high (low) stock price for day t, so the HL measure is given by
!#2 !#2
Hto HtA (1 + S2 )
" "
ln = ln (2.20)
Lot LA S
t (1 − 2 )
HL estimator captures transitory volatility at the daily level and will closely approximate the effective
spread, or the cost of immediacy. It also does not require data on trading volume. It can therefore be
applied in settings such as emerging markets, where the quality or availability of volume data may be
suspect. Furthermore the power to predict cross-sectional differences in returns is very similar to that
of the AI.
However, this measure has been found to suffer of two drawbacks: not being robust to the price
movements in non-trading periods, such as weekends, holidays, and overnight price changes and it does
need some ad-hoc overnight price-making adjustments., and, being based on the price range, the model
is sensitive to the number of observed trades per day since it adjusts the estimate by including the gap
between observable prices.
Low prices
All uppercase variables are expressed with their market observation while all lowercase variables are the respective
log transformation.
Table 2.1 summarizes all the main measures that had significant results. It must be clarified that all
CHAPTER 2. (IL-)LIQUIDITY MEASURES 14
of these are merely bid-ask spread estimators, except AI. Nonetheless, the magnitude of measures the
ones that are used are AI, RBA and MI
is generally considered a liquidity premium that compensates for price impact or transaction cost.
Other used approaches to measure and model volatility are treated in the GARCH and stochastic volatil-
ity literature. However, both GARCH and stochastic volatility measures depend on some parametric
models that are too sophisticated and, sometimes, even self-limiting. The multitude of parameters that
must be estimated may cause some issues during the model selection and computation. Therefore,
volatility estimates based on these models can be applied under several, restrictive, assumptions.
The (implied) asset volatility can be estimated, also, on options written on the underlying asset. Still,
this approach is used to profit on the asset’s position but has a limited link to the mere liquidity concept.
Liquidity cannot be traded directly; therefore, there are no option contracts written on liquidity.
Admati and Pfleiderer (1988) provide one of the first studies that investigate volume intraday patterns
as being a standard liquidity measure. Trading volume, however, could be measured with different
CHAPTER 2. (IL-)LIQUIDITY MEASURES 17
dimensions. The most popular measurements of trading volume are: number of exchanged shares, share
monetary value and number of share transactions. Those measures are, however, closely interrelated to
each other. Total volume in a particular period is the sum of the sizes of individual trades. The main
measure used in this work will be the latter, number of share transactions, that is easily obtained as
mT
X
lV olt = log(V olt ) = log N STi , (2.21)
i=1
To complete the analysis, the relative intra-day bid-ask spread, which is the high frequency counterpart
of the one already treated in 2.1.2, is used to study illiquidity patterns
To operate with high-frequency data, a rearrangement must be made. Given m intra-day spread obser-
vations per day for T number of days, time series consists of m × T quotes for the data set. To let the
example be practical, the variation will be assessed within the day. The relative bid-ask spread over
1
[t − m , t] is given by
1 1 2
∆BA(m)(t) = BA(t) − BA(t − ), for t = , , . . . , T. (2.22)
m m m
The realized measure of the relative spread gives information about the liquidity aspect of the analyzed
object. This is obtained by summing the squares of the instantaneous changes in the relative bid-ask
spreads, as for the RV. However, while in terms of realized volatility, changes are expressed as logarithmic
or percentage returns, this measure has absolute changes in the dimensionless relative bid-ask spreads.
The time-t h-periodic realized liquidity variation is dimensionless and is given by
mh
i
∆BA2(m) (t − h +
X
lRBAh (t; m) = ), (2.23)
i=1 m
for t = h, 2h, . . . , T . Regardless of the sampling frequency, realized liquidity variation is directly ob-
servable, which is not true for its theoretical counterpart, the quadratic variation. As for the RV, an
appropriate sampling frequency must be chosen when dealing with high-frequency observations. Indeed
for sufficiently large m, the realized liquidity variation converges in probability towards the quadratic
variation of the relative bid-ask spreads.
Chapter 3
The notion of realized variance (realized volatility) can be tracked down by Andersen, Bollerslev, Diebold
and Labys. This measure is typically discussed in a continuous time framework where logarithmic
prices are characterized by a semi-martingale. More restrictive specifications have been considered by
Barndorff-Nielsen and Shephard (2001). According to their assumptions, the quadratic variation (QV)
of the return process can be consistently estimated as the sum of squared intra-period returns. When
some studies talk about realized variance it is associated the notion of QV. Importantly, Andersen et al.
show that QV is the crucial determinant of the conditional return (co-) variance thereby establishing the
relevance of the realized variance measure. If the conditional mean of the return process is deterministic,
or function of variables contained in the information set, the QV is equal to the conditional return
variance, which can be estimated consistently as the sum of squared returns. However, this exposition
impose a random intra-period evolution of the instantaneous mean. But, Andersen et al. showed that
these effects are likely to be trivial in magnitude and that the QV can be considered as the main
determinant of the conditional return variance.
Through the academia the link that interconnects this measure and illiquidity has got some attention.
Not surprisingly, the wider this is more illiquid is an instrument during the event window (Engler and
Jeleskovic, 2016, and Amiram, Cserna and Levy, 2016).
This chapter focus on the exposure of the Realized Variance and the market microstructure patterns
that affect HFT. Lastly, it will be discussed the role that these have during the liquidity measurement
and how these will be used for forward tests.
18
CHAPTER 3. REALIZED VOLATILITY THEORY 19
For any n-dimensional arbitrage-free vector price process with finite mean, the logarithmic vector
price process, p, may be written as the sum of a finite variation and predictable mean component,
A = (A1 , . . . , An ), and a local martingale, M = (M1 , . . . , Mn ). These can be decomposed in their turn
into a continuous sample-path and jump process like:
where Ac and ∆A are respectively the continuous and pure jump processes of the finite-variation pre-
dictable component, while M c and ∆M are respectively the continuous sample-path and compensated
jump process; by definition it is known that M (0) = A(0) ≡ 0.
By denoting the (continuously compounded) return over the interval [t − h, t] by r(t, h) = p(t) − p(t − h).
The cumulative return process from t = 0 onward, r = (r(t))t∈[0,T ] , can be derived as r(t) ≡ r(t, t) =
p(t) − p(0) = A(t) + M (t). Obviously, r(t) inherits all the main properties of p(t) and may be decom-
posed into the predictable and integrable mean component, A, and the local martingale, M . The
predictability of A allows for some properties to still hold in the (instantaneous) mean process (i.e. it
may evolve stochastic-ally and display jumps). Nonetheless, the continuous component of the mean re-
turn must have smooth sample paths compared to those of a non-constant continuous martingale, such
as a Brownian motion, and any jump in the mean must be accompanied by a corresponding predictable
jump in the compensated jump martingale, ∆M . Consequently, there are two types of jumps in the
return process:
i. Predictable jumps where ∆A(t) 6= 0;
This statement explains that the quadratic variation of continuous finite-variation processes is zero. If
this holds, then the mean component becomes irrelevant for the quadratic variation. Moreover, jump
components only contribute to the quadratic covariation if there are simultaneous jumps in the price
path for the ith and jth asset, whereas the squared jump size contributes directly to the quadratic
variation. The quadratic variation process measures the realized sample-path variation of the squared
return processes. Under this property, this variation is caused by the innovations to the return process.
So, the quadratic covariation builds a unique and invariant ex-post realized volatility measure that is
essentially model free.
Despite this property the quadratic variation has another useful property that helps it to deal with
high-frequency data.
Lets consider an increasing sequence of random partitions of [0, T ], with 0 = τm,0 ≤ τm,1 ≤ . . . , and so
on, and such that supj>1 (τm,j+1 − tm,j ) → 0 and supj>1 τm,j → T for m → ∞ with p = 1, we have that:
[r(t ∧ τm,j) − r(t ∧ τm,j−1 )][r(t ∧ τm,j ) − r(t ∧ τm,j−1 )]0 } → [r, r]t ,
X
lim {
m→∞
(3.4)
j>1
N
for t = 1, . . . , T . By sampling with a frequency f , then Nf = f intra-day returns can be obtained as:
for i = 1, . . . , Nf and St,0 = St−1,N . It is important to notice that this model relies also on a continuous
time model, but, as above mentioned, here it will be treated in its discrete aspect.
The first two moments of the return are assumed to exist, and can be denoted as µt = E[rt | Ft−1 ] and
σt2 = var[rt | Ft−1 ] where Ft denotes the information set available at the beginning of the day−t, which
includes all the previous information.
The information set contains many observations that relate to the prices, quotes, trading volume and
1 Generalization of irregularly time spaced returns.
CHAPTER 3. REALIZED VOLATILITY THEORY 21
market depth. It is also assumed that the log (excess) return of the security follows:
rt = σt t , (3.7)
where t ∼ N(0, 1) and σt2 is the return variance of the day−t By the additive property of returns, the
tth day return is given by:
Nf
X
rt = rt,f,i . (3.8)
i=1
To give a clue on the working of these measure here is summarized its behavior. Let the return process
can be written as a standard Ito process like:
where µ(t) i is the process drift, σ(t) is the spot volatility and W (t) is a standard Brownian motion. This
model can be denoted also as a stochastic volatility (SV) model. This representation follow a special
semi-martingale model, which relies on useful properties. One of these states that, when µ(t)andσ(t)
are jointly independent from W (t), then:
where
Z t Z t
µt = µ(s)ds and IVt = σ(s)ds
t−1 t−1
where IVt denotes the integrated variance, which is the return variance.
The quadratic variation of a stochastic process over the interval [t − 1, t] is equal to:
Z t N
σs2 ds ≡ [St,j − St,j−1 ]2 .
X
QVt ≡ plim (3.11)
0 N →∞ j=1
As previously mentioned, the empirical equivalent measure of QVt is the realized variance, which is
defined as the sum of the squared intra-period returns, with frequency f ,
Nf
2
X
RVt ≡ rt,f,j . (3.12)
j=1
N
f
var[rt | Ft ] = E[rt2 | Ft ] = E 2
| Ft .
X
rt,f,i (3.13)
i=1
difference sequence.
The RVt can be related to the conditional variance only if, for instance, {µ(s), σ(s)}ts=t−1 is Ft−1 −measurable,
then IVt = σt2 . If there is no microstructure noise, or measurement error, the realized volatility can be
approximated as the second uncentered sample moment of the return process. As shown by Barndorff-
Nielsen and Shepard (2002), if there are no price jumps and microstructure noise is not impounded in
the process, the realized volatility is a consistent non-parametric measure of the Notional Volatility:
p
RV (n) → IV as n → ∞. (3.14)
pricing. What these liquidity factors capture, and how to even measure them, is problematic.
In a friction-less market, microstructure does not matter and profits come solely from longer term views
on the market.
Transaction costs can be added, leading to the clearing equation with transaction costs:
sn
δn W = L n δn p ± | δn L |, (3.15)
2
where W is a trader’s wealth before n, L is his net position in the traded asset and δn represents the
difference of the successive moment of the target variable and the actual one (i.e. δn L = Ln+1 − Ln ),
sn is the bid-ask spread at time n, p the mid-range price before the execution of the trade and the
± is "+" if the trade is made with limit order and "−" if made with market orders. This equation,
even if considers the spread impact, ignores the price impact. By ignoring such component this is the
characterization of low frequency traders, who do not optimize their execution and use just long-term
views to trade. By incorporating the price impact, high frequency traders can model their views by:
sn
δn W = Ln δn p ± | δn L | +δn Lδn p (3.16)
2
where this expression represents the notion of microstructure noise, since wealth can be tracked by
using measurable market quantities.
where pih is the friction-less equilibrium price, (price that would prevail in the absence of market mi-
crostructure frictions) and ξi,h denotes microstructure noise. By taking the difference with the previous
value and applying logarithmic properties, the process would be
where r̃ih = ln(p̃ih )−ln(p̃(i−1)h ), rih = ln(pih )−ln(p(i−1)h ) and ϑih −ϑ(i−1)h = ln(ξih )−ln(ξ(i−1)h ). Each
period can be divided in M sub-periods, and the observed high-frequency continuously compounded
return is given by
1
where ∆ = M is the horizon over which the continuously-compounded returns are computed. Hence,
r̃j,i is the j intra-period return over the ith period. So
th
where εj,i = ϑ(i−1)h+jh∆ − ϑ(i−1)h+(j−1)h∆ is the microstructure noise that affect the data. However,
both rj,i (equilibrium return) and εj,i are unobservable.
To simplify the exposure, it will be treated just the case in which i = 1. As Bandi and Russel (2003)
outlined, under the previous setting, the realized variance of the observed return is made up of three
components
M M M
ˆ = rj2 + ε2j + 2
X X X
RV r j εj . (3.21)
j=1 j=1 j=1
If the true price process was directly observable, only the first term (sumM 2
j=1 rj ) would lead the de-
termination of RV ˆ . However, microstructure noise introduces also two terms. The first added term
PM 2
( j=1 εj ) diverges to infinity almost surely as the number of observations increases asymptotically (or,
equivalently, as the sampling frequency increases in the limit) since more and more noise is being accu-
mulated for a fixed period of time h.
By selecting ∆ as small as possible (i.e., M as large as possible) is optimal. But ignoring market
microstructure noise leads to an even more dangerous situation than assuming constant volatility and
T → ∞. After suitable scaling, RV based on the observed log-returns is a consistent and asymptotically
normal estimator, but of the quantity 2M E[ε2 ] rather than σ 2 . Said differently, in the high frequency
limit, market microstructure noise totally swamps the variance of the price signal at the level of the
realized volatility. Since the expressions above are exact small-sample ones, they can, in particular, be
specialized to analyze the samples at increasingly higher frequency (∆ → 0, say, sampled every minute)
over a fixed time period (T fixed, say, a day).
In the absence of market microstructure noise (i.e. εj = 0, ∀j), the estimation error between the re-
alized variance estimator and integrated variance converges weakly to a mean-zero mixed Gaussian
√
distribution, with speed M .
√
M ˆ
h (RV − RV ) ∼ M N (0, 2Q), as M → ∞,
When microstructure noise plays a role, the realized variance estimator does not consistently estimate
the integrated variance over any given period. Intuitively, the summing of an increasing number of
contaminated return data entails infinite accumulation of noise as the frequency increases. Specifically,
while the first sum term converges to the integrated variance over the period, the second sum term
diverges to infinity, almost surely, while the third sum is stochastically dominated by the second.
ˆ a.s.
RV → ∞, as M → ∞,
The limiting result in (ii) is an asymptotic approximation suggesting that for large M, as is the case
for high-frequency data, the researcher must be wary of microstructure contamination as the effect of
the noise can be substantial. Hence, any statement about the informational content of the conventional
realized variance estimator as a measurement of the integrated variance of the underlying logarithmic
price process ought to be a finite sample statement.
Nevertheless, sample moments of the observed return series can be used to learn about population
moments of the unobserved noise returns at high frequencies. Indeed, if the noise variance has a finite
fourth moment, E(ε8 < ∞), then
1 PM q p
M j=1 r̂ j E (ϑq ),
CHAPTER 3. REALIZED VOLATILITY THEORY 26
for large M, and any fixed period h, moments of the unobserved noise process can be estimated at any
frequency, by using data sampled at the highest frequency.
T
With N = ∆ , it has been found that the variance of the noise is essentially unrelated to σ 2 . It has long
been known that sampling at very high frequencies, 1, 5, 10 secs, is not a good idea. The recommen-
dation in the literature has then been to sample sparsely at some lower frequency, by using a realized
volatility estimator constructed by summing squared log-returns at some lower frequency: 5, 10, 15, 30
minutes, (Andersen et al 2001; Barndorff-Nielsen and Shephard, 2002; Gençay et al., 2002). Reducing
the value of n, from 23400 (1 second sampling) to 78 (5 min sampling over the same trading day), has
the advantage of reducing the magnitude of the bias term 2M E[2 ] (Aït-Sahalia and Yu, 2009).
Chapter 4
This section introduces econometric methods used in analysis of volatility and liquidity time series from
the previous section. Firstly,unit root and stationarity tests will be outlined, since time series station-
arity is require before testing more complex patterns. Then the concept of cointegration is exposed,
since it is an useful tool to apply if time series are found to be non-stationary. Both Engle-Granger and
Johansen methods are presented, because the former is suitable and easier to apply in this analysis,
while the latter is used mainly as an input into further, and articulated, econometric methods.
This formulation implies strong assumption on the moments of the process, i.e. it requires all moments to
be constant, which are difficult to satisfy, but weaker form, covariance stationarity, is usually satisfactory.
To overcome this limit, the concept of covariance stationary suits as well, i.e. first and second moments
do not change over time. Even though it is a weaker assumption than the previous one, it is a good
connotation since it implies that stochastic process’ moment should be: E(yt ) = µ and Cov(yt , yt+h ) =
γh . The first moment is assumed constant as for the second condition as well. The latter implies that
variance is constant since it relies only on h ≥ 0, ∀t.
It is said that weakly dependent stationary process is integrated of order zero I(0), while nonstationary
process, which can be transformed into stationary process by taking its first difference, is said to be
integrated of order one I(1). More generally, a stochastic process is said to be integrated of order d,
I(d), if by differencing the process d-times yields a stationary process.
27
CHAPTER 4. STATIONARY AND INTEGRATED PROCESS 28
analysis is generally incorrect, mainly because of inflated t-statistics (and therefore lower p-values)and
high R2 . There are many root tests but Augmented Dickey-Fuller (ADF) test is the one that is used in
this work. Since it is ease computing it gives useful results for this kind of tests.
Original unit root Dickey-Fuller test was developed by Dickey and Fuller (1979), the augmented version
of their test is used more often nowadays, since it allows to test larger and more complex time series
models. The testing procedure is the same as the "classical" Dickey-Fuller test but applied to a different
pattern of an autoregressive process. Let yt be an autoregressive process of order p, AR(p)
p
X
yt = α0 + αi yt−i + t , (4.2)
i=1
An AR(p) process has unit root if pi=1 αi = 1. A transformation of equation 4.2 can be easily made by
P
subtracting yt−1 and a scaling procedure, which is used for ADF test of order p
p−1
X
∆yt = α + βyt−1 γ∆yt−i + t , (4.3)
i=1
β
ADF test statistics is trivially obtained as a "classical" t-statistics of coefficient β, i.e. SE(β) . However,
this statistic does not have neither a t-distribution nor asymptotically standard normal distribution.
This is the reason why critical values have to be obtained by simulations. If null hypothesis H0 : β = 0
is rejected in favor of the alternative H1 : β < 0, it means that the time series does not have unit root,
while if H0 holds, is not rejected, the time series has unit root, or there is not enough evidence that the
time series does not have unit root.
Testing the presence of a process with an unit root is the counterpart of testing the random walk (RW)
hypothesis, which provides a mean to test the weak-form efficiency of financial markets (Fama, 1970;
1991). This helps to measure the long- run effects of shocks on the path of real output in macroeconomics
(Campbell and Mankiw, 1987; Cochrane, 1988; Cogley, 1990). Indeed, a RW is a trivial example of a
non-stationary process.
Given a time series yt , the process follows a RW if the autoregressive coefficient is equal to one,ψ = 1
yt = µ + ψyt−1 + εt (4.4)
where µ is an unknown drift parameter and the error terms εt are usually neither independent nor
identically distributed (i.i.d.).
Many statistical tests were designed to test the RW but a class of test, based on the variance-ratio (VR)
methodology (see, e.g., Campbell and Mankiw, 1987; Cochrane, 1988; Lo and MacKinlay, 1988; Poterba
and Summers, 1988; Charles and Darné, 2009). The VR methodology consists of testing the RW against
stationary alternatives, by exploiting the fact that the variance of random walk increments is linear in
all sampling intervals, i.e., the sample variance of k-period return (or k-period differences), yt − yt−k , of
the time series yt , is k times the sample variance of one-period return (or the first difference), yt − yt−1 .
CHAPTER 4. STATIONARY AND INTEGRATED PROCESS 29
The VR at lag k is then defined as the ratio between ( k1 )th of the k-period return (or the k th difference)
to the variance of the one-period return (or the first difference). A k-period VR is then computed as
1 V ar(rt+k )
V Rk ≡ (4.5)
k V ar( rt + 1)
Hence, for a RW process, the variance computed at each individual lag interval k (k = 2, 3, . . . ) should
be equal to unity. If the VR is larger than 1, the price series shows a tendency to form trends, i.e.
changes in one direction are more often followed by changes in the same direction. Instead, if the VR is
less than 1, the price series shows some degree of mean reversion, which is the equivalent of a stationary
process. Changes in one direction are more often followed by changes in the opposite direction.
4.2 Cointegration
Once variables have been classified as integrated of order 0,1, etc. is possible to set up models that lead
to stationary relations among the variables, and where standard inference is possible. The necessary
criteria for stationarity among non-stationary variables is called cointegration. Testing for cointegra-
tion is necessary step to check if your modelling empirically meaningful relationships. If variables have
different trends processes, they cannot stay in fixed long-run relation to each other, implying that the
long-run models cannot be forecasted, and there is usually no valid base for inference based on standard
distributions. If any cointegration has been found, then it is necessary to continue to work with variable
differences.
Standard econometric analysis cannot be used whenever two or more time series are integrated. More-
over, any linear combination of series which are integrated of the same order is generally integrated and
has the same order of integration as individual series (Engle and Granger 1987). Namely, if xt ∼ I(1)
and yt ∼ I(1), usually also xt + βyt ∼ I(1). However, it is possible that both processes are driven by the
same deterministic or stochastic trend, integration and cointegration are concepts that allow to some
variable to be stationary. Integrated variables, identified by unit root and stationarity tests, can be
differenced in order to obtain stationarity. Cointegrated variables, identified by cointegration tests, can
be combined to form new, stationary variables. In other words they evolve in a similar way. In such
a case, linear combination xt + βyt is integrated of a lower order than individual time series, i.e. the
combination is I(0), and so xt and yt are said to be cointegrated.
In the presence of cointegration, simple differencing is a model miss-specification, since long-term in-
formation appears in the levels. Since, the cointegrated VAR model provides intermediate options,
between differences and levels, by mixing them together with the cointegrating relations. So, all terms
of the cointegrated VAR model are stationary, unit roots problems are erased in the process.
Cointegration modeling is often adopted in the economic theory. Some variables that are described
with a cointegrated vector autoregressive (VAR) model are:
ii. Spot and forward currency exchange rates and interest rates;
These macroeconomic-models must consider the possibility of structural changes in the underlying data-
generating process during the sample period.
However, Financial data, as mentioned, is available at high intra-day frequency .The Law of One Price
suggests cointegration, otherwise some arbitrage opportunities may arise, among the following groups
of variables:
Financial markets are characterized by "price multiplicity". In particular, different investors can provide
different valuations and attach different prices to the same asset. Also, different market venues can be
available for the same asset. Following Hasbrouck (2002), we can guess a statistical model for the joint
behavior of two prices for the same asset linked together by a no-arbitrage or equilibrium relationship.
Basically, the two prices incorporate a single long-term component that takes the form of a cointegrating
relation. Cointegration involves restrictions stronger than those implied by correlation. Two stock prices
can be positively correlated but not cointegrated. If stock A is cointegrated with stock B, there exists an
arbitrage relationship that ties together the two stocks. In addition, the ask and bid quotes for stock A
are also cointegrated. The reason is that the difference between the quotes can often be characterized as
a stationary variable, meaning that it cannot explode in an unbounded way. The price of a stock on two
different exchanges can be different at any point in time, but it is natural to assume that this difference
reverts to its mean over time. There are several cointegration tests that can be used. The Johansen
test is the most fundamental and used. Engle and Granger (1987) developed a first cointegration test,
based on common stochastic trends. This test is also know as the two-step procedure test.
where p is the number of variables in the equation. In this regression we assume that all variables are I(1)
and might cointegrate to form a stationary relationship, and thus a stationary residual termût = x1,t −
β1 − pi=2 βi xi,t . This equation represents the assumed economically meaningful (or understandable)
P
steady state or equilibrium relationship among the variables. If the variables are cointegrated, they
will share a common trend and form a stationary relationship in the long run. Furthermore, under
cointegration the estimated parameters can be seen as the correct estimates of the long-run steady
state parameters, and the residual (lagged once) can be used as an error correction term in an error
correction model.
The second step is to test for a unit root in the residual process of the cointegrating regression above.
CHAPTER 4. STATIONARY AND INTEGRATED PROCESS 31
k
X
∆ût = α + πût−1 + γi ût−i + vt , (4.7)
i=1
where the constant term α can be often left out to improve the efficiency of the estimate. Under the null
hypothesis of no cointegration, the estimated residual is I(1) because x1,t is I(1), and all parameters
are zero in the long run. The empirical t-distribution is not identical to the Dickey-Fuller one, although
the tests are similar. The reason is that the unit root test is applied to a derived variable, the estimated
residual from a regression. Thus, new critical values must be tabulated through simulation. The null
hypothesis is still to check that the process has no cointegration, while the alternative hypothesis is
that the equation is a cointegrating equation, meaning that the integrated variable x1,t cointegrates at
least with one of the variables on the right hand side. A significant π value would imply co-integration.
If the dependent variable is integrated with d > 0, and at least one variable is also integrated of the
same order, cointegration leads to a stationary I(0) residual. But, the test does not tell us if x1,t is
cointegrating with all, some or only one of the variables on the right hand side. Lack of cointegration
means that the residual has the same stochastic trend as the dependent variable. The integrated
properties of the dependent variable will if there is no cointegration pass through the equation to
the residual. The test statistics for H0 : π = 0 (no co-integration) against H1 : π < 0 (co-integration),
changes with the number of variables in the co-integrating equation, and in a limited sample also with
the number of lags in the augmentation (k > 0).
Asymptotically, the test is independent of which variable occurs on the left hand side of the cointegrating
regression. By choosing one variable on the left hand side the cointegrating vector are said to be
normalized around that variable, implicitly it is assumed that the normalization corresponds to some
long-run economic meaningful relationship. But, this is not always correct in limited samples, there are
evidence that normalization matters (Ng and Perron 1995). If the variables in the cointegrating vectors
have large differences in variances, some might be near integrated (having a large negative M A(1)
component), such factors might affect the outcome of the cointegration test. If testing in different ways
give different conclusions it is necessary to use more complex tests on cointegration.
There are three main problems with the two-step procedure. First, since the tests involves an ADF test
in the second step, all problems of ADF tests are valid here as well, especially choosing the number
of lags in the augmentation. Second, the test is based on the assumption of one cointegrating vector,
captured by the cointegrating regression. Thus, care must be taking when applying the test to models
with more that two variables. If two variables cointegrate adding a third integrated variable to the
model will not change the outcome of the test. If the third variable does not belong in the cointegrating
vector, OLS estimation will simply put its parameter to zero, leaving the error process unchanged.
Logical chains of bi-variate testing is often necessary (or sufficient) to get around this problem. Third,
the test assumes a common factor in the dynamics of the system. To see why this is so, rewrite the
simplest two-variable version of the test as,
If this restriction does not hold, the test should perform badly.
The advantage of the procedure is that it is easy, and relatively cost-less to apply compared with other
CHAPTER 4. STATIONARY AND INTEGRATED PROCESS 32
approaches. Especially for two variables it can work quite well, but it should be reminded that the
common factor restriction is a severe restriction since all short-run dynamics are forced to the residual
process.
where yt is an n × 1 I(1) vector and t is an n × 1 vector of innovations. By using the difference operator
∆ = 1 − L, where L is the lag operator, the VAR in levels can be transformed to a vector error correction
model (VECM) However, the use of the lag operator let to "lose" one lag, leading to a k − 1 lags in the
VECM
p−1
X
∆yt = µ + Πyt−1 + Γi ∆yt−i + t , (4.10)
i=1
The test statistics varies depending on the inclusion of constants and trends in the model. The standard
model can be said to include an unrestricted constant but no specific trend term. For a p-dimensional
vector of variables (yt ), since it has been proved that Π can be decomposed into αβ 0 , the estimated
model results,
k−1
0
X
∆yt = µ + αβ yt−1 + Γi ∆yt−i + Φdt + t , (4.11)
i=1
where µ is an unrestricted constant term and t ∼ Np (0, Σ). The number of cointegrating relationships
are denoted by r, the adjustment parameters are denoted in α elements and each column of β is a
cointegrating vector. It can be shown that for a given r, the maximum likelihood estimator of β defines
the combination of yt−1 that yields the r largest canonical correlations of ∆yt with yt−1 after correcting
for lagged differences and deterministic variables when present.
Johansen proposes two different likelihood ratio tests to test the significance of these canonical correla-
tions and also the reduced rank of the Π matrix: the trace test and maximum eigenvalue test.
n
X
JT RACE = −T ln(1 − λ̂i ) (4.12)
i=r+1
where T is the sample size and λ̂i is the ith largest canonical correlation. The trace test tests, under
the null hypothesis, that the coefficient matrix has rank less or equal than r, which resembles r cointe-
grating vectors, against the alternative hypothesis that the matrix has full rank n, which resembles n
cointegrating vectors.
The maximum eigenvalue test tests the null hypothesis of r cointegrating vectors against the alternative
hypothesis of r+1 cointegrating vectors. Usually, neither the trace and the eigenvalue tests follow a chi
square distribution. Although both the maximum eigenvalue and trace test statistics are useful to test
strict unit-root assumption, they lack on testing on systems characterized by near-unit-root processes.
Johansen’s test is used when all variables are I(1), however as he states there is little need to pre-test
the variables in the system to establish their order of integration. If a single variable is I(0), instead of
I(1), this will reveal itself through a cointegrating vector whose space is spanned by the only stationary
variable in the model.
Lets consider a model in which yt = (y1,t , y2,t )0 where y1,t is an I(1) and y2,t is an I(0) process. It is
expected that there should be a cointegrating vector as β = (0, 1)0 . If the coefficient matrix Π has full
rank n, all the n variables are stationary.
The assumption that states that any variable that is not I(1), or a pure unit-root process, is a station-
ary I(0) process allows to avoid preliminary tests on the classification of the variables as I(1) or I(0)
processes. Nevertheless, this does not make the method robust to near-integrated variables, since these
do not fall into none of the above mentioned classifications. However, the above specification tests of
the cointegrating vector suggest a way of making inference more robust in the potential presence of
near-unit-root variables. For instance, considering the bivariate case described above, explicitly testing
whether β = (01)0 will help to rule out spurious relationships that are not rejected by the initial max-
imum eigenvalue or trace test. Although we argue that such specification tests should be performed
in almost every kind of application, they are likely to be extra useful in cases where the variables are
likely to have near-unit-roots and the initial test of cointegration rank is biased.
CHAPTER 4. STATIONARY AND INTEGRATED PROCESS 34
As mentioned in the literature, the results should prove that the quotes are cointegrated by rejecting
the null hypothesis of no cointegration. Nevertheless, rejecting the null hypothesis does not necessarily
mean that the two variables are cointegrated. Indeed, rejecting the hypothesis of r = 0 would not imply
that the null hypothesis, no cointegration, is rejected if:
i. r = 1 is also rejected, which implies that matrix Π has full rank, because the variables are sta-
tionary, which means that it cannot be decomposed (Π = αβ 0 );
ii. r = 1 cannot be rejected, the restrictions i. β 0 = (1, 0) and ii. β 0 = (0, 1) must be, also, accounted
in the possible results. If one of these holds, it would mean that between x1t and x2t there is no
cointegration. If i) holds, the result exploits that x1t is stationary and that it does not have a
long-run relationship with x2t , if ii) cannot be rejected the case would be symmetric.
Chapter 5
Intra-day data are collected from the NYSE Trade and Quote (database) for the Walmart (WMT), The
Coca-Cola company (KO), JPMorgan (JPM) and Caterpillar (CAT) stocks. As event window it has
been chosen to collect data from January 29, 2001 to June 29, 2018. As time window it has been chosen
to analyzed the period that starts right after the burst of the Dot-com bubble in order to observe their
evolution and, respective, firm net-implementation during the years until the end of the first half 2018.
These stocks have been selected due to their trading volume during the day, which is large enough in
order to test the idea behind this paper.
• Delete entries whenever the price is outside the interval [Bid − 2Spread ; Ask + 2Spread].
• Delete all entries with the spread being greater or equal than 15 times the median spread of that
day.
• Delete all entries with the price being greater or equal than 5 times the median mid-quote of that
day.
• Delete all entries with the mid-quote being greater or equal than 10 times the mean absolute
deviation from the local median mid-quote.
• Delete all entries with the price being greater or equal than 10 times the mean absolute deviation
from the local median mid-quote.
35
CHAPTER 5. DATA SET AND PRELIMINARY STUDY 36
Table 5.1 reports the descriptive estimate of the return of each stock, by trading it with a frequency of
one second. The results are not statistically significant, but show a link in their return process.
As Table 5.3, 5.4 and 5.5 exhibit, all stocks present low values for the low frequency measures, which
do not seem to be significant, except MI. However, the main focus of this stats is to understand the
boundaries of each measure in order to have a further comprehension of this work’s topic.
CHAPTER 5. DATA SET AND PRELIMINARY STUDY 37
These values may imply that, on average, the firms that had a lower spread and AI were the ones that
had the most liquid behavior in the sample. Instead of merely, "observing" the values it is useful to
verify that these measure that reports liquidity status do not follow a Gaussian distribution, while
only MI gets closer to these.
Figure 5.1 shows how the log-volume of the four stocks has evolved during the event window and how
it is distributed during the intra-day transactions. The daily trading volume has enhanced, steadily,
during the years until the 2010, from which, it started to decay sharply in order to recover, and hold
the previous values, along the following year. While KO trading volume was pretty stable, the other
three had a peak around during the year 2009. This evidence may be caused by the US real estate
problem that lead to the 2007 financial crisis that spread in all the world. The rise in the real estates
allowed the three companies to ride the event in order to satisfy their client needs, for what concerns
the principal family (WMT), credit (JPM) and real estate (CAT) needs. Concerning the intra-day
volume, it is clear that all instruments behave like any other since they display peaks at the open and
closure of any trading day.
Once described the trading volume, return is analyzed to read its model and its high frequency vari-
ations. Firstly, Figure 5.2 shows the evolution of stocks’ RV along the period. By giving a first look,
it can be seen that illiquidity periods arise when several peaks show up, particularly during the period
2008-2010 and during 2003. Since it is made up by two components behavior, it is useful to have a brief
overlook at their individual patters, such as its comprehension should be as clear as possible.
It is also briefly exposed the correlation that runs within these firm assets, to arrange for any possible
CHAPTER 5. DATA SET AND PRELIMINARY STUDY 39
portfolio-build idea between these objects, and how the portfolio composition may change according to
a prospect of a long-term investment or a shorter one.
Figure 5.2 displays the RV of the instruments along the event window. It is not a surprise to see peaks
around years like 2002 (Dot-com bubble) and 2008 (subprime crisis and Lehman Brothers
bankruptcy). While during the whole period the RV stayed pretty flat and stable, in 2016 WMT and
KO had a lift due to industry related problems. KO faced a production issue in several nations all
over the world, which reduced the production temporarily. WMT’s one was, and is, related to the
poor customer service and employees relation caused by its politic of having always the lowest possible
price of their commodity needs.
Figure 5.3 exhibits the intra-day RV of the four shares, by considering a sample frequency of 5
1
minutes (∆ = 300 ) within each trade. The market opening, and closing, effect has already been
analyzed in the literature. As a common listed security, price variance enhances during the first
trading minutes after the trading day opens. After this time frame, as the time passes, the variation
narrows during the day and stays pretty stable for the day. However, as the day gets near its closure,
price changes occur in the last minutes, since every trader wants to close all his positions.
For the sake of knowledge, it has run a relation estimate of the RV of the illiquidity measures, the AI
and Bid-Ask spread (BAS) measures. Through the concept of RV, it has been estimated the realized
covariance within these factors to check if these can be considered exchangeable for the following study.
As supposed, the results show that the variables are characterized by a a perfectly positive correlation
among the two measures.
Since the two measures are correlated, to spare space and tests, the AI selection, to be the comparative
variable, is intentional since this may further stress the paper topic, illiquidity. To give a deep
knowledge about the argument, it has been put in place a comparison between the (usual) descriptive
correlation measure between the two measures of the stock and the "realized correlation" (Rρ).
As seen, the correlation measures are very different due to frequency at which the stocks are traded
during the day. Andersen, Bollerslev, Diebold and Labys (1999) found that the realized variance and
covariance are highly skewed, while the standard deviation (volatility) is almost symmetric, but its log
transformation behaves almost as a Gaussian. Realized correlation is usually positive, sometimes even
CHAPTER 5. DATA SET AND PRELIMINARY STUDY 41
with a strong weight, and it displays substantial variation during periods. It is, also, highly correlated
with the realized volatility. Particularly, return correlations tend to rise on high-volatility days.
Figure 5.4: WMT-KO Realized Correlation, Realized Covariance and Realized log-Covariance.
Figure 5.5: WMT-JPM Realized Correlation, Realized Covariance and Realized log-Covariance.
Figure 5.6: WMT-CAT Realized Correlation, Realized Covariance and Realized log-Covariance.
CHAPTER 5. DATA SET AND PRELIMINARY STUDY 42
Figure 5.7: KO-JPM Realized Correlation, Realized Covariance and Realized log-Covariance.
Figure 5.8: KO-CAT Realized Correlation, Realized Covariance and Realized log-Covariance.
Figure 5.9: JPM-CAT Realized Correlation, Realized Covariance and Realized log-Covariance.
As the graphs shows from Figure 5.4 to Figure 5.9 the Realized measures behave exactly like the
Andersen et al. said. By considering the idea of building a portfolio made with every stock, the result
will change consistently by considering which type of time-frame investment someone wants to enter.
The share are, more or less, correlated with all significant values as for the long-term investment as for
the shorter one.
Figure 5.10 shows the combination of pairs of mid-values between two of the different shares. Their
scheme seems to follow, in some instances, a pattern that relates the movement of one respect to the
other.
To further encourage the learning on the two measures’ behavior, it has been used the above stated
variance decomposition. To ascertain that these fits are consistent, and capture any possible
dependence, it is used the variation decomposition between two objects, by checking if the equality
holds. Let V arA−B be the LHS of the stated equation and V arAB the RHS of the same formula
(obtained by summing the two variances and subtracting their covariance). The measures used to test
this are, for the discrete statistic, the mid-range daily prices, while for the realized one it has been
used the mid-price variation with 5 minutes frequency.
2
σW V arA−B V arAB RVW M T −Y V arA−B V arAB
M T −Y
KO 36.04% 36.04% KO 5.39% 5.39%
JPM 25.64% 25.64% JPM 14.20% 14.20%
CAT 26.62% 26.62% CAT 8.48% 8.48%
2
σKO−Y V arA−B V arAB RVKO−Y V arA−B V arAB
2
σJP V arA−B V arAB RVJP M −Y V arA−B V arAB
M −Y
WMT 25.64% 25.64% WMT 14.20% 14.20%
KO 71.75% 71.75% KO 14.55% 14.55%
CAT 29.12% 29.12% CAT 14.59% 14.59%
2
σCAT V arA−B V arAB RVCAT −Y V arA−B V arAB
−Y
WMT 26.62% 26.62% WMT 8.48% 8.48%
KO 53.61% 53.61% KO 8.38% 8.38%
JPM 29.12% 29.12% JPM 14.59% 14.59%
As the tables from Table 5.8 to Table 5.11 show, the descriptive variance and RV formula holds for all
the stocks possible pairwise combinations. The RV exposes that with a 5 minutes frequency, i.e.
dividing the day in 78 sub-periods, the realized measure is consistent and accurate and should, hence,
be accounted to operate in short time frame of hours, or even seconds, of trading. These outcomes
may suggest that these shares co-move in an observable, accountable and measurable way, by
considering the "usual" conditional variance concept.
Concerning the ex-post measure, it is clear at first sight (as observed in the literature) that the price
discrepancy shrinks as the infra-day trading frequency narrows.
Therefore, because both the conditional descriptive measure and the realized one verify the variance
equality, anyone, who concerns to open a position with a combination of these, should not face
sudden, unexplained, price variations by accounting these measures.
Chapter 6
Empirical analysis
In this chapter, the event studies and measure analysis are summarized. In order to give a more
comprehensive understanding of the study this chapters is divided as follows:
iv. Measures estimate power of the on the market illiquidity risk premium.
At first sight Table 6.1 gives significant results for all the measures, but for the RV and MI ones
exhibits a near-unit-root result. However, since this results may lead us to doubt about the reliability
of ADF test, VR confirms the rejection of the unit root by giving very low values, which report that
all process are stationary. Therefore, both Table 6.1 and 6.2 show that both ADF and VR test gives
significant results for all measure, that allows to refuse the hypothesis of the presence of unit-root in
the time series process of each size.
45
CHAPTER 6. EMPIRICAL ANALYSIS 46
Since all the "old" measures have been proved in the literature to be stationary, it is worth to have a
further study of the behavior of the new-developed measure, MI.
Figure 6.1 shows the evolution of MI during the event period. Even through rough illiquid
sub-periods, the index stayed pretty stable for all the instrument within the boundary [0.2; 0.35],
which may result in a difference stationary process.
Usually bid and ask prices are cointegrated, so it is beneficial to study their evolution and variation
over the event window.
Figure 6.2 shows the respective bid and ask mid-prices of firms share. At first sight, as already
mentioned, these are likely to follow a similar price process. Considering this, the previously stated
variance decomposition has been re-arranged, as before, to the concept of RV to observe if there is an
observable dependence within the quotes.
CHAPTER 6. EMPIRICAL ANALYSIS 49
Because of the stress that is intentional to give to the decomposition of the coefficient matrix Π, it
has run a Johansen test that include also intercepts, linear trends (in the cointegrated series) and
deterministic quadratic trends. The decomposition of the cointegrated process would be:
Nevertheless, here the focus is given to the coefficients of the decomposed matrices, αβ 0 , because they
are the argument that allows to verify if a process stationary or not. The test used for all the Johansen’s
test is the trace test.
BAW M T BAKO
Rank
Stat p-value Stat p-value
r=0 601.2412a 0.0010 480.3424a 0.0010
r=1 9.0393a 0.0034 8.8578a 0.0037
Stationary process test
ADF -6,4634 0.0010 -22.7678 0.0010
0
| 1 + α β | 0.7473 0.7963
BAJP M BACAT
Rank
Stat p-value Stat p-value
r=0 412.4965a 0.0010 898.0167a 0.0010
r=1 11.8115a 0.0010 6.8392a 0.0092
Stationary process test
ADF -19.7918 0.0010 -23.2897 0.0010
0
| 1 + α β | 0.8257 0.6320
Table 6.3 reports Johansen cointegrating test for all the stocks. The results exploits a strong evidence
to reject the null hypothesis (cointegrating rank equal to zero) in favor of the alternative. Although
the null hypothesis is rejected, the coefficient vectors of the decomposed matrix have been analyzed to
verify that the alternative hypothesis holds. To verify the stationarity of the process it has been used
a test based on the verification of the unit root, Augmented Dickey-Fuller (ADF), and the assessment
that the product of the two matrices lies inside the unit disk. The outcomes validate the finding of
alternative hypothesis. So it can be stated that bid and ask prices of these stocks are cointegrated.
CHAPTER 6. EMPIRICAL ANALYSIS 50
All variables are taken with logs. Superscripts a, b and c reflect, severally, significance at 1%, 5% and 10% significance
level.
Table 6.4 reports Johansen Trace statistics, on stocks mid-range bid-ask price, to check if there is a
cointegration relation between the instruments and if they respect the stationarity condition. These
statistics provide some evidence for rejecting the null hypothesis of no cointegration for the couples’
average price. In any case, the stationary process audit disclose a near unit-root output. As Duffee
and Stanton (2007) outlined, a process with a very near-bound root would let shock last ,on average,
one year. This result is not an issue by considering the purpose, but it is important to mention that
these are heavily asymmetric when data are persistent.
LF = [AI, RBA, M I, DM ON , DF RI ],
where other two dummy variables are added in the measures, DM ON and DF RI . These take value of one,
respectively, if the trading day is a Monday or a Friday, to study also any possible week seasonality.
CHAPTER 6. EMPIRICAL ANALYSIS 51
Thereafter, it will be run a regression to state which frequency is the one that best estimates the
illiquidity size. The estimates would be obtained by running an OLS regression
Table 6.6 reports the results of the individual regressions run for both the sample frequency measures.
Table 6.5 exposes the single regression results, ran to investigate the common bid-ask spread measure,
used as the main illiquidity proxy.
CHAPTER 6. EMPIRICAL ANALYSIS 52
While high frequency measures report expected values (sign), low frequency liquidity measure report a
mixture of results for this test. While AI and RBA are found to hold their previous assumptions that
categorizes these as illiquid measures (since these have a positive weight with the spread), MI presents
a mix of positive and negative weights. Since this measure may be affected also by other variables, its
factor loading should be carefully interpreted. For both the regressions, dummy variables are not
significant at any confidence level. This might suggest that these stock are not affected by weekly
seasonality.
To further investigate the relation that runs within these frequency measures, an additional regression
is run to analyze all the liquidity measures on the usual spread measure. The vector of all measure is
then made up by
As Table 6.6 displays factor loading for all measures did not change its sign and confirms that MI can
be interpreted as a mere liquidity measure, in contrast of the above mentioned AI and RBA. However,
running this regression does not improve consistently the explanatory power of the regression, R2 . Such
result may imply that by using low frequency estimates is a good way to analyze the market behavior,
with an affordable loss.
where rM and rf are, respectively, the stock market return and the risk free rate. Since, the firms’ stocks
are all from US companies, rM is the daily return taken from the S&P500 and as risk free the three-
months Treasury bill rates. It should be recalled that the variables coefficients should be positive for
the first two measure, since illiquidity measure should reflect the higher risk, and negative for the liquid
one. The goal is to run a OLS regression to quantify the weights that the market return gives to the
variables. Considering that the error may be correlated to one, or all, of these variables, the White test
allows to verify that the heteroscedasticity problem would not impair or bias the estimators.Through
the test, it has been found that the variance of the estimators behaves like a homoscedastic estimator.
Table 6.7 reports the explanatory weight of each measure on the illiquidity risk premium.
The AI measure is the one that has the major weight among the others with very high values, which
are also highly significant at any possible level. This result renew the theory of Amihud that even in
CHAPTER 6. EMPIRICAL ANALYSIS 54
HFT, stocks can be differently priced due to their illiquid behavior. Obviously, lower is the value more
liquid is the instrument. This measure reflect that firm shares were often considered as liquid
instruments along the time window. However, even highly traded securities like these cannot be
considered always liquid since some exogenous events, that may affect the market or the industry in
which a firm operates, can change their usual schemes.
Since RBA is a dimensionless measure, it represents the mere illiquid aspect that characterizes each
firm instrument. All the coefficients are positive, except for CAT. The positive relation is expected
since greater is the bid-ask spread more volatile (illiquid) is the referred stock, however the negative
outcome presented by CAT may imply that this firm price is inversely related to the usual bid-ask
spread−volatility relation. This feature may be due to the industry in which it operates, or by the
fact that this has been, and is, the cheapest stock firm.
Liquidity measures are commonly negatively related to the premium that an investor would like to
earn if he holds an illiquid instrument. MI re-validates this link, with significant outcomes, and has
also a slight weight on the determination of the illiquidity risk premium. This link is characterized by
an inverse which implies that more illiquid, volatile, is the object lower, even in negative terms, lower
is the load factor on this measure.
Chapter 7
Conclusions
This work present new tests on HFT asset return and liquidity measure. It is known in the literature,
that illiquidity explains differences in expected returns across stocks, which holds also in this case. As
Amihud (2002), this work provides that stock illiquidity affects the stock excess return. The "com-
pensation" for the liquidity risk that these stocks generate, also known as "illiquidity risk premium",
accounts in his computation also the lower liquidity respect to Treasuries. The premiums tend to vary
over time as a liquidity function.
The illiquidity measures employed in this study are categorized in a two-fold classes: high (HF) and
low frequency (LF) measures. While HF are more consistent and accurate, compared to the latter, LF
are easier to compute and gives similar results wih a minimal loss in the accuracy process. HF are
made up by: Realized Variance, intra-day variation of stock return, Realized Liquidity Variation (i.e.
realized variance of the high frequency relative spread) and log intra-day high frequency volume. LF
comprehend: the Amihud Illiquidity (AI) measure, ratio of a stock absolute daily return to its daily
dollar volume, averaged over the corresponding period, which can be interpreted as the daily stock price
reaction to a dollar of trading volume, the relative bid-ask spread (RBA) measure, ratio of the common
bid-ask spread over stock mid-range price, which points out that wider is the measure more illiquid is
the stock, and a "new-developed" liquidity measure based on the Mid-range price, the Mid-range liquid-
ity (MI) measure, which is the ratio of the daily mid-range price and the correspondent daily trading
volume.
Before computing the relevant liquidity tests on these, some stationarity test have been used over the
bid and ask prices, which follow a cointegrated process such as, their combination, seems to be station-
ary. All the measures holds the hypothesis of a stationary process, for each, and the combination of
price processes are not likely to be cointegrated.
Tests on the accountability of the two-class frequency measures have been run. LF outperform HF in
explaining the illiquidity measure, bid-ask spread, which indicate that both AI and RBA have a positive
impact on the spread while MI has a negative one. However, by adding also HF in the regression does
not consistently improve the explanation power, but further stress the negative impact that MI has on
the illiquidity measure.
Furthermore, since it has been verified that LF gives consistent and, mostly, accountable measure of
illiquidity, it has been analyzed the impact of the above stated measures on the illiquidity risk premium
and how these help to explain the "illiquidity reward". It is not new that illiquidity measures result with
a positive relation while the liquid one has a negative link. The plainness of these measure should not
be topic of claims against their effectiveness. Greater is AI, more the stock will be subject to market
55
CHAPTER 7. CONCLUSIONS 56
manipulation, by entering the market with a large position with the referred instrument. This manner
is due to the illiquidity behavior of the stock itself, i.e. larger is the ratio, lower is the respective trading
volume which implies a larger illiquidity attitude.While AI reflected the illiquidity topic directly, MI
implied it in a reverse sense. The latter reports that a liquid instrument should be characterized by a
larger (i.e. smaller in absolute value) factor weight.
The illiquidity premium takes into account also the possible easiness of liquidating the position in a
short amount of time. This is stressed by the effects that stock excess return differs across stocks by
their liquidity or size due to their illiquidity behavior over time. This illiquidity feature has been found
with a stronger relevance, particularly, on small stocks.
Hence, the results suggest that the stock excess return, usually referred to as "risk premium" (in this
instance "illiquiditity risk premium"), is partially a compensation for stock illiquidity measure with low
frequency measure. This explains why some stock are characterized by high equity premiums. The
results mean that stock excess returns reflect not only the higher risk but also the lower liquidity of
stock compared to Treasury securities.
References
[1] Abdi, F., Ranaldo, A. (2017). A Simple Estimation of Bid-Ask Spreads from Daily Close, High,
and Low Prices. The Review of Financial Studies, 30: 4437-4480.
[2] Acharya, V., Pedersen, L. (2005). Asset pricing with liquidity risk. Journal of Financial Economics,
77(2): 375-410.
[3] Admati, A.R. Pffeiderer, P. (1988). A Theory of Intraday Trading Patterns. Review of Financial
Studies, 1: 3-40.
[4] Aït-Sahalia, Y., Yu, J. (2009). High Frequency Market Microstructure Noise Estimates and Liq-
uidity Measures. The Annals of Applied Statistics, 3(1): 422-457.
[5] Alabed, M., Al-Khouri, R. (2009). The pattern of intraday liquidity in emerging markets: The case
of the Amman Stock Exchange. Journal of Derivatives & Hedge Funds. 14: 265-284.
[6] Amiram, D., Cserna, B., Levy, A. (2016). Volatility, Liquidity, and Liquidity Risk. Columbia
Business School, University of Frankfurt, Ben-Gurion University.
[7] Amihud, Y., Mendelson, H. (1986). Asset pricing and the bid-ask spread. Journal of Financial
Economics, 17: 223-249
[8] Amihud, Y., Mendelson, H., Murgia, M. (1990) Stock market microstructure and return volatility:
Evidence from Italy. Journal of Banking and Finance, 14: 423-440.
[9] Amihud, Y. (2002). Illiquidity and stock returns: cross-section and time-series effects. Journal of
Financial Markets, 5: 31-56.
[10] Amihud, Y., Mendelson, H., Pedersen, L. (2005) Market microstructure and asset pricing. Stern
School, New York University.
[11] Andersen, T. G., Bollerslev, T., Diebold, F. X., Ebens H. (2001). The distribution of realized stock
return volatility. Journal of Financial Economics, 61: 43-76.
[12] Andersen, T. G., Bollerslev, T., Diebold, F. X., Labys, P. (2001). (Understanding, Optimizing,
Using and Forecasting) Realized Volatility and Correlation. RISK, 11.
[13] Andersen, T. G., Bollerslev, T., Diebold, F. X., Labys, P. (2000b). Exchange Rate Returns Stan-
dardized by Realized Volatility are (nearly) Gaussian. Multinational Finance Journal, 4: 159-179.
[14] Andersen, T. G., Bollerslev, T., Diebold, F. X., Labys, P. (2001). The Distribution of Realized
Exchange Rate Volatility. Journal of the American Statistical Association, 96: 42-55.
57
REFERENCES 58
[15] Andersen, T. G., Bollerslev, T., Diebold, F. X., Labys, P. (2002). Modeling and Forecasting Real-
ized Volatility. Econometrica, 71: 529-626.
[16] Andersen, T. G., Bollerslev, T., Meddahi, N. (2005). Correcting the Errors: Volatility Forecast
Evaluation Using High-Frequency Data and Realized Volatilities. Econometrica, 73: 279-296.
[17] Andersen, T. G., Bollerslev, T., Meddahi, N. (2011). Realized volatility forecasting and market
microstructure noise. Journal of Econometrics, 160: 220-234.
[18] Anderson, T.W. (1951). Estimating linear restrictions on regression coefficients for multivariate
normal distributions. Annals of Mathematical Statistics 22: 327-351.
[19] Anderson, T.W., Rubin, H. (1949). Estimation of the parameters of a single equation in a complete
system of stochastic equations. Annals of Mathematical Statistics 20: 46-63.
[20] Anderson, T.W. (2001). Reduced rank regression in cointegrated models. Journal of Econometrics,
106: 203-216.
[21] Bandi, F.M., Perron, B. (2006). Long memory and the relation between realized and implied
volatility. Journal of Financial Econometrics, 4: 636-670.
[22] Bandi, F.M., Phillips, P.C.B. (2007). A simple approach to the parametric estimation of potentially-
nonstationary diffusion processes. Journal of Econometrics, 137: 354-395.
[23] Bandi, F.M., Russell, J.R. (2008). Microstructure Noise, Realized Variance, and Optimal Sampling.
The Review of Economic Studies, 75(2): 339-369
[24] Bandi, F.M., Russell, J.R. (2006a). Separating microstructure noise from volatility. Journal of
Financial Economics, 79: 655-692.
[25] Bandi, F.M., Russell, J.R. (2006b). Comment on Hansen and Lunde. Journal of Business and
Economic Statistics, 24: 167-173.
[26] Bandi, F.M., Russell, J.R. (2006c). Volatility. In J.R. Birge and V. Linetski (Eds.) Handbook of
Financial Engineering. Elsevier North-Holland. Forthcoming.
[27] Bandi, F.M., Russell, J.R. (2006c). Market microstructure noise, integrated variance estimators,
and the accuracy of asymptotic approximations. Working paper.
[28] Bandi, F.M., Russell, J.R., Yang, C. (2007). Realized volatility forecasting in the presence of time-
varying noise. Working paper.
[29] Barndorff-Nielsen, O.E., Shephard N. (2002). Econometric analysis of realized volatility and its
use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B, 64:
253-280.
[30] Barndorff-Nielsen, O.E., Shephard N. (2003). How accurate is the asymptotic approximation to the
distribution of realized variance?. In Donald W.F. Andrews, James L. Powel, Paul L. Ruud, and
James H. Stock (Eds.) Identification and Inference for Econometric Models (festschrift for Thomas
J. Rothenberg). Cambridge University Press. Forthcoming.
REFERENCES 59
[31] Barndorff-Nielsen, O.E., Shephard N. (2004). Econometric analysis of realized covariation: high
frequency based covariance, regression, and correlation in financial economics. Econometrica, 72:
885-925.
[32] Barndorff-Nielsen, O.E., Shephard N. (2006). Variation, jumps, market frictions and high-frequency
data in financial econometrics. In R. Blundell, P. Torsten and W. K. Newey 39 (Eds.) Advances
in Economics and Econometrics. Theory and Applications. Ninth World Congress. Econometric
Society Monographs, Cambridge University Press.
[33] Beckers, S. (1983). Variance of security price returns based on high, low, and closing prices. Journal
of Business, 56: 97-112.
[34] Berkman, H., Eleswarapu, V.R. (1998). Short-term traders and liquidity: a test using Bombay
stock exchange data. Journal of Financial Economics 47: 339-355.
[35] Bessembinder, H. (1994). Bid-ask spreads in the interbank exchange markets. Journal of Financial
Economics, 35: 317-348.
[36] Bissoondoyal-Bheenick, E., Brooks, R., Treepongkaruna, S., Wee, M. (2016). Realized Volatility of
the Spread: An Analysis in the Foreign Exchange Market, Sabri Boubaker, Bonnie Buchanan, and
Duc Khuong Nguyen, eds., Risk Management in Emerging Markets (Emerald Group Publishing),
3-35.
[37] Black, S. W. (1991). Transactions costs and vehicle currencies. Journal of International Money and
Finance, 10: 512-526.
[38] Bollerslev, T., Domowitz, I. (1993). Trading Patterns and Prices in the Interbank Foreign Exchange
Market. Journal of Finance, 48: 1421-1443.
[39] Bollerslev, T., Melvin, M. (1994). Bid-ask spreads and volatility in the foreign exchange market.
Journal of International Economics, 36: 355-372.
[40] Bollerslev, T., Domowitz, I., Wang, J. (1997). Order flow and the bid-ask spread: An empirical
probability model of screen-based trading. Journal of Economic Dynamics and Control21(8-9):
1471-1491
[41] Boothe, P. (1988). Exchange rate risk and the bid-ask spread: A seven country comparison. Eco-
nomic Inquiry. 26(3): 485-492.
[42] Bossaerts, P., Hillion, P. (1991). Market Microstructure Effects of Government Intervention in the
Foreign Exchange Market. Review of Financial Studies, 4: 513-541.
[43] Breen, W.J., Hodrick, L.S., Korajczyk, R.A. (2002). Predicting equity liquidity. Management Sci-
ence, 48: 470-483.
[44] Breedon, F., Ranaldo, A. (2013). Intraday patterns in FX returns and order flow. Journal of Money,
Credit and Banking, 45: 953-965.
[45] Calamia, A. (1999). Market Microstructure: Theory and Empirics. LEM Working Paper, Pisa, 19
REFERENCES 60
[46] Campbell, J.Y., Lo, A.W., MacKinlay A.C. (1997). The Econometrics of Financial Markets. Prince-
ton University Press.
[47] Campbell, J.Y., Mankiw, N.G. (1987). Permanent and transitory components in macroeconomic
fluctuations. American Economic Review, 77: 111-117.
[48] Carmona, R., Webster, K. (2017). The microstructure of high frequency markets.
[49] Charles, A., Darné, O. (2009). Variance ratio tests of random walk: An overview. Journal of
Economic Surveys, Wiley, 23(3): 503-527.
[50] Chordia, T., Roll, R., Subrahmanyam, A. (2001). Market liquidity and trading activity. Journal of
Finance, 56: 501-530.
[51] Cochrane, J. H. (1988). How big is the random walk in GNP?. Journal of Political Economy, 96:
893-920.
[52] Cogley, T. (1990). International evidence on the size of the random walk in output. Journal of
Political Economy, 98: 501-518.
[53] Corsi, F. (2005). Measuring and Modeling Realized Volatility: from Tick-by-tick to Long Memory,
Ph.D. thesis, University of Lugano.
[54] Corwin, S. A., Schultz, P. (2012). A simple way to estimate bid-ask spreads from daily high and
low prices. Journal of Finance, 67: 719-759.
[55] Cuonz, J. (2018). Essays on the Dynamics of Market and Liquidity Risk. Dissertation of the
University of St. Gallen
[56] Danyliv, O., Bland, B., Nicholass, D. (2014). Convenient liquidity measure for Financial markets.
Papers, 1412.5072
[57] Dickey, D. A., W. A. Fuller. (1979). Distribution of the Estimators for Autoregressive Time Series
with a Unit Root. Journal of the American Statistical Association, 74: 427-431.
[58] Dickey, D. A., W. A. Fuller. (1981). Likelihood Ratio Statistics for Autoregressive Time Series with
a Unit Root. Econometrica, 49: 1057-1072.
[59] Domowitz, I., and El-Gamal, M. A. (1999). Financial market liquidity and the distribution of
prices. Social Sciences Research Network.
[60] Duffee R., Stanton G. R. (2008). Evidence on Simulation Inference for Near Unit-Root Processes
with Implications for Term Structure Estimation. Journal of Financial Econometrics, 6: 108-142.
[61] Enders, W. (1995). Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc..
[62] Engle, R. F., Granger, C. W. J. (1987). Co-integration and error correction: Representation,
estimation and testing. Econometrica, 55: 251-276.
[63] Engle, R.F., Kozicki, S. (1993). Testing for common factors (with comments). Journal of Business
Economics & Statistics, 11: 278-369.
REFERENCES 61
[64] Engler, M., Jeleskovic, V. (2016). Intraday volatility, trading volume and trading intensity in the
interbank market e-MID. Philipps-Universität Marburg.
[65] Evans, M. D. D., Lyons, R. K. (2002). Order flow and exchange rate dynamics. Journal of Political
Economy, 110: 170-180.
[66] Evans, M. D. D., Rime, D. (2016). Order flow information and spot rate dynamics. Journal of
International Money and Finance, 69: 45-68.
[67] Fama, E. F. (1968). Risk, Return and Equilibrium: Some Clarifying Comments. Journal of Finance,
23(1): 29-40.
[68] Fama E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of
Finance, 25: 383-417.
[69] Fama, E. F. (1990). Stock returns, expected returns, and real activity. Journal of Finance, 45:
1089-1108.
[70] Fama E. F. (1991). Efficient capital markets: II. Journal of Finance, 46: 1575- 1617.
[71] Fama, E.F., French, K.R. (1989). Business conditions and expected returns on stocks and bonds.
Journal of Financial Economics, 25: 23-49.
[72] Fama, E.F., French, K.R. (1992). The cross section of expected stock returns. Journal of Finance,
47: 427-465.
[73] Fama, E.F., MacBeth, J.D. (1973). Risk, return and equilibrium: empirical tests. Journal of Po-
litical Economy, 81: 607-636.
[74] Fong, K., Holden, C. W., Trzcinka., C. A. (2017). What are the best liquidity proxies for global
research?. Review of Finance, 21(4): 1355-1401.
[75] Frederick, H. deB. Harris, McInish, T., Shoesmith, G., Wood, R. (1995). Cointegration, Error Cor-
rection, and Price Discovery on Informationally Linked Security Markets. The Journal of Financial
and Quantitative Analysis, 30(4): 563-579.
[76] French, K.R., Schwert, G.W., Stambaugh, R.F. (1987). Expected stock returns and volatility.
Journal of Financial Economics, 19: 3-29.
[77] Foucault, T., Pagano, M., Roell, A. (2013). Market Liquidity: Theory, Evidence, and Policy. Oxford
University Press.
[78] Gargano, A., Riddiough, S.J., Sarno, L. (2017). The value of volume in foreign exchange. Working
paper.
[79] Gargano, A., Riddiough, S.J., Sarno, L. (2018). Volume and Excess Returns in Foreign Exchange.
Working paper.
[81] Garman, M. B., Klass., M. J. (1980). On the estimation of security price volatilities from historical
data. Journal of Business, 53: 67-78.
[82] Gençay, R., Ballocchi, G., Dacorogna, M. Olsen, R. and Pictet, O. (2002). Real-Time Trading
Models and the Statistical Properties of Foreign Exchange Rates. International Economic Review,
43(2): 463-492.
[83] Glassman, D. (1987). Exchange rate risk and transactions costs: Evidence from bid-ask spreads.
Journal of International Money and Finance, 6: 479-490.
[84] Goyenko, R. Y., Holden, C. W., Trzcinka., C. A. (2009). Do liquidity measures measure liquidity?.
Journal of Financial Economics, 92: 153-81.
[85] Granger, C. W. J. (1983). Cointegrated variables and error correction models. UCSD Discussion
paper, 83-13a.
[86] Guloglu, Z. C., Ekinci. C. (2016) A comparison of bid-ask spread proxies: evidence from Borsa
Istanbul futures. Journal of Economics, Finance and Accounting, 3(3): 244-254.
[87] Haldane, A. (2011). The race to zero. Bank of England speeches, given to the International Eco-
nomic Association 16th World Congress, July 8.
[88] Hamilton, J.D. (1994) Time Series Analysis. Princeton University Press.
[89] Hansen, P. R., Lunde, A. (2006). Realized variance and market microstructure noise. Journal of
Business & Economic Statistics, 24: 127-161.
[90] Hansen, B. E. (2018). Johansen’s Reduced Rank Estimator Is GMM. Department of Economics,
University of Wisconsin, Madison
[91] Hasbrouck, J. (1999). The Dynamics of Discrete Bid and Ask Quotes. The Journal of Finance,
54(6): 2109-2142.
[92] Hasbrouck, J. (2004). Liquidity in the futures pits: Inferring market dynamics from incomplete
data. Journal of Financial and Quantitative Analysis, 39: 305-26.
[93] Hasbrouck, J. (2007). Empirical Market Microstructure: The Institutions, Economics, and Econo-
metrics of Securities Trading. Oxford University Press, 1
[94] Hasbrouck, J. (2009). Trading costs and returns for US equities: the evidence from daily data.
Journal of Finance, 64: 1445-77.
[95] Hasbrouck, J. (2018). High-Frequency Quoting: Short-Term Volatility in Bids and Offers. Journal
of Financial and Quantitative Analysis, 53(2): 613-641.
[96] Hasbrouck, J., Sofianos, G. (1993). The trades of market makers: an empirical analysis of NYSE
specialists. Journal of Finance 48/5: 1565-1593.
[97] Hendry, D. F., Juselius, K. (2001). Explaining Cointegration Analysis: Part II. The Energy Journal,
22(1): 75-120.
REFERENCES 63
[98] Hjalmarsson, E., Osterholm, P. (2007). Testing for Co-integration Using the Johansen Methodology
when Variables are Near-Integrated. IMF Working Paper, 07(141): 22-27.
[99] Holden, C.W., Jacobsen, S., Subrahmanyam, A. (2014). The Empirical Analysis of Liquidity.
Foundations and Trends, 8(4): 1-102.
[100] Hwang, S., Satchell, S. (2000). Market Risk and the Concept of Fundamental Volatility: Measuring
Volatility Across Asset and Derivative Markets and Testing for the Impact of Derivatives Markets
on Financial Markets. Journal of Banking & Finance, 24(5): 759-785.
[101] Jacoby, G., Fowler, D. J., Gottesman, A. A. (2000) The capital asset pricing model and the
liquidity effect: A theoretical approach. Journal of Financial Markets, 3: 69-81.
[102] Jacod, J. (1994). Limit of random measures associated with the increments of a Brownian semi-
martingale. Working paper.
[103] Jacod, J. and P. Protter (1998). Asymptotic error distributions for the Euler method for stochastic
differential equations. Annals of Probability, 26: 267-307.
[104] Johansen, S. (1996). Likelihood-based inference in cointegrated vector autoregressive models. Ox-
ford University Press, Oxford.
[105] Johansen, S. (2002a). A small sample correction of the test for cointegrating rank in the vector
autoregressive model. Econometrica, 70: 1929-1961.
[106] Johansen, S., 2002b. A small sample correction for tests of hypotheses on the cointegrating vectors.
Journal of Econometrics, 111: 195-221.
[107] Johansen, S. (2004a). The interpretation of cointegrating coefficients in the cointegrated vector
autoregressive model. Forthcoming Oxford Bulletin of Economics and Statistics.
[108] Johansen, S. (2009). Cointegration: Overview and Development. Financial Time Series, Springer.
[109] Johansen, S. (2014). Times Series: Cointegration. CREATES Research Paper, 2014-2038.
[110] Karnaukh, N., Ranaldo, A., Söderlind, P. (2015). Understanding FX liquidity. Review of Financial
Studies, 28(11): 3073-3108.
[111] Khan, W.A., Baker, H.K. (1993). Unlisted trading privileges, liquidity and stock returns. Journal
of Financial Research 16: 221-236.
[112] Kyle, A. S. (1985). Continuous Auctions and Insider Trading. Econometrica 53: 1315-1335.
[113] Lesmond, D. A., Ogden, J. P., Trzcinka, C. A. (1999). A new estimate of transaction costs. Review
of Financial Studies, 12: 1113-41
[114] Llorente, G., Michaely, R., Saar, G., Jiang, W. (2002). Dynamic volume-return relation of indi-
vidual stocks. Review of Financial Studies, 15: 1005-1047.
[115] Lou, X. (2017). Price Impact or Trading Volume: Why is the Amihud (2002) Illiquidity Measure
Priced?. The Review of Financial Studies, 30(12): 4481-4520.
REFERENCES 64
[116] Mancini, L., Ranaldo, A., Wrampelmeyer, J. (2013). Liquidity in the foreign exchange market:
Measurement, commonality, and risk premiums. Journal of Financial Markets, 68: 1805-41.
[117] Melvin, M., Tan, K.-H. (1996). Foreign exchange market Bid-Ask spreads and the market price
of social unrest. Oxford Economic Papers, 48: 329-341.
[118] Mullins Jr., D.W. (1982). Does the capital asset pricing model work?. Harvard Business Review:
1-2: 105-113.
[119] Muranaga, J., Shimizu, T. (1999). Market Microstructure and Market Liquidity. Bank for Inter-
national Settlements, Market Liquidity: Research Findings and Selected Policy Implications, 11:
1-28.
[120] Nagel, S. (2012). Evaporating liquidity. Review of Financial Studies, 25: 2005-2039.
[121] Newey, W. K., West. K. D. (1987). A Simple Positive Semidefinite, Heteroskedasticity and Auto-
correlation Consistent Covariance Matrix. Econometrica, 55: 703-708.
[122] Newey, W. K., K. D. West. (1994). Automatic Lag Selection in Covariance Matrix Estimation.”
The Review of Economic Studies. Econometrica, 61: 631-653.
[124] O’Hara, M. (2015). High frequency market microstructure. Journal of Financial Economics,
116(2): 257-270.
[125] Oomen, R. A. C. (2001). Using high-frequency data stock market index data calculate, model and
forecast realized return variance. European University Institute, Economics Discussion Paper, No.
2001/6.
[126] Oomen, R. A. C. (2002). Modeling Realized Variance when Returns are Serially Correlated.
Manuscript, Warwick Business School.
[127] Parkinson, M. (1980). The extreme value method for estimating the variance of the rate of return.
Journal of Business, 53: 61-65.
[128] Ranaldo, A. (2000). Intraday Trading Activity on Financial Markets: The Swiss Evidence. PhD
thesis, University of Fribourg.
[129] Ranaldo, A., Santucci de Magistris, P. (2019). Trading Volume, Illiquidity and Commonalities in
FX Markets.
[130] Roch, A., Soner, H.M. (2013). Resilient Price Impact of Trading and the Cost of Illiquidity.
International Journal of Theoretical and Applied Finance, 16: 1-27.
[131] Roll, R. (1984). A simple implicit measure of the effective bid-ask spread in an efficient market.
Journal of Finance, 39: 1127-1139.
[132] Said, S. E., Dickey D. A. (1984). Testing for Unit Roots in Autoregressive Moving Average Models
of Unknown Order. Biometrika, 71(3): 599-607.
REFERENCES 65
[133] Schwartz, R.A., Francioni, R. (2004). Equity Markets in Action: The Fundamentals of Liquidity,
Market Structure & Trading. John Wiley & Sons.
[134] Sims, C., Stock, J., Watson, M. (1990). Inference in Linear Time Series Models with Some Unit
Roots. Econometrica, 58: 113-144.
[135] Sjö, B. (2010). Testing for Unit Roots and Cointegration 2 1 Unit Root Tests: Determining.
[136] Stock, J. H. (1987). Asymptotic properties of least squares estimates of cointegration vectors.
Econometrica, 55: 1035-1056.
[137] Treynor, Jack L. (1962). Toward a Theory of Market Value of Risky Assets.
[138] Vayanos, D., Wang, J. (2012). Liquidity and asset prices under asymmetric information and
imperfect competition. Review of Financial Studies, 25: 1339-1365.
[139] Wang, J. (1994). A model of competitive stock trading volume. Journal of Political Economy,
102: 127-168.
[141] Zhang, F. (2010). High-Frequency Trading, Stock Volatility, and Price Discovery. SSRN Electronic
Journal.
Summary
Liquidity and trading activity are important features of present financial markets, but little is known
about their evolution over time or about their time-series determinants. This is caused by the lack of
data provision of U.S. stock markets, which has supplied this data information only on some assets for
roughly four decades. Their fundamental importance has been represented by the influence that they
have on trading costs on specific returns (Amihud and Mendelson, 1986; Jacoby, Fowler, and Gottes-
man, 2000) which implies a direct link between liquidity and corporate costs of capital. More generally,
exchange organization, regulation, and investment management could all be improved by improving
the knowledge of factors that influence liquidity and trading activity. Understanding the scheme of
unobservable features of the market such as the magnitude of returns to liquidity provision (Nagel,
2012), the impact that asymmetric information has on prices (Llorente et al., 2002), and if trades are
more likely to have a permanent or transitory impact on price (Wang, 1994; Vayanos and Wang, 2012)
should increase investor confidence in financial markets and thereby enhance the efficacy of corporate
resource allocation.
At present, there are many measures that are good proxies that help to understand stocks liquidity
behavior. However, these are based on microstructure data which consist in high frequency trading
(HFT) data. HFT data are characterized by short time horizons, in which position are opened and
closed, even in fractions of second. These estimate are indeed, usually, very accurate to understand
the contemporaneous trading activity. Nevertheless, these are characterized by complex computational
process, assumptions on prices, in some instances, must be accounted and, mostly, they require huge
amount of data for each stock that is not available for many, even for listed ones.
Conversely, low frequency measures, measure based on a larger horizon that the previous, focus on ana-
lyzing and establish measures based with, usually, a daily frequency. These indeed are easier to compute
and require less information to be accounted in the analysis, which is also often readily available for
many instruments, listed or not. However, this lack of information gathering within observations may
lead to inaccuracy in their explanatory power for the examined topic.
Even so, among the liquidity literature many measures have been developed that give consistent esti-
mate, compared to high frequency sizes. The aim of this work is to rely on low frequency measures,
which can be computed with daily data that can be easily obtained for almost all stocks, in the ex-
plication of the illiquidity risk premium (premium that an illiquid stock should compensate to their
holder in exchange for the risk that the stockholder bears by holding the referred instrument). Illiquid
stocks are characterized by larger illiquid premiums, the whom are incorporated in the usual concept
of market risk premium. Hence, riskier (more illiquid) stocks return higher premiums.
This work hypothesis is tested by running a regression model
66
Summary 67
where rM and rf are, respectively, the stock market return and the risk free rate, AI is the Amihud
illiquidity measure, RBA is the relative bid-ask spread and MI is a "new-developed" measure that ac-
counts stock liquidity. Since, the firms’ stocks are all from US companies, rM is the daily return taken
from the S&P500 and as risk free the three-months Treasury bill rates.
Before dealing with the core hypothesis of this work, tests on measure stationarity have been made and
also on the explanatory power of both frequency measure is accounted, for the purpose of "declaring" if
the daily frequency can be a reliable estimators of illiquidity.
Liquidity measures are computed over four HFT stocks from different industries. These are: Walmart
(WMT), The Coca-Cola company (KO), JPMorgan (JPM) and Caterpillar (CAT). The event window
that has been chosen to collect data widens from January 29, 2001 to June 29, 2018. As time window
it has been chosen to analyzed the period that starts right after the burst of the Dot-com bubble in
order to observe their evolution and, respective, firm net-implementation during the years until the end
of the first half 2018.
Firstly, the measures are briefly exposed. The presentation starts by dealing with Roll (1984) measure,
which was one of the first pioneer in this field that developed a spread measure that has been used over
and over, with its own limitations. Many scholars modeled from this finding in order to overcome the
measure limits. Abdi and Ranaldo "craft" a measure, which implicitly relies on the same assumptions
made by Roll, denoted as Mid-range price made as a trivial average price of the high and low prices.
There is, in addition, also the Amihud (2002) illiquidity measure formulated as the daily ratio of abso-
lute stock return to its dollar volume, averaged over some period. This can be viewed as the daily price
response associated with one dollar of trading volume, which roughly reflect the price impact. This
measure has acquired some attention in the literature and the final test made in this work is based on
the same idea made by Amihud. Other measures have been established in the academia by Hasbrouck
(2004, 2009), who proposes a Gibbs sampler Bayesian estimation of the Roll model; Lesmond, Ogden,
and Trzcinka (1999) that introduce an estimator based on zero returns. Following the same line of rea-
soning of this, Fong, Holden, and Trzcinka (2017) formulate a new estimator that simplifies the existing
the previous measures. Holden (2009), jointly with Goyenko, Holden, and Trzcinka (2009), introduces
the Effective Tick measure based on the concept of price clustering. High and low prices have been
usually used to proxy volatility (Garman and Klass, 1980; Parkinson, 1980; Beckers, 1983). Corwin
and Schultz (2012) use them to put forward an original estimation method for transaction costs where
they assume that the high (low) price are buyer- (seller-) initiated, and their ratio can be disentangled
into the efficient price volatility and bid-ask spread. However, these have always been treated in the
literature and some flaws can be accounted to these.
Secondly, high frequency measure are exposed with the related literature. These require the gathering
of microstructure data. Market microstructure has been analyzed for almost four decades since it has
been a game changer in the trading industry. Jacod (1994) and Jacod and Protter (1998) were one of
the first studies in this new field and they introduced the new concept of realized variance, which is
the realized counterpart of the conceptual variance. Andersen, Bollerslev, Diebold and Ebens (2001)
and Andersen, Bollerslev, Diebold and Labys (2001 and 2002) re-arrange the realized measure and
model it from short-term price changes, which treat volatility as observable, rather than as a latent
variable. This has been proved to be an useful estimate of fundamental integrated volatility by Has-
brouck (2018), Hwang and Satchell (2000), Zhang (2010). However, Hansen and Lunde (2006) pointed
out that, when dealing with market microstructure, the presence of market microstructure noise in high
Summary 68
frequency data complicates the estimation of financial volatility and makes standard estimators, such
as the realized variance, unreliable. So market microstructure noise challenges the accountability of
theoretical results that rely on the absence of noise. The best remedy for market microstructure noise
depends on the properties of the noise. The time dependence in the noise and the correlation between
noise and efficient price arise naturally in some models on market microstructure effects, including (a
generalized version of) the bid-ask model by Roll (1984) (Hasbrouck, 2004) and models where agents
have asymmetric information (Glosten and Milgrom, 1985 and Easley and O’Hara, 1987, 1992). Market
microstructure noise has many sources, including the discreteness of the data (see Harris 1990, 1991)
and properties of the trading mechanism (Black, 1976; Amihud and Mendelson, 1987 and O’Hara,
1995). Other studies were focused on one of the most used sizes that plainly represents illiquidity: the
bid-ask spread. Particularly, starting from Amihud and Mendelson’s (1986) liquidity study, it has been
running several researches on the evolution of the volatility of this size. Studies on liquidity volatility
are found mostly on foreign exchange bid-ask spreads. Glassman (1987) and Boothe (1988) study the
statistical properties of the bid-ask spread, Bollerslev and Melvin (1994) evaluate the distribution of
bid-ask spreads and their ability to explain foreign exchange rate volatility based on a tick-by-tick data
set. Furthermore, Andersen et al. (2001) and Cuonz (2018) model a realized variance measure on the
intra-day bid-ask spread which satisfies the realized variation assumptions and gives consistent results,
even if stock returns are not accounted.
Thereafter, a presentation of the theory behind realized variance, and its link with market microstruc-
ture, and stationarity tests has been made, for the purpose of giving the main key concepts that allow
to understand the pattern of these sizes.
On one hand, low frequency measures are computed as follows:
BASt
RBAt = . (7.1)
ηt
where BASt is the bid-ask spread and ηt is the log mid price of the respective stock at time t.
N
1 X | rt |
AIt = , (7.2)
N t=1 V olt
where rt is the stock return over a day in a year, V olt is the respective daily volume and N is the
number of days for which trading data are available.
N
1 X ηt
M It = , (7.3)
N t=1 V olt
N
f
2
X
RVt ≡ rt,f,j , (7.4)
j=1
Summary 69
where rt,f,j is the jth intra-day return over the Nf time horizons.
mT
X
lV olt = log(V olt ) = log N STi , (7.5)
i=1
mh
i
∆BA2(m) (t − h +
X
lRBAh (t; m) = ), (7.6)
i=1 m
Firstly, non-stationary-based tests computed over the measures confirms that these neither follow a
random walk or an explosive process.
Low frequency
AI 0.5886 0.5906 0.6107 0.5764
RBA 0.7991 0.8104 0.7887 0.8428
MI 0.9998 0.9998 0.9998 0.9998
High frequency
RV 0.9998 0.9998 0.9998 0.9998
lRBA 0.9227 0.9300 0.9152 0.9328
lVol 0.9227 0.9300 0.7652 0.8189
Low frequency
AI 0.5078 0.4887 0.5105 0.4671
RBA 0.5304 0.5286 0.5615 0.5521
MI 0.6912 0.6781 0.7476 0.6876
High frequency
RV 0.5218 0.4993 0.6943 0.5534
HFRBA 0.5444 0.5382 0.5325 0.5803
lVol 0.6758 0.6552 0.7014 0.6708
VR test.
Outputs give significant results for all the measures, but for the RV and MI ones exhibits a
near-unit-root result. However, since this results may lead us to doubt about the reliability of ADF
test, VR confirms the rejection of the unit root by giving very low values, which report that all
process are stationary. Therefore, both Table 6.1 and 6.2 show that both ADF and VR test gives
Summary 70
significant results for all measure, that allows to refuse the hypothesis of the presence of unit-root in
the time series process of each size.
Time series path of MI during the event period exposes that, even through rough illiquid sub-periods,
the index stayed pretty stable for all the instrument within the boundary [0.2; 0.35], which may result
in a difference stationary process.
As Frederick (1995) and Bollerslev (1997) discovered, often, bid and ask prices are cointegrated. To
analyze their behavior, and thanks to the previous tests on the data-set, it is used the mid-range bid
and ask price of each day to analyze their movement pattern.
Usually bid and ask prices are cointegrated, so it is beneficial to study their evolution and variation
over the event window.
Summary 71
Bid and ask prices, of firms shares, at first sight are likely to follow a similar price process and it
seems to be non-stationary. Johansen test is then applied to verify this assessment, and to find a
possible combination of these which is stationary.
BAW M T BAKO
Rank
Stat p-value Stat p-value
r=0 601.2412a 0.0010 480.3424a 0.0010
r=1 9.0393a 0.0034 8.8578a 0.0037
Stationary process test
ADF -6,4634 0.0010 -22.7678 0.0010
0
| 1 + α β | 0.7473 0.7963
BAJP M BACAT
Rank
Stat p-value Stat p-value
r=0 412.4965a 0.0010 898.0167a 0.0010
r=1 11.8115a 0.0010 6.8392a 0.0092
Stationary process test
ADF -19.7918 0.0010 -23.2897 0.0010
0
| 1 + α β | 0.8257 0.6320
Johansen cointegrating test for all the stocks exploits a strong evidence to reject the null hypothesis
(cointegrating rank equal to zero) in favor of the alternative. Although the null hypothesis is rejected,
the coefficient vectors of the decomposed matrix have been analyzed to verify that the alternative
hypothesis holds. To verify the stationarity of the process it has been used a test based on the
verification of the unit root, Augmented Dickey-Fuller (ADF), and the assessment that the product of
the two matrices lies inside the unit disk. The outcomes validate the finding of alternative hypothesis.
So it can be stated that bid and ask prices of these stocks are cointegrated.
However, both the bid and ask prices, as noted in the literature, are non-stationary. Nevertheless,
by their linear combination results to be stationary, which allows to state that these are cointegrated.
Despite this, their mid-price is not cointegrated, and their combination does not satisfy the stationarity
requirements. Any possible combination within the stocks has been made but these seems to not satisfy
either the stationary conditions. Anyway, this result should be expected due to the low relationship
that resides within the firm’s industries.
Then, tests on the relation that elapses with, one of the most accountable illiquidity measures, bid-
ask spread and low and high frequency measure prove that the low ones seem to be good estimates
while the latter give good proxies, but with less explanatory power than the daily ones. Anyhow, a
"comprehensive" test with all variables exposes a slight significance to the previous ones, but stressed
the role of the "new-developed liquidity measure". Measures are categorized, firstly, in low and high
frequency as follows
LF = [AI, RBA, M I, DM ON , DF RI ],
where other two dummy variables are added in the measures, DM ON and DF RI . These take value of one,
respectively, if the trading day is a Monday or a Friday, to study also any possible week seasonality.
Thereafter, it will be run a regression to state which frequency is the one that best estimates the
illiquidity size. The estimates would be obtained by running an OLS regression
Low frequency
AI 13.4344a 5.4209c 2.7611c −2.7682b
(4.45) (2.76) (1.68) (−2.01)
RBA 2.7841a 2.1671a 2.7461a 2.0728a
(17.81) (21.94) (12.01) (26.86)
MI 0.0692b −0.2839a 0.1495a −0.3224a
(2.31) (−7.62) (3.73) (−7.86)
DM ON 0.0001 0.0020 0.0013 0.0020
(0.02) (0.97) (0.35) (0.75)
DF RI −0.0004 −0.0002 0.0009 0.0007
(−0.16) (−0.11) (0.35) (0.27)
Const. −0.058 0.0728a −0.0232b 0.1289a
(−0.68) (8.22) (−2.22) (10.43)
High frequency
RV 0.0943a 17.8142a 5.0102a 5.4451a
(3.92) (4.11) (4.90) (6.46)
lRBA 0.0123a 0.0072a 0.0090a 0.0234a
(22.46) (13.51) (10.80) (35.54)
lVol −0.0073a −0.0119a −0.0087a −0.0019b
(−8.05) (−13.38) (−7.13) (−2.47)
DM ON −0.0005 0.0005 −0.0002 0.0002
(−0.68) (0.68) (−0.16) (0.29)
DF RI −0.0006 −0.0002 −0.0002 0.0004
(−0.85) (−0.31) (−0.15) (0.46)
Const. 0.0943a 0.1904a 0.1332a −0.0315b
(6.31) (12.48) (6.09) (−2.49)
While high frequency measures report expected values (sign), low frequency liquidity measure report a
mixture of results for this test. While AI and RBA are found to hold their previous assumptions that
categorizes these as illiquid measures (since these have a positive weight with the spread), MI presents
a mix of positive and negative weights. Since this measure may be affected also by other variables, its
Summary 74
factor loading should be carefully interpreted. For both the regressions, dummy variables are not
significant at any confidence level. This might suggest that these stock are not affected by weekly
seasonality.
To further investigate the relation that runs within these frequency measures, an additional regression
is run to analyze all the liquidity measures on the usual spread measure. The vector of all measure is
then made up by
Factor weights for all measures did not change its sign and confirms that MI can be interpreted as a mere
liquidity measure, in contrast of the above mentioned AI and RBA. However, running this regression
does not improve consistently the explanatory power of the regression, R2 . Such result may imply that
Summary 75
by using low frequency estimates is a good way to analyze the market behavior, with an affordable loss.
Lastly, the core test of this study, the relation with the daily liquidity measures and the illiquidity risk
premium, gives consistent results that confirms Amihud’s finding.
The test is made by a re-arrangement of the CAPM equation, already treated by analysts like Fama,
French (1992) and Amihud himself, in order to price the illiquidity risk of these HFT shares. Hence,
more illiquid stocks should have higher illiquidity risk premium, which can be measured as:
It should be recalled that the variables coefficients should be positive for the first two measure, since
illiquidity measure should reflect the higher risk, and negative for the liquid one. The goal is to run a
OLS regression to quantify the weights that the market return gives to the variables. Considering that
the error may be correlated to one, or all, of these variables, the White test allows to verify that the
heteroscedasticity problem would not impair or bias the estimators.Through the test, it has been found
that the variance of the estimators behaves like a homoscedastic estimator.
AI measure is the one that has the major weight among the others with very high values, which are
also highly significant at any possible level. This result renew the theory of Amihud that even in
HFT, stocks can be differently priced due to their illiquid behavior. Obviously, lower is the value more
liquid is the instrument. This measure reflect that firm shares were often considered as liquid
instruments along the time window. However, even highly traded securities like these cannot be
considered always liquid since some exogenous events, that may affect the market or the industry in
which a firm operates, can change their usual schemes.
Since RBA is a dimensionless measure, it represents the mere illiquid aspect that characterizes each
firm instrument. All the coefficients are positive, except for CAT. The positive relation is expected
since greater is the bid-ask spread more volatile (illiquid) is the referred stock, however the negative
outcome presented by CAT may imply that this firm price is inversely related to the usual bid-ask
Summary 76
spread−volatility relation. This feature may be due to the industry in which it operates, or by the fact
that this has been, and is, the cheapest stock firm.
Liquidity measures are commonly negatively related to the premium that an investor would like to earn
if he holds an illiquid instrument. MI re-validates this link, with significant outcomes, and has also
a slight weight on the determination of the illiquidity risk premium. This link is characterized by an
inverse which implies that more illiquid, volatile, is the object lower, even in negative terms, lower is
the load factor on this measure.
Hence, the results suggest that the stock excess return, usually referred to as "risk premium" (in this
instance "illiquiditity risk premium"), is partially a compensation for stock illiquidity measure with low
frequency measure. This explains why some stock are characterized by high equity premiums. The
results mean that stock excess returns reflect not only the higher risk but also the lower liquidity of
stock compared to Treasury securities.