Autocorrelation I PG
Autocorrelation I PG
time series transfer from one period to another. In other words, the error for one time period a
is correlated with the error for a subsequent time period b.
Symbolically
E ( ui uj ) = 0 i≠j ( No Autocorrelation)
E ( ui uj ) ≠ 0 i≠j ( Presence of Autocorrelation)
A standardised measure, which lies between −1 and +1, of the extent to which the current value
of a series is related to its own previous value Autocorrelation can be defined as correlation
between the variables of some observations at different points of time if it is about a “time
series data”, or it will be correlation between the variables of some observations at different
space if it is about “ cross sectional data”
Autocorrelation measures the degree of similarity between a given time series and the lagged
version of that time series over successive time periods. It is similar to calculating the
correlation between two different variables except in Autocorrelation we calculate the
correlation between two different versions Xt and Xt-k of the same time series.
Autocorrelation refers to the degree of correlation of the same variables between two
successive time intervals. It measures how the lagged version of the value of a variable is
related to the original version of it in a time series.
Types of Autocorrelation
The most common form of autocorrelation is first-order serial correlation, which can either be
positive or negative.
Positive Serial Correlation is where a positive error in one period carries over into a positive
error for the following period.
Negative Serial Correlation is where a negative error in one period carries over into a negative
error for the following period.
Use of Autocorrelation
detects repeating patterns and trends in time series data. Positive autocorrelation at
specific lags may indicate the presence of seasonality.
guides the determination of order of ARIMA and MA models by providing insights
into the number of lag terms to include.
helps to check whether a time series is stationary or exhibits trends and non-stationary
behaviour.
Sudden spikes or drops in autocorrelation at certain lags may indicate the presence of
anomalies and outliers.
Autocorrelation in Time Series
Autocorrelation is important in time series as:
Autocorrelation helps reveal repeating patterns or trends within a time series. By analyzing
how a variable correlates with its past values at different lags, analysts can identify the presence
of cyclic or seasonal patterns in the data. For example, in economic data, autocorrelation may
reveal whether certain economic indicators exhibit regular patterns over specific time intervals,
such as monthly or quarterly cycles.
Financial analysts and traders often use autocorrelation to analyse historical price movements
in financial markets. By identifying autocorrelation patterns in past price changes, they may
attempt to predict future price movements. For instance, if there is a positive autocorrelation at
a specific lag, indicating a trend in price movements, traders might use this information to
inform their predictions and trading strategies.
The Autocorrelation Function (ACF) is a crucial tool for modelling time series data. ACF helps
identify which lags have significant correlations with the current observation. In time series
modelling, understanding the autocorrelation structure is essential for selecting appropriate
models. For instance, if there is a significant autocorrelation at a particular lag, it may suggest
the presence of an autoregressive (AR) component in the model, influencing the current value
based on past values. The ACF plot allows analysts to observe the decay of autocorrelation
over lags, guiding the choice of lag values to include in autoregressive models.
REASONS FOR AUTOCORRELATION
Inertia or Sluggishness
Most of the economic time series data displays inertia or sluggishness. For instance, gross
domestic product (GDP), production, employment, money supply, etc. reflect recurring and
self-sustaining fluctuations in economic activity. When an economy is recovering from
recession, most of the time series will be moving upwards. This means any subsequent value
of a series at one point of time is always greater than its previous time value.
Specification Error in the Model : Excluded variable Case
By an incorrect specification of model, certain important variables that should be included in
the model may not be included (i.e. a case of under-specification). If such model-
misspecification occurs, the residuals from such an incorrect model will exhibit systematic
pattern. If the residuals show a distinct pattern, it gives rise to serial correlation.
Specifications Bias – Excluded variable Example of quantity of beef and price of pork
Appropriate equation:
𝐘𝐭 = 𝛃𝟏 + 𝛃𝟐 𝐗 𝟐𝐭 + 𝛃𝟑 𝐗 𝟑𝐭 + 𝛃𝟒 𝐗 𝟒𝐭 + 𝐮𝐭 -(1)
Estimated equation
𝐘𝐭 = 𝛃𝟏 + 𝛃𝟐 𝐗 𝟐𝐭 + 𝛃𝟑 𝐗 𝟑𝐭 + 𝐯𝐭 (2 )
Estimating the second equation implies
𝐯𝐭 = 𝛃𝟒 𝐗 𝟒𝐭 + 𝐮𝐭
Specification Bias: Incorrect Functional Form
In the figure between points A and B the linear marginal cost curve will consistently
overestimate the true marginal cost, whereas beyond these points it will consistently
underestimate the true marginal cost. This result is to be expected, because the disturbance term
vi is, in fact, equal to output 2 + ui, and hence will catch the systematic effect of the output2 term
on marginal cost. In this case, vi will reflect autocorrelation
because of the use of an incorrect functional form
The Cobweb Phenomenon
Many agricultural commodities reflect what is called as a ‘cobweb phenomenon’. In this,
supply reacts to price with a lag of time. This is mainly because supply decisions take time to
implement. In other words, there is a gestation period involved. For instance, farmers’ decision
to plant crop might depend on the prices prevailing in the previous year’s supply position or
function
At the beginning of this year’s planting of crops, farmers are influenced by the price prevailing
last year, so their supply function
𝐒𝐮𝐩𝐩𝐥𝐲𝐭 = 𝛃𝟏 + 𝛃𝟐 𝒑𝐭−𝟏 + 𝐮𝐭
Suppose at the end of period t, price Pt turns out to be lower than Pt−1. Therefore, in period
t +1, farmers may very well decide to produce less than they did in period t. Obviously, in
this situation the disturbances ut are not expected to be random because if the farmers over
produce in year t, they are likely to reduce their production in t + 1, and so on, leading to
a cobweb pattern.
Lags
In a time series regression of consumption expenditure on income, it is not uncommon to find
that the consumption expenditure in the current period depends, among other things, on the
consumption expenditure of the previous period Consumers do not change their consumption
habits readily for psychological, technological, or institutional reasons. Now if we neglect the
lagged term in the equation below the resulting error term will reflect a systematic pattern due
to the influence of lagged consumption on current consumption
Consumption t = β1 + β2 Consumption t -1 + ut
Data Smoothing & Data Transformation
Autocorrelation may be positive or negative depending on the data. Generally, economic data
exhibits positive autocorrelation. This is because most of them either move upwards or
downwards over time. Such a trend continues at least for some time i.e. some months, or
quarters. This means, they are not generally expected to exhibit a sudden upward or downward
movement unless there is a reason or a shock
𝐘𝐭 = 𝛃𝟏 + 𝛃𝟐 𝐗 𝐭 + 𝐮𝐭 ----- ( 1 )
Y = consumption expenditure and X = income. Since Eq. (1) holds true
at every time period, it holds true also in the previous time period, (t − 1). So, we can write it
as
𝐘𝐭−𝟏 = 𝛃𝟏 + 𝛃𝟐 𝐗 𝐭−𝟏 + 𝐮𝐭−𝟏
t-1 are the lagged values of Y, X and u , lagged by one period
∆𝐘𝐭 = 𝛃𝟐 ∆𝐗 𝐭 + 𝐯𝐭
where vt =∆ ut = (ut −ut−1)
∆ is known as the first difference operator
This equation is known as the first difference form and dynamic regression model. The
previous equation is known as the level form
Nonstationary
When dealing with time series data, we should check whether the given time series is stationary.
The time series is stationary if its characteristics (e.g. Mean, variance, and covariance) are time-
variant, i.e. they do not change over time.
If that is not the case, we have a non-stationary time series.
White Noise Error term
The scheme given below is known as a Markov first-order autoregressive scheme, or simply a
first-order autoregressive scheme, usually denoted as AR(1). The name autoregressive is
appropriate because Eq. (12.2.1) can be interpreted as the regression of ut on itself lagged one
period. It is first order because ut and its immediate past value are involved; that is, the
maximum lag is 1. If the model were ut = ρ1ut−1 + ρ2ut−2 + εt, it would be an AR(2), or second-
order, autoregressive scheme, and so on.
where ρ ( = rho) is known as the coefficient of autocovariance and where εt is the stochastic
disturbance term such that it satisfies the standard OLS assumptions, namely,
An error term with the preceding properties is often called a white noise error term.
Difference Between Autocorrelation and Multicollinearity
Autocorrelation Multicollinearity
Correlation between a variable and its lagged Correlation between independent variables
values in a model
Relationship within a single variable over Relationship among multiple independent
time variables
Identifying temporal patterns in time series Detecting interdependence among predictor
data variables
Examines correlation between a variable and Investigates correlation between
its past values independent variables
Can lead to biased parameter estimates in Can lead to inflated standard errors and
time series models difficulty in isolating individual variable
effects
Ljung-Box test, Durbin-Watson statistic Variance Inflation Factor (VIF), correlation
matrix, condition indices