Time Series Analysis
Time Series Analysis
Time Series
A time series is a set of observations each one being recorded at a specified
time . In other words, arrangement of statistical data in accordance with occurrence of time is
known as time series.
According to Ya-lun Chou, “A Time Series may be defined as a collection of readings belonging
to different time periods, of some economic variable or composite of variables”.
Yt f t
where is the value of the phenomenon (variable) under consideration at time .
Thus a time series invariably gives a bivariate distribution, one of the two variables being time
and the other being the value of the phenomenon at different points of time.
Example:
1. Many time series arise in economics, such as share prices on successive days, average
incomes in successive months, company profits in successive years etc.
2. Many types of time series occur in the physical sciences, particularly in methodology marine
science and geophysics. Examples are rainfall in successive days and air temperature
measured in successive hours or months etc.
3. The analysis of sales figure in successive weeks or months is an important problem in
commerce. Marketing data have many similarities to economic data. It is an important
forecast future sale so as to plan production. It is also of interest to examine the relationship
between sales and other time series such as advertising expenditure in successive time
periods.
4. The population of Bangladesh measured at every ten years is an example of demographic
time series.
5. A special type of time series arises when observations can take one of only two values,
usually denoted by . Time series of this type called binary processes, occur particularly
Time Series Analysis-1
in communication theory. For example, the position of a switch either ‘on’ or ‘off’ could be
recorded as respectively.
6. A different type of time series occurs when we consider a series of events occurring
‘randomly’ in time. The record of the dates of major railway disasters. A series of events of
this type is known as ‘point processes’.
i) Trend:
The general tendency of the time series data to increase or decrease during a long period of time
is called the secular trend or long term trend or simply trend. The concept of trend does not
include short range oscillations. This is true in most of business and economic statistics. Trend
may have either upward or downward movement, such as production, prices, income are upward
trend while a downward trend is noticed in the time series relating to deaths, epidemics etc.
trend is the general, smooth, long term, average tendency.
b) Non-Linear Trend:
If we get a curve after plotting the time series values then it is called non-linear or curvilinear
trend.
b) Cyclic Variation:
The oscillatory movements in a time series with period of oscillation more than one year are
termed as cyclic variations. One complete period is called a cycle. The cyclic movements in a
time series are generally attributed to the so called “Business Cycle” which may also be referred
to as the four phase cycle of
1) Prosperity
2) Recession
What are the Different Types of Model are Used in Time Series?
Answer:
The following are the models commonly used for the decomposition of a time series into its
components.
i) Decomposition by additive hypothesis
ii) Decomposition by multiplicative hypothesis
iii) Mixed models.
where is the time series value at time , is the trend value at time , is the seasonal
variation at time , is the cyclic variation at time and is the random variation at time .
Assumptions:
i) operate with equal absolute effect irrespective of the trend.
ii) will have positive or negative values and the total of positive and
negative values for any cycle (and any year) will be zero.
iii) will also have positive or negative values and in the long run will be zero.
Assumptions:
i) All the values are positive.
ii) The geometric mean of in a year, in a cycle and in a long term period are
unity.
iii)Mixed Model:
The different combination of additive and multiplicative models are named as mixed model
defined as-
Time Plot:
The first step in analyzing a time series is to plot the observations against time. This plot is
known as time plot. It is similar plot of scatter diagram. This will show up important features
such as trend, seasonality, discontinuities and outliers etc.
possibly only the means and co-variances) of a sequence of random variables of which
is postulated to be a realization.
IID Noise:
By definition we can write, for any positive integer and real numbers .
Where, is the cumulative distribution function, of each of the identically distributed random
variables . In this model there is no dependence between observations. In
particular, for all and all
Although, this means that IID noise is a rather uninteresting process for forecasters, it plays an
important role as a building block for more complicated time series model.
where .
The time series obtained by tossing a penny repeatedly and scoring for each head and for
each tail is usually modeled as a realization of this process.
Random Walk:
The random walk is obtained by cumulatively summing (or integrating) iid
random variables. Thus a random walk with zero mean is obtained by defining and
In the least squares procedure we attempt to fit a parametric family of functions, e.g.,
For example, if the magnitude of the fluctuations appears to zero roughly linearly with the
level of the series, then the transformed series will have
fluctuations of more constant magnitude. There are several ways in which trend and
seasonality can be removed, some involving estimating the components and subtracting
them from the data and others depending on differencing the data i.e., replacing the original
series by for some positive integer . Whichever method is used, the
aim is to produce a stationary series.
3) Choose a model to fit the residuals, making use of various sample statistics including the
sample auto correlation function.
4) Forecasting will be achieved by forecasting the residuals and then investing the
transformations described above to arrive at forecasts of the original series .
Stochastic Process:
A stochastic process may be defied as a collection of random variables , where
denotes the set of time points. We denote the random variable at time by if is discrete
and by if is continuous. Thus a stochastic process is a collection of random variables
which are ordered in time.
Stationary Process:
A time series is said to be stationary if it has similar statistical
Non-Stationary:
A time series is said to be non-stationary time series if it is dependent
of time .
Weakly Stationary:
A time series is said to be weakly stationary if it’s mean function and
covariance function is independent of time , i.e.,
i) is independent of time and
ii) Is independent of time for each .
Usually weakly stationary is termed as simply stationary.
Strictly Stationary:
A time series is said to be strictly stationary if the joint distribution of
is the same as the joint distribution of for all
integers and , i.e.
Covariance Function:
Time Series Analysis-9
The covariance function of is defied as:
Proof:
If then the properties of strictly stationary are-
Auto-Covariance Function:
Let be any stationary time series. Then auto-covariance function of is defined
as-
Hence IID noise with finite second moment is stationary. i.e., . It indicates that
the random variables are independent and identically distributed random variables with mean
and variance .
Proof:
By definition we have,
then clearly is stationary with the same covariance function as the IID noise as:
which does not depend on . Such a sequence is referred to as white noise. This is indicated by
the notation
.
Let is the random walk and is a iid noise with for all .
Obviously and .
We have,
and
Finally we have,
Equation .
Solution:
We have
Similarly we have,
Now,
then the sample ACF will provide an estimate of the ACF of . Sample Auto-Correlation
Function is given by-
Non-Negative Definite:
A real valued function defined on the integers is non-negative definite if
for all positive integers and vectors with real valued component .
Answer:
The basic properties of auto-covariance and auto-correlation function of a stationary time series
are-
1.
2.
3. is even function i.e.
2.
3.
Example:
Suppose we have simulated values of normally distributed denoted noise.
Since the auto-correlation function for , corresponding sample auto-correlations to
be near 0. It can be shown in fact that for IID noise with finite variance, the sample auto-
of the sample auto-correlations should fall between the bounds (since is the
If the seasonal and noise fluctuations appear to increase with the level of the process, then a
preliminary transformation of the data is often used to make the transformed data more
compatible with the model.
Our aim is to estimate and extract the deterministic components in the hope that the
residual or noise component will turn out to be a stationary time series. We can then use the
theory of such processes to find a satisfactory probabilistic model for the process , to analyze
its properties and to use it in conjunction with for purpose of prediction and
simulation of .
Another approach, developed by Box and JenKins is to apply differencing operators repeatedly
to the series until the differenced observations resemble a realization of some stationary
time series . We can then use the theory of stationary processes for the modeling, analysis
Assume that is approximately linear over the interval and that the average of the
error terms over this interval is close to zero. Thus the moving average provides us with the
estimates
This particular filter is a low-pass filter in the sense that it takes the data and removes
rapidly fluctuating or high frequency component to leave the slowly varying estimated term
. The particular filter is only one of many that could be used for smoothing.
For large , , it will not only reduce noise but also will allow linear trend
For any fixed the one sided moving averages defined by the
recursions
And let equal the actual value of the time series in period .
i.e.,
Application of and is often referred to as exponential smoothing since the recursions
imply that for
The method of least squares estimation can also be used to estimate higher order polynomial
trends in the same way.
Polynomials in and are manipulated in precisely the same way as polynomial functions of
real variables.
If the operator is applied to linear trend function , then we obtain the constant
function
In the same way any polynomial trend of degree can be reduced to a constant by application
of the operator .
These considerations suggest the possibility, given any sequence of data of applying the
Advantage of Differencing:
This method has the advantage that it usually requires the estimation of fewer parameters and
does not rest on the assumption of a trend that remains fixed throughout the observation period.
The second step is to estimate the seasonal component. For each , we compute
The deseasonalized (Seasonally adjusted) data is then defined to be the original series with the
estimated seasonal component removed
Finally we re-estimate the trend from the deseasonalized data. Thus the estimated noise series is
given by obtained by subtracting the estimated seasonal and
trend components from the original data.
The first step is to eliminate the seasonal component from by introducing the
difference operator defined by
Now we examine following simple tests for checking the hypothesis that the residuals are
independent and identically distributed random variables.
Let be the realization of the iid sequence. Now about of the sample auto
Decision Rule:
If more than values of fall outside of the bounds or that one value falls far
outside the bounds, then we accept the null hypothesis i.e., the population sequence is not
iid otherwise reject the null hypothesis.
We know that,
For a large value of suggests that the sample autocorrelations of the data are too large for an
iid sequence. We therefore reject the iid hypothesis at level if , where is
the quantile of the chi-squared distribution with
b) Another Portmanteau Test, formulated by McLeod and Li (1983), can be used for testing
whether the data are an iid sequence of normally distributed random variables. They
defined the test statistic as:
The hypothesis of iid normal data is then rejected at level if the observed value of is larger
than the quantile of the distribution.
Figure-6
A large value of indicates that the series is fluctuating more rapidly than expected for an
iid sequence. On the other hand the value of much smaller than zero indicates a positive
correlation between neighboring observations. For an iid sequence with large , is
approximately .
The test statistic is
This means we can carry out a test of the iid hypothesis, rejecting it at level if ,
be the number of total count between the difference . Assume that there is no zero
difference in the sequence. If it shows a zero difference then it drops from the analysis.
Therefore iid reduced. For an iid sequence we have,
A large positive (or negative) value of indicates the presence of an increasing (or
decreasing) trend in the data. We therefore reject the assumption of no trend in the data if
distribution.
the sample values, are given by Shapiro and Francia (1972) for sample sizes . For
.