0% found this document useful (0 votes)
8 views

Module Introduction To Time Series Analysis

Time series

Uploaded by

ss t
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Module Introduction To Time Series Analysis

Time series

Uploaded by

ss t
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Subject: Statistics

Paper: Stochastic Processes and Time Series


Analysis
Module: Introduction to Time Series Analysis
1 / 21
Development Team

Principal investigator: Prof. Bhaswati Ganguli,


Department of Statistics, University of Calcutta
Paper co-ordinator: Dr. Arindam Sengupta,
Department of Statistics, University of Calcutta
Content writer: Mr. Samopriya Basu & Prof. Sugata Sen Roy,
Department of Statistics, University of Calcutta
Content reviewer: Dr. Indranil Mukhopadhyay,
Indian Statistical Institute, Kolkata

2 / 21
What is a Time Series?

A time series is a collection of observations measured in


accordance to time.
Let n be the number of time points at which observations are
taken.
Yt : the observation at time point t, t = 1, 2, . . . , n.

Note: Observations are generally taken at equidistant time points.


Unequal time intervals are often encountered, but are more difficult
(although not impossible) to handle.

3 / 21
What is a Time Series?

A time series is a collection of observations measured in


accordance to time.
Let n be the number of time points at which observations are
taken.
Yt : the observation at time point t, t = 1, 2, . . . , n.

Note: Observations are generally taken at equidistant time points.


Unequal time intervals are often encountered, but are more difficult
(although not impossible) to handle.

3 / 21
What is a Time Series?

A time series is a collection of observations measured in


accordance to time.
Let n be the number of time points at which observations are
taken.
Yt : the observation at time point t, t = 1, 2, . . . , n.

Note: Observations are generally taken at equidistant time points.


Unequal time intervals are often encountered, but are more difficult
(although not impossible) to handle.

3 / 21
What is a Time Series?

A time series is a collection of observations measured in


accordance to time.
Let n be the number of time points at which observations are
taken.
Yt : the observation at time point t, t = 1, 2, . . . , n.

Note: Observations are generally taken at equidistant time points.


Unequal time intervals are often encountered, but are more difficult
(although not impossible) to handle.

3 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Examples of time series

I The price of a stock that changes every hour of every working


day of the week.
I The air temperature that changes continuously over time but
is measured only at certain hours of the day.
I The heart rate of a patient, minute by minute.
I Annual number of deaths from any specific disease.
I Monthly tonnes of fish caught in the Bay of Bengal.
I Daily measurements of CO2 (ppm) emissions in Mumbai.
I Sunspot numbers as observed by the Royal Observatory of
Belgium.

4 / 21
Bronchitis Death Data

5 / 21
Plot

1000
1500
2000
2500
3000

500

0
J 05
A 05
J 05
O05
J 06
A 06
J 06
O06
J 07
A 07
J 07
O07
J 08
A 08
J 08
O08
J 09
Bronchitis Death Data

A 09
J 09
O09
J 10
A 10
J 10
O10
6 / 21
Types of Studies

Let us look at this through an example.


Example: The annual production of coal at 20 different collieries
are observed for 10 years.
Depending on the nature of inference, three different types of
studies can be conducted.
Panel/longitudinal analysis
Here the whole data is studied i.e. we have cross-sectional data for
each of the 10 years and the focus is generally on the interplay
between the time and the cross-sectional factors.

7 / 21
Types of Studies

Cross-sectional analysis
Here we are only interested to study the distribution of coal
production for a given year (say, the 4th ). Thus the study focuses
on this particular year only without reference to the other nine
years.

Time series analysis


The primary interest here is in the production of a single colliery
over the years. This gives an idea of the growth pattern of that
colliery in terms of production, irrespective of the productions
elsewhere.

8 / 21
Types of Studies

Cross-sectional analysis
Here we are only interested to study the distribution of coal
production for a given year (say, the 4th ). Thus the study focuses
on this particular year only without reference to the other nine
years.

Time series analysis


The primary interest here is in the production of a single colliery
over the years. This gives an idea of the growth pattern of that
colliery in terms of production, irrespective of the productions
elsewhere.

8 / 21
Comparison

I Usually longitudinal data is more informative since it takes


both the time and the cross-sectional factors into account.
I However, typically such data being more difficult to obtain, is
short both in time-length as well as in cross-section. Hence it
fails to capture the features displayed by a long time series or
by a large cross-sectional data.
I Thus if the interplay between time and cross-section is not of
importance, then depending on the purpose of the study,
either cross-sectional or time series analysis is carried out.

9 / 21
Comparison

I Usually longitudinal data is more informative since it takes


both the time and the cross-sectional factors into account.
I However, typically such data being more difficult to obtain, is
short both in time-length as well as in cross-section. Hence it
fails to capture the features displayed by a long time series or
by a large cross-sectional data.
I Thus if the interplay between time and cross-section is not of
importance, then depending on the purpose of the study,
either cross-sectional or time series analysis is carried out.

9 / 21
Comparison

I Usually longitudinal data is more informative since it takes


both the time and the cross-sectional factors into account.
I However, typically such data being more difficult to obtain, is
short both in time-length as well as in cross-section. Hence it
fails to capture the features displayed by a long time series or
by a large cross-sectional data.
I Thus if the interplay between time and cross-section is not of
importance, then depending on the purpose of the study,
either cross-sectional or time series analysis is carried out.

9 / 21
Mathematical Framework

I Let (Ω, F, P) be a probability space


I and T an index set, (some subset of R).
I Let Xt (ω) be a real-valued function defined on T × Ω, with
t ∈ T and ω ∈ Ω.
I For fixed t = t0 , Xt0 (ω) is a random variable defined on
(Ω, F, P).
I Thus, (Xt (ω); t ∈ T ), is a collection of random variables.
I For fixed ω = ω0 , Xt (ω0 ) is a real-valued function of t.
This defines a realization of the time series.

10 / 21
Mathematical Framework

I Let (Ω, F, P) be a probability space


I and T an index set, (some subset of R).
I Let Xt (ω) be a real-valued function defined on T × Ω, with
t ∈ T and ω ∈ Ω.
I For fixed t = t0 , Xt0 (ω) is a random variable defined on
(Ω, F, P).
I Thus, (Xt (ω); t ∈ T ), is a collection of random variables.
I For fixed ω = ω0 , Xt (ω0 ) is a real-valued function of t.
This defines a realization of the time series.

10 / 21
Mathematical Framework

I Let (Ω, F, P) be a probability space


I and T an index set, (some subset of R).
I Let Xt (ω) be a real-valued function defined on T × Ω, with
t ∈ T and ω ∈ Ω.
I For fixed t = t0 , Xt0 (ω) is a random variable defined on
(Ω, F, P).
I Thus, (Xt (ω); t ∈ T ), is a collection of random variables.
I For fixed ω = ω0 , Xt (ω0 ) is a real-valued function of t.
This defines a realization of the time series.

10 / 21
Mathematical Framework

I Let (Ω, F, P) be a probability space


I and T an index set, (some subset of R).
I Let Xt (ω) be a real-valued function defined on T × Ω, with
t ∈ T and ω ∈ Ω.
I For fixed t = t0 , Xt0 (ω) is a random variable defined on
(Ω, F, P).
I Thus, (Xt (ω); t ∈ T ), is a collection of random variables.
I For fixed ω = ω0 , Xt (ω0 ) is a real-valued function of t.
This defines a realization of the time series.

10 / 21
Mathematical Framework

I Let (Ω, F, P) be a probability space


I and T an index set, (some subset of R).
I Let Xt (ω) be a real-valued function defined on T × Ω, with
t ∈ T and ω ∈ Ω.
I For fixed t = t0 , Xt0 (ω) is a random variable defined on
(Ω, F, P).
I Thus, (Xt (ω); t ∈ T ), is a collection of random variables.
I For fixed ω = ω0 , Xt (ω0 ) is a real-valued function of t.
This defines a realization of the time series.

10 / 21
Mathematical Framework

I Let (Ω, F, P) be a probability space


I and T an index set, (some subset of R).
I Let Xt (ω) be a real-valued function defined on T × Ω, with
t ∈ T and ω ∈ Ω.
I For fixed t = t0 , Xt0 (ω) is a random variable defined on
(Ω, F, P).
I Thus, (Xt (ω); t ∈ T ), is a collection of random variables.
I For fixed ω = ω0 , Xt (ω0 ) is a real-valued function of t.
This defines a realization of the time series.

10 / 21
A Realization

I The collection of all realizations (as ω varies in Ω) is referred


to as an ensemble.
I A time series is a single realization from the ensemble of all
possible realizations.

The major problem with time series studies is that there is simply
one realization. If several realizations could have been obtained, it
would have been possible to apply the usual statistical techniques.

11 / 21
A Realization

I The collection of all realizations (as ω varies in Ω) is referred


to as an ensemble.
I A time series is a single realization from the ensemble of all
possible realizations.

The major problem with time series studies is that there is simply
one realization. If several realizations could have been obtained, it
would have been possible to apply the usual statistical techniques.

11 / 21
A Realization

I The collection of all realizations (as ω varies in Ω) is referred


to as an ensemble.
I A time series is a single realization from the ensemble of all
possible realizations.

The major problem with time series studies is that there is simply
one realization. If several realizations could have been obtained, it
would have been possible to apply the usual statistical techniques.

11 / 21
Cross-sectional versus Time Series Data

Cross-sectional study
Here there are n observations from a single population.
e.g. Observations from the 20 different collieries for a given year.

Time series study


Here each observation comes from n different populations.
e.g. 20 observations on production from 20 different years.

12 / 21
Cross-sectional versus Time Series Data

Cross-sectional study
Here there are n observations from a single population.
e.g. Observations from the 20 different collieries for a given year.

Time series study


Here each observation comes from n different populations.
e.g. 20 observations on production from 20 different years.

12 / 21
What’s Special about a Time Series?

The observations in a time series are generally correlated. This is


because we are looking at a single realization where the inherent
feature of the individual is translated to all its values.
E.g. a particular colliery may consistently return higher production
figures compared to others because of the abundance of coal in
that area.

The relationship is structured; i.e., there is a pattern in the


relationship. This pattern is what we wish to capture while
analysing time series data.

Usually an observation is more affected by recent time-points


rather than remote time-points.

13 / 21
Types of Time Series?

Continuous Time Series


If T is an uncountable subset of R.
e.g., T = R+ or some interval in R.

Observations on a electro-cardiogram machine are over a


continuous interval of time.
Discrete Time Series
If T is a countable subset of R.
Usually T = {0, 1, 2, . . . } or {1, 2, . . . } or Z = {0, ∓1, ∓2, ....}.

Observationally, most time series are discrete.

14 / 21
Types of Time Series?

Continuous Time Series


If T is an uncountable subset of R.
e.g., T = R+ or some interval in R.

Observations on a electro-cardiogram machine are over a


continuous interval of time.
Discrete Time Series
If T is a countable subset of R.
Usually T = {0, 1, 2, . . . } or {1, 2, . . . } or Z = {0, ∓1, ∓2, ....}.

Observationally, most time series are discrete.

14 / 21
Types of Analysis

There are two major branches in time series analysis :

Analysis in the time domain

Here the current value is viewed as a manifestation of all past


happenings as reflected by the data.

Analysis in the frequency domain

Here the series is thought to be the resultant effect of the


superimposition of a number of sinusoidal curves of different
frequencies and amplitudes.

15 / 21
Types of Analysis

There are two major branches in time series analysis :

Analysis in the time domain

Here the current value is viewed as a manifestation of all past


happenings as reflected by the data.

Analysis in the frequency domain

Here the series is thought to be the resultant effect of the


superimposition of a number of sinusoidal curves of different
frequencies and amplitudes.

15 / 21
Analysis in the Time Domain

I In the analysis in the domain, the time series Yt is looked


upon as a function of
I either its own past values

Yt = f (Yt−1 , Yt−2 , . . . )

I or the present and past values of some i.i.d. random variables

Yt = f (εt , εt−1 , εt−2 , . . . )

I or both

Yt = f (Yt−1 , Yt−2 , . . . , εt , εt−1 , εt−2 , . . . ).

16 / 21
Analysis in the Time Domain

I In the analysis in the domain, the time series Yt is looked


upon as a function of
I either its own past values

Yt = f (Yt−1 , Yt−2 , . . . )

I or the present and past values of some i.i.d. random variables

Yt = f (εt , εt−1 , εt−2 , . . . )

I or both

Yt = f (Yt−1 , Yt−2 , . . . , εt , εt−1 , εt−2 , . . . ).

16 / 21
Analysis in the Time Domain

I In the analysis in the domain, the time series Yt is looked


upon as a function of
I either its own past values

Yt = f (Yt−1 , Yt−2 , . . . )

I or the present and past values of some i.i.d. random variables

Yt = f (εt , εt−1 , εt−2 , . . . )

I or both

Yt = f (Yt−1 , Yt−2 , . . . , εt , εt−1 , εt−2 , . . . ).

16 / 21
Analysis in the Time Domain

I In the analysis in the domain, the time series Yt is looked


upon as a function of
I either its own past values

Yt = f (Yt−1 , Yt−2 , . . . )

I or the present and past values of some i.i.d. random variables

Yt = f (εt , εt−1 , εt−2 , . . . )

I or both

Yt = f (Yt−1 , Yt−2 , . . . , εt , εt−1 , εt−2 , . . . ).

16 / 21
Analysis in the Frequency Domain

I In the frequency domain, the series Yt is assumed to be


composed of a finite number of sinusoidal curves with
different amplitudes (ak ) and frequencies (θk ); i.e.,

Yt = f C1 (a1 , θ1 ), C2 (a2 , θ2 ), . . . , Cr (ar , θr ) .

I Generally, the function f is, simply, the sum or a weighted


sum of these curves, i.e. the curves are superimposed on one
another to represent Yt .

17 / 21
Analysis in the Frequency Domain

I In the frequency domain, the series Yt is assumed to be


composed of a finite number of sinusoidal curves with
different amplitudes (ak ) and frequencies (θk ); i.e.,

Yt = f C1 (a1 , θ1 ), C2 (a2 , θ2 ), . . . , Cr (ar , θr ) .

I Generally, the function f is, simply, the sum or a weighted


sum of these curves, i.e. the curves are superimposed on one
another to represent Yt .

17 / 21
The Problem

I In the time domain,


the problem is to identify the best set of variables and the best
relationship which leads to the closest representation of Yt .
I In the frequency domain,
the problem is to identify the primary curves Ck (ak , θk ), which
superimposed on each other, give the best fit to the Yt series.

18 / 21
The Problem

I In the time domain,


the problem is to identify the best set of variables and the best
relationship which leads to the closest representation of Yt .
I In the frequency domain,
the problem is to identify the primary curves Ck (ak , θk ), which
superimposed on each other, give the best fit to the Yt series.

18 / 21
Reasons for Time Series Studies

Smoothing
An irregular series needs to be smoothed so that the random
disturbances or noises are eliminated and the inherent structure of
the series retrieved so that the nature of this structure can be
studied.

Forecasting
Based on the smoothed series future values can be predicted
assuming that the current situation holds.

19 / 21
Reasons for Time Series Studies

Smoothing
An irregular series needs to be smoothed so that the random
disturbances or noises are eliminated and the inherent structure of
the series retrieved so that the nature of this structure can be
studied.

Forecasting
Based on the smoothed series future values can be predicted
assuming that the current situation holds.

19 / 21
Summary

I An introduction to what is a time series and how it is different


from other data structures has been discussed.
I The distinction between the time domain and frequency
domain approaches has been made.
I The purposes of time series studies have been listed.

20 / 21
Summary

I An introduction to what is a time series and how it is different


from other data structures has been discussed.
I The distinction between the time domain and frequency
domain approaches has been made.
I The purposes of time series studies have been listed.

20 / 21
Summary

I An introduction to what is a time series and how it is different


from other data structures has been discussed.
I The distinction between the time domain and frequency
domain approaches has been made.
I The purposes of time series studies have been listed.

20 / 21
Thank You

21 / 21

You might also like