0% found this document useful (0 votes)
16 views11 pages

Chapter 1 - Lecture

This document provides an overview of time series analysis, focusing on data measured over time in various fields such as business, agriculture, and meteorology. It discusses the importance of time series plots, correlation, and seasonality, illustrated with examples like annual rainfall in Los Angeles and global temperature data. The goal of time series analysis is to model data and predict future values, highlighting the challenges posed by correlated observations and the need for appropriate statistical methods.

Uploaded by

krrik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views11 pages

Chapter 1 - Lecture

This document provides an overview of time series analysis, focusing on data measured over time in various fields such as business, agriculture, and meteorology. It discusses the importance of time series plots, correlation, and seasonality, illustrated with examples like annual rainfall in Los Angeles and global temperature data. The goal of time series analysis is to model data and predict future values, highlighting the challenges posed by correlated observations and the need for appropriate statistical methods.

Uploaded by

krrik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 1 : Examples of Time Series

A time series is a sequence of ordered data. The “ordering” refers generally to time, but other orderings
could be envisioned (e.g., over space, etc.). In this class, we will be concerned exclusively with time series
that are

• measured on a single continuous random variable Y


• equally spaced in discrete time; that is, we will have a single realization of Y at each second, hour,
day, month, year, etc.

Time series data arise in a variety of fields. Here are just a few examples.

• In business, we observe daily closing stock prices, weekly interest rates, quarterly sales, monthly price
indices, etc.
• In agriculture, we observe annual yields (e.g., crop production), daily crop prices, annual livestock
production, etc.
• In engineering, we observe electric signals, voltage measurements, etc.
• In natural sciences, we observe chemical yields, turbulence in ocean waves, earth tectonic plate
positions, etc.
• In medicine, we observe EKG measurements on patients, drug concentrations, blood pressure read-
ings, etc.
• In epidemiology, we observe the number of flu cases per day, the number of health-care clinic visits
per week, annual tuberculosis counts, etc.
• In meteorology, we observe hourly wind speeds, daily high temperatures, annual rainfall,earthquake
frequency, etc.
• In social sciences, we observe annual birth and death rates, accident frequencies, crime rates, school
enrollments, etc.

The time series plot is the most basic graphical display in the analysis of time series data. The plot is a
basically a scatterplot of Yt versus t, with straight lines connecting the points. Notationally, Yt = value of
the variable Y at time t, for t = 1, 2, ..., n. The subscript t tells us to which time point the measurement
Yt corresponds. Note that in the sequence Y1 , Y2 , ..., Yn , the subscripts are very important because they
correspond to a particular ordering of the data. This is perhaps a change in mind set from other methods
courses where the time element is ignored.

1. Annual Rainfall in Los Angeles


Los Angeles averages 15 inches of precipitation annually, which mainly occurs during the winter and spring
(November through April) with generally light rain showers, but sometimes as heavy rainfall and thunder-
storms. The data are annual rainfall totals for Los Angeles, California from 1878 to 1992.

• Data file: larain (TSA)


• There are n = 115 observations.
• Measurements are taken each year.

1
• What are the noticable patterns?
– typical year was about 15 inches
– considerable variation over the years, i.e., some years are low, some high, many are in-between
∗ The year 1884 was an exceptionally wet year
∗ The year 1989 was quite dry
• Prediction?

– there is little information about this year’s rainfall amount from last year’s amount, i.e., the plot
shows no "trends".
– there is litte correlation between last year’s rainfall amount and this years’ amount

library (TSA)
data(larain)
plot(larain, ylab="Inches",xlab="Year",type="o",
main="Time Series Plot of Los Angeles Annual Rainfall")

Time Series Plot of Los Angeles Annual Rainfall


40
30
Inches

20
10

1880 1900 1920 1940 1960 1980

Year

plot(y=larain,x=zlag(larain,1),ylab="Inches",xlab="Previous Year Inches",


main="Scatterplot of LA Rainfall versus Last Year’s LA Rainfall")

2
Scatterplot of LA Rainfall versus Last Year's LA Rainfall
40
30
Inches

20
10

10 20 30 40

Previous Year Inches

This plot is called lag-1 scatterplot, displaying the observed data plotted against the lag-1 series; i.e., the
scatterplot of the data points (Y1 , Y2 ), (Y2 , Y3 ), . . . , (Yn−1 , Yn ), graphically describing the degree of
correlation between rainfall from one year to the next year.

2. Monthly Average Temperature in Dubuque, Iowa


Monthly Average Temperature in Dubuque, Iowa from January 1964 to December 1975

• Data file: tempdub (TSA)

• There are n = 144 observations.


• Measurements are taken each month.
• What are the noticable patterns?
– Seasonality
∗ all January and Februarys are quite cold but they are similar in value and different from the
temperatures of the warmer months of June, July, and August.
– small variation for the same month over the years
• Prediction?

– strong correlation between the same month over the years in scatterplot is about 0.97
– seasonality can be used for prediction.

3
library (TSA)
data(tempdub)
plot(tempdub, ylab="Temperature",xlab="Time",type="o",
main="Average Monthly Temperatures, Dubuque, Iowa")

Average Monthly Temperatures, Dubuque, Iowa


70
60
Temperature

50
40
30
20
10

1964 1966 1968 1970 1972 1974 1976

Time

plot(y=tempdub,x=zlag(tempdub,12),ylab="Temperature",xlab="Previous Year Temperature",


main="Ave Monthly Temp versus Previous Year’s Ave Monthly Tempe")

4
Ave Monthly Temp versus Previous Year's Ave Monthly Tempe
70
60
Temperature

50
40
30
20
10

10 20 30 40 50 60 70

Previous Year Temperature

cor(tempdub,zlag(tempdub,12),use="complete.obs")

## [1] 0.9702201

3. Monthly Oil Fiter Sales


Monthly sales of a speciality oil filter for construction equipment manufactured by John Deere from July
1983 to June 1987

• Data file: oilfilters (TSA)


• There are n = 48 observations.
• Measurements are taken each month.
• What are the noticable patterns?
– Seasonality
– larger variation for the same month over the years
• Prediction?
– sales for the winter months of January and February all tend to be high
– sales in September, October, November, and December are generally quite low

5
library (TSA)
data(oilfilters)
plot(oilfilters, ylab="Sales",xlab="Time",type="l",
main="Monthly Oil Filter Sales with Month Symbol")
points(y=oilfilters,x=time(oilfilters),pch=as.vector(season(oilfilters)))

Monthly Oil Filter Sales with Month Symbol


6000

J F F
J J J A
F
5000

FMAM A
4000

J
Sales

S M
J A J
M
A O J J
D
3000

D
M J M
N A A M J
J O A
ON ON
2000

S M
N D
S S D

1984 1985 1986 1987

Time

plot(y=oilfilters,x=zlag(oilfilters,12),ylab="Sales",xlab="Time",
main="Lag-12 scatterplot")

6
Lag−12 scatterplot
6000
5000
4000
Sales

3000
2000

2000 3000 4000 5000 6000

Time

cor(oilfilters,zlag(oilfilters,12),use="complete.obs")

## [1] 0.8084015

plot(y=oilfilters,x=zlag(oilfilters,1),ylab="Sales",xlab="Time",
main="Lag-1 scatterplot")

7
Lag−1 scatterplot
6000
5000
4000
Sales

3000
2000

2000 3000 4000 5000 6000

Time

cor(oilfilters,zlag(oilfilters,1),use="complete.obs")

## [1] 0.3142145

4. Global temperature data


“Global warming” refers to an increase in the average temperature of the Earth’s near-surface air and oceans
since the mid-20th century and its projected continuation. The data are annual temperature deviations
(1856-1997), measured from a combination of land-air average temperature anomalies in degrees Centigrade.

• Data file: globaltemps


• There are n = 142 observations.
• Measurements are taken each year.

• What are the noticable patterns?


– variation over the years
– neighboring values here are very closely related
– large changes in deviation do not occur from one year to the next
– globa temperatures increase over the years
• Prediction?

8
– strong correlation between the neighboring years is about 0.84

##downlaod "globaltemps.txt" file and save it in your current working folder###


globaltemps = ts(read.table(file = "globaltemps.txt"),start=1856)
plot(globaltemps,ylab="Global temperature deviations",xlab="Year",type="o",
main="Global Temperature Data")

Global Temperature Data


0.4
Global temperature deviations

0.2
0.0
−0.2
−0.4

1860 1880 1900 1920 1940 1960 1980 2000

Year

plot(y=globaltemps,x=zlag(globaltemps,1),ylab="Global temperature deviations",


xlab="Previous Year Temperature deviation",
main="Lag-1 Scatterplot")

9
Lag−1 Scatterplot
0.4
Global temperature deviations

0.2
0.0
−0.2
−0.4

−0.4 −0.2 0.0 0.2 0.4

Previous Year Temperature deviation

cor(globaltemps,zlag(globaltemps,1),use="complete.obs")

## [,1]
## V1 0.8421212

The purpose of time series analysis is twofold:

1. to model the stochastic (random) mechanism that gives rise to the series of data

2. to predict (forecast) the future values of the series based on the previous history.

NOTES: For time series data, we get to see only a single measurement from a population (at time t) instead
of a sample of measurements at a fixed point in time (cross-sectional data).

1. The special feature of time series data is that they are not independent! Instead, observations are
correlated through time.
• Correlated data are generally more difficult to analyze.
• Statistical theory in the absence of independence becomes markedly more difficult.

2. Most classical statistical methods (e.g., regression, analysis of variance, etc.) assume that observations
are statistically independent. For example, in the simple linear regression model
Yi = β0 + β1 xi + i
or an ANOVA model like

10
Yijk = µ + αi + βj + (αβ)ij + ijk ,
we typically assume that the  error terms are independent and identically distributed (iid) normal
random variables with mean 0 and constant variance.
3. There can be additional trends or seasonal variation patterns (seasonality) that may be difficult to
identify and model.

4. The data may be highly non-normal in appearance and be possibly contaminated by outliers.

Our overarching goal in this course is to build (and use) time series models for data. This breaks down into
different parts.

1. Model specification (identification)


• Consider different classes of time series models for stationary processes.
• Use descriptive statistics, graphical displays, subject matter knowledge, etc. to make sensible
candidate selections.
• Abide by the Principle of Parsimony.
2. Model fitting
• Once a candidate model is chosen, estimate the parameters in the model.
• We will use least squares and/or maximum likelihood to do this.

3. Model diagnostics
• Use statistical inference and graphical displays to check how well the model fits the data.
• This part of the analysis may suggest the candidate model is inadequate and may point to more
appropriate models.

11

You might also like