0% found this document useful (0 votes)
13 views44 pages

Unit 2b TS Decomposition

c

Uploaded by

jenapham129
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views44 pages

Unit 2b TS Decomposition

c

Uploaded by

jenapham129
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

DASC6510/DASC4990

Unit 2b: Time series decomposition

Erfanul Hoque, PhD


Thompson Rivers Univeristy
The note is strongly inspired by the materials shared on the Book:
Hyndman, R. J. & Athanasopoulos, G. (2021) Forecasting:
principles and practice, 3rd edition. https://otexts.com/fpp3/

2
Time series decomposition

3
Time series decomposition

• We can think of a time series as comprising three


components: a trend-cycle component, a seasonal component,
and a remainder component (containing anything else in the
time series).
• Here, we consider the most common methods for extracting
these components from a time series. Often this is done to
help improve understanding of the time series, but it can also
be used to improve forecast accuracy.
• When decomposing a time series, it is sometimes helpful to
first transform or adjust the series in order to make the
decomposition (and later analysis) as simple as possible

4
Transformations and adjustments

5
Population Adjustement/Per capita adjustments

global_economy %>%
filter(Country == "Australia") %>%
autoplot(GDP)

1.5e+12

1.0e+12
GDP

5.0e+11

0.0e+00
1960 1980 2000
Year [1Y]

6
Per capita adjustments

global_economy %>%
filter(Country == "Australia") %>%
autoplot(GDP / Population)

60000
GDP/Population

40000

20000

0
1960 1980 2000
Year [1Y]

7
Exercise

Consider the GDP information in global_economy. Plot the GDP


per capita for each country over time. Which country has the
highest GDP per capita? How has this changed over time?

8
Inflation adjustments

• Data which are affected by the value of money need to be adjusted


before modelling.
• For example, the average cost of a new house will have
increased over the last few decades due to inflation. A
$200,000 house this year is not the same as a $200,000 house
twenty years ago. For this reason, financial time series are
usually adjusted so that all values are stated in dollar values
from a particular year.
• To make these adjustments, a price index is used. If zt denotes the
price index and yt denotes the original house price in year t, then
xt = yt /zt ∗ z2000 gives the adjusted house price at particular yer
(say, year 2000) dollar values.
• For consumer goods, a common price index is the Consumer Price
Index (or CPI).

9
Inflation adjustments

print_retail <- aus_retail %>%


filter(Industry == "Newspaper and book retailing") %>%
group_by(Industry) %>%
index_by(Year = year(Month)) %>%
summarise(Turnover = sum(Turnover))
aus_economy <- global_economy %>%
filter(Code == "AUS")
print_retail %>%
left_join(aus_economy, by = "Year") %>%
mutate(Adjusted_turnover = Turnover / CPI * 100) %>%
pivot_longer(c(Turnover, Adjusted_turnover),
values_to = "Turnover") %>%
mutate(name = factor(name,
levels=c("Turnover","Adjusted_turnover"))) %>%
ggplot(aes(x = Year, y = Turnover)) +
geom_line() +
facet_grid(name ~ ., scales = "free_y") +
labs(title = "Turnover: Australian print media industry",
y = "$AU")
10
Inflation adjustments

Turnover: Australian print media industry

4000

Turnover
3000

2000
$AU

5000

Adjusted_turnover
4500

4000

3500

3000

1990 2000 2010


Year

11
Mathematical transformations

If the data show different variation at different levels of


the series, then a transformation can be useful.
Denote original observations as y1 , . . . , yn and
transformed observations as w1 , . . . , wn .
Mathematical transformations for stabilizing variation

Square root wt = yt ↓

Cube root wt = 3 yt Increasing
Logarithm wt = log(yt ) strength
Logarithms, in particular, are useful because they are
more interpretable: changes in a log value are relative
(percent) changes on the original scale.
12
Mathematical transformations

food <- aus_retail %>%


filter(Industry == "Food retailing") %>%
summarise(Turnover = sum(Turnover))

10000
Turnover ($AUD)

5000

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M]

13
Mathematical transformations

food %>% autoplot(sqrt(Turnover)) +


labs(y = "Square root turnover")

100
Square root turnover

75

50

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M]
14
Mathematical transformations

food %>% autoplot(Turnover^(1/3)) +


labs(y = "Cube root turnover")

20
Cube root turnover

15

10
1990 Jan 2000 Jan 2010 Jan 2020 Jan
Month [1M]
15
Mathematical transformations

food %>% autoplot(log(Turnover)) +


labs(y = "Log turnover")

9.5

9.0
Log turnover

8.5

8.0

7.5

7.0
1990 Jan 2000 Jan 2010 Jan 2020 Jan
Month [1M]
16
Mathematical transformations

food %>% autoplot(-1/Turnover) +


labs(y = "Inverse turnover")

−0.00025
Inverse turnover

−0.00050

−0.00075

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M]
17
Box-Cox transformations

Each of these transformations is close to a member of the family of


Box-Cox transformations:
{
log(yt ), λ = 0;
wt =
(ytλ − 1)/λ, λ ̸= 0.

• λ = 1: (No substantive transformation)


• λ = 12 : (Square root plus linear transformation)
• λ = 0: (Natural logarithm)
• λ = −1: (Inverse plus 1)

18
Box-Cox transformations
Box−Cox transformed food retailing turnover (lambda = 1)
12500

10000

7500
Turnover

5000

2500

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month

19
Box-Cox transformations

food %>%
features(Turnover, features = guerrero)

## # A tibble: 1 x 1
## lambda_guerrero
## <dbl>
## 1 0.0895

• This attempts to balance the seasonal fluctuations


and random variation across the series.
• Always check the results.
• A low value of λ can give extremely large prediction
intervals.
20
Box-Cox transformations

food %>% autoplot(box_cox(Turnover, 0.0524)) +


labs(y = "Box-Cox transformed turnover")

12
Box−Cox transformed turnover

11

10

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M]
21
Transformations

• Often no transformation needed.


• Simple transformations are easier to explain and work
well enough.
• Transformations can have very large effect on PI.
• If some data are zero or negative, then use λ > 0.
• Choosing logs is a simple way to force forecasts to be
positive
• Transformations must be reversed to obtain forecasts
on the original scale. (Handled automatically by
fable.)

22
Time series components

23
Time series patterns

Recall

Trend pattern exists when there is a long-term increase or


decrease in the data.
Cyclic pattern exists when data exhibit rises and falls that
are not of fixed period (duration usually of at least 2
years).
Seasonal pattern exists when a series is influenced by seasonal
factors (e.g., the quarter of the year, the month, or
day of the week).

24
Time series decomposition

yt = f (St , Tt , Rt )

where yt = data at period t


Tt = trend-cycle component at period t
St = seasonal component at period t
Rt = remainder component at period t
Additive decomposition: yt = St + Tt + Rt .
Multiplicative decomposition: yt = St × Tt × Rt .

25
Time series decomposition

• Additive model appropriate if magnitude of seasonal


fluctuations does not vary with level.
• If seasonal are proportional to level of series, then
multiplicative model appropriate.
• Multiplicative decomposition more prevalent with
economic series
• Alternative: use a Box-Cox transformation, and then
use additive decomposition.
• Logs turn multiplicative relationship into an additive
relationship:

yt = St × Tt × Rt ⇒ log yt = log St + log Tt + log Rt . 26


US Retail Employment

us_retail_employment <- us_employment %>%


filter(year(Month) >= 1990, Title == "Retail Trade") %>%
select(-Series_ID)
us_retail_employment

## # A tsibble: 357 x 3 [1M]


## Month Title Employed
## <mth> <chr> <dbl>
## 1 1990 Jan Retail Trade 13256.
## 2 1990 Feb Retail Trade 12966.
## 3 1990 Mar Retail Trade 12938.
## 4 1990 Apr Retail Trade 13012.
## 5 1990 May Retail Trade 13108.
## 6 1990 Jun Retail Trade 13183.
## 7 1990 Jul Retail Trade 13170.
## 8 1990 Aug Retail Trade 13160.
## 9 1990 Sep Retail Trade 13113.
## 10 1990 Oct Retail Trade 13185. 27
US Retail Employment

us_retail_employment %>%
autoplot(Employed) +
labs(y = "Persons (thousands)",
title = "Total employment in US retail")

Total employment in US retail

16000
Persons (thousands)

15000

14000

13000

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M]

28
US Retail Employment

us_retail_employment %>%
model(stl = STL(Employed))

## # A mable: 1 x 1
## stl
## <model>
## 1 <STL>

29
US Retail Employment

dcmp <- us_retail_employment %>%


model(stl = STL(Employed))
components(dcmp)

## # A dable: 357 x 7 [1M]


## # Key: .model [1]
## # : Employed = trend + season_year + remainder
## .model Month Employed trend season_~1 remai~2 seaso~3
## <chr> <mth> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 stl 1990 Jan 13256. 13288. -33.0 0.836 13289.
## 2 stl 1990 Feb 12966. 13269. -258. -44.6 13224.
## 3 stl 1990 Mar 12938. 13250. -290. -22.1 13228.
## 4 stl 1990 Apr 13012. 13231. -220. 1.05 13232.
## 5 stl 1990 May 13108. 13211. -114. 11.3 13223.
## 6 stl 1990 Jun 13183. 13192. -24.3 15.5 13207.
## 7 stl 1990 Jul 13170. 13172. -23.2 21.6 13193.
## 8 stl 1990 Aug 13160. 13151. -9.52 17.8 13169.
## 9 stl 1990 Sep 13113. 13131. -39.5 22.0 13153. 30
US Retail Employment

us_retail_employment %>%
autoplot(Employed, color='gray') +
autolayer(components(dcmp), trend, color='#D55E00') +
labs(y = "Persons (thousands)",
title = "Total employment in US retail")

Total employment in US retail

16000
Persons (thousands)

15000

14000

13000

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M] 31
US Retail Employment

components(dcmp) %>% autoplot()

STL decomposition
Employed = trend + season_year + remainder
16000

Employed
15000
14000
13000
16000

15000

trend
14000

13000

season_year
500
250
0
−250
100

remainder
50
0
−50
−100
1990 Jan 2000 Jan 2010 Jan 2020 Jan
Month
32
season_year

−250
0
250
500
1990
2000

Jan
2010
2020
1990
2000
2010 Feb
2020
1990
2000
Mar

2010
2020
1990
2000
Apr
US Retail Employment

2010
2020
1990
2000
May

2010
2020
1990
2000
Jun

2010
2020
1990

Month
2000
Jul

2010
2020
1990
2000
Aug

2010
2020
1990
2000
Sep

2010
2020
components(dcmp) %>% gg_subseries(season_year)

1990
2000
Oct

2010
2020
1990
2000
Nov

2010
2020
1990
2000
Dec

2010
2020 stl
33
Seasonal adjustment

• Useful by-product of decomposition: an easy way to calculate


seasonally adjusted data.
• Additive decomposition: seasonally adjusted data given by

yt − St = Tt + Rt

• Multiplicative decomposition: seasonally adjusted data given


by
yt /St = Tt × Rt

34
US Retail Employment

us_retail_employment %>%
autoplot(Employed, color='gray') +
autolayer(components(dcmp), season_adjust, color='#0072B2') +
labs(y = "Persons (thousands)",
title = "Total employment in US retail")

Total employment in US retail

16000
Persons (thousands)

15000

14000

13000

1990 Jan 2000 Jan 2010 Jan 2020 Jan


Month [1M] 35
Seasonal adjustment

• We use estimates of S based on past values to seasonally


adjust a current value.
• Seasonally adjusted series reflect remainders as well as
trend. Therefore they are not “smooth” and “downturns” or
“upturns” can be misleading.
• It is better to use the trend-cycle component to look for
turning points.

36
Classical Decomposition

• The traditional way to do time series decomposition is called


Classical decomposition.

• The first step in a classical decomposition is to use a moving


average method.
• The simplest estimate of the trend-cycle uses moving
averages.
• A moving average of order m can be written as

1 ∑ k
T̂t = yt+j , where m = 2k + 1
m j=−k

37
Moving Average Smoothing

So a moving average is an average of nearby points


• observations nearby in time are also likely to be close in
value.
• average eliminates some randomness in the data, leaving a
smooth trend-cycle component.
3-MA: T̂t = (yt−1 + yt + yt+1 )/3

5-MA: T̂t = (yt−2 + yt−1 + yt + yt+1 + yt+2 )/5


• each average computed by dropping oldest observation
and including next observation.
• averaging moves through time series until trend-cycle
computed at each observation possible.
38
Moving averages: example

global_economy |> filter(Country == "Australia") |>


autoplot(Exports) +
labs(y="% of GDP", title= "Total Australian exports")

Total Australian exports

21
% of GDP

18

15

12
1960 1980 2000
Year [1Y]

39
Moving average smoothing

Year Exports 5-MA


1960.00 12.99
1961.00 12.40
1962.00 13.94 13.46
1963.00 13.01 13.50
1964.00 14.94 13.61
... ... ...
2012.00 21.52 20.78
2013.00 19.99 20.81
2014.00 21.08 20.37
2015.00 20.01 20.32
2016.00 19.25
2017.00 21.27

40
Moving average smoothing

Total Australian exports: 3−MA

21
% of GDP

18

15

12
1960 1980 2000
Year [1Y]

41
Moving average smoothing

Total Australian exports: 5−MA

21
% of GDP

18

15

12
1960 1980 2000
Year [1Y]

42
Exercise:

1. For the following series, find an appropriate


transformation in order to stabilise the variance.
• United States GDP from global_economy
• Slaughter of Victorian “Bulls, bullocks and steers” in
aus_livestock
• Victorian Electricity Demand from vic_elec.
• Gas production from aus_production
2. Why is a Box-Cox transformation unhelpful for the
canadian_gas data?

43
Next Lecture!

• In the next lecture, we learn The Forecaster’s Toolbox

Please go to Chapter 5 of text book (Hyndman, R. J. &


Athanasopoulos, G. (2021) Forecasting: principles and practice,
3rd edition. https://otexts.com/fpp3/) beforehand.

44

You might also like