0% found this document useful (0 votes)
25 views52 pages

Time Series m3

Time series analysis involves examining data points collected over time to identify trends, seasonal variations, cyclical movements, and irregular fluctuations. Key components include the trend, seasonal variation, cyclical variation, and irregular variation, each contributing to the overall understanding of the data. Various models, such as ARIMA and autoregressive models, are used for forecasting based on historical data while ensuring the time series is stationary.

Uploaded by

yatinchauhan786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views52 pages

Time Series m3

Time series analysis involves examining data points collected over time to identify trends, seasonal variations, cyclical movements, and irregular fluctuations. Key components include the trend, seasonal variation, cyclical variation, and irregular variation, each contributing to the overall understanding of the data. Various models, such as ARIMA and autoregressive models, are used for forecasting based on historical data while ensuring the time series is stationary.

Uploaded by

yatinchauhan786
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Time Series

Module 3
Introduction
• Time series analysis is a specific way of analyzing a sequence of data
points collected over an interval of time.
• In time series analysis, analysts record data points at consistent
intervals over a set period of time rather than just recording the data
points randomly.
Overview of Time Series Analysis
• Time series is a sequence of
observations of categorical or Timestamp Stock -
Price
numeric variables indexed by a
2015-10-11 09:00:00 100
date, or timestamp. A clear
2015-10-11 10:00:00 110
example of time series data is the 2015-10-11 11:00:00 105
time series of a stock price. In the 2015-10-11 12:00:00 90
table, we can see the basic 2015-10-11 13:00:00 120
structure of time series data. In this
case the observations are recorded
every hour.
.

The trend is defined as the long term underlying


growth movement in a time series.

Accurate trend spotting can only be determined if the


data are available for a sufficient length of time.

Forecasting does not produce


definitive results. Forecasters can
and do get things wrong from
election results and football
scores to the weather.
Time series examples
• Sales data
• Gross national product
• Share price
• Unemployment rates
• Population
• Foreign debt
• Interest rates
Time series components
Time series data can be broken into these four components:
1. Secular trend
2. Seasonal variation
3. Cyclical variation
4. Irregular variation
Components of Time-Series Data
Irregular Cyclical
fluctuations

Tren Seasonal
1 2 3 4 5 6 7
d 8 9 10 11 12 13

Yea
r
1. Secular Trend
1. Trend component: This is useful in predicting future
movements. Over a long period of time, the trend
shows whether the data tends to increase or decrease.
The term “trend” refers to an average, long-term,
smooth tendency. Not all increases or decreases have
to occur simultaneously. Different sections of time
show varying tendencies in terms of trends that are
increasing, decreasing, or stable. There must, however,
be an overall upward, downward, or stable trend.
2. Seasonal Variation
The seasonal component of a time series is the variation in some variable due to some
predetermined patterns in its behavior. This definition can be used for any type of time series
including individual commodity price quotes, interest rates, exchange rates, stock prices, and so
on.
• In many applications, seasonal components can be represented by simple regression
equations. This approach is sometimes referred to as a “seasonalized regression” or a
“bimodal regression”

The seasonal variation of a time series is a pattern of change that recurs regularly over time.

Seasonal variations are usually due to the differences between seasons and to festive occasions
such as Easter and Christmas.

Examples include:
• Air conditioner sales in Summer
• Heater sales in Winter
• Flu cases in Winter
• Airline tickets for flights during school vacations
Monthly Retail Sales in NSW Retail Department Stores
3. Cyclical variation
• The cyclical component in a time series is the part of the movement in the
variable which can be explained by other cyclical movements in the
economy.
• In other words, this term gives information about seasonal patterns. It is
also called the long-period (LP) effect or boom-bust process. For example,
during recessions, business cycles are usually characterized by slower
growth rates than before the recession started.

Cyclical variations also have recurring patterns but with a longer and more
erratic time scale compared to Seasonal variations.
The name is quite misleading because these cycles can be far from regular
and it is usually impossible to predict just how long periods of expansion or
contraction will be.
There is no guarantee of a regularly returning pattern.
Cyclical variation

Example include:

• Floods
• Wars
• Changes in interest rates
• Economic depressions or recessions
• Changes in consumer spending
Cyclical variation
This chart represents an economic cycle, but we know
it doesn’t always go like this. The timing and length of
each phase is not predictable.
4. Irregular variation
The irregular component is the part of the movement in the variable which
cannot be explained by cyclical movements in the economy.

An irregular (or random) variation in a time series occurs over varying


(usually short) periods.

It follows no pattern and is by nature unpredictable.


It usually occurs randomly and may be linked to events that also occur
randomly.
Irregular variation cannot be explained mathematically.
Irregular variation

If the variation cannot be accounted for by secular trend, season or


cyclical variation, then it is usually attributed to irregular variation.
Example include:

– Sudden changes in interest rates


– Collapse of companies
– Natural disasters
– Sudden shift s in government policy
– Dramatic changes to the stock market
– Effect of Middle East unrest on petrol prices
Monthly Value of Building Approvals ACT)
Operations on Time series analysis
(I)Models of time series also include
(i) Classification
(ii)Curve - fitting
(iii)Explanative analysis
(iv)Descriptive analysis
(v)Explorative analysis
(vi)Forecasting
(vii) Intervention analysis
(viii) Segmentation
(II) Data classification

(i) Stock time series data


(ii) Flow time series data

(III) Data variations


(i) Functional analysis
(ii) Trend analysis
(iii) Seasonal variations
Box-Jenkins Methodology
• A time series consists of an ordered sequence of equally spaced values over time. Examples of a time series are
monthly unemployment rates, daily website visits, or stock prices every second.
• A time series can consist of the following components: • Trend • Seasonality • Cyclic • Random
• The trend refers to the long-term movement in a time series. It indicates whether the observation values are
increasing or decreasing over time. Examples of trends are a steady increase in sales month over month or an
annual decline of fatalities due to car accidents.
• The seasonality component describes the fixed, periodic fluctuation in the observations over time. As the name
suggests, the seasonality component is often related to the calendar. For example, monthly retail sales can
fluctuate over the year due to the weather and holidays.
• A cyclic component also refers to a periodic fluctuation, but one that is not as fixed as in the case of a
seasonality component. For example, retails sales are influenced by the general state of the economy. Thus, a
retail sales time series can often follow the lengthy boom-bust cycles of the economy.
• After accounting for the other three components, the random component is what remains. Although noise is
certainly part of this random component, there is often some underlying structure to this random component
that needs to be modeled to forecast future values of a given time series. Developed by George Box and
Gwilym Jenkins,
• The Box-Jenkins methodology for time series analysis involves the following three main steps:
1. Condition data and select a model. • Identify and account for any trends or seasonality in the time series. •
Examine the remaining time series and determine a suitable model.
2. Estimate the model parameters.
3. Assess the model and return to Step 1, if necessary
ACF (Auto Correlation Function)
Auto Correlation function takes into consideration of all the past
observations irrespective of its effect on the future or present time
period. It calculates the correlation between the t and (t-k) time
period. It includes all the lags or intervals between t and (t-k) time
periods. Correlation is always calculated using the Pearson Correlation
formula.
Autocorrelation Function (ACF)
• The term autocorrelation refers to the degree of similarity between
A) a given time series, and
B) a lagged version of itself, over
C) successive time intervals.
In other words, autocorrelation is intended to measure the relationship
between a variable’s present value and any past values that you may
have access to.
• Therefore, a time series autocorrelation attempts to measure the current
values of a variable against the historical data of that variable. It ultimately
plots one series over the other, and determines the degree of similarity
between the two.
• For the sake of comparison, autocorrelation is essentially the exact same
process that you would go through when calculating the correlation
between two different sets of time series values on your own. The major
difference here is that autocorrelation uses the same time series two
times: once in its original values, and then again once a few different time
periods have occurred.
• Autocorrelation is also known as serial correlation, time series correlation
and lagged correlation. Regardless of how it’s being used, autocorrelation
is an ideal method for uncovering trends and patterns in time series data
that would have otherwise gone undiscovered.
Autocorrelation examples
• Example 1: Regression analysis- One prominent example of how
autocorrelation is commonly used takes the form of regression
analysis using time series data.
• Example 2: Signal processing- Autocorrelation is also a very
important technique in signal processing, which is a part of electrical
engineering that focuses on understanding more about (and even
modifying or synthesizing) signals like sound, images and sometimes
scientific measurements. In this context, autocorrelation can help
people better understand repeating events like musical beats —
which itself is important for determining the proper tempo of a song.
Many also use it to estimate a very specific pitch in a musical tone,
too.
Summary
• Autocorrelation represents the degree of similarity between a given
time series and a lagged version of itself over successive time
intervals.
• Autocorrelation measures the relationship between a variable's
current value and its past values.
• An autocorrelation of +1 represents a perfect positive correlation,
while an autocorrelation of -1 represents a perfect negative
correlation.
• Technical analysts can use autocorrelation to measure how much
influence past prices for a security have on its future price.
PACF(Partial Correlation Function)
• The PACF determines the partial correlation between time period t
and t-k. It doesn’t take into consideration all the time lags between t
and t-k. For e.g. let's assume that today's stock price may be
dependent on 3 days prior stock price but it might not take into
consideration yesterday's stock price closure. Hence we consider only
the time lags having a direct impact on future time period by
neglecting the insignificant time lags in between the two-time slots t
and t-k.
How to differentiate when to use ACF and
PACF?
• Let's take an example of sweets sale and income generated in a
village over a year. Under the assumption that every 2 months there
is a festival in the village, we take out the historical data of sweets
sale and income generated for 12 months. If we plot the time as
month then we can observe that when it comes to calculating the
sweets sale we are interested in only alternate months as the sale of
sweets increases every two months. But if we are to consider the
income generated next month then we have to take into
consideration all the 12 months of last year.
• So in the above situation, we will use ACF to find out the income
generated in the future but we will be using PACF to find out the
sweets sold in the next month.
Stationary Time Series
• A key role in time series analysis is played by processes whose
properties, or some of them, do not vary over time. Such a property is
illustrated in the following important concept, stationarity. We then
introduce the most commonly used stationary linear time series
models–the autoregressive integrated moving average (ARIMA)
models. These models have assumed great importance in modeling
real-world processes.
Time Series Models
• AR
• MA
• ARMA
• ARIMA
• AR, MA, ARMA, and ARIMA models are used to forecast the observation at
(t+1) based on the historical data of previous time spots recorded for the
same observation. However, it is necessary to make sure that the time
series is stationary over the historical data of observation overtime period.
If the time series is not stationary then we could apply the differencing
factor on the records and see if the graph of the time series is a stationary
overtime period.
What Is an Autoregressive Model?
• A statistical model is autoregressive if it predicts future values based
on past values. For example, an autoregressive model might seek to
predict a stock's future prices based on its past performance.
• Autoregressive models predict future values based on past values.
• They are widely used in technical analysis to forecast future security
prices.
• Autoregressive models implicitly assume that the future will resemble
the past.
• Therefore, they can prove inaccurate under certain market
conditions, such as financial crises or periods of rapid technological
change.
AR (Auto-Regressive) Model
• The time period at t is impacted by the observation at various slots
t-1, t-2, t-3, ….., t-k. The impact of previous time spots is decided by
the coefficient factor at that particular period of time.
• The price of a share of any particular company X may depend on all
the previous share prices in the time series. This kind of model
calculates the regression of past time series and calculates the
present or future values in the series in know as Auto Regression (AR)
model.
Yt = β₁* y-₁ + β₂* y -₂ + β₃ * y -₃ + ………… + β * y -
• Consider an example of a milk distribution company that produces
milk every month in the country. We want to calculate the amount of
milk to be produced current month considering the milk generated in
the last year. We begin by calculating the PACF values of all the 12
lags with respect to the current month. If the value of the PACF of any
particular month is more than a significant value only those values
will be considered for the model analysis.
AR (Auto-Regressive) Model
For e.g in the figure the values 1,2, 3 up to 12 displays the
direct effect(PACF) of the milk production in the current
month w.r.t the given the lag t. If we consider two
significant values above the threshold then the model will
be termed as AR(2).
MA (Moving Average) Model
• The time period at t is impacted by the unexpected external factors at
various slots t-1, t-2, t-3, ….., t-k. These unexpected impacts are
known as Errors or Residuals. The impact of previous time spots is
decided by the coefficient factor α at that particular period of time.
The price of a share of any particular company X may depend on
some company merger that happened overnight or maybe the
company resulted in shutdown due to bankruptcy. This kind of model
calculates the residuals or errors of past time series and calculates
the present or future values in the series in know as Moving Average
(MA) model.
Yt = α₁* Ɛ -₁ + α₂ * Ɛ -₂ + α₃ * Ɛ -₃ + ………… + α *
Ɛ -
• Consider an example of Cake distribution during my birthday. Let's
assume that your mom asks you to bring pastries to the party. Every
year you miss judging the no of invites to the party and end
upbringing more or less no of cakes as per requirement. The
difference in the actual and expected results in the error. So you want
to avoid the error for this year hence we apply the moving average
model on the time series and calculate the no of pastries needed this
year based on past collective errors. Next, calculate the ACF values of
all the lags in the time series. If the value of the ACF of any particular
month is more than a significant value only those values will be
considered for the model analysis.
• For e.g in the figure the values
1,2, 3 up to 12 displays the total
error(ACF) of count in pastries
current month w.r.t the given
the lag t by considering all the
in-between lags between time t
and current month. If we
consider two significant values
above the threshold then the
model will be termed as MA(2).
ARMA (Auto Regressive Moving Average)
Model
• This is a model that is combined from the AR and MA models. In this
model, the impact of previous lags along with the residuals is
considered for forecasting the future values of the time series. Here β
represents the coefficients of the AR model and α represents the
coefficients of the MA model.
• Yt = β₁* y -₁ + α₁* Ɛ -₁ + β₂* y -₂ + α₂ * Ɛ -₂ + β₃ * y -₃ + α₃ * Ɛ -₃
+………… + β * y - + α * Ɛ -
• Consider the above graphs where
the MA and AR values are plotted
with their respective significant
values. Let's assume that we
consider only 1 significant value
from the AR model and likewise 1
significant value from the MA
model. So the ARMA model will
be obtained from the combined
values of the other two models
will be of the order of ARMA(1,1).
ARIMA (Auto-Regressive Integrated Moving
Average) Model
• We know that in order to apply the various models we must in the
beginning convert the series into Stationary Time Series. In order to
achieve the same, we apply the differencing or Integrated method
where we subtract the t-1 value from t values of time series. After
applying the first differencing if we are still unable to get the
Stationary time series then we again apply the second-order
differencing.
• The ARIMA model is quite similar to the ARMA model other than the
fact that it includes one more factor known as Integrated( I ) i.e.
differencing which stands for I in the ARIMA model. So in short
ARIMA model is a combination of a number of differences already
applied on the model in order to make it stationary, the number of
previous lags along with residuals errors in order to forecast future
values.
• Consider the graphs where the MA and
AR values are plotted with their
respective significant values. Let's
assume that we consider only 1
significant value from the AR model
and likewise 1 significant value from
the MA model. Also, the graph was
initially non-stationary and we had to
perform differencing operation once in
order to convert into a stationary set.
Hence the ARIMA model which will be
obtained from the combined values of
the other two models along with the
Integral operator can be displayed as
ARIMA(1,1,1).
Creating ARIMA models for time series forecasting

1. Determining model parameters

• ARIMA models have three key parameters: the order of autoregression, the degree of differencing
and the order of the moving average. These parameters are represented as p, d, and q,
respectively. Selecting the optimal combination is essential for effective forecasting.

2. Fitting ARIMA models

• Once you’ve determined the optimal (p, d, q) parameters, fit your ARIMA model to the training set
using statistical software or programming languages like Python or R. While fitting the model, pay
close attention to its residuals, as they provide crucial information about the model’s performance.
Ideally, the residuals should be white noise, indicating that the model has captured the underlying
structure of the data.
3. Model selection techniques
• In some cases, you may need to compare multiple ARIMA models with different parameter
combinations to identify the highest-performing model. Common model selection criteria include:

∙ Akaike information criterion (AIC)

∙ Bayesian information criterion (BIC)


Evaluating ARIMA model performance

Accuracy metrics for time series forecasting

The following metrics can help assess the accuracy of your ARIMA model.

• Mean absolute error (MAE)

• Mean squared error (MSE)


• Root mean squared error (RMSE)
• ARIMA model forecasting

1. Making predictions with ARIMA models

• Once you’ve evaluated your ARIMA model’s performance, you can use it to generate forecasts for
future time periods.

2. Confidence intervals and prediction intervals

• You should consider the uncertainty associated with the predictions when generating forecasts.
Confidence intervals provide a range within which the true value of the predicted variable is likely to
fall, with a specified probability.

• 3. Visualizing forecasts

• Plot your ARIMA model’s forecasts, along with the associated confidence or prediction intervals, to
visualize the model’s predictions and the associated uncertainty.
Building ARIMA models in Python

ARIMA model implementation in Python

• Python’s statsmodels library provides tools for building and analyzing ARIMA models. Key functions
include ARIMA() for model specification, fit() for fitting the model to the data and forecast() for
generating predictions.

Best practices for Python-based ARIMA modeling

• When building ARIMA models in Python, adhere to the following best practices:

1. Preprocess and clean your data to ensure it’s compatible with ARIMA modeling

2. Use ACF and PACF plots to determine the optimal (p, d, q) parameters

3. Split your data into training and test sets and use cross-validation to assess model
performance

4. Evaluate model accuracy using appropriate error metrics and visualizations


Real-world applications of ARIMA models

ARIMA modeling in finance, retail and healthcare

ARIMA models have found widespread use in various industries, including:

∙ Finance

ARIMA models are employed for forecasting stock prices, exchange rates and other
financial time series data.

∙ Retail

Businesses use ARIMA models to forecast sales, manage inventory and optimize resource
allocation.

∙ Healthcare

ARIMA models help predict patient admissions, medical resource utilization, and disease
prevalence.
Reasons to choose ARIMA model and caution
Pros
(i) Good for short-term forecasting,
(ii) Only it needs historical data
(iii) Models non-stationary data.
(iv) It is based on the statistical concept of serial correlation, where past data points influence future
data points.

Cons
(i) Not built for long term forecasting.
(ii) Poor at predicting turning points.
(iii) Computationally expensive.
(iv) Parameters are subjective.
THANK YOU

You might also like