Time Series m3
Time Series m3
Module 3
Introduction
• Time series analysis is a specific way of analyzing a sequence of data
points collected over an interval of time.
• In time series analysis, analysts record data points at consistent
intervals over a set period of time rather than just recording the data
points randomly.
Overview of Time Series Analysis
• Time series is a sequence of
observations of categorical or Timestamp Stock -
Price
numeric variables indexed by a
2015-10-11 09:00:00 100
date, or timestamp. A clear
2015-10-11 10:00:00 110
example of time series data is the 2015-10-11 11:00:00 105
time series of a stock price. In the 2015-10-11 12:00:00 90
table, we can see the basic 2015-10-11 13:00:00 120
structure of time series data. In this
case the observations are recorded
every hour.
.
Tren Seasonal
1 2 3 4 5 6 7
d 8 9 10 11 12 13
Yea
r
1. Secular Trend
1. Trend component: This is useful in predicting future
movements. Over a long period of time, the trend
shows whether the data tends to increase or decrease.
The term “trend” refers to an average, long-term,
smooth tendency. Not all increases or decreases have
to occur simultaneously. Different sections of time
show varying tendencies in terms of trends that are
increasing, decreasing, or stable. There must, however,
be an overall upward, downward, or stable trend.
2. Seasonal Variation
The seasonal component of a time series is the variation in some variable due to some
predetermined patterns in its behavior. This definition can be used for any type of time series
including individual commodity price quotes, interest rates, exchange rates, stock prices, and so
on.
• In many applications, seasonal components can be represented by simple regression
equations. This approach is sometimes referred to as a “seasonalized regression” or a
“bimodal regression”
The seasonal variation of a time series is a pattern of change that recurs regularly over time.
Seasonal variations are usually due to the differences between seasons and to festive occasions
such as Easter and Christmas.
Examples include:
• Air conditioner sales in Summer
• Heater sales in Winter
• Flu cases in Winter
• Airline tickets for flights during school vacations
Monthly Retail Sales in NSW Retail Department Stores
3. Cyclical variation
• The cyclical component in a time series is the part of the movement in the
variable which can be explained by other cyclical movements in the
economy.
• In other words, this term gives information about seasonal patterns. It is
also called the long-period (LP) effect or boom-bust process. For example,
during recessions, business cycles are usually characterized by slower
growth rates than before the recession started.
Cyclical variations also have recurring patterns but with a longer and more
erratic time scale compared to Seasonal variations.
The name is quite misleading because these cycles can be far from regular
and it is usually impossible to predict just how long periods of expansion or
contraction will be.
There is no guarantee of a regularly returning pattern.
Cyclical variation
Example include:
• Floods
• Wars
• Changes in interest rates
• Economic depressions or recessions
• Changes in consumer spending
Cyclical variation
This chart represents an economic cycle, but we know
it doesn’t always go like this. The timing and length of
each phase is not predictable.
4. Irregular variation
The irregular component is the part of the movement in the variable which
cannot be explained by cyclical movements in the economy.
• ARIMA models have three key parameters: the order of autoregression, the degree of differencing
and the order of the moving average. These parameters are represented as p, d, and q,
respectively. Selecting the optimal combination is essential for effective forecasting.
• Once you’ve determined the optimal (p, d, q) parameters, fit your ARIMA model to the training set
using statistical software or programming languages like Python or R. While fitting the model, pay
close attention to its residuals, as they provide crucial information about the model’s performance.
Ideally, the residuals should be white noise, indicating that the model has captured the underlying
structure of the data.
3. Model selection techniques
• In some cases, you may need to compare multiple ARIMA models with different parameter
combinations to identify the highest-performing model. Common model selection criteria include:
The following metrics can help assess the accuracy of your ARIMA model.
• Once you’ve evaluated your ARIMA model’s performance, you can use it to generate forecasts for
future time periods.
• You should consider the uncertainty associated with the predictions when generating forecasts.
Confidence intervals provide a range within which the true value of the predicted variable is likely to
fall, with a specified probability.
• 3. Visualizing forecasts
• Plot your ARIMA model’s forecasts, along with the associated confidence or prediction intervals, to
visualize the model’s predictions and the associated uncertainty.
Building ARIMA models in Python
• Python’s statsmodels library provides tools for building and analyzing ARIMA models. Key functions
include ARIMA() for model specification, fit() for fitting the model to the data and forecast() for
generating predictions.
• When building ARIMA models in Python, adhere to the following best practices:
1. Preprocess and clean your data to ensure it’s compatible with ARIMA modeling
2. Use ACF and PACF plots to determine the optimal (p, d, q) parameters
3. Split your data into training and test sets and use cross-validation to assess model
performance
∙ Finance
ARIMA models are employed for forecasting stock prices, exchange rates and other
financial time series data.
∙ Retail
Businesses use ARIMA models to forecast sales, manage inventory and optimize resource
allocation.
∙ Healthcare
ARIMA models help predict patient admissions, medical resource utilization, and disease
prevalence.
Reasons to choose ARIMA model and caution
Pros
(i) Good for short-term forecasting,
(ii) Only it needs historical data
(iii) Models non-stationary data.
(iv) It is based on the statistical concept of serial correlation, where past data points influence future
data points.
Cons
(i) Not built for long term forecasting.
(ii) Poor at predicting turning points.
(iii) Computationally expensive.
(iv) Parameters are subjective.
THANK YOU