Tsa - Time Series Analysis
Tsa - Time Series Analysis
A time series is a sequence of data points collected or recorded at regular time intervals. Time series
analysis involves studying these data points to understand the underlying patterns, trends, and relationships.
Forecasting uses these patterns to predict future values.
Time series analysis is widely used in various fields, including finance (stock prices), economics (GDP
growth), meteorology (weather prediction), healthcare (disease trends), and business (sales forecasting).
Key concepts:
• Temporal dependence: Data points are correlated with their past values.
• Stationarity: A time series is stationary if its statistical properties (mean, variance) remain constant
over time.
• Seasonality & Trend: Some time series exhibit repeating patterns (seasonality) or long-term
directional movement (trend).
Cyclic Patterns (C): Fluctuations occurring at irregular intervals (e.g., business cycles). Cyclic patterns are
long-term changes in data that do not happen at fixed time intervals. These cycles are often influenced by
economic or social factors. A cycle looks like a wave but does not follow a fixed pattern like seasonality.
Example:
• The economy booms and crashes every few years.
• Real estate prices go up and down over time.
• The stock market experiences bull (rising) and bear (falling) phases.
Irregular Component (I): Random variations or noise in data (e.g., sudden market crashes). The irregular
component includes random, unpredictable events that affect the data. The graph will show sudden spikes or
drops that do not follow any pattern.
Example:
• A sudden stock market crash due to an economic crisis.
• A natural disaster like an earthquake affecting agricultural production.
• The COVID-19 pandemic causing a drop in airline travel.
1. If ACF(k) is high, it means the data points at time t and t−k are strongly related.
2. ACF helps identify seasonality → If ACF values are high at specific lags (e.g., every 12
months), it suggests a yearly pattern.
3. The ACF plot (Correlogram) shows autocorrelation values for different lags.
• Partial Autocorrelation: Measures the correlation between a time series and a lagged version of
itself after removing the influence of other lags. Partial Autocorrelation measures the relationship between
a time series value and its lagged values, removing the effects of intermediate values. In simple terms,
PACF tells us:
o How much influence a past value has on the present value, ignoring the effects of other
past values.
o Helps determine the correct number of lagged values (p) for an AR model in forecasting.
1. PACF removes the indirect influence of previous lags, so it only shows the direct effect of
past values on the present.
2. The PACF plot helps choose the AR (AutoRegressive) model order for time series
forecasting.
3. If PACF is significant at lag k but drops off after, it suggests that an AR(k) model is suitable.
Plots Used:
• Autocorrelation Function (ACF): Shows correlation for different lags.
• Partial Autocorrelation Function (PACF): Helps determine the order of AR and MA models.
A company records its quarterly sales (in million units) over five years:
[50, 55, 70, 90, 52, 57, 75, 95, 54, 60, 80, 100, 56, 62, 85, 110, 58, 65, 90, 120]
Identifying Trend
A trend refers to a consistent increase or decrease in values over time.
• Observing the data, we see that sales are increasing from 50 to 120 over five years.
• The growth is gradual (small increments per quarter).
• Thus, a positive trend exists.
Identifying Seasonality
Seasonality refers to patterns repeating at regular intervals.
Q.2 Given monthly sales: [120, 135, 150, 160, 175, 190]
Calculate the 3-month moving average forecast for the next month.
Compute Moving Averages
A 3-month moving average is calculated as:
• Forecast for the 7th month: 188.3
• Actual sales = 200, so the forecast underestimated by 11.7 units.
5. Lag Plots
Why Use?
• Determines if a time series has a pattern or is random.
• Helps identify autocorrelation (how past values influence future values).
• Assists in choosing ARIMA models for forecasting.
When to Use?
• While checking if today’s stock price is related to yesterday’s price.
• While analysing seasonal effects in sales
• While selecting appropriate forecasting models like ARIMA that rely on autocorrelation.
Example:
• A lag plot for monthly airline passenger data can confirm seasonality.
• A lag plot of daily COVID-19 cases can help in understanding infection trends.
These transformations are essential when working with time series data, as they help:
1. Stabilize variance (Log Transformation)
2. Remove trends and stationarize data (Differencing)
3. Make data normal (Box-Cox Transformation)
4. Scale data for machine learning models (Normalization)
Why Use Transformations?
• Helps remove heteroscedasticity (changing variance).
• Improves model accuracy.
• Makes forecasting models more effective.
Confidence Interval
R-squared (R2)
ARMA Model
The Autoregressive Moving Average (ARMA) model is a widely used statistical model for analyzing and
forecasting stationary time series data. It combines two components:
1. Autoregressive (AR) Model – Uses past values of the time series to predict future values.
2. Moving Average (MA) Model – Uses past error terms (shocks) to improve predictions.
Since ARMA assumes stationarity, it is suitable for time series data with no trend or seasonal components. If the
data is non-stationary, it must be transformed using differencing before applying ARMA.
Key Applications of ARMA Model
• Stock Market Prediction – Forecasting short-term price movements.
• Economic Forecasting – Modeling GDP growth rates and inflation.
• Signal Processing – Filtering noise from signals.
• Weather Forecasting – Analyzing temperature variations.
Component of ARMA
Advantages
• Works well for stationary time series.
• Effective for short-term forecasting.
• Less computationally intensive compared to deep learning models.
Limitations
• Cannot handle non-stationary data – Requires transformation.
• Cannot capture seasonality – Use SARIMA instead for seasonal data.
• Model selection is manual – Requires tuning p and q based on ACF/PACF plots.
ARIMA Model
The Autoregressive Integrated Moving Average (ARIMA) model is used for time series forecasting, particularly
for non-stationary data. ARIMA extends the Autoregressive Moving Average (ARMA) model by introducing an
Integrated (I) component, which helps in making the series stationary through differencing.
ARIMA is suitable for datasets with trends but no seasonality. If seasonality exists, we use the Seasonal ARIMA
(SARIMA) model.
Key Applications of ARIMA Model
• Stock Market Forecasting – Predicting price trends.
• Economic Data Modeling – Forecasting GDP, inflation rates.
• Sales Forecasting – Estimating future product demand.
• Weather Forecasting – Predicting temperature variations.
Advantages
• Handles Trend – Unlike ARMA, ARIMA can model non-stationary data.
• Good for Short-Term Forecasting – Works well for economic and sales forecasting.
• Widely Used – Many industries rely on ARIMA models.
Limitations
• Does Not Handle Seasonality – Requires SARIMA for seasonal data.
• Sensitive to Parameter Selection – Requires tuning of p,d,q.
• Computationally Intensive – Larger datasets require more processing power.
SARIMA Model
The Seasonal Autoregressive Integrated Moving Average (SARIMA) model is an extension of the ARIMA model
designed to handle seasonal time series data. While ARIMA works well for non-seasonal time series, it does not
explicitly model seasonality. SARIMA addresses this limitation by incorporating seasonal autoregressive, seasonal
differencing, and seasonal moving average components.
SARIMA is particularly useful for datasets with patterns that repeat periodically, such as:
• Monthly sales data (retail sales, airline passengers).
• Daily temperature data (seasonal climate changes).
• Quarterly GDP growth (economic trends).
Advantages
• Handles both trend and seasonality effectively.
• Works well with time series data that exhibits seasonal fluctuations.
• Can be fine-tuned using ACF and PACF plots.
Limitations
• Requires manual parameter tuning
• Computationally expensive for long time series datasets.
• May not work well for data with sudden structural changes
Multivariate time series
A multivariate time series (MTS) is a collection of multiple time-dependent variables observed over time. Unlike
univariate time series, which analyze only one variable, MTS models capture the relationships among multiple
variables, helping improve forecasting accuracy and understanding interactions between variables.
Examples of Multivariate Time Series:
• Economics: GDP, inflation rate, and unemployment rate over time.
• Finance: Stock prices, trading volume, and interest rates.
• Weather Forecasting: Temperature, humidity, and wind speed.
• Healthcare: Blood pressure, heart rate, and oxygen levels over time.
VARIMA Model
The Vector Autoregressive Integrated Moving Average (VARIMA) model is an extension of the VAR (Vector
AutoRegression) and VARMA (Vector AutoRegressive Moving Average) models that accounts for non-
stationarity in multivariate time series data.
• VARIMA models multiple time-dependent variables that influence each other over time.
• It includes differencing (Integrated - I component) to handle non-stationary data.
• It combines autoregressive (AR) and moving average (MA) components for multivariate data analysis.
Example Use Cases of VARIMA:
• Economics: Forecasting GDP, interest rates, and inflation together.
• Finance: Predicting stock market trends using stock prices, exchange rates, and bond yields.
• Climate Science: Studying temperature, humidity, and air pressure trends.
• Energy Forecasting: Predicting electricity consumption based on past consumption, weather data, and
industrial demand.
Components of VARIMA
VARIMA is an extension of the VARMA model with an additional integration (I) component for handling non-
stationary data.
A VARIMA (p, d, q) model consists of:
• p (AutoRegression - AR): Number of past values (lags) used for prediction.
• d (Integration - I): Number of times the series is differenced to make it stationary.
• q (Moving Average - MA): Number of past error terms used to improve predictions.