5.time Series Analysis
5.time Series Analysis
The Series of data points recorded over a specified period of time is called Time-series data. Time-
series analysis is a technique for analyzing time series data and extract meaningful statistical
information and characteristics of the data. One of the major objectives of the analysis is to forecast
future value. Extrapolation is involved when forecasting with the time series analysis which is
extremely complex. But, the forecasted value along with the estimation of uncertainty associated
with that can make the result extremely valuable.
Time series analysis is a statistical method to analyse the past data within a given duration of time to
forecast the future. It comprises of ordered sequence of data at equally spaced interval. To
understand the time series data & the analysis let us consider an example. Consider an example of
Airline Passenger data. It has the count of passenger over a period of time.
Exploratory Analysis:
The first step is to perform the exploratory analysis which is carried out by plotting a line chart of the
count of passengers against time. Figure 1 shows the count of passenger on y-axis and time on x-axis
where each interval can be considered as a year.
1. TREND: Increasing or decreasing pattern has been observed over a period of time. In this case, the
gradually increasing underlying trend is observed. i.e. the count of passengers has increased over a
period of time.
2. SEASONALITY: Refers to cyclic pattern. A similar pattern that repeats after a certain interval of
time. In the airline passenger example, we can observe a cyclic pattern that has a certain high & a
low point which is visible in all the interval.
3. HETROSCEDASTICITY: Refers to Non-constant variance or varying deflection from the mean over a
period of time. In the below plot the variance has increased continuously over a period of time.
Figure 2 shows the graph of the Airline passenger data and the decomposed components
(highlighted on the left) that we discussed above. Namely:
Trend
Seasonality
Ample of time series data is being generated from a variety of fields. And hence the study time series
analysis holds a lot of applications. Let us try to understand the importance of time series analysis in
different areas.
2. Field of Finance: Widely used in the field of finance such as to understand the stock market
fluctuations, yield management, understand the market volatility, etc.
3. Social Scientistà: Birth rates or death rates over a period of time and can come with the
schemes in their interest.
5. Environmental Science: Environmental time series data can help us explain the rise in
temperature over the past few years. Plot shows the temperature data over a period of time
6. Sales forecasting: Understanding the sales number forecasting, the company’s performance.
The plot shows an earning with respect to time
Time Series Analysis is needed to predict the future based on past data values which are
mostly dependent on time. It is used by researchers and executives to predict sales, price,
policies, and production.
KEY TAKEAWAYS
A time series is a data set that tracks a sample over time.
In particular, a time series allows one to see what factors influence certain variables from
period to period.
Time series analysis can be useful to see how a given asset, security, or economic variable
changes over time.
Forecasting methods using time series are used in both fundamental and technical analysis.
Although cross-sectional data is seen as the opposite of time series, the two are often used
together in practice.
Trend
Seasonality
Irregularity
Cyclic
Trend: A trend is nothing but a movement to relatively higher or lower values over a long
period of time. When time series analysis shows a general pattern that is upward, we call it
an Up-Trend and when it exhibits a lower pattern, we call it a Down-Trend. Whenever there
is no trend or a straight line, we call it a horizontal trend.
For example: A new residential site is being built, and people are moving there. You opened
a hardware shop over there and now at the beginning, everyone will buy hardware. So, the
sales of the shop are high or we can say the trend is high. But after some time, when
everyone has their own hardware, and every house is occupied, the trend may go down.
Let’s say that the sales are up for the first two years, and then they go down.
Seasonality: It’s a repeating pattern within a fixed time period. For example, Diwali is
celebrated all over India in the months of either October or November. Now, the sales of
crackers in these months are very high as compared to other months of the year. This has
been noticed for the past two years, five years, ten years, and so on, so it’s a repeating
pattern within a fixed time period, while in trend this is not the case. Taking one more
example of ice cream, the sales of ice cream go comparatively higher in summers than in
winters, so this is a seasonality again.
Irregularity: This is also known as noise or irrelevant data. It is inconsistent in nature, or, we
can say, unsymmetric. Irregularity typically occurs for a brief period and does not repeat. For
example, COVID-19 emerged suddenly within a decade. During the COVID pandemic, sales
of sanitizers and masks were high, but after some time, these products have become less
common. So, this is all happening erratically. You don’t know how many sales will occur, so
this represents random variation, which is known as an irregularity.
Cyclic: It is repeating up and down movements, so this means we can go over more than a
year. Cyclic does not have any fixed patterns. They can happen anytime, like in a year in a
decade, or maybe within six months. They keep on repeating and as a result, they are much
harder to predict.
In Time Series Analysis, ARIMA stands for Auto-Regressive Integrated Moving Average. It is
used to predict the future values of time series using historical data.
Integration: It is the difference between the current analysis and the previous analysis. It is
used as a stationary time series. All values are parameters of our ARIMA model. Instead of
using different operators and models to represent ARIMA models, you use indicators to
represent them. The parameters are:
p: Previous market value for each period. It is derived from the autoregressive model.
d: Number of times the data is changed to keep it constant. How many times the integration
was done
Moving Average: A moving average is a statistical method that uses an updated average to
help reduce noise. It uses the average price of a specific time. You can achieve this by taking
different pieces of data and finding their average.
First, you consider a group of data points and average them. You can find the next average
by subtracting the first value from the data and including the next value in the series.
Time Series Analysis Techniques:
Analyzing time series data involves various techniques, from linear systems analysis to
nonlinear dynamics and rule induction. Let's break down each approach:
- In linear systems analysis, the focus is on understanding the behaviour of a system using
linear mathematical models.
- Time series data is examined in terms of linear relationships between variables over time.
- Techniques like autoregressive integrated moving average (ARIMA) modeling and linear
regression are commonly used.
- Key concepts include stationarity, autocorrelation, and the identification of trends and
seasonality.
2. Nonlinear Dynamics:
- Techniques such as chaos theory, fractal analysis, and phase space reconstruction are
used to understand the underlying dynamics of the system.
- Nonlinear time series analysis involves methods like Lyapunov exponents, attractor
reconstruction, and surrogate data testing to characterize the dynamics of the system.
- It's particularly useful for systems with complex, unpredictable behavior, such as weather
patterns or financial markets.
3. Rule Induction:
- Rule induction involves extracting meaningful patterns or rules from the time series data.
- It aims to discover relationships or rules that describe the behaviour of the system.
- Techniques like decision trees, association rule mining, and genetic algorithms are
commonly employed.
- Rule-based models can provide interpretable insights into the underlying dynamics of the
system and help in forecasting or decision-making.
When analyzing time series data, it's often beneficial to employ a combination of these
approaches. Linear systems analysis provides a good starting point, especially for
understanding basic trends and patterns. However, nonlinear dynamics and rule induction
techniques are essential for capturing more complex behavior and uncovering hidden
relationships within the data. Integrating these methods can lead to a more comprehensive
understanding of the dynamics driving the time series.
Time Series Analysis Techniques: Linear vs. Nonlinear and Rule Induction
Time series data refers to a sequence of measurements over time, like stock prices, weather
patterns, or patient health readings. Analyzing these sequences helps us understand trends,
forecast future values, and uncover hidden patterns. Here, we'll explore three approaches for
time series analysis:
1. Linear Systems Analysis:
Focuses on: Linear systems analysis assumes the system generating the time series is linear.
Linear systems have proportional cause-and-effect relationships. For example, if you double
the input to a linear system, the output will also double.
Techniques: Common techniques include:
Linear Models: This approach assumes that the relationship between variables in the time
series can be adequately described using linear equations.
Autoregressive Integrated Moving Average (ARIMA): ARIMA models are widely used for
time series forecasting. They involve parameters for autoregression, differencing, and moving
averages, making them suitable for capturing trends and seasonal patterns.
Linear Regression: Linear regression is another technique used for modeling time series
data. It identifies linear relationships between the dependent variable (the time series) and
one or more independent variables.
Stationarity: A crucial concept in time series analysis, stationarity implies that the statistical
properties of the data, such as mean and variance, remain constant over time. Stationarity is
often a prerequisite for applying linear models effectively.
Autocorrelation: Autocorrelation measures the correlation between a time series and its
lagged values. In linear systems analysis, autocorrelation functions help identify the presence
of temporal dependencies and inform the choice of appropriate models.
Advantages: Linear models are interpretable, meaning the coefficients have a clear meaning
in relation to the data. They're also computationally efficient for large datasets.
Disadvantages: Linear models struggle to capture complex non-linear relationships that
might exist in real-world time series data.
2. Nonlinear Dynamics:
Focuses on: Nonlinear dynamics explores the behavior of systems with non-linear
relationships. These systems can exhibit complex phenomena like chaos and bifurcations,
where small changes in initial conditions can lead to drastically different outcomes.
Techniques: Techniques used for nonlinear time series analysis include:
o Phase Space Reconstruction: Embeds the time series data into a higher-dimensional
space to reveal hidden patterns and dynamics.
o Recurrence Plot Analysis: Identifies recurring patterns in the data by plotting points
where the time series is similar to itself at different time lags.
o Bifurcation Analysis: Studies how the behavior of the system changes under small
variations in a control parameter.
Advantages: Nonlinear methods can capture complex dynamics in time series data that
linear models miss.
Disadvantages: Nonlinear techniques can be computationally expensive and may require
more sophisticated mathematical knowledge to interpret the results.
3. Rule Induction:
Focuses on: Rule induction aims to extract a set of rules that govern the behavior of the time
series data. These rules can be used for classification (identifying different states or regimes)
or prediction (forecasting future values).
Techniques: Common techniques include:
o Decision Trees: A tree-like structure where each node represents a condition on the
data, and the branches lead to different outcomes or classifications.
o Association Rule Learning: Discovers relationships between different variables in the
time series data.
o Genetic Algorithms: Genetic algorithms mimic the process of natural selection to
evolve a population of potential solutions (rules) toward an optimal solution. They
can be applied to search for rules that best describe the temporal behavior of the
data.
o Fuzzy Rule-Based Systems: Fuzzy logic allows for representing uncertainty or
imprecision in the data. Fuzzy rule-based systems use linguistic variables and fuzzy
logic operations to derive rules that capture the relationships in the time series.
Advantages: Rule induction methods can provide easily interpretable rules for understanding
the time series behavior.
Disadvantages: The quality of the rules depends heavily on the chosen features and the
complexity of the underlying system.
Choosing the Right Technique:
The best approach for your time series analysis depends on the nature of your data and the
research question you're trying to answer.
If you suspect a linear relationship and interpretability is important, linear models are a
good starting point.
If you suspect complex non-linear dynamics, explore techniques from nonlinear dynamics
or consider feature engineering to transform the data before applying linear methods.
Rule induction can be a valuable tool for uncovering relationships within the data,
especially when combined with other techniques.
Additional Considerations:
Stationarity: Many time series analysis techniques assume the data is stationary, meaning
the statistical properties (mean, variance) are constant over time. Detrending or differencing
the data might be necessary before applying some techniques.
Feature Engineering: Creating new features from existing ones can sometimes improve the
performance of all the techniques mentioned above.
By understanding the strengths and limitations of these time series analysis approaches, you
can select the most suitable method to extract valuable insights from your data.