Unit IV - Time Series Methods
Unit IV - Time Series Methods
Time series models are used to forecast events based on verified historical data. Common types
include ARIMA, smooth-based, and moving average. Not all models will yield the same results
for the same dataset, so it's critical to determine which one works best based on the individual
time series.
When forecasting, it is important to understand your goal. To narrow down the specifics of
your predictive modeling problem, ask questions about:
1. Volume of data available — more data is often more helpful, offering greater
opportunity for exploratory data analysis, model testing and tuning, and model fidelity.
2. Required time horizon of predictions — shorter time horizons are often easier to
predict — with higher confidence — than longer ones.
3. Forecast update frequency — Forecasts might need to be updated frequently over
time or might need to be made once and remain static (updating forecasts as new
information becomes available often results in more accurate predictions).
4. Forecast temporal frequency — Often forecasts can be made at lower or higher
frequencies, which allows harnessing down-sampling and up-sampling of data (this in
turn can offer benefits while modeling).
ARIMA
The ARIMA model (an acronym for Auto-Regressive Integrated Moving Average), essentially
creates a linear equation which describes and forecasts your time series data. This equation is
generated through three separate parts which can be described as:
The AR and MA aspects of ARIMA actually come from standalone models that can describe
trends of more simplified time series data. With ARIMA modeling, you essentially have the
power to use a combination of these two models along with differencing (the “I”) to allow for
simple or complex time series analysis. Pretty cool, right?
• p: the lag order or the number of time lag of autoregressive model AR(p)
• d: degree of differencing or the number of times the data have had subtracted with past
value
• q: the order of moving average model MA(q)
The table shows the weekly sales volume of a store. The store manager is not savvy with
forecasting models, so he considered the current week’s sales as a forecast for next week. This
methodology is also known as the naïve forecasting method due to the nature of simplicity.
The difference between the actual value and the forecasted value is known as forecast error.
So, in this example, forecast error for week 2 is
A positive value of forecast error signifies that the model has underestimated the actual value
of the period. A negative value of forecast error signifies that the model has overestimated the
actual value of the period.
The following table calculates the forecast error for the rest of the weeks:
A simple measure of forecast accuracy is the mean or average of the forecast error, also known
as Mean Forecast Error.
In this example, calculate the average of all the forecast errors to get mean forecast error:
The MFE for this forecasting method is 0.2.
Since the MFE is positive, it signifies that the model is under-forecasting; the actual value tends
to more than the forecast values. Because positive and negative forecast errors tend to offset
one another, the average forecast error is likely to be small. Hence, MFE is not a particularly
useful measure of forecast accuracy.
This method avoids the problem of positive and negative forecast errors. As the name suggests,
the mean absolute error is the average of the absolute values of the forecast errors.
RMSE for this forecast model is 4.57. It means, on average, the forecast values were 4.57
values away from the actual.
Mean Absolute Percentage Error (MAPE)
The size of MAE or RMSE depends upon the scale of the data. As a result, it is difficult to
make comparisons for a different time interval (such as comparing a method of forecasting
monthly sales to a method forecasting a weekly sales volume). In such cases, we use the mean
absolute percentage error (MAPE).
Steps for calculating MAPE:
• by dividing the absolute forecast error by the actual value
• calculating the average of individual absolute percentage error
It signifies that the 21% average deviation of the forecast from the actual value in the given
model.
STL approach
STL uses LOESS (locally estimated scatterplot smoothing) to extract smooths estimates of the
three components. The key inputs into STL are:
Here we use STL to handle the seasonality and then an ARIMA(1,1,0) to model the
deseasonalized data. The seasonal component is forecast from the find full cycle where
Extract features from generated model as Height, Average Energy etc and
Analyze for prediction
The Extract Vector Features tool can be used on classified ground points, either on an entire
point cloud or on a user-defined subsection. This lidar feature extraction tool lets the user derive
features such as building footprints, building roof structures, power lines, and other structures
from classified Lidar ground points.
Example:
The building extraction tool will create 3D building vectors or mesh features based on points
in the Structure/Building Point group.
The default display of extracted 3D buildings is controlled by Area Feature Types: Building -
Floor, Building - Ground, Building - Roof, and Building - Wall. These styles carry over to 3D
model buildings as well.
Simplified Building Footprints - Select this option to extract only the building footprints and
other similar features from the classified ground points.
Pin Footprints to Height - Use this option to set the elevation for all generated building
footprints features to a single value.
Simplified Planes - Select this option to identify and create roof features for extracted
buildings.
Simplified Side Walls - Select this to generate side wall features extracted buildings.
Sharpen edges and stitch planes by adding points at planar intersections - This option will
generate point features along intersections between planes. The points will be placed between
pairs of nearby points that are contained in different planes.
Buildings as Mesh - If selected, the building roof will each be turned into a single 3D mesh
feature.
Color Vertices by Lidar Intensity - Use this option to color the mesh vertices by the lidar
intensity value at that location.
Footprint Line Simplification Epsilon - Use this field to specify, in meters, the distance from
one vertex to another for the vertex to be removed when simplifying the building footprint
features. Larger values will further simplify the extracted building features.
Maximum Distance to Plane - Set the maximum distance in meters from an identified plane
a point can be to still be considered part of the plane.
Minimum Number of Points in Plane - Use this field to set the minimum number of points
needed in an area to identify a plane and create an extracted building feature.
Minimum Footprint Area - User this field to specify the minimum size of a building footprint
in square meters. Generated footprints smaller than the entered value will be discarded.
Different features capture different aspects of sound. Generally audio features are categorised
with regards to the following aspects:
These broad categories cover mainly musical signals rather than audio in general:
• High-level: These are the abstract features that are understood and enjoyed by humans.
These include instrumentation, key, chords, melody, harmony, rhythm, genre, mood,
etc.
• Mid-level: These are features we can perceive. These include pitch, beat-related
descriptors, note onsets, fluctuation patterns, MFCCs, etc. We may say that these are
aggregation of low-level features.
• Low-level: These are statistical features that are extracted from the audio. These make
sense to the machine, but not to humans. Examples include amplitude envelope, energy,
spectral centroid, spectral flux, zero-crossing rate, etc.