Time Series Analysis
Time Series Analysis
Time Series Analysis
Problem Statement:
Forecast the number of passengers travelling in the metro in next
year.
Solution Statement:
TS is a collection of data points collected at constant time intervals.
These are analyzed to determine the long term trend so as to forecast
the future or perform some other form of analysis. In Time series, we
have only 2 variables, time & the variable we want to forecast.
System Architecture:
Flow chart:
Analytics:
ARIMA model
ARIMA(Auto Regressive Integrated Moving Average) is a
combination of 2 models AR(Auto Regressive) & MA(Moving
Average). It has 3 hyperparameters - P(auto regressive lags),d(order
of differentiation),Q(moving avg.) which respectively comes from the
AR, I & MA components. The AR part is correlation between prev &
current time periods. To smooth out the noise, the MA part is used.
The I part binds together the AR & MA parts.
How to find value of P & Q for ARIMA ?
We need to take help of ACF(Auto Correlation Function) &
PACF(Partial Auto Correlation Function) plots. ACF & PACF graphs
are used to find value of P & Q for ARIMA. We need to check, for
which value in x-axis, graph line drops to 0 in y-axis for 1st time.
From PACF(at y=0), get P
From ACF(at y=0), get Q
ADCF test
In statistics and econometrics, an augmented Dickey–Fuller test
(ADF) tests the null hypothesis that a unit root is present in a time
series sample. The alternative hypothesis is different depending on
which version of the test is used, but is usually stationarity or trend-
stationarity. It is an augmented version of the Dickey–Fuller test for a
larger and more complicated set of time series models.