Time Series & Streaming
Time Series & Streaming
12
Time Series
Time series data is a sequence of data points indexed or listed in time order. It's
crucially distinct because time is one of its axes when plotted. Time series metrics
are data tracked over time, like daily sales in a store.
It's pervasive as our world gets more instrumented; sensors and systems
constantly generate time series data. Its applications span various industries.
Examples
Finance: Stock prices, market indices.
● Increasing importance due to connected devices, real-time data processing, IoT, and AI.
● Essential for managing data from billions of connected devices.
Applications:
Inference and Prediction: It infers past data trends and predicts future values based on collected
observations.
Data Collection: Time series data is created by repeating measurements over time.
Visualization: Data points are plotted on a graph for better understanding of trends.
Example: Tracking temperature hourly over a day provides insights into temperature fluctuations.
Forecasting: Future events can be forecasted based on observed patterns and trends.
Identification of Patterns:
Application: Widely used in various fields like geophysics, oceanography, astronomy, etc.
Periodogram: Estimates spectral density by measuring correlation between the time series and sine/cosine waves at
different frequencies.
Transformation: Data is transformed from time domain to frequency domain for analysis.
Wavelet Analysis:
Definition: Wavelets are localized functions in time and frequency used for decomposing signals.
New Approach: Synthesizes new and old ideas, relatively new field since 1983.
Critical methods of analyzing time series data
Autocorrelation:
Serial dependence where a time series is linearly related to a lagged version of itself.
Importance:
Analysis Tools: Autocorrelation function (ACF) and partial autocorrelation function (PACF).
Model Selection: Aids in selecting appropriate ARIMA models for time series prediction.
Cross-correlation:
Example: Positive correlation between independent variable X and dependent variables Y and Z.
Behavior: Variables Y and Z are cross-correlated due to their relationship with variable X