A Hands-On Introduction To Time Series Classification (With Python Code)
A Hands-On Introduction To Time Series Classification (With Python Code)
The time series data most of us are exposed to deals primarily with
generating forecasts. Whether that’s predicting the demand or sales of
a product, the count of passengers in an airline or the closing price of a
particular stock, we are used to leveraging tried and tested time series
techniques for forecasting requirements.
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 1/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 2/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
Table of contents
1. Introduction to Time Series Classification
1.1 ECG Signals
1.2 Image Data
1.3 Sensors
4. Preprocessing
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 3/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
For example, consider the following signal sample which represents the
electrical activity for one heartbeat. The image on the left represents a
normal heartbeat while the one adjacent to it represents a Myocardial
Infarction.
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 4/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
The data captured from the electrodes will be in time series form, and
the signals can be classified into different classes. We can also classify
EEG signals which record the electrical activity of the brain.
2) Image Classification
Images can also be in a sequential time-dependent format. Consider
the following scenario:
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 5/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
What other applications can you think of where we can apply time
series classification? Let me know in the comments section below the
article.
There are four motion sensors (A1, A2, A3, A4) placed across two
rooms. Have a look at the below image which illustrates where the
sensors are positioned in each room. The setup in these two rooms was
created in 3 different pairs of rooms (group1, group2, group3).
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 6/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
A person can move along any of the six pre-defined paths shown in the
above image. If a person walks on path 2, 3, 4 or 6, he moves within the
room. On the other hand, if a person follows path 1 or path 5, we can
say that the person has moved between the rooms.
Now that the problem statement is clear, it’s time to get down to
coding! In the next section, we will look at the dataset for the problem
which should help clear up any lingering questions you might have on
this statement. You can download the dataset from this link: Indoor
User Movement Prediction.
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 7/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
• A Target csv file that contains the target variable for each
MovementAAL file
• The Path csv file that contains the path which the object took
Let’s have a look at the datasets. We’ll start by importing the necessary
libraries.
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from os import listdir
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 8/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
Before loading all the files, let’s take a quick sneak peek into the data
we are going to deal with. Reading the first two files from the
movement data:
df1 =
pd.read_csv(‘/MovementAAL/dataset/MovementAAL_RSS_1.csv')
df2 =
pd.read_csv('/MovementAAL/dataset/MovementAAL_RSS_2.csv')
df1.head()
df2.head()
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 9/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
df1.shape, df2.shape
The files contain normalized data from the four sensors — A1, A2, A3,
A4. The length of the csv files (number of rows) vary, since the data
corresponding to each csv is for a different duration. To simplify things,
let us suppose the sensor data is collected every second. The first
reading was for a duration of 27 seconds (so 27 rows), while another
reading was for 26 seconds (so 26 rows).
We will have to deal with this varying length before we build our
model. For now, we will read and store the values from the sensors in a
list using the following code block:
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 10/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
path = 'MovementAAL/dataset/MovementAAL_RSS_'
sequences = list()
for i in range(1,315):
file_path = path + str(i) + '.csv'
print(file_path)
df = pd.read_csv(file_path, header=0)
values = df.values
sequences.append(values)
targets =
pd.read_csv('MovementAAL/dataset/MovementAAL_target.csv')
targets = targets.values[:,1]
We now have a list ‘sequences’ that contains the data from the motion
sensors and ‘targets’ which holds the labels for the csv files. When we
print sequences[0], we get the values of sensors from the first csv file:
sequences[0]
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 11/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 12/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
groups =
pd.read_csv('MovementAAL/groups/MovementAAL_DatasetGroup.csv
', header=0)
groups = groups.values[:,1]
We will take the data from the first two sets for training purposes and
the third group for testing.
Preprocessing Steps
Since the time series data is of varying length, we cannot directly build
a model on this dataset. So how can we decide the ideal length of a
series? There are multiple ways in which we can deal with it and here
are a few ideas (I would love to hear your suggestions in the comment
section):
• Pad the shorter sequences with zeros to make the length of all the
series equal. In this case, we will be feeding incorrect data to the
model
• Find the maximum length of the series and pad the sequence with
the data in the last row
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 13/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
• Take the mean of all the lengths, truncate the longer series, and
pad the series which are shorter than the mean length
len_sequences = []
for one_seq in sequences:
len_sequences.append(len(one_seq))
pd.Series(len_sequences).describe()
count 314.000000
mean 42.028662
std 16.185303
min 19.000000
25% 26.000000
50% 41.000000
75% 56.000000
max 129.000000
dtype: float64
Most of the files have lengths between 40 to 60. Just 3 files are coming
up with a length more than 100. Thus, taking the minimum or
maximum length does not make much sense. The 90th quartile comes
out to be 60, which is taken as the length of sequence for the data. Let’s
code it out:
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 14/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
new_seq = []
for one_seq in sequences:
len_one_seq = len(one_seq)
last_val = one_seq[-1]
n = to_pad - len_one_seq
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 15/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
train = np.array(train)
validation = np.array(validation)
test = np.array(test)
train_target = np.array(train_target)
train_target = (train_target+1)/2
validation_target = np.array(validation_target)
validation_target = (validation_target+1)/2
test_target = np.array(test_target)
test_target = (test_target+1)/2
Note: You can get acquainted with LSTMs in this wonderfully explained
tutorial. I would advice you to go through that rst as it’ll help you
understand how the below code works.
model = Sequential()
model.add(LSTM(256, input_shape=(seq_len, 4)))
model.add(Dense(1, activation='sigmoid'))
model.summary()
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 16/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
We will now train the model and monitor the validation accuracy:
adam = Adam(lr=0.001)
chk = ModelCheckpoint('best_model.pkl', monitor='val_acc',
save_best_only=True, mode='max', verbose=1)
model.compile(loss='binary_crossentropy', optimizer=adam,
metrics=['accuracy'])
model.fit(train, train_target, epochs=200, batch_size=128,
callbacks=[chk], validation_data=
(validation,validation_target))
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 17/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
End Notes
That brings us to the end of this tutorial. The idea behind penning this
down was to introduce you to a whole new world in the time series
spectrum in a practical manner.
. . .
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 18/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 19/20
26/03/2019 A Hands-On Introduction to Time Series Classification (with Python Code)
https://medium.com/analytics-vidhya/a-hands-on-introduction-to-time-series-classification-with-python-code-48f8b442e7c1 20/20