0% found this document useful (0 votes)
9 views37 pages

DL 2

This document discusses univariate CNN models for time series forecasting. It begins by explaining how to prepare univariate time series data for use in a CNN model by splitting the time series into multiple input/output patterns called samples. Each sample contains a number of past time steps as input and one time step as the output. The document then provides an example of defining a 1D CNN model for univariate time series forecasting, which includes a convolutional layer, max pooling layer, flatten layer, and dense layers. The shape of the input layer must match the structure of the split time series samples.

Uploaded by

2022c104012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views37 pages

DL 2

This document discusses univariate CNN models for time series forecasting. It begins by explaining how to prepare univariate time series data for use in a CNN model by splitting the time series into multiple input/output patterns called samples. Each sample contains a number of past time steps as input and one time step as the output. The document then provides an example of defining a 1D CNN model for univariate time series forecasting, which includes a convolutional layer, max pooling layer, flatten layer, and dense layers. The shape of the input layer must match the structure of the split time series samples.

Uploaded by

2022c104012
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

8.2.

Univariate CNN Models 88

8.2 Univariate CNN Models


Although traditionally developed for two-dimensional image data, CNNs can be used to model
univariate time series forecasting problems. Univariate time series are datasets comprised of a
single series of observations with a temporal ordering and a model is required to learn from the
series of past observations to predict the next value in the sequence. This section is divided into
Chapter 8 two parts; they are:

1. Data Preparation
How to Develop CNNs for Time Series 2. CNN Model

Forecasting 8.2.1 Data Preparation


Before a univariate series can be modeled, it must be prepared. The CNN model will learn a
function that maps a sequence of past observations as input to an output observation. As such,
Convolutional Neural Network models, or CNNs for short, can be applied to time series the sequence of observations must be transformed into multiple examples from which the model
forecasting. There are many types of CNN models that can be used for each specific type of can learn. Consider a given univariate sequence:
time series forecasting problem. In this tutorial, you will discover how to develop a suite of
[10, 20, 30, 40, 50, 60, 70, 80, 90]
CNN models for a range of standard time series forecasting problems. The objective of this
tutorial is to provide standalone examples of each model on each type of time series problem as Listing 8.1: Example of a univariate time series.
a template that you can copy and adapt for your specific time series forecasting problem. After
We can divide the sequence into multiple input/output patterns called samples, where three
completing this tutorial, you will know:
time steps are used as input and one time step is used as output for the one-step prediction
❼ How to develop CNN models for univariate time series forecasting. that is being learned.
X, y
❼ How to develop CNN models for multivariate time series forecasting. 10, 20, 30, 40
20, 30, 40, 50
❼ How to develop CNN models for multi-step time series forecasting. 30, 40, 50, 60
...
Let’s get started. Listing 8.2: Example of a univariate time series as a supervised learning problem.
The split sequence() function below implements this behavior and will split a given
8.1 Tutorial Overview univariate sequence into multiple samples where each sample has a specified number of time
steps and the output is a single time step.
In this tutorial, we will explore how to develop CNN models for time series forecasting. The
# split a univariate sequence into samples
models are demonstrated on small contrived time series problems intended to give the flavor
def split_sequence(sequence, n_steps):
of the type of time series problem being addressed. The chosen configuration of the models is X, y = list(), list()
arbitrary and not optimized for each problem; that was not the goal. This tutorial is divided for i in range(len(sequence)):
into four parts; they are: # find the end of this pattern
end_ix = i + n_steps
1. Univariate CNN Models # check if we are beyond the sequence
if end_ix > len(sequence)-1:
break
2. Multivariate CNN Models
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
3. Multi-step CNN Models
X.append(seq_x)
y.append(seq_y)
4. Multivariate Multi-step CNN Models return array(X), array(y)

Listing 8.3: Example of a function to split a univariate series into a supervised learning problem.

87
8.2. Univariate CNN Models 89 8.2. Univariate CNN Models 90

We can demonstrate this function on our small contrived dataset above. The complete are followed by a dense fully connected layer that interprets the features extracted by the
example is listed below. convolutional part of the model. A flatten layer is used between the convolutional layers and
# univariate data preparation the dense layer to reduce the feature maps to a single one-dimensional vector. We can define a
from numpy import array 1D CNN Model for univariate time series forecasting as follows.
# define model
# split a univariate sequence into samples model = Sequential()
def split_sequence(sequence, n_steps): model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps,
X, y = list(), list() n_features)))
for i in range(len(sequence)):
model.add(MaxPooling1D(pool_size=2))
# find the end of this pattern
model.add(Flatten())
end_ix = i + n_steps model.add(Dense(50, activation=✬relu✬))
# check if we are beyond the sequence model.add(Dense(1))
if end_ix > len(sequence)-1: model.compile(optimizer=✬adam✬, loss=✬mse✬)
break
# gather input and output parts of the pattern Listing 8.6: Example of defining a CNN model.
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x) Key in the definition is the shape of the input; that is what the model expects as input for
y.append(seq_y) each sample in terms of the number of time steps and the number of features. We are working
return array(X), array(y)
with a univariate series, so the number of features is one, for one variable. The number of
# define input sequence time steps as input is the number we chose when preparing our dataset as an argument to the
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] split sequence() function.
# choose a number of time steps The input shape for each sample is specified in the input shape argument on the definition
n_steps = 3 of the first hidden layer. We almost always have multiple samples, therefore, the model will
# split into samples
X, y = split_sequence(raw_seq, n_steps) expect the input component of training data to have the dimensions or shape: [samples,
# summarize the data timesteps, features]. Our split sequence() function in the previous section outputs the
for i in range(len(X)): X with the shape [samples, timesteps], so we can easily reshape it to have an additional
print(X[i], y[i]) dimension for the one feature.
Listing 8.4: Example of transforming a univariate time series into a supervised learning problem. # reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
Running the example splits the univariate series into six samples where each sample has X = X.reshape((X.shape[0], X.shape[1], n_features))
three input time steps and one output time step.
Listing 8.7: Example of reshaping data for the CNN.
[10 20 30] 40
[20 30 40] 50 The CNN does not actually view the data as having time steps, instead, it is treated as a
[30 40 50] 60 sequence over which convolutional read operations can be performed, like a one-dimensional
[40 50 60] 70
image. In this example, we define a convolutional layer with 64 filter maps and a kernel size
[50 60 70] 80
[60 70 80] 90 of 2. This is followed by a max pooling layer and a dense layer to interpret the input feature.
An output layer is specified that predicts a single numerical value. The model is fit using the
Listing 8.5: Example output from transforming a univariate time series into a supervised learning efficient Adam version of stochastic gradient descent and optimized using the mean squared
problem. error, or ‘mse’, loss function. Once the model is defined, we can fit it on the training dataset.
Now that we know how to prepare a univariate series for modeling, let’s look at developing # fit model
a CNN model that can learn the mapping of inputs to outputs. model.fit(X, y, epochs=1000, verbose=0)

Listing 8.8: Example of fitting a CNN model.


8.2.2 CNN Model After the model is fit, we can use it to make a prediction. We can predict the next value
A one-dimensional CNN is a CNN model that has a convolutional hidden layer that operates in the sequence by providing the input: [70, 80, 90]. And expecting the model to predict
over a 1D sequence. This is followed by perhaps a second convolutional layer in some cases, something like: [100]. The model expects the input shape to be three-dimensional with
such as very long input sequences, and then a pooling layer whose job it is to distill the output [samples, timesteps, features], therefore, we must reshape the single input sample before
of the convolutional layer to the most salient elements. The convolutional and pooling layers making the prediction.
8.2. Univariate CNN Models 91 8.3. Multivariate CNN Models 92

yhat = model.predict(x_input, verbose=0)


# demonstrate prediction print(yhat)
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features)) Listing 8.10: Example of a CNN model for univariate time series forecasting.
yhat = model.predict(x_input, verbose=0)
Running the example prepares the data, fits the model, and makes a prediction. We can see
Listing 8.9: Example of reshaping data read for making a prediction. that the model predicts the next value in the sequence.
We can tie all of this together and demonstrate how to develop a 1D CNN model for
univariate time series forecasting and make a single prediction. Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
running the example a few times.
# univariate cnn example
from numpy import array
from keras.models import Sequential [[101.67965]]
from keras.layers import Dense
from keras.layers import Flatten Listing 8.11: Example output from a CNN model for univariate time series forecasting.
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D For an example of a CNN applied to a real-world univariate time series forecasting problem
see Chapter 14. For an example of grid searching CNN hyperparameters on a univariate time
# split a univariate sequence into samples series forecasting problem, see Chapter 15.
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
8.3 Multivariate CNN Models
end_ix = i + n_steps
# check if we are beyond the sequence
Multivariate time series data means data where there is more than one observation for each
if end_ix > len(sequence)-1: time step. There are two main models that we may require with multivariate time series data;
break they are:
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] 1. Multiple Input Series.
X.append(seq_x)
y.append(seq_y) 2. Multiple Parallel Series.
return array(X), array(y)

# define input sequence Let’s take a look at each in turn.


raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
8.3.1 Multiple Input Series
# split into samples A problem may have two or more parallel input time series and an output time series that is
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features]
dependent on the input time series. The input time series are parallel because each series has
n_features = 1 observations at the same time steps. We can demonstrate this with a simple example of two
X = X.reshape((X.shape[0], X.shape[1], n_features)) parallel input time series where the output series is the simple addition of the input series.
# define model
# define input sequence
model = Sequential()
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps,
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
n_features)))
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten()) Listing 8.12: Example of defining multiple parallel series.
model.add(Dense(50, activation=✬relu✬))
model.add(Dense(1)) We can reshape these three arrays of data as a single dataset where each row is a time step
model.compile(optimizer=✬adam✬, loss=✬mse✬)
and each column is a separate time series. This is a standard way of storing parallel time series
# fit model
model.fit(X, y, epochs=1000, verbose=0) in a CSV file.
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
8.3. Multivariate CNN Models 93 8.3. Multivariate CNN Models 94

Output:
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) 65
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1)) Listing 8.17: Example output from the first sample.
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq)) That is, the first three time steps of each parallel series are provided as input to the model
and the model associates this with the value in the output series at the third time step, in this
Listing 8.13: Example of defining parallel series as a dataset. case, 65. We can see that, in transforming the time series into input/output samples to train
the model, that we will have to discard some values from the output time series where we do
The complete example is listed below.
not have values in the input time series at prior time steps. In turn, the choice of the size of
# multivariate data preparation the number of input time steps will have an important effect on how much of the training data
from numpy import array
from numpy import hstack
is used. We can define a function named split sequences() that will take a dataset as we
# define input sequence have defined it with rows for time steps and columns for parallel series and return input/output
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) samples.
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
# split a multivariate sequence into samples
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
def split_sequences(sequences, n_steps):
# convert to [rows, columns] structure
X, y = list(), list()
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
for i in range(len(sequences)):
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
# find the end of this pattern
out_seq = out_seq.reshape((len(out_seq), 1))
end_ix = i + n_steps
# horizontally stack columns
# check if we are beyond the dataset
dataset = hstack((in_seq1, in_seq2, out_seq))
if end_ix > len(sequences):
print(dataset)
break
Listing 8.14: Example of defining a dependent time series dataset. # gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
Running the example prints the dataset with one row per time step and one column for each X.append(seq_x)
of the two input and one output parallel time series. y.append(seq_y)
return array(X), array(y)
[[ 10 15 25]
[ 20 25 45] Listing 8.18: Example of a function for preparing samples for a dependent time series.
[ 30 35 65]
[ 40 45 85] We can test this function on our dataset using three time steps for each input time series as
[ 50 55 105] input. The complete example is listed below.
[ 60 65 125]
# multivariate data preparation
[ 70 75 145]
from numpy import array
[ 80 85 165]
from numpy import hstack
[ 90 95 185]]

Listing 8.15: Example output from defining a dependent time series dataset. # split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
As with the univariate time series, we must structure these data into samples with input X, y = list(), list()
and output samples. A 1D CNN model needs sufficient context to learn a mapping from an for i in range(len(sequences)):
# find the end of this pattern
input sequence to an output value. CNNs can support parallel input time series as separate end_ix = i + n_steps
channels, like red, green, and blue components of an image. Therefore, we need to split the # check if we are beyond the dataset
data into samples maintaining the order of observations across the two input sequences. If we if end_ix > len(sequences):
chose three input time steps, then the first sample would look as follows: break
# gather input and output parts of the pattern
Input:
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
10, 15 X.append(seq_x)
20, 25 y.append(seq_y)
30, 35 return array(X), array(y)

Listing 8.16: Example input from the first sample. # define input sequence
8.3. Multivariate CNN Models 95 8.3. Multivariate CNN Models 96

in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) CNN Model
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) We are now ready to fit a 1D CNN model on this data, specifying the expected number of time
# convert to [rows, columns] structure steps and features to expect for each input sample, in this case three and two respectively.
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1)) # define model
out_seq = out_seq.reshape((len(out_seq), 1)) model = Sequential()
# horizontally stack columns model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps,
dataset = hstack((in_seq1, in_seq2, out_seq)) n_features)))
# choose a number of time steps model.add(MaxPooling1D(pool_size=2))
n_steps = 3 model.add(Flatten())
# convert into input/output model.add(Dense(50, activation=✬relu✬))
X, y = split_sequences(dataset, n_steps) model.add(Dense(1))
print(X.shape, y.shape) model.compile(optimizer=✬adam✬, loss=✬mse✬)
# summarize the data Listing 8.21: Example of defining a CNN for forecasting a dependent series.
for i in range(len(X)):
print(X[i], y[i]) When making a prediction, the model expects three time steps for two input time series.
Listing 8.19: Example of splitting a dependent series into samples. We can predict the next value in the output series providing the input values of:
Running the example first prints the shape of the X and y components. We can see that the 80, 85
90, 95
X component has a three-dimensional structure. The first dimension is the number of samples, 100, 105
in this case 7. The second dimension is the number of time steps per sample, in this case 3, the
value specified to the function. Finally, the last dimension specifies the number of parallel time Listing 8.22: Example input for forecasting out-of-sample.
series or the number of variables, in this case 2 for the two parallel series. The shape of the one sample with three time steps and two variables must be [1, 3, 2].
This is the exact three-dimensional structure expected by a 1D CNN as input. The data We would expect the next value in the sequence to be 100 + 105 or 205.
is ready to use without further reshaping. We can then see that the input and output for
each sample is printed, showing the three time steps for each of the two input series and the # demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
associated output for each sample. x_input = x_input.reshape((1, n_steps, n_features))
(7, 3, 2) (7,) yhat = model.predict(x_input, verbose=0)

[[10 15] Listing 8.23: Example of preparing input for forecasting out-of-sample.
[20 25]
[30 35]] 65 The complete example is listed below.
[[20 25] # multivariate cnn example
[30 35] from numpy import array
[40 45]] 85 from numpy import hstack
[[30 35] from keras.models import Sequential
[40 45] from keras.layers import Dense
[50 55]] 105 from keras.layers import Flatten
[[40 45] from keras.layers.convolutional import Conv1D
[50 55] from keras.layers.convolutional import MaxPooling1D
[60 65]] 125
[[50 55] # split a multivariate sequence into samples
[60 65] def split_sequences(sequences, n_steps):
[70 75]] 145 X, y = list(), list()
[[60 65] for i in range(len(sequences)):
[70 75] # find the end of this pattern
[80 85]] 165 end_ix = i + n_steps
[[70 75] # check if we are beyond the dataset
[80 85] if end_ix > len(sequences):
[90 95]] 185 break
# gather input and output parts of the pattern
Listing 8.20: Example output from splitting a dependent series into samples.
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
8.3. Multivariate CNN Models 97 8.3. Multivariate CNN Models 98

X.append(seq_x) being modeled. For example, it allows you to configure each submodel differently for each input
y.append(seq_y) series, such as the number of filter maps and the kernel size. This type of model can be defined
return array(X), array(y) in Keras using the Keras functional API. First, we can define the first input model as a 1D
# define input sequence CNN with an input layer that expects vectors with n steps and 1 feature.
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) # first input model
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) visible1 = Input(shape=(n_steps, n_features))
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) cnn1 = Conv1D(filters=64, kernel_size=2, activation=✬relu✬)(visible1)
# convert to [rows, columns] structure cnn1 = MaxPooling1D(pool_size=2)(cnn1)
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) cnn1 = Flatten()(cnn1)
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1)) Listing 8.26: Example of defining the first input model.
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq)) We can define the second input submodel in the same way.
# choose a number of time steps
# second input model
n_steps = 3
visible2 = Input(shape=(n_steps, n_features))
# convert into input/output
cnn2 = Conv1D(filters=64, kernel_size=2, activation=✬relu✬)(visible2)
X, y = split_sequences(dataset, n_steps)
cnn2 = MaxPooling1D(pool_size=2)(cnn2)
# the dataset knows the number of features, e.g. 2
cnn2 = Flatten()(cnn2)
n_features = X.shape[2]
# define model Listing 8.27: Example of defining the second input model.
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps, Now that both input submodels have been defined, we can merge the output from each
n_features))) model into one long vector which can be interpreted before making a prediction for the output
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten()) sequence.
model.add(Dense(50, activation=✬relu✬)) # merge input models
model.add(Dense(1)) merge = concatenate([cnn1, cnn2])
model.compile(optimizer=✬adam✬, loss=✬mse✬) dense = Dense(50, activation=✬relu✬)(merge)
# fit model output = Dense(1)(dense)
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction Listing 8.28: Example of defining the output model.
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features)) We can then tie the inputs and outputs together.
yhat = model.predict(x_input, verbose=0)
# connect input and output models
print(yhat)
model = Model(inputs=[visible1, visible2], outputs=output)
Listing 8.24: Example of a CNN model for forecasting a dependent time series. Listing 8.29: Example of connecting the input and output models.
Running the example prepares the data, fits the model, and makes a prediction. The image below provides a schematic for how this model looks, including the shape of the
inputs and outputs of each layer.
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
running the example a few times.

[[206.0161]]

Listing 8.25: Example output from a CNN model for forecasting a dependent time series.

Multi-headed CNN Model


There is another, more elaborate way to model the problem. Each input series can be handled by
a separate CNN and the output of each of these submodels can be combined before a prediction
is made for the output sequence. We can refer to this as a multi-headed CNN model. It may
offer more flexibility or better performance depending on the specifics of the problem that is
8.3. Multivariate CNN Models 99 8.3. Multivariate CNN Models 100

We can tie all of this together; the complete example is listed below.
# multivariate multi-headed 1d cnn example
from numpy import array
from numpy import hstack
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
from keras.layers.merge import concatenate

# split a multivariate sequence into samples


def split_sequences(sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
X.append(seq_x)
y.append(seq_y)
Figure 8.1: Plot of Multi-headed 1D CNN for Multivariate Time Series Forecasting. return array(X), array(y)

# define input sequence


This model requires input to be provided as a list of two elements where each element in the in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
list contains data for one of the submodels. In order to achieve this, we can split the 3D input in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
data into two separate arrays of input data; that is from one array with the shape [7, 3, 2] out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
to two 3D arrays with [7, 3, 1]. # convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
# one time series per head in_seq2 = in_seq2.reshape((len(in_seq2), 1))
n_features = 1 out_seq = out_seq.reshape((len(out_seq), 1))
# separate input data # horizontally stack columns
X1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features) dataset = hstack((in_seq1, in_seq2, out_seq))
X2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features) # choose a number of time steps
n_steps = 3
Listing 8.30: Example of preparing the input data for the multi-headed model. # convert into input/output
X, y = split_sequences(dataset, n_steps)
These data can then be provided in order to fit the model.
# one time series per head
# fit model n_features = 1
model.fit([X1, X2], y, epochs=1000, verbose=0) # separate input data
X1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)
Listing 8.31: Example of fitting the multi-headed model. X2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)
# first input model
Similarly, we must prepare the data for a single sample as two separate two-dimensional visible1 = Input(shape=(n_steps, n_features))
arrays when making a single one-step prediction. cnn1 = Conv1D(filters=64, kernel_size=2, activation=✬relu✬)(visible1)
cnn1 = MaxPooling1D(pool_size=2)(cnn1)
# reshape one sample for making a prediction
cnn1 = Flatten()(cnn1)
x_input = array([[80, 85], [90, 95], [100, 105]])
# second input model
x1 = x_input[:, 0].reshape((1, n_steps, n_features))
visible2 = Input(shape=(n_steps, n_features))
x2 = x_input[:, 1].reshape((1, n_steps, n_features))
cnn2 = Conv1D(filters=64, kernel_size=2, activation=✬relu✬)(visible2)
Listing 8.32: Example of preparing data for forecasting with the multi-headed model. cnn2 = MaxPooling1D(pool_size=2)(cnn2)
8.3. Multivariate CNN Models 101 8.3. Multivariate CNN Models 102

cnn2 = Flatten()(cnn2)
# merge input models 10, 15, 25
merge = concatenate([cnn1, cnn2]) 20, 25, 45
dense = Dense(50, activation=✬relu✬)(merge) 30, 35, 65
output = Dense(1)(dense)
Listing 8.36: Example input from the first sample.
model = Model(inputs=[visible1, visible2], outputs=output)
model.compile(optimizer=✬adam✬, loss=✬mse✬) Output:
# fit model
model.fit([X1, X2], y, epochs=1000, verbose=0) 40, 45, 85
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]]) Listing 8.37: Example output from the first sample.
x1 = x_input[:, 0].reshape((1, n_steps, n_features))
x2 = x_input[:, 1].reshape((1, n_steps, n_features)) The split sequences() function below will split multiple parallel time series with rows for
yhat = model.predict([x1, x2], verbose=0) time steps and one series per column into the required input/output shape.
print(yhat)
# split a multivariate sequence into samples
Listing 8.33: Example of a Multi-headed CNN for forecasting a dependent time series. def split_sequences(sequences, n_steps):
X, y = list(), list()
Running the example prepares the data, fits the model, and makes a prediction. for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider # check if we are beyond the dataset
running the example a few times. if end_ix > len(sequences)-1:
break
[[205.871]] # gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
Listing 8.34: Example output from a Multi-headed CNN model for forecasting a dependent X.append(seq_x)
time series. y.append(seq_y)
return array(X), array(y)
For an example of CNN models developed for a multivariate time series classification problem, Listing 8.38: Example of splitting multiple parallel time series into samples.
see Chapter 24.
We can demonstrate this on the contrived problem; the complete example is listed below.
8.3.2 Multiple Parallel Series # multivariate output data prep
from numpy import array
An alternate time series problem is the case where there are multiple parallel time series and a from numpy import hstack
value must be predicted for each. For example, given the data from the previous section:
# split a multivariate sequence into samples
[[ 10 15 25] def split_sequences(sequences, n_steps):
[ 20 25 45] X, y = list(), list()
[ 30 35 65] for i in range(len(sequences)):
[ 40 45 85] # find the end of this pattern
[ 50 55 105] end_ix = i + n_steps
[ 60 65 125] # check if we are beyond the dataset
[ 70 75 145] if end_ix > len(sequences)-1:
[ 80 85 165] break
[ 90 95 185]] # gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
Listing 8.35: Example of parallel time series. X.append(seq_x)
y.append(seq_y)
We may want to predict the value for each of the three time series for the next time step. This
return array(X), array(y)
might be referred to as multivariate forecasting. Again, the data must be split into input/output
samples in order to train a model. The first sample of this dataset would be: # define input sequence
Input: in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
8.3. Multivariate CNN Models 103 8.3. Multivariate CNN Models 104

# convert to [rows, columns] structure The number of parallel series is also used in the specification of the number of values to predict
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) by the model in the output layer; again, this is three.
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1)) # define model
# horizontally stack columns model = Sequential()
dataset = hstack((in_seq1, in_seq2, out_seq)) model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps,
# choose a number of time steps n_features)))
n_steps = 3 model.add(MaxPooling1D(pool_size=2))
# convert into input/output model.add(Flatten())
X, y = split_sequences(dataset, n_steps) model.add(Dense(50, activation=✬relu✬))
print(X.shape, y.shape) model.add(Dense(n_features))
# summarize the data model.compile(optimizer=✬adam✬, loss=✬mse✬)
for i in range(len(X)):
print(X[i], y[i])
Listing 8.41: Example of defining a CNN model for forecasting multiple parallel time series.

Listing 8.39: Example of splitting multiple parallel series into samples. We can predict the next value in each of the three parallel series by providing an input of
three time steps for each series.
Running the example first prints the shape of the prepared X and y components. The
70, 75, 145
shape of X is three-dimensional, including the number of samples (6), the number of time steps 80, 85, 165
chosen per sample (3), and the number of parallel time series or features (3). The shape of y 90, 95, 185
is two-dimensional as we might expect for the number of samples (6) and the number of time
Listing 8.42: Example input for forecasting out-of-sample.
variables per sample to be predicted (3). The data is ready to use in a 1D CNN model that
expects three-dimensional input and two-dimensional output shapes for the X and y components The shape of the input for making a single prediction must be 1 sample, 3 time steps, and 3
of each sample. Then, each of the samples is printed showing the input and output components features, or [1, 3, 3].
of each sample.
# demonstrate prediction
(6, 3, 3) (6, 3) x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
[[10 15 25] yhat = model.predict(x_input, verbose=0)
[20 25 45]
[30 35 65]] [40 45 85] Listing 8.43: Example of preparing data for forecasting out-of-sample.
[[20 25 45]
[30 35 65] We would expect the vector output to be: [100, 105, 205]. We can tie all of this together
[40 45 85]] [ 50 55 105] and demonstrate a 1D CNN for multivariate output time series forecasting below.
[[ 30 35 65]
# multivariate output 1d cnn example
[ 40 45 85]
from numpy import array
[ 50 55 105]] [ 60 65 125]
from numpy import hstack
[[ 40 45 85]
from keras.models import Sequential
[ 50 55 105]
from keras.layers import Dense
[ 60 65 125]] [ 70 75 145]
from keras.layers import Flatten
[[ 50 55 105]
from keras.layers.convolutional import Conv1D
[ 60 65 125]
from keras.layers.convolutional import MaxPooling1D
[ 70 75 145]] [ 80 85 165]
[[ 60 65 125]
# split a multivariate sequence into samples
[ 70 75 145]
def split_sequences(sequences, n_steps):
[ 80 85 165]] [ 90 95 185]
X, y = list(), list()
Listing 8.40: Example output from splitting multiple parallel series into samples. for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
Vector-Output CNN Model if end_ix > len(sequences)-1:
break
We are now ready to fit a 1D CNN model on this data. In this model, the number of time steps # gather input and output parts of the pattern
and parallel series (features) are specified for the input layer via the input shape argument. seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
8.3. Multivariate CNN Models 105 8.3. Multivariate CNN Models 106

y.append(seq_y)
return array(X), array(y) # define model
visible = Input(shape=(n_steps, n_features))
# define input sequence cnn = Conv1D(filters=64, kernel_size=2, activation=✬relu✬)(visible)
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) cnn = MaxPooling1D(pool_size=2)(cnn)
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) cnn = Flatten()(cnn)
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) cnn = Dense(50, activation=✬relu✬)(cnn)
# convert to [rows, columns] structure
Listing 8.46: Example of defining the input model.
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
We can then define one output layer for each of the three series that we wish to forecast,
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns where each output submodel will forecast a single time step.
dataset = hstack((in_seq1, in_seq2, out_seq)) # define output 1
# choose a number of time steps output1 = Dense(1)(cnn)
n_steps = 3 # define output 2
# convert into input/output output2 = Dense(1)(cnn)
X, y = split_sequences(dataset, n_steps) # define output 3
# the dataset knows the number of features, e.g. 2 output3 = Dense(1)(cnn)
n_features = X.shape[2]
# define model Listing 8.47: Example of defining the output models.
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps, We can then tie the input and output layers together into a single model.
n_features)))
# tie together
model.add(MaxPooling1D(pool_size=2))
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.add(Flatten())
model.compile(optimizer=✬adam✬, loss=✬mse✬)
model.add(Dense(50, activation=✬relu✬))
model.add(Dense(n_features)) Listing 8.48: Example of connecting the input and output models.
model.compile(optimizer=✬adam✬, loss=✬mse✬)
# fit model To make the model architecture clear, the schematic below clearly shows the three separate
model.fit(X, y, epochs=3000, verbose=0) output layers of the model and the input and output shapes of each layer.
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

Listing 8.44: Example of a CNN model for forecasting multiple parallel time series.
Running the example prepares the data, fits the model and makes a prediction.

Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
running the example a few times.

[[100.11272 105.32213 205.53436]]

Listing 8.45: Example output from a CNN model for forecasting multiple parallel time series.

Multi-output CNN Model


As with multiple input series, there is another more elaborate way to model the problem. Each
output series can be handled by a separate output CNN model. We can refer to this as a
multi-output CNN model. It may offer more flexibility or better performance depending on the
specifics of the problem that is being modeled. This type of model can be defined in Keras Figure 8.2: Plot of Multi-output 1D CNN for Multivariate Time Series Forecasting.
using the Keras functional API. First, we can define the first input model as a 1D CNN model.
8.3. Multivariate CNN Models 107 8.4. Multi-step CNN Models 108

When training the model, it will require three separate output arrays per sample. We can X, y = split_sequences(dataset, n_steps)
achieve this by converting the output training data that has the shape [7, 3] to three arrays # the dataset knows the number of features, e.g. 2
with the shape [7, 1]. n_features = X.shape[2]
# separate output
# separate output y1 = y[:, 0].reshape((y.shape[0], 1))
y1 = y[:, 0].reshape((y.shape[0], 1)) y2 = y[:, 1].reshape((y.shape[0], 1))
y2 = y[:, 1].reshape((y.shape[0], 1)) y3 = y[:, 2].reshape((y.shape[0], 1))
y3 = y[:, 2].reshape((y.shape[0], 1)) # define model
visible = Input(shape=(n_steps, n_features))
Listing 8.49: Example of preparing the output samples for fitting the multi-output model. cnn = Conv1D(filters=64, kernel_size=2, activation=✬relu✬)(visible)
cnn = MaxPooling1D(pool_size=2)(cnn)
These arrays can be provided to the model during training. cnn = Flatten()(cnn)
# fit model cnn = Dense(50, activation=✬relu✬)(cnn)
model.fit(X, [y1,y2,y3], epochs=2000, verbose=0) # define output 1
output1 = Dense(1)(cnn)
Listing 8.50: Example of fitting the multi-output model. # define output 2
output2 = Dense(1)(cnn)
Tying all of this together, the complete example is listed below. # define output 3
# multivariate output 1d cnn example output3 = Dense(1)(cnn)
from numpy import array # tie together
from numpy import hstack model = Model(inputs=visible, outputs=[output1, output2, output3])
from keras.models import Model model.compile(optimizer=✬adam✬, loss=✬mse✬)
from keras.layers import Input # fit model
from keras.layers import Dense model.fit(X, [y1,y2,y3], epochs=2000, verbose=0)
from keras.layers import Flatten # demonstrate prediction
from keras.layers.convolutional import Conv1D x_input = array([[70,75,145], [80,85,165], [90,95,185]])
from keras.layers.convolutional import MaxPooling1D x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# split a multivariate sequence into samples print(yhat)
def split_sequences(sequences, n_steps):
X, y = list(), list()
Listing 8.51: Example of a Multi-output CNN model for forecasting multiple parallel time series.
for i in range(len(sequences)):
# find the end of this pattern
Running the example prepares the data, fits the model, and makes a prediction.
end_ix = i + n_steps
# check if we are beyond the dataset Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
if end_ix > len(sequences)-1: running the example a few times.
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :] [array([[100.96118]], dtype=float32),
X.append(seq_x) array([[105.502686]], dtype=float32),
y.append(seq_y) array([[205.98045]], dtype=float32)]
return array(X), array(y)
Listing 8.52: Example output from a Multi-output CNN model for forecasting multiple parallel
# define input sequence time series.
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) For an example of CNN models developed for a multivariate time series forecasting problem,
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) see Chapter 19.
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1)) 8.4 Multi-step CNN Models
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns In practice, there is little difference to the 1D CNN model in predicting a vector output that
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
represents different output variables (as in the previous example), or a vector output that
n_steps = 3 represents multiple time steps of one variable. Nevertheless, there are subtle and important
# convert into input/output differences in the way the training data is prepared. In this section, we will demonstrate the
8.4. Multi-step CNN Models 109 8.4. Multi-step CNN Models 110

case of developing a multi-step forecast model using a vector model. Before we look at the def split_sequence(sequence, n_steps_in, n_steps_out):
specifics of the model, let’s first look at the preparation of data for multi-step forecasting. X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
8.4.1 Data Preparation end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out
As with one-step forecasting, a time series used for multi-step time series forecasting must be # check if we are beyond the sequence
split into samples with input and output components. Both the input and output components if out_end_ix > len(sequence):
will be comprised of multiple time steps and may or may not have the same number of steps. break
# gather input and output parts of the pattern
For example, given the univariate time series: seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
[10, 20, 30, 40, 50, 60, 70, 80, 90] X.append(seq_x)
y.append(seq_y)
Listing 8.53: Example of a univariate time series. return array(X), array(y)

We could use the last three time steps as input and forecast the next two time steps. The # define input sequence
first sample would look as follows: raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
Input: # choose a number of time steps
n_steps_in, n_steps_out = 3, 2
[10, 20, 30] # split into samples
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
Listing 8.54: Example input for the first sample. # summarize the data
for i in range(len(X)):
Output: print(X[i], y[i])
[40, 50]
Listing 8.57: Example transforming a time series into samples for multi-step forecasting.
Listing 8.55: Example output for the first sample.
Running the example splits the univariate series into input and output time steps and prints
The split sequence() function below implements this behavior and will split a given the input and output components of each.
univariate time series into samples with a specified number of input and output time steps. [10 20 30] [40 50]
[20 30 40] [50 60]
# split a univariate sequence into samples
[30 40 50] [60 70]
def split_sequence(sequence, n_steps_in, n_steps_out):
[40 50 60] [70 80]
X, y = list(), list()
[50 60 70] [80 90]
for i in range(len(sequence)):
# find the end of this pattern Listing 8.58: Example output from transforming a time series into samples for multi-step
end_ix = i + n_steps_in
forecasting.
out_end_ix = end_ix + n_steps_out
# check if we are beyond the sequence Now that we know how to prepare data for multi-step forecasting, let’s look at a 1D CNN
if out_end_ix > len(sequence):
model that can learn this mapping.
break
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix] 8.4.2 Vector Output Model
X.append(seq_x)
y.append(seq_y) The 1D CNN can output a vector directly that can be interpreted as a multi-step forecast. This
return array(X), array(y) approach was seen in the previous section were one time step of each output time series was
Listing 8.56: Example of function for splitting a univariate time series into samples for multi-step forecasted as a vector. As with the 1D CNN models for univariate data in a prior section, the
forecasting. prepared samples must first be reshaped. The CNN expects data to have a three-dimensional
structure of [samples, timesteps, features], and in this case, we only have one feature so
We can demonstrate this function on the small contrived dataset. The complete example is the reshape is straightforward.
listed below. # reshape from [samples, timesteps] into [samples, timesteps, features]
# multi-step data preparation n_features = 1
from numpy import array X = X.reshape((X.shape[0], X.shape[1], n_features))

# split a univariate sequence into samples Listing 8.59: Example of reshaping data for multi-step forecasting.
8.4. Multi-step CNN Models 111 8.5. Multivariate Multi-step CNN Models 112

With the number of input and output steps specified in the n steps in and n steps out # choose a number of time steps
variables, we can define a multi-step time-series forecasting model. n_steps_in, n_steps_out = 3, 2
# split into samples
# define model X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
model = Sequential() # reshape from [samples, timesteps] into [samples, timesteps, features]
model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps_in, n_features = 1
n_features))) X = X.reshape((X.shape[0], X.shape[1], n_features))
model.add(MaxPooling1D(pool_size=2)) # define model
model.add(Flatten()) model = Sequential()
model.add(Dense(50, activation=✬relu✬)) model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps_in,
model.add(Dense(n_steps_out)) n_features)))
model.compile(optimizer=✬adam✬, loss=✬mse✬) model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
Listing 8.60: Example of defining a CNN model for multi-step forecasting.
model.add(Dense(50, activation=✬relu✬))
The model can make a prediction for a single sample. We can predict the next two steps model.add(Dense(n_steps_out))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
beyond the end of the dataset by providing the input: [70, 80, 90]. We would expect the # fit model
predicted output to be: [100, 110]. As expected by the model, the shape of the single sample model.fit(X, y, epochs=2000, verbose=0)
of input data when making the prediction must be [1, 3, 1] for the 1 sample, 3 time steps of # demonstrate prediction
the input, and the single feature. x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
# demonstrate prediction yhat = model.predict(x_input, verbose=0)
x_input = array([70, 80, 90]) print(yhat)
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0) Listing 8.62: Example of a vector-output CNN for multi-step forecasting.
Listing 8.61: Example of reshaping data for making an out-of-sample forecast. Running the example forecasts and prints the next two time steps in the sequence.
Tying all of this together, the 1D CNN for multi-step forecasting with a univariate time Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
series is listed below. running the example a few times.
# univariate multi-step vector-output 1d cnn example
from numpy import array [[102.86651 115.08979]]
from keras.models import Sequential
from keras.layers import Dense Listing 8.63: Example output from a vector-output CNN for multi-step forecasting.
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D For an example of CNN models developed for a multi-step time series forecasting problem,
from keras.layers.convolutional import MaxPooling1D see Chapter 19.
# split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
X, y = list(), list()
8.5 Multivariate Multi-step CNN Models
for i in range(len(sequence)):
# find the end of this pattern
In the previous sections, we have looked at univariate, multivariate, and multi-step time series
end_ix = i + n_steps_in forecasting. It is possible to mix and match the different types of 1D CNN models presented so
out_end_ix = end_ix + n_steps_out far for the different problems. This too applies to time series forecasting problems that involve
# check if we are beyond the sequence multivariate and multi-step forecasting, but it may be a little more challenging. In this section,
if out_end_ix > len(sequence): we will explore short examples of data preparation and modeling for multivariate multi-step
break
# gather input and output parts of the pattern
time series forecasting as a template to ease this challenge, specifically:
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
X.append(seq_x) 1. Multiple Input Multi-step Output.
y.append(seq_y)
return array(X), array(y) 2. Multiple Parallel Input and Multi-step Output.

# define input sequence Perhaps the biggest stumbling block is in the preparation of data, so this is where we will
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] focus our attention.
8.5. Multivariate Multi-step CNN Models 113 8.5. Multivariate Multi-step CNN Models 114

8.5.1 Multiple Input Multi-step Output


# multivariate multi-step data preparation
There are those multivariate time series forecasting problems where the output series is separate from numpy import array
but dependent upon the input time series, and multiple time steps are required for the output from numpy import hstack
series. For example, consider our multivariate time series from a prior section: # split a multivariate sequence into samples
[[ 10 15 25] def split_sequences(sequences, n_steps_in, n_steps_out):
[ 20 25 45] X, y = list(), list()
[ 30 35 65] for i in range(len(sequences)):
[ 40 45 85] # find the end of this pattern
[ 50 55 105] end_ix = i + n_steps_in
[ 60 65 125] out_end_ix = end_ix + n_steps_out-1
[ 70 75 145] # check if we are beyond the dataset
[ 80 85 165] if out_end_ix > len(sequences):
[ 90 95 185]] break
# gather input and output parts of the pattern
Listing 8.64: Example of a multivariate time series. seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x)
We may use three prior time steps of each of the two input time series to predict two time y.append(seq_y)
steps of the output time series. return array(X), array(y)
Input:
# define input sequence
10, 15 in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
20, 25 in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
30, 35 out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
Listing 8.65: Example input for the first sample. in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
Output: out_seq = out_seq.reshape((len(out_seq), 1))
65 # horizontally stack columns
85 dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
Listing 8.66: Example output for the first sample. n_steps_in, n_steps_out = 3, 2
# convert into input/output
The split sequences() function below implements this behavior. X, y = split_sequences(dataset, n_steps_in, n_steps_out)
# split a multivariate sequence into samples print(X.shape, y.shape)
def split_sequences(sequences, n_steps_in, n_steps_out): # summarize the data
X, y = list(), list() for i in range(len(X)):
for i in range(len(sequences)): print(X[i], y[i])
# find the end of this pattern Listing 8.68: Example preparing a multivariate input dependent time series with multi-step
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out-1
forecasts.
# check if we are beyond the dataset
Running the example first prints the shape of the prepared training data. We can see that
if out_end_ix > len(sequences):
break the shape of the input portion of the samples is three-dimensional, comprised of six samples,
# gather input and output parts of the pattern with three time steps and two variables for the two input time series. The output portion of the
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1] samples is two-dimensional for the six samples and the two time steps for each sample to be
X.append(seq_x) predicted. The prepared samples are then printed to confirm that the data was prepared as we
y.append(seq_y)
return array(X), array(y)
specified.
(6, 3, 2) (6, 2)
Listing 8.67: Example of a function for transforming a multivariate time series into samples for
multi-step forecasting. [[10 15]
[20 25]
We can demonstrate this on our contrived dataset. The complete example is listed below. [30 35]] [65 85]
[[20 25]
8.5. Multivariate Multi-step CNN Models 115 8.5. Multivariate Multi-step CNN Models 116

[30 35] # choose a number of time steps


[40 45]] [ 85 105] n_steps_in, n_steps_out = 3, 2
[[30 35] # convert into input/output
[40 45] X, y = split_sequences(dataset, n_steps_in, n_steps_out)
[50 55]] [105 125] # the dataset knows the number of features, e.g. 2
[[40 45] n_features = X.shape[2]
[50 55] # define model
[60 65]] [125 145] model = Sequential()
[[50 55] model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps_in,
[60 65] n_features)))
[70 75]] [145 165] model.add(MaxPooling1D(pool_size=2))
[[60 65] model.add(Flatten())
[70 75] model.add(Dense(50, activation=✬relu✬))
[80 85]] [165 185] model.add(Dense(n_steps_out))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
Listing 8.69: Example output from preparing a multivariate input dependent time series with # fit model
multi-step forecasts. model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
We can now develop a 1D CNN model for multi-step predictions. In this case, we will x_input = array([[70, 75], [80, 85], [90, 95]])
demonstrate a vector output model. The complete example is listed below. x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
# multivariate multi-step 1d cnn example print(yhat)
from numpy import array
from numpy import hstack Listing 8.70: Example of a CNN model for multivariate dependent time series with multi-step
from keras.models import Sequential forecasts.
from keras.layers import Dense
from keras.layers import Flatten Running the example fits the model and predicts the next two time steps of the output
from keras.layers.convolutional import Conv1D
sequence beyond the dataset. We would expect the next two steps to be [185, 205].
from keras.layers.convolutional import MaxPooling1D

# split a multivariate sequence into samples Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
def split_sequences(sequences, n_steps_in, n_steps_out): running the example a few times.
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern [[185.57011 207.77893]]
end_ix = i + n_steps_in Listing 8.71: Example output from a CNN model for multivariate dependent time series with
out_end_ix = end_ix + n_steps_out-1
# check if we are beyond the dataset
multi-step forecasts.
if out_end_ix > len(sequences):
break
# gather input and output parts of the pattern 8.5.2 Multiple Parallel Input and Multi-step Output
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x) A problem with parallel time series may require the prediction of multiple time steps of each
y.append(seq_y) time series. For example, consider our multivariate time series from a prior section:
return array(X), array(y)
[[ 10 15 25]
# define input sequence [ 20 25 45]
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) [ 30 35 65]
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) [ 40 45 85]
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) [ 50 55 105]
# convert to [rows, columns] structure [ 60 65 125]
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) [ 70 75 145]
in_seq2 = in_seq2.reshape((len(in_seq2), 1)) [ 80 85 165]
out_seq = out_seq.reshape((len(out_seq), 1)) [ 90 95 185]]
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq)) Listing 8.72: Example of a multivariate time series.
8.5. Multivariate Multi-step CNN Models 117 8.5. Multivariate Multi-step CNN Models 118

We may use the last three time steps from each of the three time series as input to the X.append(seq_x)
model, and predict the next time steps of each of the three time series as output. The first y.append(seq_y)
sample in the training dataset would be the following. return array(X), array(y)
Input: # define input sequence
10, 15, 25 in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
20, 25, 45 in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
30, 35, 65 out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
Listing 8.73: Example input for the first sample. in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
Output: out_seq = out_seq.reshape((len(out_seq), 1))
40, 45, 85 # horizontally stack columns
50, 55, 105 dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
Listing 8.74: Example output for the first sample. n_steps_in, n_steps_out = 3, 2
# convert into input/output
The split sequences() function below implements this behavior. X, y = split_sequences(dataset, n_steps_in, n_steps_out)
print(X.shape, y.shape)
# split a multivariate sequence into samples
# summarize the data
def split_sequences(sequences, n_steps_in, n_steps_out):
for i in range(len(X)):
X, y = list(), list()
print(X[i], y[i])
for i in range(len(sequences)):
# find the end of this pattern Listing 8.76: Example preparing a multivariate parallel time series with multi-step forecasts.
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out Running the example first prints the shape of the prepared training dataset. We can see
# check if we are beyond the dataset that both the input (X) and output (Y ) elements of the dataset are three dimensional for the
if out_end_ix > len(sequences):
number of samples, time steps, and variables or parallel time series respectively. The input and
break
# gather input and output parts of the pattern output elements of each series are then printed side by side so that we can confirm that the
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :] data was prepared as we expected.
X.append(seq_x)
(5, 3, 3) (5, 2, 3)
y.append(seq_y)
return array(X), array(y)
[[10 15 25]
Listing 8.75: Example of a function for transforming a multivariate time series into samples for [20 25 45]
[30 35 65]] [[ 40 45 85]
multi-step forecasting. [ 50 55 105]]
[[20 25 45]
We can demonstrate this function on the small contrived dataset. The complete example is
[30 35 65]
listed below. [40 45 85]] [[ 50 55 105]
# multivariate multi-step data preparation [ 60 65 125]]
from numpy import array [[ 30 35 65]
from numpy import hstack [ 40 45 85]
[ 50 55 105]] [[ 60 65 125]
# split a multivariate sequence into samples [ 70 75 145]]
def split_sequences(sequences, n_steps_in, n_steps_out): [[ 40 45 85]
X, y = list(), list() [ 50 55 105]
for i in range(len(sequences)): [ 60 65 125]] [[ 70 75 145]
# find the end of this pattern [ 80 85 165]]
end_ix = i + n_steps_in [[ 50 55 105]
out_end_ix = end_ix + n_steps_out [ 60 65 125]
# check if we are beyond the dataset [ 70 75 145]] [[ 80 85 165]
if out_end_ix > len(sequences): [ 90 95 185]]
break
# gather input and output parts of the pattern
Listing 8.77: Example output from preparing a multivariate parallel time series with multi-step
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :] forecasts.
8.5. Multivariate Multi-step CNN Models 119 8.6. Extensions 120

We can now develop a 1D CNN model for this dataset. We will use a vector-output model # define model
in this case. As such, we must flatten the three-dimensional structure of the output portion of model = Sequential()
each sample in order to train the model. This means, instead of predicting two steps for each model.add(Conv1D(filters=64, kernel_size=2, activation=✬relu✬, input_shape=(n_steps_in,
n_features)))
series, the model is trained on and expected to predict a vector of six numbers directly. model.add(MaxPooling1D(pool_size=2))
# flatten output model.add(Flatten())
n_output = y.shape[1] * y.shape[2] model.add(Dense(50, activation=✬relu✬))
y = y.reshape((y.shape[0], n_output)) model.add(Dense(n_output))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
Listing 8.78: Example of flattening output samples for training the model with vector output. # fit model
model.fit(X, y, epochs=7000, verbose=0)
The complete example is listed below. # demonstrate prediction
# multivariate output multi-step 1d cnn example x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
from numpy import array x_input = x_input.reshape((1, n_steps_in, n_features))
from numpy import hstack yhat = model.predict(x_input, verbose=0)
from keras.models import Sequential print(yhat)
from keras.layers import Dense
Listing 8.79: Example of a CNN model for multivariate parallel time series with multi-step
from keras.layers import Flatten
from keras.layers.convolutional import Conv1D forecasts.
from keras.layers.convolutional import MaxPooling1D
Running the example fits the model and predicts the values for each of the three time steps
# split a multivariate sequence into samples for the next two time steps beyond the end of the dataset. We would expect the values for these
def split_sequences(sequences, n_steps_in, n_steps_out): series and time steps to be as follows:
X, y = list(), list() 90, 95, 185
for i in range(len(sequences)): 100, 105, 205
# find the end of this pattern
end_ix = i + n_steps_in Listing 8.80: Example output for the out-of-sample forecast.
out_end_ix = end_ix + n_steps_out
# check if we are beyond the dataset We can see that the model forecast gets reasonably close to the expected values.
if out_end_ix > len(sequences):
break
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
# gather input and output parts of the pattern running the example a few times.
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
X.append(seq_x) [[ 90.47855 95.621284 186.02629 100.48118 105.80815 206.52821 ]]
y.append(seq_y) Listing 8.81: Example output from a CNN model for multivariate parallel time series with
return array(X), array(y)
multi-step forecasts.
# define input sequence For an example of CNN models developed for a multivariate multi-step time series forecasting
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
problem, see Chapter 19.
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) 8.6 Extensions
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1)) This section lists some ideas for extending the tutorial that you may wish to explore.
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq)) ❼ Problem Differences. Explain the main changes to the CNN required when modeling
# choose a number of time steps each of univariate, multivariate and multi-step time series forecasting problems.
n_steps_in, n_steps_out = 3, 2
# convert into input/output ❼ Experiment. Select one example and modify it to work with your own small contrived
X, y = split_sequences(dataset, n_steps_in, n_steps_out) dataset.
# flatten output
n_output = y.shape[1] * y.shape[2] ❼ Develop Framework. Use the examples in this chapter as the basis for a framework for
y = y.reshape((y.shape[0], n_output)) automatically developing an CNN model for a given time series forecasting problem.
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2] If you explore any of these extensions, I’d love to know.
8.7. Further Reading 121 8.8. Summary 122

8.7 Further Reading 8.8 Summary


This section provides more resources on the topic if you are looking to go deeper. In this tutorial, you discovered how to develop a suite of CNN models for a range of standard
time series forecasting problems. Specifically, you learned:
8.7.1 Books
❼ How to develop CNN models for univariate time series forecasting.
❼ Deep Learning, 2016.
https://amzn.to/2MQyLVZ ❼ How to develop CNN models for multivariate time series forecasting.

❼ Deep Learning with Python, 2017. ❼ How to develop CNN models for multi-step time series forecasting.
https://amzn.to/2vMRiMe
8.8.1 Next
8.7.2 Papers In the next lesson, you will discover how to develop Recurrent Neural Network models for time
❼ Backpropagation Applied to Handwritten Zip Code Recognition, 1989. series forecasting.
https://ieeexplore.ieee.org/document/6795724/

❼ Gradient-based Learning Applied to Document Recognition, 1998.


https://ieeexplore.ieee.org/document/726791/

❼ Very Deep Convolutional Networks for Large-Scale Image Recognition, 2014.


https://arxiv.org/abs/1409.1556

8.7.3 APIs
❼ Keras: The Python Deep Learning library.
https://keras.io/

❼ Getting started with the Keras Sequential model.


https://keras.io/getting-started/sequential-model-guide/

❼ Getting started with the Keras functional API.


https://keras.io/getting-started/functional-api-guide/

❼ Keras Sequential Model API.


https://keras.io/models/sequential/

❼ Keras Core Layers API.


https://keras.io/layers/core/

❼ Keras Convolutional Layers API.


https://keras.io/layers/convolutional/

❼ Keras Pooling Layers API.


https://keras.io/layers/pooling/
9.2. Univariate LSTM Models 124

9.2 Univariate LSTM Models


LSTMs can be used to model univariate time series forecasting problems. These are problems
comprised of a single series of observations and a model is required to learn from the series of
past observations to predict the next value in the sequence. We will demonstrate a number of
variations of the LSTM model for univariate time series forecasting. This section is divided into
Chapter 9 six parts; they are:

1. Data Preparation
How to Develop LSTMs for Time 2. Vanilla LSTM
Series Forecasting 3. Stacked LSTM

4. Bidirectional LSTM

Long Short-Term Memory networks, or LSTMs for short, can be applied to time series forecasting. 5. CNN-LSTM
There are many types of LSTM models that can be used for each specific type of time series 6. ConvLSTM
forecasting problem. In this tutorial, you will discover how to develop a suite of LSTM models
for a range of standard time series forecasting problems. The objective of this tutorial is to Each of these models are demonstrated for one-step univariate time series forecasting, but
provide standalone examples of each model on each type of time series problem as a template can easily be adapted and used as the input part of a model for other types of time series
that you can copy and adapt for your specific time series forecasting problem. forecasting problems.
After completing this tutorial, you will know:

❼ How to develop LSTM models for univariate time series forecasting. 9.2.1 Data Preparation
❼ How to develop LSTM models for multivariate time series forecasting. Before a univariate series can be modeled, it must be prepared. The LSTM model will learn a
function that maps a sequence of past observations as input to an output observation. As such,
❼ How to develop LSTM models for multi-step time series forecasting. the sequence of observations must be transformed into multiple examples from which the LSTM
can learn. Consider a given univariate sequence:
Let’s get started. [10, 20, 30, 40, 50, 60, 70, 80, 90]

Listing 9.1: Example of a univariate time series.


9.1 Tutorial Overview
We can divide the sequence into multiple input/output patterns called samples, where three
In this tutorial, we will explore how to develop a suite of different types of LSTM models for time steps are used as input and one time step is used as output for the one-step prediction
time series forecasting. The models are demonstrated on small contrived time series problems that is being learned.
intended to give the flavor of the type of time series problem being addressed. The chosen X, y
configuration of the models is arbitrary and not optimized for each problem; that was not the 10, 20, 30, 40
goal. This tutorial is divided into four parts; they are: 20, 30, 40, 50
30, 40, 50, 60
...
1. Univariate LSTM Models
Listing 9.2: Example of a univariate time series as a supervised learning problem.
2. Multivariate LSTM Models
The split sequence() function below implements this behavior and will split a given
3. Multi-step LSTM Models univariate sequence into multiple samples where each sample has a specified number of time
steps and the output is a single time step.
4. Multivariate Multi-step LSTM Models

123
9.2. Univariate LSTM Models 125 9.2. Univariate LSTM Models 126

[40 50 60] 70
# split a univariate sequence into samples [50 60 70] 80
def split_sequence(sequence, n_steps): [60 70 80] 90
X, y = list(), list()
for i in range(len(sequence)): Listing 9.5: Example output from transforming a univariate time series into a supervised learning
# find the end of this pattern problem.
end_ix = i + n_steps
# check if we are beyond the sequence Now that we know how to prepare a univariate series for modeling, let’s look at developing
if end_ix > len(sequence)-1: LSTM models that can learn the mapping of inputs to outputs, starting with a Vanilla LSTM.
break
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] 9.2.2 Vanilla LSTM
X.append(seq_x)
y.append(seq_y) A Vanilla LSTM is an LSTM model that has a single hidden layer of LSTM units, and an
return array(X), array(y) output layer used to make a prediction. Key to LSTMs is that they offer native support for
Listing 9.3: Example of a function to split a univariate series into a supervised learning problem. sequences. Unlike a CNN that reads across the entire input vector, the LSTM model reads one
time step of the sequence at a time and builds up an internal state representation that can be
We can demonstrate this function on our small contrived dataset above. The complete used as a learned context for making a prediction. We can define a Vanilla LSTM for univariate
example is listed below. time series forecasting as follows.
# univariate data preparation # define model
from numpy import array model = Sequential()
model.add(LSTM(50, activation=✬relu✬, input_shape=(n_steps, n_features)))
# split a univariate sequence into samples model.add(Dense(1))
def split_sequence(sequence, n_steps): model.compile(optimizer=✬adam✬, loss=✬mse✬)
X, y = list(), list()
for i in range(len(sequence)): Listing 9.6: Example of defining a Vanilla LSTM model.
# find the end of this pattern
end_ix = i + n_steps Key in the definition is the shape of the input; that is what the model expects as input for
# check if we are beyond the sequence each sample in terms of the number of time steps and the number of features. We are working
if end_ix > len(sequence)-1: with a univariate series, so the number of features is one, for one variable. The number of
break time steps as input is the number we chose when preparing our dataset as an argument to the
# gather input and output parts of the pattern
split sequence() function.
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x) The shape of the input for each sample is specified in the input shape argument on the
y.append(seq_y) definition of first hidden layer. We almost always have multiple samples, therefore, the model
return array(X), array(y) will expect the input component of training data to have the dimensions or shape: [samples,
timesteps, features]. Our split sequence() function in the previous section outputs the X
# define input sequence
with the shape [samples, timesteps], so we easily reshape it to have an additional dimension
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps for the one feature.
n_steps = 3 # reshape from [samples, timesteps] into [samples, timesteps, features]
# split into samples n_features = 1
X, y = split_sequence(raw_seq, n_steps) X = X.reshape((X.shape[0], X.shape[1], n_features))
# summarize the data
for i in range(len(X)): Listing 9.7: Example of reshaping training data for the LSTM.
print(X[i], y[i])
In this case, we define a model with 50 LSTM units in the hidden layer and an output layer
Listing 9.4: Example of transforming a univariate time series into a supervised learning problem. that predicts a single numerical value. The model is fit using the efficient Adam version of
stochastic gradient descent and optimized using the mean squared error, or ‘mse’ loss function.
Running the example splits the univariate series into six samples where each sample has
Once the model is defined, we can fit it on the training dataset.
three input time steps and one output time step.
# fit model
[10 20 30] 40 model.fit(X, y, epochs=200, verbose=0)
[20 30 40] 50
[30 40 50] 60 Listing 9.8: Example of fitting the LSTM model.
9.2. Univariate LSTM Models 127 9.2. Univariate LSTM Models 128

After the model is fit, we can use it to make a prediction. We can predict the next value yhat = model.predict(x_input, verbose=0)
in the sequence by providing the input: [70, 80, 90]. And expecting the model to predict print(yhat)
something like: [100]. The model expects the input shape to be three-dimensional with Listing 9.10: Example of a Vanilla LSTM for univariate time series forecasting.
[samples, timesteps, features], therefore, we must reshape the single input sample before
making the prediction. Running the example prepares the data, fits the model, and makes a prediction. We can see
that the model predicts the next value in the sequence.
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features)) Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
yhat = model.predict(x_input, verbose=0) running the example a few times.
Listing 9.9: Example of preparing an input sample ready for making an out-of-sample forecast.
[[102.09213]]
We can tie all of this together and demonstrate how to develop a Vanilla LSTM for univariate
Listing 9.11: Example output from a Vanilla LSTM for univariate time series forecasting.
time series forecasting and make a single prediction.
# univariate lstm example
from numpy import array 9.2.3 Stacked LSTM
from keras.models import Sequential
from keras.layers import LSTM Multiple hidden LSTM layers can be stacked one on top of another in what is referred to as
from keras.layers import Dense
a Stacked LSTM model. An LSTM layer requires a three-dimensional input and LSTMs by
# split a univariate sequence into samples default will produce a two-dimensional output as an interpretation from the end of the sequence.
def split_sequence(sequence, n_steps): We can address this by having the LSTM output a value for each time step in the input data by
X, y = list(), list() setting the return sequences=True argument on the layer. This allows us to have 3D output
for i in range(len(sequence)): from hidden LSTM layer as input to the next. We can therefore define a Stacked LSTM as
# find the end of this pattern
end_ix = i + n_steps
follows.
# check if we are beyond the sequence # define model
if end_ix > len(sequence)-1: model = Sequential()
break model.add(LSTM(50, activation=✬relu✬, return_sequences=True, input_shape=(n_steps,
# gather input and output parts of the pattern n_features)))
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix] model.add(LSTM(50, activation=✬relu✬))
X.append(seq_x) model.add(Dense(1))
y.append(seq_y) model.compile(optimizer=✬adam✬, loss=✬mse✬)
return array(X), array(y)
Listing 9.12: Example of defining a Stacked LSTM model.
# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] We can tie this together; the complete code example is listed below.
# choose a number of time steps # univariate stacked lstm example
n_steps = 3 from numpy import array
# split into samples from keras.models import Sequential
X, y = split_sequence(raw_seq, n_steps) from keras.layers import LSTM
# reshape from [samples, timesteps] into [samples, timesteps, features] from keras.layers import Dense
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features)) # split a univariate sequence
# define model def split_sequence(sequence, n_steps):
model = Sequential() X, y = list(), list()
model.add(LSTM(50, activation=✬relu✬, input_shape=(n_steps, n_features))) for i in range(len(sequence)):
model.add(Dense(1)) # find the end of this pattern
model.compile(optimizer=✬adam✬, loss=✬mse✬) end_ix = i + n_steps
# fit model # check if we are beyond the sequence
model.fit(X, y, epochs=200, verbose=0) if end_ix > len(sequence)-1:
# demonstrate prediction break
x_input = array([70, 80, 90]) # gather input and output parts of the pattern
x_input = x_input.reshape((1, n_steps, n_features)) seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
9.2. Univariate LSTM Models 129 9.2. Univariate LSTM Models 130

X.append(seq_x) The complete example of the Bidirectional LSTM for univariate time series forecasting is
y.append(seq_y) listed below.
return array(X), array(y)
# univariate bidirectional lstm example
# define input sequence from numpy import array
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90] from keras.models import Sequential
# choose a number of time steps from keras.layers import LSTM
n_steps = 3 from keras.layers import Dense
# split into samples from keras.layers import Bidirectional
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features] # split a univariate sequence
n_features = 1 def split_sequence(sequence, n_steps):
X = X.reshape((X.shape[0], X.shape[1], n_features)) X, y = list(), list()
# define model for i in range(len(sequence)):
model = Sequential() # find the end of this pattern
model.add(LSTM(50, activation=✬relu✬, return_sequences=True, input_shape=(n_steps, end_ix = i + n_steps
n_features))) # check if we are beyond the sequence
model.add(LSTM(50, activation=✬relu✬)) if end_ix > len(sequence)-1:
model.add(Dense(1)) break
model.compile(optimizer=✬adam✬, loss=✬mse✬) # gather input and output parts of the pattern
# fit model seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
model.fit(X, y, epochs=200, verbose=0) X.append(seq_x)
# demonstrate prediction y.append(seq_y)
x_input = array([70, 80, 90]) return array(X), array(y)
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0) # define input sequence
print(yhat) raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
Listing 9.13: Example of a Stacked LSTM for univariate time series forecasting. n_steps = 3
# split into samples
Running the example predicts the next value in the sequence, which we expect would be 100. X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features]
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider n_features = 1
running the example a few times. X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
[[102.47341]] model = Sequential()
model.add(Bidirectional(LSTM(50, activation=✬relu✬), input_shape=(n_steps, n_features)))
Listing 9.14: Example output from a Stacked LSTM for univariate time series forecasting. model.add(Dense(1))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
# fit model
9.2.4 Bidirectional LSTM model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
On some sequence prediction problems, it can be beneficial to allow the LSTM model to learn x_input = array([70, 80, 90])
the input sequence both forward and backwards and concatenate both interpretations. This x_input = x_input.reshape((1, n_steps, n_features))
is called a Bidirectional LSTM. We can implement a Bidirectional LSTM for univariate time yhat = model.predict(x_input, verbose=0)
print(yhat)
series forecasting by wrapping the first hidden layer in a wrapper layer called Bidirectional.
An example of defining a Bidirectional LSTM to read input both forward and backward is as Listing 9.16: Example of a Bidirectional LSTM for univariate time series forecasting.
follows. Running the example predicts the next value in the sequence, which we expect would be 100.
# define model
model = Sequential() Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
model.add(Bidirectional(LSTM(50, activation=✬relu✬), input_shape=(n_steps, n_features))) running the example a few times.
model.add(Dense(1))
model.compile(optimizer=✬adam✬, loss=✬mse✬) [[101.48093]]

Listing 9.15: Example of defining a Bidirectional LSTM model. Listing 9.17: Example output from a Bidirectional LSTM for univariate time series forecasting.
9.2. Univariate LSTM Models 131 9.2. Univariate LSTM Models 132

9.2.5 CNN-LSTM model.add(LSTM(50, activation=✬relu✬))


model.add(Dense(1))
A convolutional neural network, or CNN for short, is a type of neural network developed for
working with two-dimensional image data. The CNN can be very effective at automatically Listing 9.20: Example of defining the LSTM output model.
extracting and learning features from one-dimensional sequence data such as univariate time We can tie all of this together; the complete example of a CNN-LSTM model for univariate
series data. A CNN model can be used in a hybrid model with an LSTM backend where the time series forecasting is listed below.
CNN is used to interpret subsequences of input that together are provided as a sequence to an
# univariate cnn lstm example
LSTM model to interpret. This hybrid model is called a CNN-LSTM. from numpy import array
The first step is to split the input sequences into subsequences that can be processed by the from keras.models import Sequential
CNN model. For example, we can first split our univariate time series data into input/output from keras.layers import LSTM
samples with four steps as input and one as output. Each sample can then be split into two from keras.layers import Dense
from keras.layers import Flatten
sub-samples, each with two time steps. The CNN can interpret each subsequence of two time from keras.layers import TimeDistributed
steps and provide a time series of interpretations of the subsequences to the LSTM model from keras.layers.convolutional import Conv1D
to process as input. We can parameterize this and define the number of subsequences as from keras.layers.convolutional import MaxPooling1D
n seq and the number of time steps per subsequence as n steps. The input data can then be
reshaped to have the required structure: [samples, subsequences, timesteps, features]. # split a univariate sequence into samples
def split_sequence(sequence, n_steps):
For example: X, y = list(), list()
# choose a number of time steps for i in range(len(sequence)):
n_steps = 4 # find the end of this pattern
# split into samples end_ix = i + n_steps
X, y = split_sequence(raw_seq, n_steps) # check if we are beyond the sequence
# reshape from [samples, timesteps] into [samples, subsequences, timesteps, features] if end_ix > len(sequence)-1:
n_features = 1 break
n_seq = 2 # gather input and output parts of the pattern
n_steps = 2 seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X = X.reshape((X.shape[0], n_seq, n_steps, n_features)) X.append(seq_x)
y.append(seq_y)
Listing 9.18: Example of reshaping data for a CNN-LSTM model. return array(X), array(y)

We want to reuse the same CNN model when reading in each sub-sequence of data separately. # define input sequence
This can be achieved by wrapping the entire CNN model in a TimeDistributed wrapper that raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
will apply the entire model once per input, in this case, once per input subsequence. The CNN # choose a number of time steps
n_steps = 4
model first has a convolutional layer for reading across the subsequence that requires a number
# split into samples
of filters and a kernel size to be specified. The number of filters is the number of reads or X, y = split_sequence(raw_seq, n_steps)
interpretations of the input sequence. The kernel size is the number of time steps included of # reshape from [samples, timesteps] into [samples, subsequences, timesteps, features]
each read operation of the input sequence. The convolution layer is followed by a max pooling n_features = 1
layer that distills the filter maps down to 14 of their size that includes the most salient features. n_seq = 2
n_steps = 2
These structures are then flattened down to a single one-dimensional vector to be used as a X = X.reshape((X.shape[0], n_seq, n_steps, n_features))
single input time step to the LSTM layer. # define model
# define the input cnn model model = Sequential()
model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation=✬relu✬), model.add(TimeDistributed(Conv1D(filters=64, kernel_size=1, activation=✬relu✬),
input_shape=(None, n_steps, n_features))) input_shape=(None, n_steps, n_features)))
model.add(TimeDistributed(MaxPooling1D(pool_size=2))) model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten())) model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, activation=✬relu✬))
Listing 9.19: Example of defining the CNN input model. model.add(Dense(1))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
Next, we can define the LSTM part of the model that interprets the CNN model’s read of # fit model
the input sequence and makes a prediction. model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
# define the output model x_input = array([60, 70, 80, 90])
9.2. Univariate LSTM Models 133 9.2. Univariate LSTM Models 134

x_input = x_input.reshape((1, n_seq, n_steps, n_features))


yhat = model.predict(x_input, verbose=0) # univariate convlstm example
print(yhat) from numpy import array
from keras.models import Sequential
Listing 9.21: Example of a CNN-LSTM for univariate time series forecasting. from keras.layers import Dense
from keras.layers import Flatten
Running the example predicts the next value in the sequence, which we expect would be 100. from keras.layers import ConvLSTM2D

Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider # split a univariate sequence into samples
def split_sequence(sequence, n_steps):
running the example a few times. X, y = list(), list()
for i in range(len(sequence)):
[[101.69263]] # find the end of this pattern
end_ix = i + n_steps
Listing 9.22: Example output from a CNN-LSTM for univariate time series forecasting. # check if we are beyond the sequence
if end_ix > len(sequence)-1:
break
# gather input and output parts of the pattern
9.2.6 ConvLSTM seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
A type of LSTM related to the CNN-LSTM is the ConvLSTM, where the convolutional reading
y.append(seq_y)
of input is built directly into each LSTM unit. The ConvLSTM was developed for reading return array(X), array(y)
two-dimensional spatial-temporal data, but can be adapted for use with univariate time series
forecasting. The layer expects input as a sequence of two-dimensional images, therefore the # define input sequence
shape of input data must be: [samples, timesteps, rows, columns, features]. raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
For our purposes, we can split each sample into subsequences where timesteps will become
n_steps = 4
the number of subsequences, or n seq, and columns will be the number of time steps for # split into samples
each subsequence, or n steps. The number of rows is fixed at 1 as we are working with X, y = split_sequence(raw_seq, n_steps)
one-dimensional data. We can now reshape the prepared samples into the required structure. # reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features]
n_features = 1
# choose a number of time steps n_seq = 2
n_steps = 4 n_steps = 2
# split into samples X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features))
X, y = split_sequence(raw_seq, n_steps) # define model
# reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features] model = Sequential()
n_features = 1 model.add(ConvLSTM2D(filters=64, kernel_size=(1,2), activation=✬relu✬, input_shape=(n_seq,
n_seq = 2 1, n_steps, n_features)))
n_steps = 2 model.add(Flatten())
X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features)) model.add(Dense(1))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
Listing 9.23: Example of reshaping data for a ConvLSTM model.
# fit model
We can define the ConvLSTM as a single layer in terms of the number of filters and a two- model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
dimensional kernel size in terms of (rows, columns). As we are working with a one-dimensional x_input = array([60, 70, 80, 90])
series, the number of rows is always fixed to 1 in the kernel. The output of the model must x_input = x_input.reshape((1, n_seq, 1, n_steps, n_features))
then be flattened before it can be interpreted and a prediction made. yhat = model.predict(x_input, verbose=0)
print(yhat)
# define the input cnnlstm model
model.add(ConvLSTM2D(filters=64, kernel_size=(1,2), activation=✬relu✬, input_shape=(n_seq, Listing 9.25: Example of a ConvLSTM for univariate time series forecasting.
1, n_steps, n_features)))
model.add(Flatten()) Running the example predicts the next value in the sequence, which we expect would be 100.
Listing 9.24: Example of defining the ConvLSTM input model.
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
The complete example of a ConvLSTM for one-step univariate time series forecasting is running the example a few times.
listed below.
9.3. Multivariate LSTM Models 135 9.3. Multivariate LSTM Models 136

[[103.68166]] # multivariate data preparation


from numpy import array
Listing 9.26: Example output from a ConvLSTM for univariate time series forecasting. from numpy import hstack
# define input sequence
For an example of an LSTM applied to a real-world univariate time series forecasting problem in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
see Chapter 14. For an example of grid searching LSTM hyperparameters on a univariate time in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
series forecasting problem, see Chapter 15. Now that we have looked at LSTM models for out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
univariate data, let’s turn our attention to multivariate data. # convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
9.3 Multivariate LSTM Models # horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
Multivariate time series data means data where there is more than one observation for each print(dataset)
time step. There are two main models that we may require with multivariate time series data;
Listing 9.29: Example of defining a dependent series dataset.
they are:
Running the example prints the dataset with one row per time step and one column for each
1. Multiple Input Series. of the two input and one output parallel time series.
2. Multiple Parallel Series. [[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
Let’s take a look at each in turn.
[ 40 45 85]
[ 50 55 105]
9.3.1 Multiple Input Series [ 60 65 125]
[ 70 75 145]
A problem may have two or more parallel input time series and an output time series that is [ 80 85 165]
[ 90 95 185]]
dependent on the input time series. The input time series are parallel because each series has
an observation at the same time steps. We can demonstrate this with a simple example of two Listing 9.30: Example output from defining a dependent series dataset.
parallel input time series where the output series is the simple addition of the input series.
As with the univariate time series, we must structure these data into samples with input
# define input sequence
and output elements. An LSTM model needs sufficient context to learn a mapping from an
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) input sequence to an output value. LSTMs can support parallel input time series as separate
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) variables or features. Therefore, we need to split the data into samples maintaining the order of
observations across the two input sequences. If we chose three input time steps, then the first
Listing 9.27: Example of defining multiple input and a dependent time series.
sample would look as follows:
We can reshape these three arrays of data as a single dataset where each row is a time step, Input:
and each column is a separate time series. This is a standard way of storing parallel time series 10, 15
in a CSV file. 20, 25
30, 35
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) Listing 9.31: Example input from the first sample.
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1)) Output:
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq)) 65

Listing 9.28: Example of reshaping the parallel series into the columns of a dataset. Listing 9.32: Example Output from the first sample.

The complete example is listed below. That is, the first three time steps of each parallel series are provided as input to the model
and the model associates this with the value in the output series at the third time step, in this
case, 65. We can see that, in transforming the time series into input/output samples to train
the model, that we will have to discard some values from the output time series where we do
9.3. Multivariate LSTM Models 137 9.3. Multivariate LSTM Models 138

not have values in the input time series at prior time steps. In turn, the choice of the size of n_steps = 3
the number of input time steps will have an important effect on how much of the training data # convert into input/output
is used. We can define a function named split sequences() that will take a dataset as we X, y = split_sequences(dataset, n_steps)
print(X.shape, y.shape)
have defined it with rows for time steps and columns for parallel series and return input/output # summarize the data
samples. for i in range(len(X)):
# split a multivariate sequence into samples print(X[i], y[i])
def split_sequences(sequences, n_steps): Listing 9.34: Example of splitting a dependent series dataset into samples.
X, y = list(), list()
for i in range(len(sequences)): Running the example first prints the shape of the X and y components. We can see that
# find the end of this pattern
end_ix = i + n_steps
the X component has a three-dimensional structure. The first dimension is the number of
# check if we are beyond the dataset samples, in this case 7. The second dimension is the number of time steps per sample, in this
if end_ix > len(sequences): case 3, the value specified to the function. Finally, the last dimension specifies the number of
break parallel time series or the number of variables, in this case 2 for the two parallel series. This is
# gather input and output parts of the pattern the exact three-dimensional structure expected by an LSTM as input. The data is ready to
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
X.append(seq_x)
use without further reshaping. We can then see that the input and output for each sample is
y.append(seq_y) printed, showing the three time steps for each of the two input series and the associated output
return array(X), array(y) for each sample.
Listing 9.33: Example of a function for transforming a dependent series dataset into samples. (7, 3, 2) (7,)

We can test this function on our dataset using three time steps for each input time series as [[10 15]
input. The complete example is listed below. [20 25]
[30 35]] 65
# multivariate data preparation [[20 25]
from numpy import array [30 35]
from numpy import hstack [40 45]] 85
[[30 35]
# split a multivariate sequence into samples [40 45]
def split_sequences(sequences, n_steps): [50 55]] 105
X, y = list(), list() [[40 45]
for i in range(len(sequences)): [50 55]
# find the end of this pattern [60 65]] 125
end_ix = i + n_steps [[50 55]
# check if we are beyond the dataset [60 65]
if end_ix > len(sequences): [70 75]] 145
break [[60 65]
# gather input and output parts of the pattern [70 75]
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1] [80 85]] 165
X.append(seq_x) [[70 75]
y.append(seq_y) [80 85]
return array(X), array(y) [90 95]] 185

# define input sequence Listing 9.35: Example output from splitting a dependent series dataset into samples.
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) We are now ready to fit an LSTM model on this data. Any of the varieties of LSTMs in the
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))]) previous section can be used, such as a Vanilla, Stacked, Bidirectional, CNN, or ConvLSTM
# convert to [rows, columns] structure model. We will use a Vanilla LSTM where the number of time steps and parallel series (features)
in_seq1 = in_seq1.reshape((len(in_seq1), 1)) are specified for the input layer via the input shape argument.
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1)) # define model
# horizontally stack columns model = Sequential()
dataset = hstack((in_seq1, in_seq2, out_seq)) model.add(LSTM(50, activation=✬relu✬, input_shape=(n_steps, n_features)))
# choose a number of time steps model.add(Dense(1))
9.3. Multivariate LSTM Models 139 9.3. Multivariate LSTM Models 140

model.compile(optimizer=✬adam✬, loss=✬mse✬) n_steps = 3


# convert into input/output
Listing 9.36: Example of defining a Vanilla LSTM for modeling a dependent series. X, y = split_sequences(dataset, n_steps)
# the dataset knows the number of features, e.g. 2
When making a prediction, the model expects three time steps for two input time series. n_features = X.shape[2]
We can predict the next value in the output series providing the input values of: # define model
model = Sequential()
80, 85
model.add(LSTM(50, activation=✬relu✬, input_shape=(n_steps, n_features)))
90, 95
model.add(Dense(1))
100, 105
model.compile(optimizer=✬adam✬, loss=✬mse✬)
Listing 9.37: Example input for making an out-of-sample forecast. # fit model
model.fit(X, y, epochs=200, verbose=0)
The shape of the one sample with three time steps and two variables must be [1, 3, 2]. # demonstrate prediction
We would expect the next value in the sequence to be 100 + 105, or 205. x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
# demonstrate prediction yhat = model.predict(x_input, verbose=0)
x_input = array([[80, 85], [90, 95], [100, 105]]) print(yhat)
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0) Listing 9.39: Example of a Vanilla LSTM for multivariate dependent time series forecasting.
Listing 9.38: Example of making an out-of-sample forecast. Running the example prepares the data, fits the model, and makes a prediction.
The complete example is listed below. Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
# multivariate lstm example running the example a few times.
from numpy import array
from numpy import hstack
from keras.models import Sequential [[208.13531]]
from keras.layers import LSTM
Listing 9.40: Example output from a Vanilla LSTM for multivariate dependent time series
from keras.layers import Dense
forecasting.
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps): For an example of LSTM models developed for a multivariate time series classification
X, y = list(), list() problem, see Chapter 25.
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps 9.3.2 Multiple Parallel Series
# check if we are beyond the dataset
if end_ix > len(sequences):
An alternate time series problem is the case where there are multiple parallel time series and a
break value must be predicted for each. For example, given the data from the previous section:
# gather input and output parts of the pattern [[ 10 15 25]
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1] [ 20 25 45]
X.append(seq_x) [ 30 35 65]
y.append(seq_y) [ 40 45 85]
return array(X), array(y) [ 50 55 105]
[ 60 65 125]
# define input sequence [ 70 75 145]
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) [ 80 85 165]
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) [ 90 95 185]]
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure Listing 9.41: Example of a multivariate parallel time series.
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1)) We may want to predict the value for each of the three time series for the next time step. This
out_seq = out_seq.reshape((len(out_seq), 1)) might be referred to as multivariate forecasting. Again, the data must be split into input/output
# horizontally stack columns
samples in order to train a model. The first sample of this dataset would be:
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps Input:
9.3. Multivariate LSTM Models 141 9.3. Multivariate LSTM Models 142

# convert to [rows, columns] structure


10, 15, 25 in_seq1 = in_seq1.reshape((len(in_seq1), 1))
20, 25, 45 in_seq2 = in_seq2.reshape((len(in_seq2), 1))
30, 35, 65 out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
Listing 9.42: Example input from the first sample.
dataset = hstack((in_seq1, in_seq2, out_seq))
Output: # choose a number of time steps
n_steps = 3
40, 45, 85 # convert into input/output
X, y = split_sequences(dataset, n_steps)
Listing 9.43: Example output from the first sample. print(X.shape, y.shape)
# summarize the data
The split sequences() function below will split multiple parallel time series with rows for for i in range(len(X)):
time steps and one series per column into the required input/output shape. print(X[i], y[i])
# split a multivariate sequence into samples Listing 9.45: Example of splitting a multivariate parallel time series onto samples.
def split_sequences(sequences, n_steps):
X, y = list(), list() Running the example first prints the shape of the prepared X and y components. The
for i in range(len(sequences)): shape of X is three-dimensional, including the number of samples (6), the number of time steps
# find the end of this pattern
end_ix = i + n_steps
chosen per sample (3), and the number of parallel time series or features (3). The shape of y
# check if we are beyond the dataset is two-dimensional as we might expect for the number of samples (6) and the number of time
if end_ix > len(sequences)-1: variables per sample to be predicted (3). The data is ready to use in an LSTM model that
break expects three-dimensional input and two-dimensional output shapes for the X and y components
# gather input and output parts of the pattern of each sample. Then, each of the samples is printed showing the input and output components
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
of each sample.
y.append(seq_y) (6, 3, 3) (6, 3)
return array(X), array(y)
[[10 15 25]
Listing 9.44: Example of a function for splitting parallel series into samples. [20 25 45]
[30 35 65]] [40 45 85]
We can demonstrate this on the contrived problem; the complete example is listed below. [[20 25 45]
# multivariate output data prep [30 35 65]
from numpy import array [40 45 85]] [ 50 55 105]
from numpy import hstack [[ 30 35 65]
[ 40 45 85]
# split a multivariate sequence into samples [ 50 55 105]] [ 60 65 125]
def split_sequences(sequences, n_steps): [[ 40 45 85]
X, y = list(), list() [ 50 55 105]
for i in range(len(sequences)): [ 60 65 125]] [ 70 75 145]
# find the end of this pattern [[ 50 55 105]
end_ix = i + n_steps [ 60 65 125]
# check if we are beyond the dataset [ 70 75 145]] [ 80 85 165]
if end_ix > len(sequences)-1: [[ 60 65 125]
break [ 70 75 145]
# gather input and output parts of the pattern [ 80 85 165]] [ 90 95 185]
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
Listing 9.46: Example output from splitting a multivariate parallel time series onto samples.
y.append(seq_y) We are now ready to fit an LSTM model on this data. Any of the varieties of LSTMs in the
return array(X), array(y)
previous section can be used, such as a Vanilla, Stacked, Bidirectional, CNN, or ConvLSTM
# define input sequence model. We will use a Stacked LSTM where the number of time steps and parallel series (features)
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90]) are specified for the input layer via the input shape argument. The number of parallel series is
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95]) also used in the specification of the number of values to predict by the model in the output
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
layer; again, this is three.
9.3. Multivariate LSTM Models 143 9.4. Multi-step LSTM Models 144

return array(X), array(y)


# define model
model = Sequential() # define input sequence
model.add(LSTM(100, activation=✬relu✬, return_sequences=True, input_shape=(n_steps, in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
n_features))) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
model.add(LSTM(100, activation=✬relu✬)) out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
model.add(Dense(n_features)) # convert to [rows, columns] structure
model.compile(optimizer=✬adam✬, loss=✬mse✬) in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
Listing 9.47: Example of defining a Stacked LSTM for parallel time series forecasting.
out_seq = out_seq.reshape((len(out_seq), 1))
We can predict the next value in each of the three parallel series by providing an input of # horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
three time steps for each series. # choose a number of time steps
70, 75, 145 n_steps = 3
80, 85, 165 # convert into input/output
90, 95, 185 X, y = split_sequences(dataset, n_steps)
# the dataset knows the number of features, e.g. 2
Listing 9.48: Example input for making an out-of-sample forecast. n_features = X.shape[2]
# define model
The shape of the input for making a single prediction must be 1 sample, 3 time steps, and 3 model = Sequential()
features, or [1, 3, 3]. model.add(LSTM(100, activation=✬relu✬, return_sequences=True, input_shape=(n_steps,
n_features)))
# demonstrate prediction
model.add(LSTM(100, activation=✬relu✬))
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
model.add(Dense(n_features))
x_input = x_input.reshape((1, n_steps, n_features))
model.compile(optimizer=✬adam✬, loss=✬mse✬)
yhat = model.predict(x_input, verbose=0)
# fit model
Listing 9.49: Example of reshaping a sample for making an out-of-sample forecast. model.fit(X, y, epochs=400, verbose=0)
# demonstrate prediction
We would expect the vector output to be: x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
[100, 105, 205] yhat = model.predict(x_input, verbose=0)
print(yhat)
Listing 9.50: Example output for an out-of-sample forecast.
Listing 9.51: Example of a Stacked LSTM for multivariate parallel time series forecasting.
We can tie all of this together and demonstrate a Stacked LSTM for multivariate output
time series forecasting below. Running the example prepares the data, fits the model, and makes a prediction.
# multivariate output stacked lstm example Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
from numpy import array
from numpy import hstack running the example a few times.
from keras.models import Sequential
from keras.layers import LSTM [[101.76599 108.730484 206.63577 ]]
from keras.layers import Dense
Listing 9.52: Example output from a Stacked LSTM for multivariate parallel time series
# split a multivariate sequence into samples forecasting.
def split_sequences(sequences, n_steps):
X, y = list(), list() For an example of LSTM models developed for a multivariate time series forecasting problem,
for i in range(len(sequences)): see Chapter 20.
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset 9.4 Multi-step LSTM Models
if end_ix > len(sequences)-1:
break A time series forecasting problem that requires a prediction of multiple time steps into the
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :] future can be referred to as multi-step time series forecasting. Specifically, these are problems
X.append(seq_x) where the forecast horizon or interval is more than one time step. There are two main types of
y.append(seq_y) LSTM models that can be used for multi-step forecasting; they are:
9.4. Multi-step LSTM Models 145 9.4. Multi-step LSTM Models 146

1. Vector Output Model


# multi-step data preparation
2. Encoder-Decoder Model from numpy import array

# split a univariate sequence into samples


Before we look at these models, let’s first look at the preparation of data for multi-step def split_sequence(sequence, n_steps_in, n_steps_out):
forecasting. X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
9.4.1 Data Preparation end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out
As with one-step forecasting, a time series used for multi-step time series forecasting must be # check if we are beyond the sequence
split into samples with input and output components. Both the input and output components if out_end_ix > len(sequence):
will be comprised of multiple time steps and may or may not have the same number of steps. break
For example, given the univariate time series: # gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
[10, 20, 30, 40, 50, 60, 70, 80, 90] X.append(seq_x)
y.append(seq_y)
Listing 9.53: Example of a univariate time series. return array(X), array(y)
We could use the last three time steps as input and forecast the next two time steps. The # define input sequence
first sample would look as follows: raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
Input: # choose a number of time steps
n_steps_in, n_steps_out = 3, 2
[10, 20, 30]
# split into samples
Listing 9.54: Example input from the first sample. X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
# summarize the data
Output: for i in range(len(X)):
print(X[i], y[i])
[40, 50]
Listing 9.57: Example of splitting a univariate series for multi-step forecasting into samples.
Listing 9.55: Example output from the first sample.
Running the example splits the univariate series into input and output time steps and prints
The split sequence() function below implements this behavior and will split a given the input and output components of each.
univariate time series into samples with a specified number of input and output time steps.
[10 20 30] [40 50]
# split a univariate sequence into samples [20 30 40] [50 60]
def split_sequence(sequence, n_steps_in, n_steps_out): [30 40 50] [60 70]
X, y = list(), list() [40 50 60] [70 80]
for i in range(len(sequence)): [50 60 70] [80 90]
# find the end of this pattern
end_ix = i + n_steps_in Listing 9.58: Example output from splitting a univariate series for multi-step forecasting into
out_end_ix = end_ix + n_steps_out samples.
# check if we are beyond the sequence
if out_end_ix > len(sequence): Now that we know how to prepare data for multi-step forecasting, let’s look at some LSTM
break models that can learn this mapping.
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
X.append(seq_x) 9.4.2 Vector Output Model
y.append(seq_y)
return array(X), array(y) Like other types of neural network models, the LSTM can output a vector directly that can
Listing 9.56: Example of a function for splitting a univariate series into samples for multi-step be interpreted as a multi-step forecast. This approach was seen in the previous section were
forecasting. one time step of each output time series was forecasted as a vector. As with the LSTMs for
univariate data in a prior section, the prepared samples must first be reshaped. The LSTM
We can demonstrate this function on the small contrived dataset. The complete example is expects data to have a three-dimensional structure of [samples, timesteps, features], and
listed below. in this case, we only have one feature so the reshape is straightforward.
9.4. Multi-step LSTM Models 147 9.4. Multi-step LSTM Models 148

end_ix = i + n_steps_in
# reshape from [samples, timesteps] into [samples, timesteps, features] out_end_ix = end_ix + n_steps_out
n_features = 1 # check if we are beyond the sequence
X = X.reshape((X.shape[0], X.shape[1], n_features)) if out_end_ix > len(sequence):
break
Listing 9.59: Example of preparing data for fitting an LSTM model.
# gather input and output parts of the pattern
With the number of input and output steps specified in the n steps in and n steps out seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
X.append(seq_x)
variables, we can define a multi-step time-series forecasting model. Any of the presented LSTM y.append(seq_y)
model types could be used, such as Vanilla, Stacked, Bidirectional, CNN-LSTM, or ConvLSTM. return array(X), array(y)
Below defines a Stacked LSTM for multi-step forecasting.
# define input sequence
# define model
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
model = Sequential()
# choose a number of time steps
model.add(LSTM(100, activation=✬relu✬, return_sequences=True, input_shape=(n_steps_in,
n_steps_in, n_steps_out = 3, 2
n_features)))
# split into samples
model.add(LSTM(100, activation=✬relu✬))
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
model.add(Dense(n_steps_out))
# reshape from [samples, timesteps] into [samples, timesteps, features]
model.compile(optimizer=✬adam✬, loss=✬mse✬)
n_features = 1
Listing 9.60: Example of defining a Stacked LSTM for multi-step forecasting. X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
The model can make a prediction for a single sample. We can predict the next two steps model = Sequential()
model.add(LSTM(100, activation=✬relu✬, return_sequences=True, input_shape=(n_steps_in,
beyond the end of the dataset by providing the input:
n_features)))
[70, 80, 90] model.add(LSTM(100, activation=✬relu✬))
model.add(Dense(n_steps_out))
Listing 9.61: Example input for making an out-of-sample forecast. model.compile(optimizer=✬adam✬, loss=✬mse✬)
# fit model
We would expect the predicted output to be: model.fit(X, y, epochs=50, verbose=0)
[100, 110] # demonstrate prediction
x_input = array([70, 80, 90])
Listing 9.62: Expected output for making an out-of-sample forecast. x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
As expected by the model, the shape of the single sample of input data when making the print(yhat)
prediction must be [1, 3, 1] for the 1 sample, 3 time steps of the input, and the single feature.
Listing 9.64: Example of a Stacked LSTM for multi-step time series forecasting.
# demonstrate prediction
x_input = array([70, 80, 90]) Running the example forecasts and prints the next two time steps in the sequence.
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0) Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
running the example a few times.
Listing 9.63: Example of preparing a sample for making an out-of-sample forecast.
Tying all of this together, the Stacked LSTM for multi-step forecasting with a univariate [[100.98096 113.28924]]
time series is listed below. Listing 9.65: Example output from a Stacked LSTM for multi-step time series forecasting.
# univariate multi-step vector-output stacked lstm example
from numpy import array
from keras.models import Sequential 9.4.3 Encoder-Decoder Model
from keras.layers import LSTM
from keras.layers import Dense A model specifically developed for forecasting variable length output sequences is called the
Encoder-Decoder LSTM. The model was designed for prediction problems where there are
# split a univariate sequence into samples both input and output sequences, so-called sequence-to-sequence, or seq2seq problems, such
def split_sequence(sequence, n_steps_in, n_steps_out):
X, y = list(), list() as translating text from one language to another. This model can be used for multi-step time
for i in range(len(sequence)): series forecasting. As its name suggests, the model is comprised of two sub-models: the encoder
# find the end of this pattern and the decoder.
9.4. Multi-step LSTM Models 149 9.4. Multi-step LSTM Models 150

The encoder is a model responsible for reading and interpreting the input sequence. The
# reshape output training data
output of the encoder is a fixed length vector that represents the model’s interpretation of the y = y.reshape((y.shape[0], y.shape[1], n_features))
sequence. The encoder is traditionally a Vanilla LSTM model, although other encoder models
can be used such as Stacked, Bidirectional, and CNN models. Listing 9.72: Example of reshaping output samples for training the Encoder-Decoder LSTM.
# define encoder model The complete example of an Encoder-Decoder LSTM for multi-step time series forecasting
model.add(LSTM(100, activation=✬relu✬, input_shape=(n_steps_in, n_features)))
is listed below.
Listing 9.66: Example of defining the encoder input model. # univariate multi-step encoder-decoder lstm example
from numpy import array
The decoder uses the output of the encoder as an input. First, the fixed-length output of from keras.models import Sequential
the encoder is repeated, once for each required time step in the output sequence. from keras.layers import LSTM
# repeat encoding from keras.layers import Dense
model.add(RepeatVector(n_steps_out)) from keras.layers import RepeatVector
from keras.layers import TimeDistributed
Listing 9.67: Example of repeating the encoded vector.
# split a univariate sequence into samples
This sequence is then provided to an LSTM decoder model. The model must output a value def split_sequence(sequence, n_steps_in, n_steps_out):
for each value in the output time step, which can be interpreted by a single output model. X, y = list(), list()
for i in range(len(sequence)):
# define decoder model # find the end of this pattern
model.add(LSTM(100, activation=✬relu✬, return_sequences=True)) end_ix = i + n_steps_in
Listing 9.68: Example of defining the decoder model. out_end_ix = end_ix + n_steps_out
# check if we are beyond the sequence
We can use the same output layer or layers to make each one-step prediction in the output if out_end_ix > len(sequence):
break
sequence. This can be achieved by wrapping the output part of the model in a TimeDistributed # gather input and output parts of the pattern
wrapper. seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
# define model output X.append(seq_x)
model.add(TimeDistributed(Dense(1))) y.append(seq_y)
return array(X), array(y)
Listing 9.69: Example of defining the output model.
# define input sequence
The full definition for an Encoder-Decoder model for multi-step time series forecasting is raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
listed below. # choose a number of time steps
n_steps_in, n_steps_out = 3, 2
# define model # split into samples
model = Sequential() X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)
model.add(LSTM(100, activation=✬relu✬, input_shape=(n_steps_in, n_features))) # reshape from [samples, timesteps] into [samples, timesteps, features]
model.add(RepeatVector(n_steps_out)) n_features = 1
model.add(LSTM(100, activation=✬relu✬, return_sequences=True)) X = X.reshape((X.shape[0], X.shape[1], n_features))
model.add(TimeDistributed(Dense(1))) y = y.reshape((y.shape[0], y.shape[1], n_features))
model.compile(optimizer=✬adam✬, loss=✬mse✬) # define model
Listing 9.70: Example of defining an Encoder-Decoder LSTM for multi-step forecasting. model = Sequential()
model.add(LSTM(100, activation=✬relu✬, input_shape=(n_steps_in, n_features)))
As with other LSTM models, the input data must be reshaped into the expected three- model.add(RepeatVector(n_steps_out))
model.add(LSTM(100, activation=✬relu✬, return_sequences=True))
dimensional shape of [samples, timesteps, features].
model.add(TimeDistributed(Dense(1)))
# reshape input training data model.compile(optimizer=✬adam✬, loss=✬mse✬)
X = X.reshape((X.shape[0], X.shape[1], n_features)) # fit model
model.fit(X, y, epochs=100, verbose=0)
Listing 9.71: Example of reshaping input samples for training the Encoder-Decoder LSTM. # demonstrate prediction
x_input = array([70, 80, 90])
In the case of the Encoder-Decoder model, the output, or y part, of the training dataset x_input = x_input.reshape((1, n_steps_in, n_features))
must also have this shape. This is because the model will predict a given number of time steps yhat = model.predict(x_input, verbose=0)
with a given number of features for each input sample. print(yhat)
9.5. Multivariate Multi-step LSTM Models 151 9.5. Multivariate Multi-step LSTM Models 152

We may use three prior time steps of each of the two input time series to predict two time
Listing 9.73: Example of an Encoder-Decoder LSTM for multi-step time series forecasting. steps of the output time series.
Input:
Running the example forecasts and prints the next two time steps in the sequence.
10, 15
20, 25
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider 30, 35
running the example a few times.
Listing 9.76: Example of input from the first sample.
[[[101.9736 Output:
[116.213615]]]
65
Listing 9.74: Example output from an Encoder-Decoder LSTM for multi-step time series 85
forecasting. Listing 9.77: Example of output from the first sample.
For an example of LSTM models developed for a multi-step time series forecasting problem, The split sequences() function below implements this behavior.
see Chapter 20.
# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
9.5 Multivariate Multi-step LSTM Models for i in range(len(sequences)):
# find the end of this pattern
In the previous sections, we have looked at univariate, multivariate, and multi-step time series end_ix = i + n_steps_in
forecasting. It is possible to mix and match the different types of LSTM models presented so out_end_ix = end_ix + n_steps_out-1
far for the different problems. This too applies to time series forecasting problems that involve # check if we are beyond the dataset
multivariate and multi-step forecasting, but it may be a little more challenging. In this section, if out_end_ix > len(sequences):
break
we will provide short examples of data preparation and modeling for multivariate multi-step # gather input and output parts of the pattern
time series forecasting as a template to ease this challenge, specifically: seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x)
1. Multiple Input Multi-step Output. y.append(seq_y)
return array(X), array(y)
2. Multiple Parallel Input and Multi-step Output.
Listing 9.78: Example of a function for splitting a dependent series for multi-step forecasting
Perhaps the biggest stumbling block is in the preparation of data, so this is where we will into samples.
focus our attention. We can demonstrate this on our contrived dataset. The complete example is listed below.
# multivariate multi-step data preparation
9.5.1 Multiple Input Multi-step Output from numpy import array
from numpy import hstack
There are those multivariate time series forecasting problems where the output series is separate
but dependent upon the input time series, and multiple time steps are required for the output # split a multivariate sequence into samples
series. For example, consider our multivariate time series from a prior section: def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
[[ 10 15 25] for i in range(len(sequences)):
[ 20 25 45] # find the end of this pattern
[ 30 35 65] end_ix = i + n_steps_in
[ 40 45 85] out_end_ix = end_ix + n_steps_out-1
[ 50 55 105] # check if we are beyond the dataset
[ 60 65 125] if out_end_ix > len(sequences):
[ 70 75 145] break
[ 80 85 165] # gather input and output parts of the pattern
[ 90 95 185]] seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
X.append(seq_x)
Listing 9.75: Example of a multivariate dependent time series. y.append(seq_y)
9.5. Multivariate Multi-step LSTM Models 153 9.5. Multivariate Multi-step LSTM Models 154

return array(X), array(y) Stacked LSTM. The complete example is listed below.
# multivariate multi-step stacked lstm example
# define input sequence
from numpy import array
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
from numpy import hstack
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
from keras.models import Sequential
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
from keras.layers import LSTM
# convert to [rows, columns] structure
from keras.layers import Dense
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
# split a multivariate sequence into samples
out_seq = out_seq.reshape((len(out_seq), 1))
def split_sequences(sequences, n_steps_in, n_steps_out):
# horizontally stack columns
X, y = list(), list()
dataset = hstack((in_seq1, in_seq2, out_seq))
for i in range(len(sequences)):
# choose a number of time steps
# find the end of this pattern
n_steps_in, n_steps_out = 3, 2
end_ix = i + n_steps_in
# covert into input/output
out_end_ix = end_ix + n_steps_out-1
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
# check if we are beyond the dataset
print(X.shape, y.shape)
if out_end_ix > len(sequences):
# summarize the data
break
for i in range(len(X)):
# gather input and output parts of the pattern
print(X[i], y[i])
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
Listing 9.79: Example splitting a parallel series for multi-step forecasting into samples. X.append(seq_x)
y.append(seq_y)
Running the example first prints the shape of the prepared training data. We can see that return array(X), array(y)
the shape of the input portion of the samples is three-dimensional, comprised of six samples,
# define input sequence
with three time steps, and two variables for the 2 input time series. The output portion of the
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
samples is two-dimensional for the six samples and the two time steps for each sample to be in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
predicted. The prepared samples are then printed to confirm that the data was prepared as we out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
specified. # convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
(6, 3, 2) (6, 2) in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
[[10 15] # horizontally stack columns
[20 25] dataset = hstack((in_seq1, in_seq2, out_seq))
[30 35]] [65 85] # choose a number of time steps
[[20 25] n_steps_in, n_steps_out = 3, 2
[30 35] # covert into input/output
[40 45]] [ 85 105] X, y = split_sequences(dataset, n_steps_in, n_steps_out)
[[30 35] # the dataset knows the number of features, e.g. 2
[40 45] n_features = X.shape[2]
[50 55]] [105 125] # define model
[[40 45] model = Sequential()
[50 55] model.add(LSTM(100, activation=✬relu✬, return_sequences=True, input_shape=(n_steps_in,
[60 65]] [125 145] n_features)))
[[50 55] model.add(LSTM(100, activation=✬relu✬))
[60 65] model.add(Dense(n_steps_out))
[70 75]] [145 165] model.compile(optimizer=✬adam✬, loss=✬mse✬)
[[60 65] # fit model
[70 75] model.fit(X, y, epochs=200, verbose=0)
[80 85]] [165 185] # demonstrate prediction
Listing 9.80: Example output from splitting a parallel series for multi-step forecasting into x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_steps_in, n_features))
samples. yhat = model.predict(x_input, verbose=0)
print(yhat)
We can now develop an LSTM model for multi-step predictions. A vector output or an
encoder-decoder model could be used. In this case, we will demonstrate a vector output with a
9.5. Multivariate Multi-step LSTM Models 155 9.5. Multivariate Multi-step LSTM Models 156

# split a multivariate sequence into samples


Listing 9.81: Example of an Stacked LSTM for multi-step forecasting for a dependent series. def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
Running the example fits the model and predicts the next two time steps of the output for i in range(len(sequences)):
sequence beyond the dataset. We would expect the next two steps to be: [185, 205]. It is a # find the end of this pattern
challenging framing of the problem with very little data, and the arbitrarily configured version end_ix = i + n_steps_in
of the model gets close. out_end_ix = end_ix + n_steps_out
# check if we are beyond the dataset
if out_end_ix > len(sequences):
Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider break
running the example a few times. # gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
X.append(seq_x)
[[188.70619 210.16513]]
y.append(seq_y)
Listing 9.82: Example output from an Stacked LSTM for multi-step forecasting for a dependent return array(X), array(y)
series. Listing 9.86: Example of a function for splitting a parallel dataset for multi-step forecasting
into samples.

9.5.2 Multiple Parallel Input and Multi-step Output We can demonstrate this function on the small contrived dataset. The complete example is
listed below.
A problem with parallel time series may require the prediction of multiple time steps of each
# multivariate multi-step data preparation
time series. For example, consider our multivariate time series from a prior section: from numpy import array
[[ 10 15 25] from numpy import hstack
[ 20 25 45]
[ 30 35 65] # split a multivariate sequence into samples
[ 40 45 85] def split_sequences(sequences, n_steps_in, n_steps_out):
[ 50 55 105] X, y = list(), list()
[ 60 65 125] for i in range(len(sequences)):
[ 70 75 145] # find the end of this pattern
[ 80 85 165] end_ix = i + n_steps_in
[ 90 95 185]] out_end_ix = end_ix + n_steps_out
# check if we are beyond the dataset
Listing 9.83: Example of a multivariate parallel time series dataset. if out_end_ix > len(sequences):
break
We may use the last three time steps from each of the three time series as input to the model # gather input and output parts of the pattern
and predict the next time steps of each of the three time series as output. The first sample in seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
the training dataset would be the following. X.append(seq_x)
y.append(seq_y)
Input: return array(X), array(y)
10, 15, 25
20, 25, 45 # define input sequence
30, 35, 65 in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
Listing 9.84: Example of input from the first sample. out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
Output: in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
40, 45, 85
out_seq = out_seq.reshape((len(out_seq), 1))
50, 55, 105
# horizontally stack columns
Listing 9.85: Example of output from the first sample. dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
The split sequences() function below implements this behavior. n_steps_in, n_steps_out = 3, 2
# covert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)
9.5. Multivariate Multi-step LSTM Models 157 9.5. Multivariate Multi-step LSTM Models 158

print(X.shape, y.shape) end_ix = i + n_steps_in


# summarize the data out_end_ix = end_ix + n_steps_out
for i in range(len(X)): # check if we are beyond the dataset
print(X[i], y[i]) if out_end_ix > len(sequences):
break
Listing 9.87: Example splitting a parallel series for multi-step forecasting into samples. # gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
Running the example first prints the shape of the prepared training dataset. We can see X.append(seq_x)
that both the input (X) and output (Y ) elements of the dataset are three dimensional for the y.append(seq_y)
number of samples, time steps, and variables or parallel time series respectively. The input and return array(X), array(y)
output elements of each series are then printed side by side so that we can confirm that the
# define input sequence
data was prepared as we expected. in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
(5, 3, 3) (5, 2, 3) in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
[[10 15 25] # convert to [rows, columns] structure
[20 25 45] in_seq1 = in_seq1.reshape((len(in_seq1), 1))
[30 35 65]] [[ 40 45 85] in_seq2 = in_seq2.reshape((len(in_seq2), 1))
[ 50 55 105]] out_seq = out_seq.reshape((len(out_seq), 1))
[[20 25 45] # horizontally stack columns
[30 35 65] dataset = hstack((in_seq1, in_seq2, out_seq))
[40 45 85]] [[ 50 55 105] # choose a number of time steps
[ 60 65 125]] n_steps_in, n_steps_out = 3, 2
[[ 30 35 65] # covert into input/output
[ 40 45 85] X, y = split_sequences(dataset, n_steps_in, n_steps_out)
[ 50 55 105]] [[ 60 65 125] # the dataset knows the number of features, e.g. 2
[ 70 75 145]] n_features = X.shape[2]
[[ 40 45 85] # define model
[ 50 55 105] model = Sequential()
[ 60 65 125]] [[ 70 75 145] model.add(LSTM(200, activation=✬relu✬, input_shape=(n_steps_in, n_features)))
[ 80 85 165]] model.add(RepeatVector(n_steps_out))
[[ 50 55 105] model.add(LSTM(200, activation=✬relu✬, return_sequences=True))
[ 60 65 125] model.add(TimeDistributed(Dense(n_features)))
[ 70 75 145]] [[ 80 85 165] model.compile(optimizer=✬adam✬, loss=✬mse✬)
[ 90 95 185]] # fit model
model.fit(X, y, epochs=300, verbose=0)
Listing 9.88: Example output from splitting a parallel series for multi-step forecasting into # demonstrate prediction
samples. x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
x_input = x_input.reshape((1, n_steps_in, n_features))
We can use either the Vector Output or Encoder-Decoder LSTM to model this problem. In yhat = model.predict(x_input, verbose=0)
this case, we will use the Encoder-Decoder model. The complete example is listed below. print(yhat)

# multivariate multi-step encoder-decoder lstm example Listing 9.89: Example of an Encoder-Decoder LSTM for multi-step forecasting for parallel series.
from numpy import array
from numpy import hstack Running the example fits the model and predicts the values for each of the three time steps
from keras.models import Sequential for the next two time steps beyond the end of the dataset. We would expect the values for these
from keras.layers import LSTM series and time steps to be as follows:
from keras.layers import Dense
from keras.layers import RepeatVector 90, 95, 185
from keras.layers import TimeDistributed 100, 105, 205

Listing 9.90: Expected output for an out-of-sample forecast.


# split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out): We can see that the model forecast gets reasonably close to the expected values.
X, y = list(), list()
for i in range(len(sequences)): Note: Given the stochastic nature of the algorithm, your specific results may vary. Consider
# find the end of this pattern running the example a few times.
9.6. Extensions 159 9.8. Summary 160

❼ Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting,


[[[ 91.86044 97.77231 189.66768 ]
[103.299355 109.18123 212.6863 ]]]
2015.
https://arxiv.org/abs/1506.04214v1
Listing 9.91: Example output from an Encoder-Decoder Output LSTM for multi-step forecasting
for parallel series.
9.7.3 APIs
For an example of LSTM models developed for a multivariate multi-step time series forecasting
❼ Keras: The Python Deep Learning library.
problem, see Chapter 20.
https://keras.io/

❼ Getting started with the Keras Sequential model.


9.6 Extensions https://keras.io/getting-started/sequential-model-guide/
This section lists some ideas for extending the tutorial that you may wish to explore. ❼ Getting started with the Keras functional API.
❼ Problem Differences. Explain the main changes to the LSTM required when modeling https://keras.io/getting-started/functional-api-guide/
each of univariate, multivariate and multi-step time series forecasting problems. ❼ Keras Sequential Model API.
❼ Experiment. Select one example and modify it to work with your own small contrived https://keras.io/models/sequential/
dataset. ❼ Keras Core Layers API.
❼ Develop Framework. Use the examples in this chapter as the basis for a framework for https://keras.io/layers/core/
automatically developing an LSTM model for a given time series forecasting problem. ❼ Keras Recurrent Layers API.
If you explore any of these extensions, I’d love to know. https://keras.io/layers/recurrent/

9.7 Further Reading 9.8 Summary


This section provides more resources on the topic if you are looking to go deeper. In this tutorial, you discovered how to develop a suite of LSTM models for a range of standard
time series forecasting problems. Specifically, you learned:

9.7.1 Books ❼ How to develop LSTM models for univariate time series forecasting.
❼ Deep Learning, 2016. ❼ How to develop LSTM models for multivariate time series forecasting.
https://amzn.to/2MQyLVZ
❼ How to develop LSTM models for multi-step time series forecasting.
❼ Deep Learning with Python, 2017.
https://amzn.to/2vMRiMe
9.8.1 Next
9.7.2 Papers This is the final lesson of this part, the next part will focus on systematically developing models
for univariate time series forecasting problems.
❼ Long Short-Term Memory, 1997.
https://ieeexplore.ieee.org/document/6795963/.
❼ Learning to Forget: Continual Prediction with LSTM, 1999.
https://ieeexplore.ieee.org/document/818041/
❼ Recurrent Nets that Time and Count, 2000.
https://ieeexplore.ieee.org/document/861302/
❼ LSTM: A Search Space Odyssey, 2017.
https://arxiv.org/abs/1503.04069

You might also like