Deep Learning Methods
Deep Learning Methods
◦ How to transform a time series dataset into a two-dimensional supervised learning format
◦ How to transform a two-dimensional time series dataset into a three-dimensional structure suitable for CNNs and
LSTMs
◦ How to step through a worked example of splitting a very long time series into subsequences ready for training a
CNN or LSTM model
◦ The model will learn how to map inputs to outputs from the provided examples
Y = f (X )
◦ For a univariate time series problem where we are interested in one-step predictions
◦ The observations at prior time steps, so-called lag observations, are used as input
◦ The output is the observation at the current time step
(10,)
The length of each dimension is referred to as the shape of the array
(7, 3) (7,)
[1 2 3] 4
[2 3 4] 5
[3 4 5] 6
[4 5 6] 7
[5 6 7] 8
[6 7 8] 9
[7 8 9] 10
The array will have two dimensions
X y
◦ Data in this form can be used directly to train a simple neural network, such as a Multilayer Perceptron
◦ The difficulty when trying to prepare this data for CNNs and LSTMs
◦ Require data to have a three-dimensional structure
• One sequence is one sample • One time step is one point of • One feature is one
• A batch is comprised of one observation in the sample observation at a time step
or more samples • One sample is comprised of • One time step is comprised
multiple time steps of one or more features
◦ The first layer in the network is actually the first hidden layer
◦ The 32 refers to the number of units in the first hidden layer
◦ The number of units in the first hidden layer is completely unrelated to the number of samples, time steps or
features in the input data
◦ If have 7 samples and 3 time steps per sample for the input element of our time series
◦ Reshape it into [7, 3, 1] by providing a tuple to the reshape() function specifying the desired new shape of (7, 3, 1)
◦ The array must have enough data to support the new shape, which in this case it does as [7, 3] and [7, 3, 1] are functionally
the same thing
# transform input from [samples, features] to [samples, timesteps, features]
X = X.reshape((7, 3, 1))
The number of
The number of The number of
X.shape[1] columns in a 2D
feature time steps
array
# define univariate time series The univariate time series with 10 time steps
series = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(series.shape)
# transform to a supervised learning problem
X, y = split_sequence(series, 3)
print(X.shape, y.shape)
# transform input from [samples, features] to [samples, timesteps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))
print(X.shape) The shape if the input (X) and output (y) elements of each sample after converted into
a supervised learning problem
Workshop on Deep Learning Time Series by [email protected] July 2022 287
3D Data Preparation Basics
◦ The input element of each sample is reshaped to be three-dimensional suitable for fitting an LSTM or CNN
(10,)
(7, 3) (7,)
(7, 3, 1)
Have 3,650 rows and 2 columns: a standard univariate time series dataset
data = array(data)
print(data[:5, :])
print(data.shape)
◦ If not, may want to look at imputing the missing values, resampling the data to a new time scale, or developing a
model that can handle missing values
# example of dropping the time dimension from the dataset
data = array(dataframe)
# drop date Drop the first column
data = data[:, 1]
print(data.shape)
◦ Running the example prints the shape of the dataset after the time column has been removed
(3650,)
◦ There are many ways to do this, and may want to explore some depending on the problem
◦ Need overlapping sequences
◦ Non-overlapping is good but the model needs state across the sub-sequences
◦ The data can now be used as an input (X) to an LSTM model, or even a CNN model
(73, 50, 1)
◦ How to develop a suite of MLP models for a range of standard time series forecasting problems
◦ How to develop MLP models for univariate time series forecasting
◦ How to develop MLP models for multivariate time series forecasting
◦ How to develop MLP models for multi-step time series forecasting
X, y
10, 20, 30, 40
20, 30, 40, 50
30, 40, 50, 60
...
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps))
yhat = model.predict(x_input, verbose=0)
◦ Two parallel input time series where the output series is the simple addition of the input series
# multivariate data preparation
from numpy import hstack
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105] y: One output time step
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
◦ Transforming the time series into input/output samples to train the model
◦ Discard some values from the output time series where do not have values in the input time series at prior time steps
◦ The choice of the size of the number of input time steps will have an important effect on how much of the training data is
used
[[10 15]
[20 25]
[30 35]]
Flatten
◦ The shape of the 1 sample with 3 time steps and 2 variables would be [1, 3, 2]
◦ Reshape this to be 1 sample with a vector of 6 elements or [1, 6]
◦ Expect the next value in the sequence to be 100 + 105 or 205
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_input))
yhat = model.predict(x_input, verbose=0)
[[205.86246]]
Define the first input model as an MLP with an input layer that expects vectors
# first input model with n_steps features
visible1 = Input(shape=(n_steps,))
dense1 = Dense(100, activation='relu')(visible1)
◦ Requires input as a list of two elements, where each element in the list contains data for one of the submodels
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
Output
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
◦ The model output will be a vector, with one element for each of the three different time series
# determine the number of outputs
n_output = y.shape[1]
◦ Predict the next value in each of the three parallel series by providing an input of three time steps for each series
70, 75, 145
80, 85, 165
90, 95, 185
# flatten input
n_input = X.shape[1] * X.shape[2]
X = X.reshape((X.shape[0], n_input))
n_output = y.shape[1]
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_input))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ Define one output layer for each of the three series that wish to forecast, where each output submodel will
forecast a single time step
# define output 1
output1 = Dense(1)(dense)
# define output 2
output2 = Dense(1)(dense)
# define output 2
output3 = Dense(1)(dense)
◦ The schematic shows the three separate output layers of the model and the input and output shapes of each
layer
# flatten input
n_input = X.shape[1] * X.shape[2]
X = X.reshape((X.shape[0], n_input))
# define model
visible = Input(shape=(n_input,))
dense = Dense(100, activation='relu')(visible)
# define output 1
output1 = Dense(1)(dense)
# define output 2
output2 = Dense(1)(dense)
# define output 2
output3 = Dense(1)(dense)
# tie together
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, [y1,y2,y3], epochs=2000, verbose=0)
[array([[101.16882]], dtype=float32),
array([[105.94562]], dtype=float32),
array([[207.73004]], dtype=float32)]
◦ The shape of the single sample of input data must be [1, 3] for the 1 sample and 3 time steps (features) of the
input and the single feature
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in))
yhat = model.predict(x_input, verbose=0)
# define model
model = Sequential()
model.add(Dense(100, activation='relu', input_dim=n_steps_in))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in))
yhat = model.predict(x_input, verbose=0)
print(yhat)
[[102.572365 113.88405 ]]
◦ Data preparation and modeling for multivariate multi-step time series forecasting
1. Multiple Input Multi-step Output
2. Multiple Parallel Input and Multi-step Output
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145] Output for a multi-step forecast for a dependent series
[ 80 85 165]
[ 90 95 185]]
[[10 15]
The output portion of the samples is two-dimensional for the six samples and the two time steps
[20 25]
for each sample to be predicted
[30 35]] [65 85]
[[20 25]
[30 35] Input for a multi-step forecast for a dependent series
[40 45]] [ 85 105]
[[30 35]
[40 45]
[50 55]] [105 125]
[[40 45]
[50 55]
[60 65]] [125 145] Output for a multi-step forecast for a dependent series
[[50 55]
[60 65]
[70 75]] [145 165]
[[60 65]
[70 75]
[80 85]] [165 185]
# flatten input
n_input = X.shape[1] * X.shape[2]
X = X.reshape((X.shape[0], n_input))
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_input))
yhat = model.predict(x_input, verbose=0)
print(yhat)
[[187.13754 208.63889]]
(5, 3, 3) (5, 2, 3)
[[10 15 25] Both the input (X) and output (Y ) elements of the dataset are three dimensional for
[20 25 45] the number of samples, time steps, and variables or parallel time series
[30 35 65]] [[ 40 45 85]
[ 50 55 105]]
[[20 25 45]
[30 35 65]
[40 45 85]] [[ 50 55 105]
[ 60 65 125]]
[[ 30 35 65] Input for multi-step forecasting for a multivariate series
[ 40 45 85]
[ 50 55 105]] [[ 60 65 125]
[ 70 75 145]]
[[ 40 45 85]
[ 50 55 105]
[ 60 65 125]] [[ 70 75 145]
[ 80 85 165]] Output for multi-step forecasting for a multivariate series
[[ 50 55 105]
[ 60 65 125]
[ 70 75 145]] [[ 80 85 165]
[ 90 95 185]]
# flatten output
n_output = y.shape[1] * y.shape[2]
y = y.reshape((y.shape[0], n_output))
# flatten input
n_input = X.shape[1] * X.shape[2]
X = X.reshape((X.shape[0], n_input))
# define model
model = Sequential()
model.add(Dense(100, activation='relu', input_dim=n_input))
model.add(Dense(n_output))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
x_input = x_input.reshape((1, n_input))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ Provide standalone examples of each model on each type of time series problem as a template that can copy and
adapt for the specific time series forecasting problem
◦ How to develop CNN models for univariate time series forecasting
◦ How to develop CNN models for multivariate time series forecasting
◦ How to develop CNN models for multi-step time series forecasting
• Data Preparation
• CNN Model
◦ Divide the sequence into multiple input/output patterns called samples for the one-step prediction that is being
learned
Three time steps are used as input
X, y
10, 20, 30, 40
20, 30, 40, 50
30, 40, 50, 60 One time step is used as output
...
[10 20 30] 40
[20 30 40] 50
[30 40 50] 60
[40 50 60] 70
[50 60 70] 80
[60 70 80] 90
◦ The convolutional and pooling layers are followed by a dense fully connected layer that interprets the features
extracted by the convolutional part of the model
◦ A flatten layer is used between the convolutional layers and the dense layer to reduce the feature maps to a
single one-dimensional vector
◦ The split_sequence() function in the previous section outputs the X with the shape [samples,
timesteps]
◦ Reshape it to have an additional dimension for the one feature
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
◦ The model expects the input shape to be 3D with [samples, timesteps, features]
◦ Reshape the single input sample before making the prediction
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation='relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
[[ 10 15 25]
The first three time steps of each parallel series are provided as input to the model
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105] y: One output time step
[ 60 65 125]
[ 70 75 145] The model associates this with the value in the output series at the third time step
[ 80 85 165]
[ 90 95 185]]
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation='relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
◦ The shape of the one sample with three time steps and two variables must be [1, 3, 2]
◦ Expect the next value in the sequence to be 100 + 105 or 205
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# fit model
model.fit(X, y, epochs=1000, verbose=0)
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ This type of model can be defined in Keras using the Keras functional API
◦ Want to predict the value for each of the three time series for the next time step
◦ Referred to as multivariate forecasting
◦ The data must be split into input/output samples in order to train a model
X: Input
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125] y: Output
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation='relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')
◦ The shape of the input for making a single prediction must be 1 sample, 3 time steps, and 3 features, or [1, 3, 3]
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ Tie the input and output layers together into a single model
# tie together
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse')
# define model
visible = Input(shape=(n_steps, n_features))
cnn = Conv1D(64, 2, activation='relu')(visible)
cnn = MaxPooling1D()(cnn)
cnn = Flatten()(cnn)
cnn = Dense(50, activation='relu')(cnn)
# define output 1
output1 = Dense(1)(cnn)
# define output 2
output2 = Dense(1)(cnn)
# define output 3
output3 = Dense(1)(cnn)
# tie together
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ There are subtle and important differences in the way the training data is prepared
◦ Demonstrate the case of developing a multi-step forecast model using a vector model
◦ Use the last three time steps as input and forecast the next two time steps
X: Input
y: Output
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
4.1. Multiple Input and Multi-step 4.2. Multiple Parallel Input and Multi-
Output step Output
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125] y: two time steps of the output time series
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
# fit model
model.fit(X, y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
[[185.5557 208.05902]]
X: The last three time steps from each of the three time series
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125] y: The next time steps of each of the three time series
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
# flatten output
n_output = y.shape[1] * y.shape[2]
y = y.reshape((y.shape[0], n_output))
# flatten output
n_output = y.shape[1] * y.shape[2]
y = y.reshape((y.shape[0], n_output))
# define model
model = Sequential()
model.add(Conv1D(64, 2, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_output))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=7000, verbose=0)
# demonstrate prediction
x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ The LSTM architecture aims to provide a short-term memory for RNN that can last thousands of timesteps, thus
"long short-term memory"
X, y
10, 20, 30, 40
20, 30, 40, 50
30, 40, 50, 60
...
[10 20 30] 40
[20 30 40] 50
[30 40 50] 60 y: One output time step
[40 50 60] 70
[50 60 70] 80
[60 70 80] 90
◦ The model expects the input shape to be 3D with [samples, timesteps, features]
◦ Reshape the single input sample before making the prediction
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ Stacking LSTM hidden layers makes the model deeper, more accurately earning the
description as a deep learning technique
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps, n_features)))
model.add(LSTM(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
# define model
model = Sequential()
model.add(Bidirectional(LSTM(50, activation='relu'), input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ The CNN can interpret each subsequence of two time steps and provide a time series of interpretations of the
subsequences to the LSTM model to process as input
◦ Parameterize this and define the number of subsequences as n_seq and the number of time steps per subsequence as n
steps.
◦ The input data can then be reshaped to have the required structure: [samples, subsequences, timesteps,
features]
# fit model
model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
x_input = array([60, 70, 80, 90])
x_input = x_input.reshape((1, n_seq, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
# reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features))
# reshape from [samples, timesteps] into [samples, timesteps, rows, columns, features]
n_features = 1
n_seq = 2
n_steps = 2
X = X.reshape((X.shape[0], n_seq, 1, n_steps, n_features))
# fit model
model.fit(X, y, epochs=500, verbose=0)
# demonstrate prediction
x_input = array([60, 70, 80, 90])
x_input = x_input.reshape((1, n_seq, 1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ A simple example of two parallel input time series where the output series is the simple addition of the input
series
# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105] y: One output time step
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
◦ The shape of the one sample with three time steps and two variables must be [1, 3, 2]
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125] y: One output time step
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
◦ The shape of the input for making a single prediction must be 1 sample, 3 time steps, and 3 features, or [1, 3, 3]
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
# fit model
model.fit(X, y, epochs=400, verbose=0)
# demonstrate prediction
x_input = array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
◦ The shape of the single sample of input data when making the prediction must be [1, 3, 1] for the 1 sample, 3
time steps of the input, and the single feature
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ Use the same output layer or layers to make each one-step prediction in the output sequence
◦ Achieved by wrapping the output part of the model in a TimeDistributed wrapper
# define model output
model.add(TimeDistributed(Dense(1)))
◦ The input data must be reshaped into the expected three-dimensional shape of [samples, timesteps, features]\
# reshape input training data
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(RepeatVector(n_steps_out))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105] Two time steps of the output time series
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
The last three time steps from each of the three time series
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125] The next time steps of each of the three time series
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
# fit model
model.fit(X, y, epochs=300, verbose=0)
# demonstrate prediction
x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
◦ Output from an Encoder-Decoder Output LSTM for multi-step forecasting for parallel series
[[[ 90.46589 95.4396 185.87837]
[100.56111 105.4582 206.39198]]]