Temporal Convolutional Network (TCN)
Temporal Convolutional Network (TCN)
Network (TCN)
By,
Hira Khan
Supervised by,
Prof. Dr. Nadeem Javaid
Motivation
• Convolutional networks can achieve better performance than RNNs in
many tasks while avoiding common drawbacks of recurrent models,
such as the exploding/vanishing gradient problem or lacking memory
retention.
• Using a convolutional network instead of a recurrent one can lead to
performance improvements as it allows for parallel computation of
outputs.
Temporal Convolutional Network (TCN)
• The seminal work of Lea et al. (2016) first proposed a Temporal
Convolutional Networks (TCNs) for video-based action segmentation.
• A Temporal Convolutional Network (TCN) is a type of neural network
architecture designed for sequence modeling tasks.
• TCNs leverage the power of convolutional operations to capture and
learn patterns within sequential data.
• It combines simplicity, autoregressive prediction, and very long memory.
• Applications: time series forecasting, natural language processing, and
more.
Working of TCN
• The TCN is designed from two basic principles:
• The architecture can take a sequence of any length and
map it to an output sequence of the same length, just as
with an RNNs.
• The convolutions in the architecture are causal,
meaning that there is no information “leakage” from
future to past.
Working of TCN
• Input Sequence:
• The input of a TCN is in form of sequential data.
• It could be a time series, a sentence in natural language,
or any other sequential data.
• For TCN, the input and output sequence for model will
be of same length.
Input Vector Output Vector
TCN
(ri, ti,fi) (ro, to, fo)
Where,
i = input vector
o = output vector
r = rows/batch size
t = timestamp/input length
f = features/input or output size
Working of TCN
• TCNs use 1D fully convolutional Network (FCN) to
process the input sequence
• Standard 1D convolutional layer:
kernel size of 3.
Working of TCN
• 1D Convolutional Network layer:
For the purposes of this forecasting model, the stride is always set to 1.
Working of TCN
• 1D convolutional layer:
To get output_length of 4
from input_length of 4 and
kernel_size of 3, left zero-
padding of 2 is applied
To ensure an output tensor has the same length as the input tensor, zero padding is applied.
Working of TCN
• Causal convolutional layer:
The padding strategy
employed by causal
convolutional layers
involves left-padding
the input sequence with
zeros to maintain the
causality of temporal
dependencies during the
convolutional
operation.
The output at time T can only convolve with elements from time T and earlier
of previous layer.
Working of TCN
• The TCN is designed from two basic principles:
• The architecture can take a sequence of any length and
map it to an output sequence of the same length, just as
with an RNNs.
• The convolutions in the architecture are causal,
meaning that there is no information “leakage” from
future to past.
The output at time T can only convolve with elements from time T and earlier
of previous layer.
Working of TCN
• Causal convolutional layer:
Drawback:
• For large sequences:
very large filters of
extremely deep network
is required.
• Vanishing/exploding
gradient problem
The output at time T can only convolve with elements from time T and earlier
of previous layer.
Working of TCN
• Dilated Causal convolutional layer:
Where,
x is the input,
f(x) is the residual function,
Y is output of the residual block
where, D is the depth of convolutional layer, k is the kernel size, and s is the residual blocks
Working of TCN
• Output Layer:
• The final layer of the TCN produces the desired output
dimensions based on the learned representations by
passing it through a series of dense layers.
• For regression tasks like time series forecasting, a linear
activation function might be used and the output could
be a single value or a sequence of predicted values.
• For other tasks like classification tasks, a softmax or
sigmoid activation might be employed, depending on
the number of classes and the nature of the problem. The
output might be different, such as classification labels in
NLP tasks.
Strengths and Drawback of TCN
• Strengths:
• TCN are less computationally expensive as they do not require the complex gating
mechanisms that are used in recurrent neural networks (RNNs).
• By using 1D convolutional layers, it can be easily learn long range dependencies like
RNNs.
• It is able to take variable length sequences and map it to an output sequence of the
same length through 1D FCN.
• It is able to provide large receptive field for learning relevant context that varies
significantly in scale (short and long-range dependencies in the input sequence).
• Due to the use of residual blocks they do not suffer from the vanishing gradient
problem that can affect RNNs. It can capture more complex temporal dependencies
• TCNs have been shown to be effective for a variety of time series tasks, such as
forecasting, classification, and segmentation.
• TCN is capable of capturing different levels of abstraction by stacking multiple layers
of dilated convolutions.
Strengths and Drawback of TCN
• Drawbacks:
• For processing large receptive field, TCN can have high memory
requirements, especially when processing long sequences. This is because they
need to store the entire input sequence in memory.
• TCN requires more parameter to be tuned for large receptive field making it
computationally expensive.
• It require a large amount of training data to achieve good performance which
can be a challenge for some time series tasks.
• TCNs can be difficult to interpret, which can make it difficult to understand
how they make predictions.
References
[1] https://unit8.com/resources/temporal-convolutional-networks-and-forecasting/
[2] https://www.youtube.com/watch?v=TSGZBXILk14
[3] Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., ...
& Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv
preprint arXiv:1609.03499.
[4] Duc, T. N., Minh, C. T., Xuan, T. P., & Kamioka, E. (2020). Convolutional neural
networks for continuous QoE prediction in video streaming services. IEEE
Access, 8, 116268-116278.
[5] https://lukeguerdan.com/blog/2019/intro-to-tcns/
[6] Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic
convolutional and recurrent networks for sequence modeling. arXiv preprint
arXiv:1803.01271.