100% found this document useful (1 vote)
391 views21 pages

Temporal Convolutional Network (TCN)

1) Temporal convolutional networks (TCNs) are a type of neural network architecture designed for sequence modeling tasks like time series forecasting and natural language processing. TCNs leverage the power of convolutional operations to capture patterns within sequential data. 2) TCNs apply causal dilated convolutional layers with residual connections to efficiently capture long-term dependencies without loss of information. This allows TCNs to take variable length sequences as input and output sequences of the same length. 3) The strengths of TCNs include their ability to efficiently capture long-term dependencies, handle variable length sequences, and avoid issues like vanishing gradients. However, their interpretability is limited and they require large amounts of data and memory for long

Uploaded by

jaffar bikat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
391 views21 pages

Temporal Convolutional Network (TCN)

1) Temporal convolutional networks (TCNs) are a type of neural network architecture designed for sequence modeling tasks like time series forecasting and natural language processing. TCNs leverage the power of convolutional operations to capture patterns within sequential data. 2) TCNs apply causal dilated convolutional layers with residual connections to efficiently capture long-term dependencies without loss of information. This allows TCNs to take variable length sequences as input and output sequences of the same length. 3) The strengths of TCNs include their ability to efficiently capture long-term dependencies, handle variable length sequences, and avoid issues like vanishing gradients. However, their interpretability is limited and they require large amounts of data and memory for long

Uploaded by

jaffar bikat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Temporal Convolutional

Network (TCN)
By,
Hira Khan
Supervised by,
Prof. Dr. Nadeem Javaid
Motivation
• Convolutional networks can achieve better performance than RNNs in
many tasks while avoiding common drawbacks of recurrent models,
such as the exploding/vanishing gradient problem or lacking memory
retention.
• Using a convolutional network instead of a recurrent one can lead to
performance improvements as it allows for parallel computation of
outputs.
Temporal Convolutional Network (TCN)
• The seminal work of Lea et al. (2016) first proposed a Temporal
Convolutional Networks (TCNs) for video-based action segmentation.
• A Temporal Convolutional Network (TCN) is a type of neural network
architecture designed for sequence modeling tasks.
• TCNs leverage the power of convolutional operations to capture and
learn patterns within sequential data.
• It combines simplicity, autoregressive prediction, and very long memory.
• Applications: time series forecasting, natural language processing, and
more.
Working of TCN
• The TCN is designed from two basic principles:
• The architecture can take a sequence of any length and
map it to an output sequence of the same length, just as
with an RNNs.
• The convolutions in the architecture are causal,
meaning that there is no information “leakage” from
future to past.
Working of TCN
• Input Sequence:
• The input of a TCN is in form of sequential data.
• It could be a time series, a sentence in natural language,
or any other sequential data.
• For TCN, the input and output sequence for model will
be of same length.
Input Vector Output Vector
TCN
(ri, ti,fi) (ro, to, fo)

Where,
i = input vector
o = output vector
r = rows/batch size
t = timestamp/input length
f = features/input or output size
Working of TCN
• TCNs use 1D fully convolutional Network (FCN) to
process the input sequence
• Standard 1D convolutional layer:

kernel size of 3.
Working of TCN
• 1D Convolutional Network layer:

For the purposes of this forecasting model, the stride is always set to 1.
Working of TCN
• 1D convolutional layer:

To get output_length of 4
from input_length of 4 and
kernel_size of 3, left zero-
padding of 2 is applied

To ensure an output tensor has the same length as the input tensor, zero padding is applied.
Working of TCN
• Causal convolutional layer:
The padding strategy
employed by causal
convolutional layers
involves left-padding
the input sequence with
zeros to maintain the
causality of temporal
dependencies during the
convolutional
operation.

The output at time T can only convolve with elements from time T and earlier
of previous layer.
Working of TCN
• The TCN is designed from two basic principles:
• The architecture can take a sequence of any length and
map it to an output sequence of the same length, just as
with an RNNs.
• The convolutions in the architecture are causal,
meaning that there is no information “leakage” from
future to past.

TCN = 1st principle +2nd principle


TCN = Dilated convolution+ Zero Padding +Causal Convolution
Working of TCN
• Causal convolutional layer:

The output at time T can only convolve with elements from time T and earlier
of previous layer.
Working of TCN
• Causal convolutional layer:
Drawback:
• For large sequences:
very large filters of
extremely deep network
is required.
• Vanishing/exploding
gradient problem

The output at time T can only convolve with elements from time T and earlier
of previous layer.
Working of TCN
• Dilated Causal convolutional layer:

For kernel size = K and Dilation factor = D,


Effective history of a layer = (K-1) x D

Dilated convolution enables an exponentially large receptive field for model


to learn through extremely large effective history.
Convolutional layer: Standard vs. Causal vs.
Dilated convolution.
Working of TCN
• Residual Blocks:
• For building very deep networks, without the risk of
vanishing/exploding gradient problem and enabling the network to
learn more easily.
• TCNs often use residual connections to improve training and model
performance

Where,
x is the input,
f(x) is the residual function,
Y is output of the residual block

A residual block contains a branch leading out to a series of transformations


whose outputs are added back to the inputs, allowing the layers to learn
modification to the identity mapping rather than the entire transformation.
Working of TCN
• Residual Blocks:
• For building very deep networks, without the risk of
vanishing/exploding gradient problem and enabling the network to
learn more easily.
• TCNs often use residual connections to improve training and model
performance
In TCN, for different input and
output widths, an additional 1x1
convolution is used to ensure that
element wise addition received
tensor of same shape.
Working of TCN
• Residual Blocks:
• The receptive field allows the TCN network to be stabilized
• The output of two residual blocks stacked over each other, in TCN:

where, D is the depth of convolutional layer, k is the kernel size, and s is the residual blocks
Working of TCN
• Output Layer:
• The final layer of the TCN produces the desired output
dimensions based on the learned representations by
passing it through a series of dense layers.
• For regression tasks like time series forecasting, a linear
activation function might be used and the output could
be a single value or a sequence of predicted values.
• For other tasks like classification tasks, a softmax or
sigmoid activation might be employed, depending on
the number of classes and the nature of the problem. The
output might be different, such as classification labels in
NLP tasks.
Strengths and Drawback of TCN
• Strengths:
• TCN are less computationally expensive as they do not require the complex gating
mechanisms that are used in recurrent neural networks (RNNs).
• By using 1D convolutional layers, it can be easily learn long range dependencies like
RNNs.
• It is able to take variable length sequences and map it to an output sequence of the
same length through 1D FCN.
• It is able to provide large receptive field for learning relevant context that varies
significantly in scale (short and long-range dependencies in the input sequence).
• Due to the use of residual blocks they do not suffer from the vanishing gradient
problem that can affect RNNs. It can capture more complex temporal dependencies
• TCNs have been shown to be effective for a variety of time series tasks, such as
forecasting, classification, and segmentation.
• TCN is capable of capturing different levels of abstraction by stacking multiple layers
of dilated convolutions.
Strengths and Drawback of TCN
• Drawbacks:
• For processing large receptive field, TCN can have high memory
requirements, especially when processing long sequences. This is because they
need to store the entire input sequence in memory.
• TCN requires more parameter to be tuned for large receptive field making it
computationally expensive.
• It require a large amount of training data to achieve good performance which
can be a challenge for some time series tasks.
• TCNs can be difficult to interpret, which can make it difficult to understand
how they make predictions.
References
[1] https://unit8.com/resources/temporal-convolutional-networks-and-forecasting/
[2] https://www.youtube.com/watch?v=TSGZBXILk14
[3] Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., ...
& Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv
preprint arXiv:1609.03499.
[4] Duc, T. N., Minh, C. T., Xuan, T. P., & Kamioka, E. (2020). Convolutional neural
networks for continuous QoE prediction in video streaming services. IEEE
Access, 8, 116268-116278.
[5] https://lukeguerdan.com/blog/2019/intro-to-tcns/
[6] Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic
convolutional and recurrent networks for sequence modeling. arXiv preprint
arXiv:1803.01271.

You might also like