0% found this document useful (0 votes)

12 views16 pages

2. Encoder-Decoder Sequence to Sequence Architechure

The Encoder-Decoder Sequence-to-Sequence (Seq2Seq) architecture is designed for processing sequential data, consisting of an encoder that transforms input sequences into a fixed-size hidden representation and a decoder that generates output sequences from this representation. This architecture is particularly useful in natural language processing, speech recognition, and machine translation, allowing for input and output sequences of varying lengths. The model is trained to maximize the likelihood of the correct output given the input, utilizing context vectors to encapsulate the semantic meaning of the input.

Uploaded by

devanand272003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views16 pages

2. Encoder-Decoder Sequence to Sequence Architechure

Uploaded by

devanand272003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Encoder-Decoder Sequence-to-Sequence

Architecture

Mr. Sivadasan E T
Associate Professor
Vidya Academy of Science and Technology, Thrissur
Encoder-Decoder Sequence-to-Sequence
Architecture

The Encoder-Decoder Sequence-to-Sequence

(Seq2Seq) architecture is a machine learning
architecture designed for tasks involving
sequential data.

It takes an input sequence, processes it, and

generates an output sequence.
Encoder-Decoder Sequence-to-Sequence
Architecture

The architecture consists of two fundamental

components: an encoder and a decoder.
The encoder processes the input sequence and
transforms it into a fixed-size hidden
representation.
The decoder uses the hidden representation to
generate output sequence.
Encoder-Decoder Sequence-to-Sequence
Architecture

The encoder-decoder structure allows them to

handle input and output sequences of different
lengths, making them capable to handle
sequential data.

The model is trained to maximize the likelihood of

the correct output sequence given the input
sequence.
Encoder-Decoder Sequence-to-Sequence
Architecture

Commonly used in tasks involving NLP, speech

recognition, machine translation or question
answering.

Where the input and output sequences in the

training set are generally not of the same length
(although their lengths might be related).
Encoder-Decoder Sequence-to-Sequence
Architecture

Imagine we have an input sentence:

👉 "The sky is“

The correct output word (what the model

should predict) is:
👉 "blue"
Encoder-Decoder Sequence-to-Sequence
Architecture
Encoder Block

The main purpose of the encoder block is to process

the input sequence and capture information in a
fixed-size context vector.
Encoder

1. The input sequence is put into the encoder.

2. The encoder processes each element of the input

sequence using neural networks (or transformer
architecture).
Encoder

3. Throughout this process, the encoder keeps an

internal state, and the ultimate hidden state
functions as the context vector that
encapsulates a compressed representation of
the entire input sequence.

4. This context vector captures the semantic

meaning and important information of the input
sequence.
Decoder Block

The decoder block is similar to encoder block.

The decoder processes the context vector from

encoder to generate output sequence
incrementally.
Decoder Architecture

In the training phase, the decoder receives both

the context vector and the desired target output
sequence (ground truth).

During inference, the decoder relies on its own

previously generated outputs as inputs for
subsequent steps.
Encoder-Decoder Sequence-to-Sequence
Architecture

We often call the input to the RNN the “context.”

We want to produce a representation of this
context, C.

The context C might be a vector or sequence of

vectors that summarize the input sequence
X = (x(1), . . . , x(nx)).
Encoder-Decoder Sequence-to-Sequence
Architecture

The idea is very simple:

1. An encoder or reader or input RNN processes the

input sequence. The encoder emits the context C,
usually as a simple function of its final hidden state.
Encoder-Decoder Sequence-to-Sequence
Architecture

(2) a decoder or writer or output RNN is

conditioned on that fixed-length vector to
generate the output sequence (or computes
the probability of a given output sequence).
Y = (y(1) , . . . , y(ny )).
Thank You!

Attention is all you need
No ratings yet
Attention is all you need
15 pages
Boral FireClad
No ratings yet
Boral FireClad
8 pages
cl8_encdec
No ratings yet
cl8_encdec
51 pages
dlunit4
No ratings yet
dlunit4
122 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
Visualizing A Neural Machine Translation Model
No ratings yet
Visualizing A Neural Machine Translation Model
38 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
Module 3 Part 2 Encoder
No ratings yet
Module 3 Part 2 Encoder
14 pages
unit5 3
No ratings yet
unit5 3
48 pages
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
No ratings yet
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
20 pages
Sequence Models-II
No ratings yet
Sequence Models-II
10 pages
L5
No ratings yet
L5
99 pages
DL CO4 PPT-1
No ratings yet
DL CO4 PPT-1
29 pages
Unit-3_Part-02
No ratings yet
Unit-3_Part-02
40 pages
[Slides] Module 44
No ratings yet
[Slides] Module 44
119 pages
Neural Machine Translation, Seq2seq, and Attention
No ratings yet
Neural Machine Translation, Seq2seq, and Attention
17 pages
LA2 Presentation
No ratings yet
LA2 Presentation
21 pages
Exploring Sequence-to-Sequence Models _ Understanding the power of Encoder and Decoder Architecture _ by Sachinsoni _ Medium
No ratings yet
Exploring Sequence-to-Sequence Models _ Understanding the power of Encoder and Decoder Architecture _ by Sachinsoni _ Medium
18 pages
M5 Topic 1 - Encoder Decoder
No ratings yet
M5 Topic 1 - Encoder Decoder
21 pages
Encoder Decoder
No ratings yet
Encoder Decoder
8 pages
sequence to sequence
No ratings yet
sequence to sequence
4 pages
A - DEH - PR-2018-1113-GB - FGAC1005-1014CD4-1 - DA - R4-03-2020 - 150dpi Mini Chiller PDF
No ratings yet
A - DEH - PR-2018-1113-GB - FGAC1005-1014CD4-1 - DA - R4-03-2020 - 150dpi Mini Chiller PDF
8 pages
NLP_Answers
No ratings yet
NLP_Answers
13 pages
Deep Recurrent Neural Networks (1)
No ratings yet
Deep Recurrent Neural Networks (1)
24 pages
15 - NEW 2020 ATTENTION ENC DEC TRANSFORMERS Lect15
No ratings yet
15 - NEW 2020 ATTENTION ENC DEC TRANSFORMERS Lect15
50 pages
DL 8
No ratings yet
DL 8
7 pages
Unit_IV_Natural Language Processing (1)
No ratings yet
Unit_IV_Natural Language Processing (1)
9 pages
Attention: Sharad Jones
No ratings yet
Attention: Sharad Jones
25 pages
Sequence Learning Problem
No ratings yet
Sequence Learning Problem
42 pages
Week9 Seq2seq
No ratings yet
Week9 Seq2seq
32 pages
NeurIPS-2021-understanding-how-encoder-decoder-architectures-attend-Paper
No ratings yet
NeurIPS-2021-understanding-how-encoder-decoder-architectures-attend-Paper
12 pages
Unit 3
No ratings yet
Unit 3
27 pages
Attention is all you need
No ratings yet
Attention is all you need
19 pages
Unlocking Linguistic Intelligence_ Attention Mechanisms and Transformer Architectures in NLP (1)
No ratings yet
Unlocking Linguistic Intelligence_ Attention Mechanisms and Transformer Architectures in NLP (1)
117 pages
AN2DL_05_2324_Seq2SeqAndWordEmbedding
No ratings yet
AN2DL_05_2324_Seq2SeqAndWordEmbedding
42 pages
Advances in Civil Engineering and Building Materials
100% (2)
Advances in Civil Engineering and Building Materials
423 pages
Encoder_Decoder_Transformers_Notes
No ratings yet
Encoder_Decoder_Transformers_Notes
6 pages
Generative AI
No ratings yet
Generative AI
54 pages
Sequence Models - Merged
No ratings yet
Sequence Models - Merged
67 pages
NLP Script
No ratings yet
NLP Script
2 pages
L.7
No ratings yet
L.7
54 pages
Vinija's Notes - Natural Language Processing - Attention
No ratings yet
Vinija's Notes - Natural Language Processing - Attention
27 pages
Persian Gardens PDF
100% (3)
Persian Gardens PDF
16 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
04. Treasure In The Royal Tower
No ratings yet
04. Treasure In The Royal Tower
37 pages
Encoder_vs_Decoder_Transformer_Updated
No ratings yet
Encoder_vs_Decoder_Transformer_Updated
10 pages
Transformer networks
No ratings yet
Transformer networks
53 pages
Deep Neural Network Module 7 Attention Transformer
No ratings yet
Deep Neural Network Module 7 Attention Transformer
40 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
UNIT-3 Sequence Modeling
No ratings yet
UNIT-3 Sequence Modeling
20 pages
DAA FinalReport
No ratings yet
DAA FinalReport
14 pages
1. Introduction to neural networks -Single layer perceptrons - Modified
No ratings yet
1. Introduction to neural networks -Single layer perceptrons - Modified
26 pages
CS2 - Chapter 2
No ratings yet
CS2 - Chapter 2
98 pages
The Complexities of John Hejduk’s Work: Exorcising Outlines, Apparitions and Angels J. Kevin Storypdf download
100% (2)
The Complexities of John Hejduk’s Work: Exorcising Outlines, Apparitions and Angels J. Kevin Storypdf download
53 pages
Transformer Architecture
No ratings yet
Transformer Architecture
18 pages
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
No ratings yet
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
11 pages
1. Introduction to deep learning- Deep feed forward network
No ratings yet
1. Introduction to deep learning- Deep feed forward network
24 pages
2014 10 Cho EMNLP
No ratings yet
2014 10 Cho EMNLP
11 pages
1. Computer Vision
No ratings yet
1. Computer Vision
20 pages
5 FM200 Installation
No ratings yet
5 FM200 Installation
70 pages
Modern Language Models
No ratings yet
Modern Language Models
28 pages
1. Recurrent Neural Networks RNN
No ratings yet
1. Recurrent Neural Networks RNN
19 pages
Understanding Transformer model architectures - Practical Artificial Intelligence
No ratings yet
Understanding Transformer model architectures - Practical Artificial Intelligence
6 pages
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
No ratings yet
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
5 pages
attention
No ratings yet
attention
15 pages
2. Activation Functions - Sigmoid- Tanh- ReLU- Softmax- Risk Minimization- Loss Function
No ratings yet
2. Activation Functions - Sigmoid- Tanh- ReLU- Softmax- Risk Minimization- Loss Function
17 pages
Ascentia Sky E Brochure
No ratings yet
Ascentia Sky E Brochure
72 pages
ARTIACAT CFM 1906 (Cabinet Furniture Mechanisms)
No ratings yet
ARTIACAT CFM 1906 (Cabinet Furniture Mechanisms)
40 pages
Design system
No ratings yet
Design system
7 pages
1706.03762v1
No ratings yet
1706.03762v1
15 pages
What Is A Transformer
No ratings yet
What Is A Transformer
11 pages
Design of stringer for through type truss girder bridge (1) (3)
No ratings yet
Design of stringer for through type truss girder bridge (1) (3)
16 pages
1construction of Supersture - Rimna
No ratings yet
1construction of Supersture - Rimna
27 pages
New Rice Mill Project PT - Padi Flour Nusantara Mojokerto - Indonesia
No ratings yet
New Rice Mill Project PT - Padi Flour Nusantara Mojokerto - Indonesia
26 pages
Bài 1 - Bài tập thực hành
No ratings yet
Bài 1 - Bài tập thực hành
5 pages
3. AdaGrad- RMSProp- Adam
No ratings yet
3. AdaGrad- RMSProp- Adam
9 pages
2. Speech Recognition
No ratings yet
2. Speech Recognition
7 pages
Column Design
No ratings yet
Column Design
15 pages
Ceiling Work Outline
No ratings yet
Ceiling Work Outline
37 pages
AAA Baul Link
No ratings yet
AAA Baul Link
4 pages
13GCS
No ratings yet
13GCS
12 pages
MATIERE Unibridge Technical Specifications
No ratings yet
MATIERE Unibridge Technical Specifications
2 pages
Children's Culture House Ama'r
No ratings yet
Children's Culture House Ama'r
12 pages
Tally Mini
No ratings yet
Tally Mini
2 pages
Es WD GRD 690 Usa
No ratings yet
Es WD GRD 690 Usa
1 page
Toilet 6'-9"X4'-0": Bed Room #2
No ratings yet
Toilet 6'-9"X4'-0": Bed Room #2
1 page
CV Zara Ferreira
No ratings yet
CV Zara Ferreira
1 page
August 05
No ratings yet
August 05
1 page
Library Study
No ratings yet
Library Study
2 pages
SG 004
No ratings yet
SG 004
1 page
Bill of Quantity: Baner Lifespaces LLP
No ratings yet
Bill of Quantity: Baner Lifespaces LLP
2 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Foundation Course for Advanced Computer Studies
From Everand
Foundation Course for Advanced Computer Studies
Franck Ismael Djédjé
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet