Module 3 Part 2 Encoder

The document provides an overview of the encoder-decoder architecture used in recurrent neural networks for machine translation, detailing the roles of the encoder and decoder in processing input sequences. It includes a forward pass example demonstrating how the encoder converts input into a latent vector and how the decoder generates the output sequence. Applications of this architecture include machine translation, speech recognition, text summarization, chatbots, and image captioning.

Uploaded by

chinmy anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views14 pages

Module 3 Part 2 Encoder

Uploaded by

chinmy anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Introduction
Encoder and decoder architectures
Example Forward pass
Application

1
Reference:
10.4, Ian Goodfellow, Yoshua Bengio, Aaron
Courville, Deep Learning, MIT Press, 2016
Encoder-Decoder Architecture: Overview
https://www.youtube.com/watch?v=671xny8i
ows

2
Introduction
The encoder-decoder architecture for recurrent neural
networks is the standard neural machine translation
method.
There are three main blocks in the encoder-decoder
model,
1. Encoder: The Encoder will convert the input sequence
into a single-dimensional vector (Hidden/Latent
vector).
2. Hidden/Latent Vector
3. Decoder: The decoder will convert the hidden/Latent
vector into the output sequence.
Encoder-Decoder models are jointly trained to maximize
the conditional probabilities of the target sequence given
the input sequence.
3
RNN vs RNN based Encoder-Decoder
Simple RNN:
• Consists of a single RNN layer that maps inputs xt
directly to outputs yt at each time step limiting its
ability to learn global dependencies over long
sequences.
Encoder-Decoder RNN:
• Composed of two RNNs: the encoder and the
decoder.
• Encoder: Processes the entire input sequence into a
fixed-length representation (final hidden/latent state)
captures global context in a latent representation.
• Decoder: Uses the encoder’s latent representation
to generate the output sequence.
4
Example (same from RNN)
Inputs: 𝑥=[0.5,0.6,0.7] ;
Targets: 𝑦=[0.6,0.7,0.8]
Initial hidden state: ℎ0=0
Weights:𝑊𝑥=0.8, 𝑊ℎ=0.6
Biases:𝑏x= 0.1

5
6
1. Encoder Forward Pass
 Time step 𝑡=1
Hidden state

ℎ1=tanh(0.8⋅0.5+0.6⋅0+0.1)=tanh(0.5)≈0.462
 Time step 𝑡=2
ℎ2=tanh(0.8⋅0.6+0.6⋅0.462+0.1)=tanh(0.857)≈0

 Time step 𝑡=3

.695

ℎ3=tanh(0.8⋅0.7+0.6⋅0.695+0.1)=tanh(1.077)≈0
.792

7
2. Decoder Forward Pass
The decoder generates the target sequence
using the hidden state from the encoder. The
initial hidden state of the decoder is

For each timestep 𝑡:

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2

Biases: 𝑏h= 0.1, 𝑏𝑦=0.05

8
2. Decoder Forward Pass

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2

Biases: 𝑏h= 0.1, 𝑏𝑦=0.05
Time step 𝑡=1:

9
2. Decoder Forward Pass

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2

Biases: 𝑏h= 0.1, 𝑏𝑦=0.05
Time step 𝑡=1:

10
2. Decoder Forward Pass
Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2
Biases: 𝑏h= 0.1, 𝑏𝑦=0.05
Time step 𝑡=1:

Time step 𝑡=2:

Time step 𝑡=3:

11
Encoder-decoder or sequence-to-
sequence RNN

12
Applications
 Machine Translation
Translating "Bonjour" (French) to "Hello" (English).
 Speech Recognition
Converting spoken words in an audio file to text
 Text Summarization
Summarizing a 1,000-word article into a 100-word
abstract.
 Chatbots and Conversational Agents
Responding to "What's the weather today?" with
"It's sunny and 25°C.“
 Image Captioning
Generating the caption "A dog playing with a ball in
13 the park" for a given image.
14

Niels Floor - This Is Learning Experience Design - What It Is, How It Works, and Why It Matters.-New Riders (2023)
100% (2)
Niels Floor - This Is Learning Experience Design - What It Is, How It Works, and Why It Matters.-New Riders (2023)
295 pages
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
21 pages
Garnet - SDS Dura International SDN BHD
No ratings yet
Garnet - SDS Dura International SDN BHD
7 pages
Final 10-4-21 Presentaion Strengthen Steel Jacketing
No ratings yet
Final 10-4-21 Presentaion Strengthen Steel Jacketing
37 pages
ch10 Sequence Modelling - Recurrent and Recursive Nets
No ratings yet
ch10 Sequence Modelling - Recurrent and Recursive Nets
45 pages
External Six-Component Strain Gauge Balance For Low Speed Wind Tunnels
No ratings yet
External Six-Component Strain Gauge Balance For Low Speed Wind Tunnels
7 pages
The Science of Wll-Being
0% (1)
The Science of Wll-Being
18 pages
Deep Recurrent Neural Networks (1)
No ratings yet
Deep Recurrent Neural Networks (1)
24 pages
2. Encoder-Decoder Sequence to Sequence Architechure
No ratings yet
2. Encoder-Decoder Sequence to Sequence Architechure
16 pages
Unit-3_Part-02
No ratings yet
Unit-3_Part-02
40 pages
Sequence Models-II
No ratings yet
Sequence Models-II
10 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
UNIT-3 Sequence Modeling
No ratings yet
UNIT-3 Sequence Modeling
20 pages
DL_MOD4 (3)
No ratings yet
DL_MOD4 (3)
105 pages
unit5 3
No ratings yet
unit5 3
48 pages
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
No ratings yet
Sequence-To-Sequence, Attention, Transformer - Machine Learning Lecture
20 pages
Visualizing A Neural Machine Translation Model
No ratings yet
Visualizing A Neural Machine Translation Model
38 pages
cl8_encdec
No ratings yet
cl8_encdec
51 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
M5 Topic 1 - Encoder Decoder
No ratings yet
M5 Topic 1 - Encoder Decoder
21 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
UNIT-IV DL
No ratings yet
UNIT-IV DL
23 pages
11-rnn
No ratings yet
11-rnn
32 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
dlunit4
No ratings yet
dlunit4
122 pages
Unit 3 Questions With Answers Ghanta Ka Password
No ratings yet
Unit 3 Questions With Answers Ghanta Ka Password
20 pages
2014 10 Cho EMNLP
No ratings yet
2014 10 Cho EMNLP
11 pages
Slides on RNNs 26th March 2025
No ratings yet
Slides on RNNs 26th March 2025
30 pages
Exploring Sequence-to-Sequence Models _ Understanding the power of Encoder and Decoder Architecture _ by Sachinsoni _ Medium
No ratings yet
Exploring Sequence-to-Sequence Models _ Understanding the power of Encoder and Decoder Architecture _ by Sachinsoni _ Medium
18 pages
Unit_IV_Natural Language Processing (1)
No ratings yet
Unit_IV_Natural Language Processing (1)
9 pages
Unit 3
No ratings yet
Unit 3
8 pages
RNN-StannfordBased
No ratings yet
RNN-StannfordBased
102 pages
nndl (2)
No ratings yet
nndl (2)
10 pages
5a. Recurrent Neural Networks
No ratings yet
5a. Recurrent Neural Networks
45 pages
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
No ratings yet
Pervasive Attention 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction
11 pages
DL For Sequencial Data
No ratings yet
DL For Sequencial Data
36 pages
Unit 3
No ratings yet
Unit 3
27 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
DL CO4 PPT-1
No ratings yet
DL CO4 PPT-1
29 pages
Recurrent Neural Network: Dr. Sukanta Ghosh
100% (1)
Recurrent Neural Network: Dr. Sukanta Ghosh
34 pages
Sequence-To-Sequence Models: CIS 530, Computational Linguistics: Spring 2018
No ratings yet
Sequence-To-Sequence Models: CIS 530, Computational Linguistics: Spring 2018
61 pages
sequence to sequence
No ratings yet
sequence to sequence
4 pages
Machine Translation Wise 2016/2017
No ratings yet
Machine Translation Wise 2016/2017
58 pages
DL 8
No ratings yet
DL 8
7 pages
[Slides] Module 44
No ratings yet
[Slides] Module 44
119 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
16 pages
Lecture5
No ratings yet
Lecture5
102 pages
Endsem Imp Dl Unit 4
No ratings yet
Endsem Imp Dl Unit 4
30 pages
Sequence Learning Problem
No ratings yet
Sequence Learning Problem
42 pages
Sequence Models - Merged
No ratings yet
Sequence Models - Merged
67 pages
AN2DL_05_2324_Seq2SeqAndWordEmbedding
No ratings yet
AN2DL_05_2324_Seq2SeqAndWordEmbedding
42 pages
lec14-RNN3-8-Feb-18
No ratings yet
lec14-RNN3-8-Feb-18
16 pages
Soft Computing 1
No ratings yet
Soft Computing 1
15 pages
DL Un3
No ratings yet
DL Un3
11 pages
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
No ratings yet
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
24 pages
DL-UNIT_5
No ratings yet
DL-UNIT_5
10 pages
Class44-46 Introduction To Enncoder-Decoder Model Attention-03-09May2023
No ratings yet
Class44-46 Introduction To Enncoder-Decoder Model Attention-03-09May2023
35 pages
A Practical Survey On Faster and Lighter Transformers - 2023 - Fournier Et Al
No ratings yet
A Practical Survey On Faster and Lighter Transformers - 2023 - Fournier Et Al
40 pages
DL 4
No ratings yet
DL 4
11 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
The Beginner’s Guide to Node.js
From Everand
The Beginner’s Guide to Node.js
Steven Mcananey
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
Political XI: Sciencb Ouestion (
No ratings yet
Political XI: Sciencb Ouestion (
5 pages
Geography 1 - 7 Poddar
75% (4)
Geography 1 - 7 Poddar
20 pages
PG Prospectus 2024-25
No ratings yet
PG Prospectus 2024-25
63 pages
Full Download Multivariate Analysis For The Behavioral Sciences Kimmo Vehkalahti PDF
100% (3)
Full Download Multivariate Analysis For The Behavioral Sciences Kimmo Vehkalahti PDF
52 pages
QP 2nd Pre Board Class 12 Maths PDF
No ratings yet
QP 2nd Pre Board Class 12 Maths PDF
7 pages
The Persistence of Language - Constructing and Confronting The Past and Present in The Voices of Jane H. Hill 2013
No ratings yet
The Persistence of Language - Constructing and Confronting The Past and Present in The Voices of Jane H. Hill 2013
471 pages
Reduced PDF
No ratings yet
Reduced PDF
51 pages
Fixed Income Securities Innovative Assignment
No ratings yet
Fixed Income Securities Innovative Assignment
11 pages
Aspen-Hysyssimulation Sultana2011
No ratings yet
Aspen-Hysyssimulation Sultana2011
4 pages
CE2060 - Geotechnical Engineering - 1
No ratings yet
CE2060 - Geotechnical Engineering - 1
2 pages
Case Study On Trans Harbour Link Bridge Mumbai
No ratings yet
Case Study On Trans Harbour Link Bridge Mumbai
23 pages
Hands-on Time Series Analysis With Python: From Basics To Bleeding Edge Techniques B. V. Vishwas download
100% (1)
Hands-on Time Series Analysis With Python: From Basics To Bleeding Edge Techniques B. V. Vishwas download
55 pages
Igc-2 Element-01 Questions
No ratings yet
Igc-2 Element-01 Questions
4 pages
Writing Continuum Cluster Markers
No ratings yet
Writing Continuum Cluster Markers
2 pages
Iso 12952 6 2021
100% (2)
Iso 12952 6 2021
38 pages
4 - Markov Process
No ratings yet
4 - Markov Process
86 pages
Machine Learning in Manufacturing Ergonomics Recent Advances Challenges and Opportunities
No ratings yet
Machine Learning in Manufacturing Ergonomics Recent Advances Challenges and Opportunities
8 pages
Organizational Behavior and Human Decision Processes
No ratings yet
Organizational Behavior and Human Decision Processes
13 pages
Lab#1 - Juan Castro PDF
No ratings yet
Lab#1 - Juan Castro PDF
10 pages
Precalculus Functions and Graphs 13th Edition pdf
No ratings yet
Precalculus Functions and Graphs 13th Edition pdf
39 pages
Ellis Resume PDF
No ratings yet
Ellis Resume PDF
1 page
Math Paper Algebraic Fractions (Solved)
No ratings yet
Math Paper Algebraic Fractions (Solved)
3 pages
A Tata Power Initiative: Energy Conservation Programme
No ratings yet
A Tata Power Initiative: Energy Conservation Programme
18 pages
Surface Features Analysis in Salt-Affected Area Using Hyperspectral Data A Case Study in The Zone of Chott, Tunisia
No ratings yet
Surface Features Analysis in Salt-Affected Area Using Hyperspectral Data A Case Study in The Zone of Chott, Tunisia
6 pages
1 SM
No ratings yet
1 SM
8 pages

Module 3 Part 2 Encoder

Uploaded by

Module 3 Part 2 Encoder

Uploaded by

Contents

 Time step 𝑡=3

For each timestep 𝑡:

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2

Time step 𝑡=2:

Time step 𝑡=3:

You might also like