0% found this document useful (0 votes)
5 views14 pages

Module 3 Part 2 Encoder

The document provides an overview of the encoder-decoder architecture used in recurrent neural networks for machine translation, detailing the roles of the encoder and decoder in processing input sequences. It includes a forward pass example demonstrating how the encoder converts input into a latent vector and how the decoder generates the output sequence. Applications of this architecture include machine translation, speech recognition, text summarization, chatbots, and image captioning.

Uploaded by

chinmy anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views14 pages

Module 3 Part 2 Encoder

The document provides an overview of the encoder-decoder architecture used in recurrent neural networks for machine translation, detailing the roles of the encoder and decoder in processing input sequences. It includes a forward pass example demonstrating how the encoder converts input into a latent vector and how the decoder generates the output sequence. Applications of this architecture include machine translation, speech recognition, text summarization, chatbots, and image captioning.

Uploaded by

chinmy anand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Contents

Introduction
Encoder and decoder architectures
Example Forward pass
Application

1
Reference:
10.4, Ian Goodfellow, Yoshua Bengio, Aaron
Courville, Deep Learning, MIT Press, 2016
Encoder-Decoder Architecture: Overview
https://www.youtube.com/watch?v=671xny8i
ows

2
Introduction
The encoder-decoder architecture for recurrent neural
networks is the standard neural machine translation
method.
There are three main blocks in the encoder-decoder
model,
1. Encoder: The Encoder will convert the input sequence
into a single-dimensional vector (Hidden/Latent
vector).
2. Hidden/Latent Vector
3. Decoder: The decoder will convert the hidden/Latent
vector into the output sequence.
Encoder-Decoder models are jointly trained to maximize
the conditional probabilities of the target sequence given
the input sequence.
3
RNN vs RNN based Encoder-Decoder
Simple RNN:
• Consists of a single RNN layer that maps inputs xt​
directly to outputs yt​ at each time step limiting its
ability to learn global dependencies over long
sequences.
Encoder-Decoder RNN:
• Composed of two RNNs: the encoder and the
decoder.
• Encoder: Processes the entire input sequence into a
fixed-length representation (final hidden/latent state)
captures global context in a latent representation.
• Decoder: Uses the encoder’s latent representation
to generate the output sequence.
4
Example (same from RNN)
Inputs: 𝑥=[0.5,0.6,0.7] ;
Targets: 𝑦=[0.6,0.7,0.8]
Initial hidden state: ℎ0=0
Weights:𝑊𝑥=0.8, 𝑊ℎ=0.6
Biases:𝑏x= 0.1

5
6
1. Encoder Forward Pass
 Time step 𝑡=1
Hidden state

ℎ1=tanh(0.8⋅0.5+0.6⋅0+0.1)=tanh(0.5)≈0.462
 Time step 𝑡=2
ℎ2=tanh(0.8⋅0.6+0.6⋅0.462+0.1)=tanh(0.857)≈0

 Time step 𝑡=3


.695

ℎ3=tanh(0.8⋅0.7+0.6⋅0.695+0.1)=tanh(1.077)≈0
.792

7
2. Decoder Forward Pass
The decoder generates the target sequence
using the hidden state from the encoder. The
initial hidden state of the decoder is

For each timestep 𝑡:

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2


Biases: 𝑏h= 0.1, 𝑏𝑦=0.05

8
2. Decoder Forward Pass

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2


Biases: 𝑏h= 0.1, 𝑏𝑦=0.05
Time step 𝑡=1:

9
2. Decoder Forward Pass

Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2


Biases: 𝑏h= 0.1, 𝑏𝑦=0.05
Time step 𝑡=1:

10
2. Decoder Forward Pass
Weights: 𝑊ℎ=0.6, 𝑊𝑦=1.2
Biases: 𝑏h= 0.1, 𝑏𝑦=0.05
Time step 𝑡=1:

Time step 𝑡=2:

Time step 𝑡=3:

11
Encoder-decoder or sequence-to-
sequence RNN

12
Applications
 Machine Translation
Translating "Bonjour" (French) to "Hello" (English).
 Speech Recognition
Converting spoken words in an audio file to text
 Text Summarization
Summarizing a 1,000-word article into a 100-word
abstract.
 Chatbots and Conversational Agents
Responding to "What's the weather today?" with
"It's sunny and 25°C.“
 Image Captioning
Generating the caption "A dog playing with a ball in
13 the park" for a given image.
14

You might also like