0% found this document useful (0 votes)
50 views43 pages

Deeplearning - Ai Deeplearning - Ai

Uploaded by

Lê Tường
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views43 pages

Deeplearning - Ai Deeplearning - Ai

Uploaded by

Lê Tường
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Copyright Notice

These slides are distributed under the Creative Commons License.

DeepLearning.AI makes these slides available for educational purposes. You may not use or distribute
these slides for commercial purposes. You may make copies of these slides and use or distribute them for
educational purposes as long as you cite DeepLearning.AI as the source of the slides.

For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode


Recurrent Neural
Networks

Why sequence
deeplearning.ai
models?
Examples of sequence data
“The quick brown fox jumped
Speech recognition over the lazy dog.”

Music generation ∅
“There is nothing to like
Sentiment classification in this movie.”

DNA sequence analysis AGCCCCTGTGAGGAACTAG AGCCCCTGTGAGGAACTAG

Machine translation Voulez-vous chanter avec Do you want to sing with


moi? me?

Video activity recognition Running

Name entity recognition Yesterday, Harry Potter Yesterday, Harry Potter


met Hermione Granger. met Hermione Granger.
Andrew Ng
Recurrent Neural
Networks

Notation
deeplearning.ai
Motivating example
x: Harry Potter and Hermione Granger invented a new spell.

Andrew Ng
Representing words
x: Harry Potter and Hermione Granger invented a new spell.
! "#$ ! "%$ ! "&$ ⋯ ! "($

Andrew Ng
Representing words
x: Harry Potter and Hermione Granger invented a new spell.
! "#$ ! "%$ ! "&$ ⋯ ! "($

And = 367
Invented = 4700
A=1
New = 5976
Spell = 8376
Harry = 4075
Potter = 6830
Hermione = 4200
Gran… = 4000

Andrew Ng
Recurrent Neural
Networks

Recurrent Neural
deeplearning.ai
Network Model
Why not a standard network?
! "#$ ) "#$

! "%$ ) "%$

⋮ ⋮ ⋮ ⋮

! "'($ ) "'*$

Problems:
- Inputs, outputs can be different lengths in different examples.
- Doesn’t share features learned across different positions of text.
Andrew Ng
Recurrent Neural Networks

He said, “Teddy Roosevelt was a great President.”


He said, “Teddy bears are on sale!”
Andrew Ng
Forward Propagation
)- "#$ )- "%$ )- ".$ )- "'* $

+"#$ +"%$ +"'( /#$


+",$ ⋯

! "#$ ! "%$ ! ".$ ! "'( $

Andrew Ng
Simplified RNN notation
+"1$ = 3(566 +"1/#$ + 568 ! "1$ + 96 )

)- "1$ = 3(5;6 +"1$ + 9; )

Andrew Ng
Recurrent Neural
Networks

Backpropagation
deeplearning.ai
through time
Forward propagation and backpropagation
'( "&$ '( ")$ '( "*$ '( "+. $

!"&$ !")$ !"+, -&$


!"#$ ⋯

% "&$ % ")$ % "*$ % "+, $

Andrew Ng
Forward propagation and backpropagation

ℒ "1$ '( "1$ , ' "1$ =

Backpropagation through time


Andrew Ng
Recurrent Neural
Networks

Different types
deeplearning.ai
of RNNs
Examples of sequence data
“The quick brown fox jumped
Speech recognition over the lazy dog.”

Music generation ∅
“There is nothing to like
Sentiment classification in this movie.”

DNA sequence analysis AGCCCCTGTGAGGAACTAG AGCCCCTGTGAGGAACTAG

Machine translation Voulez-vous chanter avec Do you want to sing with


moi? me?

Video activity recognition Running

Name entity recognition Yesterday, Harry Potter Yesterday, Harry Potter


met Hermione Granger. met Hermione Granger.
Andrew Ng
Examples of RNN architectures

Andrew Ng
Examples of RNN architectures

Andrew Ng
Summary of RNN types
() #'% () #'% () #*% () #+, % ()
"#$% "#$% ⋯ "#$% ⋯
& #'% & & #'% & #*% & #+. %
One to one One to many Many to one

() #'% () #*% () #+, % () #'% () #+, %

"#$% "#$% ⋯ ⋯ ⋯

& #'% & #*% & #'% & #+. %


& #+. %
Many to many Many to many
Andrew Ng
Recurrent Neural
Networks

Language model and


deeplearning.ai
sequence generation
What is language modelling?
Speech recognition
The apple and pair salad.

The apple and pear salad.

!(The apple and pair salad) =

!(The apple and pear salad) =

Andrew Ng
Language modelling with an RNN
Training set: large corpus of english text.

Cats average 15 hours of sleep a day.

The Egyptian Mau is a bread of cat. <EOS>


Andrew Ng
RNN model

Cats average 15 hours of sleep a day. <EOS>

ℒ &' ()*, & ()* = − - &0()* log &'0()*


0
ℒ = - ℒ ()* &' ()*, & ()*
) Andrew Ng
Recurrent Neural
Networks

Sampling novel
deeplearning.ai
sequences
Sampling a sequence from a trained RNN
'( "&$ '( "/$ '( "0$ '( ")* $

!"#$ !"&$ !"/$ !"0$ ⋯ !")* $

% "&$ ' "&$ ' "/$ ' ")- .&$

Andrew Ng
Character-level language model
y<n> la 1 softmax co do dai bang voi vocabulary. Cho nao co percentage cao nhat thi y<n> la word do (hoac chon theo random choice,
danh xe percentage).
Nhu vay P(n) = P(y<n>) trong softmax. Do do, P(ca cau y) = P(y<0>).P(y<1>)....

Vocabulary = [a, aaron, …, zulu, <UNK>]

'( "&$ '( "/$ '( "0$ '( ")* $

!"#$ !"&$ !"/$ !"0$ ⋯ !")* $

% "&$ '( "&$ '( "/$ '( ")- .&$


Andrew Ng
Sequence generation
Neu model train tren tap du lieu tin tuc (News) thi ket qua la doan van khong co lang man, an du.
Con neu model train tren Shakespeare thi noi dung se tru tinh, an du va co nhieu bien phap tu tu.

News Shakespeare

President enrique peña nieto, announced The mortal moon hath her eclipse in love.
sench’s sulk former coming football langston
paring. And subject of this thou art another this fold.

“I was not at all surprised,” said hich langston. When besser be my love to me see sabl’s.

“Concussion epidemic”, to be examined. For whose are ruse of mine eyes heaves.

The gray football the told some and this has on


the uefa icon, should money as.

Andrew Ng
Recurrent Neural
Networks

Vanishing gradients
deeplearning.ai
with RNNs
Vanishing gradients with RNNs
Vi du: Translate tu tieng viet sang
tieng anh. Den tu "đã là" ta dich
sang were hay was. RNN can '( "&$ '( "-$ '( "/$ '( ")* $
thong tin ty trc do de chon, nhung
tu cat thi o qua xa. Co lay history
nhung qua xa thi khong update
duoc. Do do can LSTM.

!"#$ !"&$ !"-$ !"/$ ⋯ !")* $

% "&$ % "-$ % "/$ % "). $

% ⋮ ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ ⋮ '(

Exploding gradients.
Andrew Ng
Recurrent Neural
Networks

Gated Recurrent
deeplearning.ai
Unit (GRU)
RNN unit

!"#$ = &(() !"#*+$ , - "#$ + /) )

Andrew Ng
GRU (simplified)

The cat, which already ate …, was full.


[Cho et al., 2014. On the properties of neural machine translation: Encoder-decoder approaches]
[Chung et al., 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling] Andrew Ng
Full GRU
Truoc do: c = c~ va c = c~ = tanh(c_)
Thay thanh: c = Tc*c~ + (1-Tc)*c_
5̃ "#$ = tanh((> [ 5 "#*+$ , - "#$ ] + /> ) => c = c~ = tanh(c_) = tanh(Tc*c~_ + (1-Tc)*c_ _)

Γ2 = 3((2 5 "#*+$ , - "#$ + /2 )

5 "#$ = Γ2 ∗ 5̃ "#$ + 1 − Γ2 + 5 "#*+$


y tuong: Thay the y = tanh(..) la c~ = tanh(...)
Sau do co 1 bien Tu bien thien tu 0 den 1. Nhu 1 he so (1 switch dieu chinh volumn cho tung khau)
y = c = gamma.c~ + (1-gamma)*c_
Nhu vay gamma la 1 he so scale tu c~ thanh c, gamma la su anh huong cua history vao c~ de thanh c.
Binh thuong y = c = c~ nhung khi co gamma thi ta can nhac them 1 phan gia tri qua khu de thay doi gia gamma la % hien tai va qua khu.

The cat, which ate already, was full.

Andrew Ng
Recurrent Neural
Networks

LSTM (long short


deeplearning.ai
term memory) unit
GRU and LSTM
GRU LSTM
!̃ #$% = tanh(,- Γ/ ∗ ! #$12%, 4 #$% + 6- )

Γ8 = 9(,8 ! #$12%, 4 #$% + 68 )


khong co Tr dau chi co Tu va (1-Tu) thoi

Γ/ = 9(,/ ! #$12%, 4 #$% + 6/ )

! #$% = Γ8 ∗ !̃ #$% + 1 − Γ8 ∗ ! #$12%

=#$% = ! #$%

Khac voi GRU, LSTM co he so rieng cho c~ la Tu va c_ la Tf, ngoai ra con 1 he so To chuyen tu ct thanh at. Nhu vay thi he so c~ la To*Tu va c_ la To*Tf

[Hochreiter & Schmidhuber 1997. Long short-term memory] Andrew Ng


LSTM units
GRU LSTM
!̃ #$% = tanh(,- Γ/ ∗ ! #$12%, 4 #$% + 6- ) !̃ #$% = tanh(,- =#$12%, 4 #$% + 6- )

Γ8 = 9(,8 ! #$12%, 4 #$% + 68 ) Γ8 = 9(,8 =#$12%, 4 #$% + 68 )

Γ/ = 9(,/ ! #$12%, 4 #$% + 6/ ) Γ> = 9(,> =#$12%, 4 #$% + 6> )

! #$% = Γ8 ∗ !̃ #$% + 1 − Γ8 ∗ ! #$12% Γ? = 9(,? =#$12%, 4 #$% + 6? )

=#$% = ! #$% ! #$% = Γ8 ∗ !̃ #$% + Γ> ∗ ! #$12%

=#$% = Γ? ∗ ! #$%
[Hochreiter & Schmidhuber 1997. Long short-term memory] Andrew Ng
LSTM in pictures
D #$%

!̃ #$% = tanh(,- =#$12%, 4 #$% + 6- )


softmax

=#$%
Γ8 = 9(,8 =#$12%, 4 #$% + 68 )
! #$12% * ⨁ ! #$%
--
Γ> = 9(,> =#$12%, 4 #$% + 6> ) tanh ! #$%
* =#$%
Γ? = 9(,? =#$12%, 4 #$% + 6? ) =#$12% B #$%
C #$%
!̃ #$% A #$%
*
=#$%
! #$% = Γ8 ∗ !̃ #$% + Γ> ∗ ! #$12%
forget gate update gate tanh output gate

=#$% = Γ? ∗ ! #$%
4 #$%
D #2% D #F% Duong a la duong value D #G%
Duong c la duong history, luu tru cac
softmax softmax softmax
gia tri qua khu de bo sung cho a
=#2% =#F% #G% =
! #F%
-- --
! #G%
--
! #2%
! #E% * ⨁ ! #2% * ⨁ ! #F% * ⨁

=#E% #2% =#2% =#F%


=#F% =#G%
=

4 #2% 4 #F% 4 #G% Andrew Ng


Recurrent Neural
Networks

Bidirectional RNN
deeplearning.ai
Getting information from the future
He said, “Teddy bears are on sale!”
He said, “Teddy Roosevelt was a great President!”

!" #)% !" #(% !" #*% !" #.% !" #-% !" #/% !" #$%

+#,% +#)% +#(% +#*% +#.% +#-% +#/% +#$%

' #)% ' #(% ' #*% ' #.% ' #-% ' #/% ' #$%
He said, “Teddy bears are on sale!”

Andrew Ng
Wy khong chi thoa man Wy*a_(forward) = a(forward)

Bidirectional RNN (BRNN)


ma cung phai thoa man Wy*a(backward) =
a_(backward). De hieu la, Wy thoa man train "I love
you" ra "anh yeu em" thi khi dua "You love I" cung
phai ra "em yeu anh". Noi chung thi, Wy la giai
nghiem 2 chieu cua phep Wy*a_(forward) = a(forward)
va Wy*a(backward) = a_(backward)

Andrew Ng
Recurrent Neural
Networks

Deep RNNs
deeplearning.ai
Deep RNN example softmax of tôi softmax of yêu
softmax of class (dog,cat) = (1x2) (1x923) (1x923)

, "#$ , "%$ , "&$ , "'$

([&]"+$ ([&]"#$ ([&]"%$ ([&]"&$ ([&]"'$

([%]"+$ ([%]"#$ ([%]"%$ ([%]"&$ ([%]"'$

([#]"+$

Ex: "I love you too" one hot of love one hot of you
!"#$ ! "%$ ! "&$ (1x1023) ! "'$
va 1 dict co 1023 tu english (1x1023)
va 923 tu tieng viet one hot of I in vocab too (1x1023)
mat (1x1023)

image (12x12 = 1x 224)

Deep RNN la Deeper tung block => so luong weight tang cao tu length of RNN L thanh L*depth
Andrew Ng

You might also like