Exercise 1
1.1 Your friend’s mood dependent on the past few days as well as in the current
day’s weather. You’ve collected data for the past 365 days on the weather,
which you represent as a sequence as x<1>,…,x<365>. You’ve also collected
data on your friends’s mood, which you represent as y<1>,…,y<365>. You’d like
to build a model to map from x→y. Should you use a Unidirectional RNN or
Bidirectional RNN for this problem?
1.2 Explain why vanishing gradients affects heavily on RNN’s than in deep FNN’s
Solution:
1.1 You need a unidirectional RNN because your friend’s mood does only depend on
the current and past days’ weather. Having the information from the future weather’s
would not help you.
1.2 Vanishing gradients affect much more RNN’s because the weight matrices (in time
steps) are shared whereas in FNN the weight matrices are different at each step.
Exercise 2
In this exercise we are showing several “rolled” computational graphs of RNN. For each
one, you are asked to (1) draw the “unrolled” computational graphs for the input that
we ask, make sure to write the values of inputs, hidden, outputs units and also weights
for each time step; (2) you have to write the RNN equations and (3) you have to
explain what the RNN is learning
2.1 For the following RNN, we have the following input sequence: x(t=0)=2,
x(t=1)=-0.5; x(t=2)=1
Linear
output
unit
W=1
Linear
hidden W=1
unit
W=1
input
unit
2.2 For the following RNN, we have the following input sequence x(t=0)=(2,-2);
x(t=1)=(0,3.5); x(t=2)=(1,2.2)
Logistic
output
unit
W=1
Linearhi
dden W=1
unit
W=1 W=-1
input input
unit unit
1 2
SOLUTION
2.1 Equations: h(t)=Wx(t)+Wh(t-1); y(t)=Wh(t); The network is summing the
inputs.
2 1.5 2.5
W=1 W=1 W=1
2 1.5 2.5
W=1 W=1 W=1
W=1 W=1 W=1
2 -0.5 1
T=1 T=2 T=3
&
2.2 Equations h(t)=Wx(t)+Uh(t-1); y(t)=f(Wh(t)), with 𝑓 (𝑧) = &'( )*
This one compares the total values of the first or second input
1 0.92 0.03
4 0.5 -0.7
2 -2 0 -3.5
3.5 1 2.2
T=1 T=2 T=3
Exercise 3.
The figure shows a RNN with one input unit x, one logistic hidden unit h, and one linear output
unit y. The Network parameters are Wxh=-0,1, Whh=0.5 and Why=0.25, hbias=0.4
and ybias=0.0. The input takes the values 18, 9, -8 at time steps 0,1 and 2.
y1 y2 y3
Why
Whh
h0 h1 h2
Wxh
x0 x1 x2
T=0 T=1 T=2
3.1-Compute the hidden value h0
3.2-Compute the output value y1
3.3-Compute the output value y2
SOLUTION:
3.1
1
𝑓 (𝑧 ) =
1 + 𝑒 ./
ℎ1 = 𝑓(𝑥1 𝑊45 + ℎ.& 𝑊55 + 𝑏);
ℎ1 = 𝑓(18 ∗ −0.1 + 0 + 0.4);
1
ℎ1 = 1+𝑒1.4 = 0,2;
3.2
𝑦& = 𝑊5A ℎ1= 0,25*0,2=0,05
3.3
ℎ& = 𝑓(𝑥& 𝑊45 + ℎ1 𝑊55 + 𝑏)
𝑦B = 𝑊5A ℎ&= 0,25*ℎ&