0% found this document useful (0 votes)

2 views92 pages

First

The document is a review of neural networks and deep learning, covering key concepts such as derivatives, biological neurons, linear regression approaches, gradient descent, and computation graphs. It explains the forward and backward passes in neural networks, detailing how to compute gradients and update weights. Additionally, it discusses types of neurons, particularly linear and logistic neurons, and their respective computations.

Uploaded by

Rahul Panjwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views92 pages

First

Uploaded by

Rahul Panjwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 92

L2: Neural

Networks Review
Deep Learning
CS427/CS5310/EE414/EE513

By Murtaza Taj
Derivatives De
ep
Le a r n i ng

Function Product Rule Chain Rule

n m n m
• f(x) = x • f(x) = x y • z=x
n m
d( f(x)) n−1 d(x) ∂(y ) ∂(x ) n
• dx
= nx
dx
= nx n−1
• f′(x, y) = x m
∂y
+ yn
∂x
• y=1+z
∂y ∂z ∂y
=
• ∂x ∂x ∂z
m
• f(x) = x 3 x m −n
• f(x, y) = = x y ∂y m−1 n−1
d( f(x)) 2−1 d(x) 2
y n
• ∂x = (mx )(nz )
• = 3x = 3x
dx dx
Biological Neuron De
ep
Le a r n i ng

T
w = {w0, w1, w2, ⋯}
T
x = {1,x1, x2, ⋯}
Analytical vs. Iterative Solution De
ep
Le a r n i ng

w = {w0, w1, w2, ⋯}T

(i) (i) (i)
x = {1,x1 , x2 , ⋯}T
(i) T (i) (i)
=w x =
∑
y wj xj
j
(i) (i) 2 (i) T (i) 2
(t − w x )
∑ ∑
E= (t − y ) =
i i

Direct/Analytical Approach Iterative Approach

∂E
∂E (i) (i) (i) wj ← wj − η
∑
=− (t − y )xj ∂wj
∂wj i
wj ← wj + Δwj
Ax = b
∂E
Δwj = − η
∂wj
(i) (i) (i)
Δwj = η(t − y )xj
where η is the learning rate
Linear Regression: Two Approaches De
ep
Le a r n i ng

Direct/Analytical Approach Iterative Approach

∂E
Ax = b wj = wj − η
∂wj
Key Computation: Forward Pass De ng
ep n i
a r
e
T
y=w x
L

x T
w x y
Key Computation: Forward Pass De ng
ep n i
a r
Le

w
y
x y=w xT E=
1
∑
n i∈train
(i) (i) 2
(t − y )
E
Key Computation: Forward/Backward Pass De ng
ep n i
a r
Le

w
y (i)
=w x (i)T (i)
y
x =
∑
(i) (i)
wj xj
E=
1
∑
n i∈train
(i) (i) 2
(t − y )
E
j

(i) (i)
∂E 2 ∂y ∂E 2
∑
= =−
∑
(i) (i) (i)
(t − y )xj
∂wj n i ∂wj ∂y (i) n i∈train
(i)
∂y ∂y (i) (i) ∂E
E
∂E 2
=−
∑
(i) (i)
(t − y )wj { , (i) }
∂xj
(i) n i∈train ∂wj ∂xj ∂y (i)
Gradient Descent De ng
ep n i
r

Δw ← Initialize to ZEROS
e a
L
• j
• For each iteration
(i) (i)
• For each training example (x , t )
• For each weight wj
(i) (i) (i) (i) (i) (i)
Δwj ← Δwj + ηxj (t −y )
∑
Δwj = η xj (t −y )
• For each weight wj i

wj ← wj + Δwj
Gradient Descent De ng
ep n i
a r
Le

Left Image: https://medium.com/analytics-vidhya/gradient-descent-part-2-the-math-c23060a96a13

Right Image: https://hduongtrong.github.io/2015/11/23/coordinate-descent/
Example: Computational Graph
Computation Graph De ng
ep n i
a r
Le

• For example here is a simple mathematical equation

y = (a + b) × (b − c)
2
y = ab − ac + b − bc
Computation Graph De ng
ep n i
a r
Le

• For example here is a simple mathematical equation

y = (a + b) × (b − c)
2
y = ab − ac + b − bc

• We want to calculate gradient w.r.t a, b, c

∂y ∂y ∂y
, ,
∂a ∂b ∂c
Computation Graph De ng
ep n i
a r
Le

• For example here is a simple mathematical equation

y = (a + b) × (b − c)
2
y = ab − ac + b − bc
• We want to calculate gradient w.r.t a, b, c
∂y ∂y ∂y
, ,
∂a ∂b ∂c
∂y
=b−c
∂a
Computation Graph De ng
ep n i
a r
Le

• For example here is a simple mathematical equation

y = (a + b) × (b − c)
2
y = ab − ac + b − bc

• We want to calculate gradient w.r.t a, b, c

∂y ∂y ∂y
, ,
∂y ∂a ∂b ∂c
=b−c
∂a
∂y
=a+b+b−c
∂b
Computation Graph De ng
ep n i
a r
Le

• For example here is a simple mathematical equation

y = (a + b) × (b − c)
2
y = ab − ac + b − bc

• We want to calculate gradient w.r.t a, b, c

∂y ∂y ∂y
, ,
∂a ∂b ∂c
∂y ∂y
=b−c = − (a + b)
∂a ∂c
∂y
=a+b+b−c
∂b
Computation Graph De ng
ep n i
a r
Le

• For example here is a simple mathematical equation

y = (a + b) × (b − c)

• We can write a computation graph of above equation as follows:

+
y
b *

−
c
Computation Graph - Forward Pass De ng
ep n i
a r
Le

• Let us consider a = 5, b = − 2, c = 4
y = (a + b) × (b − c)

• First we will use these values to compute d and e

a 5
d=a+b=5−2=3
+
−2
b ×
−2

−
c 4 e=b−c=−2−4=−6
Computation Graph - Forward Pass De ng
ep n i
a r
Le

• Let us consider a = 3, b = − 2, c = 4
y = (a + b) × (b − c)

• Now we will use d and e to compute y

a 5
d=a+b=5−2=3
+
−2 y = d × e = 3 × (−6) = − 18
b ×
−2

−
e=b−c=−2−4=−6
c 4
Computation Graph - Backward Pass De ng
ep n i
a r
Le

• We start the backward pass by finding the derivative of the final output with
respect to the final output (itself!).
∂y
=1
∂y

• Our computational graph now looks as shown below

a 5

+ d
−2
b × −18 y
−2
∂y
e =1
− ∂y

c 4
Computation Graph - Backward Pass De ng
ep n i
a r
Le

• Now given y = d × e
∂y
=e=b−c
∂d
∂y
=d=a+b
∂e
a 5 ∂y
=e=−6
∂d
+ d
−2
b × −18 y
−2
e
−
∂y
c 4 =d=3
∂e
Computation Graph - Backward Pass De ng
ep n i
a r
Le

• We want to calculate gradient w.r.t a, b, c y = (a + b) × (b − c)

∂y ∂y ∂y y=d×e
, ,
∂a ∂b ∂c
∂y ∂y ∂d
• From chain rule we have = =e×1=−6
∂y ∂a ∂d ∂a
=−6
∂a

a 5 ∂y
=e=−6
∂d
+ d=a+b
−2
b × −18
y = d*e
−2

− e=b−c
∂y
c 4 =d=3
∂e
y = (a + b) × (b − c)
Computation Graph - Backward Pass De ng
ep n i
a r
Le

• We want to calculate gradient w.r.t a, b, c

∂y ∂y ∂y
, ,
∂a ∂b ∂c

∂y ∂y ∂e
• From chain rule we have = = d × (−1) = − 3
∂c ∂e ∂c
a 5 ∂y
=e=−6
∂d
+ d=a+b
−2
b × −18
y = d*e
−2

− e=b−c
4 ∂y
c ∂y =d=3
=−3 ∂e
∂c
y = (a + b) × (b − c)
Computation Graph - Backward Pass De ng
ep n i
a r
Le

• We want to calculate gradient w.r.t a, b, c

∂y ∂y ∂y
, ,
∂a ∂b ∂c
∂y ∂y ∂e ∂y ∂d
• From chain rule we have = +
∂b ∂e ∂b ∂d ∂b
=d×1+e×1=−3
a 5 ∂y
=e=−6
∂d
+ d=a+b
−2
b × −18
y = d*e
∂y −2
=−3
∂b e=b−c
−
4 ∂y
c =d=3
∂e
DIY Example De ng
ep n i
a r
Le

• g = (x + y)z
DIY Examples De ng
ep n i
a r
Le

https://ekababisong.org/gcp-ml-seminar/tensorflow/
Types of Neuron
Types of Neurons De
ep
Le a r n i ng

w1
w0
w2
f(x, w)
w3 w1
Linear Neuron
w0
w2
f(x, w)
w1 w3
w0 Logistic Neuron
w2
f(x, w)
w3 Potentially more. Require a convex
Perceptron loss function for gradient descent
Slide Credit: HKUST
Logistic Neuron
Logistic Neuron De
ep
Le a r n i ng

• These give a real-valued output that is a

∑
smooth and bounded function of their z = w0 + xjwj
total input
j
• They have nice derivates which make
learning easy 1
y=
1+e −z
Key Computation: Forward-Prop De
ep
Le a r n i ng

x z = w0 +
∑
j
wj xj z
y=
1
1+e −z
y
Key Computation: Forward-Prop De
ep
e a r n i ng

w
L

z 1 y
x z = w0 +
∑
j
wj xj y=
1 + e −z
1
E = (t − y)
2
2
E

∂E
∂wj
∂E ∂z ∂z ∂y ∂E
∂xj
{ , }
∂wj ∂xj ∂z ∂y
E
Key Computation: Back-Prop De
ep
Le a r n i ng

w
z 1 y
x z = w0 +
∑
j
wj xj y=
1+e −z
1
E = (t − y)
2
2
E

∂E ∂z ∂y ∂E
= = − xj y(1 − y)(t − y)
∂wj ∂wj ∂z ∂y
∂E
∂xj
{xj, wj} y(1 − y) −(t − y) E
Derivation of Logistic Neuron De
ep
Le a r n i ng

1 −z −1
y= = (1 + e )
1+e −z

( 1 + e −z ) ( 1 + e −z )
−z −z
∂y −1(−e ) 1 e
= = = y(1 − y)
∂z (1 + e )
−z 2

−z −z −z
e (1 + e ) − 1 (1 + e ) −1
because = = = 1 − y
1 + e −z (1 + e −z) (1 + e −z) (1 + e −z)
Key Computation: Back-Prop De
ep
e a r n i ng

w
L

z 1 y
x z = w0 +
∑
j
wj xj y=
1+e −z
E=
1
∑
2 i∈train
(t − y) 2
E

∂E ∂z ∂y ∂E
∑
= =− xj y(1 − y)(t − y)
∂wj ∂wj ∂z ∂y i
∂E
∂xj
{xj, wj} y(1 − y) −
∑
(t − y) E
i
Key Computation: Back-Prop De
ep
Le a r n i ng

• Iterative Solution E

∂E
wj ← wj − η
∂wj
wj ← wj + Δwj ∂E
∂wj w
∂E
Δwj = − η
∂wj E

(i) (i) (i) (i) (i) (i)

∑
Δwj = η xj y (1 − y )(t − y )
i
∂E
∂wj w
Activtion Functions De
ep
Le a r n i ng

sigmoid vs tanh
Rectified Linear Units (ReLU) De
ep
Le a r n i ng

! ReLU max(0,z) [Krizhevsky et al., NIPS12]

! Leaky ReLU max(0.1z, z)

Perceptron
Example:
w = [0.79, 0.96, 0.66]
Implements a linear
function f(x1, x2)

Threshold to get linear

1 w0 decision boundary

w1
x1

x2 w2
AND Function De
ep
Le a r n i ng
AND Function De
ep
Le a r n i ng
AND Function De
ep
Le a r n i ng
AND Function De
ep
Le a r n i ng
AND Function De
ep
Le a r n i ng
OR Function De
ep
Le a r n i ng

1×1−1×1−1×1=−1
OR Function De
ep
Le a r n i ng

1×1−1×1+1×1=1
OR Function De
ep
Le a r n i ng

1×1+1×1−1×1=1
OR Function De
ep
Le a r n i ng

1×1+1×1+1×1=3
What to do incase of non-linear problem? XOR De
ep
Le a r n i ng

Idea: Stack a bunch of them together

Consider XOR x1
-1
x2
-1
XOR(x1, x2)
-1
-1 1 1
1 -1 1
1 1 -1

1 ? x2

?
x1
?
x1
x2
No set of weights
can separate
these classes
XOR Function De
ep
Le a r n i ng

Consider XOR x1
-1
x2
-1
XOR(x1, x2)
-1
-1 1 1
1 -1 1
1 1 -1

1 ? x2

?
x1
?
x1
x2
No set of weights
can separate
these classes
XOR De
ep
Le a r n i ng

XOR
XOR(x1, x2) = OR(AND(x1, ¬x2), AND(¬x1, x2))
x1 x2 AND(x1, ¬x2) AND(¬x1, x2) OR(AND(x1, ¬x2), AND(¬x1, x2))
-1 -1
-1 1
1 -1
1 1
XOR De
ep
Le a r n i ng

XOR
XOR(x1, x2) = OR(AND(x1, ¬x2), AND(¬x1, x2))
x1 x2 AND(x1, ¬x2) AND(¬x1, x2) OR(AND(x1, ¬x2), AND(¬x1, x2))
-1 -1 -1
-1 1 -1
1 -1 1
1 1 -1
XOR De
ep
Le a r n i ng

XOR
XOR(x1, x2) = OR(AND(x1, ¬x2), AND(¬x1, x2))
x1 x2 AND(x1, ¬x2) AND(¬x1, x2) OR(AND(x1, ¬x2), AND(¬x1, x2))
-1 -1 -1 -1
-1 1 -1 1
1 -1 1 -1
1 1 -1 -1
XOR De
ep
e a r n i ng

XOR
L

XOR(x1, x2) = OR(AND(x1, ¬x2), AND(¬x1, x2))

x1 x2 AND(x1, ¬x2) AND(¬x1, x2) OR(AND(x1, ¬x2), AND(¬x1, x2))
-1 -1 -1 -1 -1
-1 1 -1 1 1
1 -1 1 -1 1
1 1 -1 -1 -1
XOR De
ep
Le a r n i ng

XOR(x1, x2) = OR(AND(x1, ¬x2), AND(¬x1, x2))

OR AND
1 1 1 -1

x1 1 x1 1

x2 1 x2 1

1 -1 1
1
-1
1 1
x1 XOR(x1, x2)
h1 = AND(x1, ¬x2)
-1 OR(h1, h2)
-1 1
x2
1
h2 = AND(¬x1, x2)
XOR De
ep
Le a r n i ng

XOR(x1, x2) = OR(AND(x1, ¬x2), AND(¬x1, x2))

OR AND XOR
1 1 1 -1 1
1 -1 1
-1
x1 1 x1 1 x1 1 1

-1
-1 1
x2 1 x2 1 x2
1

By combining two
Perceptrons, we are
able to create a
non-linear decision
boundary
A Powerful Model De
ep
Le a r n i ng

A Powerful Model

Yaser Abu Mostafa, Caltech

Demo: De
ep
Le a r n i ng

http://playground.tensorflow.org/
Readings De
ep
Le a r n i ng

• Gradient Descent
• Forward and Backward pass of Neural Network
• [Nikhil Buduma CH1] Neural, Perceptron, Regression, Logistic Regression
• [Nikhil Buduma CH2] Training feedforward NN, back prop., Gradient Descent
Multilayer Networks De
ep
Le a r n i ng

x1 w1
h1 w5 o1
w2 w6

w3 w7
x2 h2 o2
w4 w8

b1 b2
Multilayer Networks De
ep
Le a r n i ng

h1 o1
x1 w1 w5
w2 z y w6
z y

w3
h2
w7 o2
x2 w4 z y w8 z y

b1 b2
Multilayer Networks De
ep
Le a r n i ng

h1 o1
x1 w1 w5
zh1 yh1 zo1 yo1
w2 w6

w3
h2
w7 o2
x2 w4 zh2 yh2 w8 zo2 yo2

b1 b2
Key Computation: Back-Prop De
ep
Le a r n i ng

∂E h1 o1
=? x1 w1 w5
∂w5 zh1 yh1 zo1 yo1
w2 w6

w3
h2
w7 o2
x2 w4 zh2 yh2 w8 zo2 yo2

b1 b2
Key Computation: Back-Prop De
ep
Le a r n i ng

∂E ∂zo1 ∂yo1 ∂Eo1 ∂Eo2

∂w5 ∂w5 ∂zo1 ∂yo1
+
∂yo1
ET

∂E
yh1 yo1(1 − yo1) −(to1 − yo1) ET
∂w5
Key Computation: Back-Prop De
ep
Le a r n i ng

∂E h1 o1
=? x1 w1 w5
∂w1 zh1 yh1 zo1 yo1
w2 w6

w3
h2
w7 o2
x2 w4 zh2 yh2 w8 zo2 yo2

b1 b2
Key Computation: Back-Prop De
ep
Le a r n i ng

∂E
=?
∂w1

∂zh1 ∂yh1 ∂E
∂w1 ∂zh1 ∂yh1 E

∂zh1 ∂yh1 ∂Eo1 ∂Eo2

∂w1 ∂zh1 ∂yh1
+
∂yh1
E
∂zo1 ∂yo1 ∂Eo1
∂zh1 ∂yh1 +

∂w1 ∂zh1
∂yh1 ∂zo1 ∂yo1
∂zo2 ∂yo2 ∂Eo2 E
∂yh1 ∂zo2 ∂yo2
Key Computation: Back-Prop De
ep
Le a r n i ng

∂E
=?
∂w1

∂zh1 ∂yh1 ∂E
∂w1 ∂zh1 ∂yh1 E
∂zo1 ∂yo1 ∂Eo1
∂zh1 ∂yh1 +

∂w1 ∂zh1
∂yh1 ∂zo1 ∂yo1
∂zo2 ∂yo2 ∂Eo2 E
∂yh1 ∂zo2 ∂yo2

−w5yo1(1 − yo1)(to1 − yo1)+

x1 yh1(1 − yh1) −w7yo2(1 − yo2)(to2 − yo2) E

Gradient Descent De
ep
Le a r n i ng

• Δwj ← Initialize to ZEROS

• For each iteration
(i) (i)
• For each training example (x , t )
• For each weight wj
Δwj ← Δwj+
ηx1yh1(1 − yh1)[−w5yo1(1 − yo1)(to1 − yo1] − w7yo2(1 − yo2)(to2 − yo2]

• For each weight wj

wj ← wj + Δwj
Key Computation: Back-Prop De
ep
Le a r n i ng

∂E h1 o1
=? x1 w1 w5
∂w1 zh1 yh1 zo1 yo1
w2 w6

w3
h2
w7 o2
x2 w4 zh2 yh2 w8 zo2 yo2

w9
w10 o3
b1 b2 zo3 yo3
Numerical Example
A Step by Step Backpropagation Example De
ep
Le a r n i ng

https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
Multilayer Networks De
ep
Le a r n i ng

x1 w1
h1 w5 o1
w2 w6

w3 w7
x2 h2 o2
w4 w8

b1 b2
Multilayer Networks De
ep
Le a r n i ng

h1 o1
x1 w1 w5
w2 z y w6
z y

w3
h2
w7 o2
x2 w4 z y w8 z y

b1 b2
Multilayer Networks De
ep
Le a r n i ng

h1 o1
x1 w1 w5
zh1 yh1 zo1 yo1
w2 w6

w3
h2
w7 o2
x2 w4 zh2 yh2 w8 zo2 yo2

b1 b2
A Step by Step Backpropagation Example
De
ep
Le a r n i ng

x
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The goal of backpropagation is to optimize the weights so that the

neural network can learn how to correctly map arbitrary inputs to
outputs.

For the rest of this tutorial we’re going to work with a single training set:
given inputs 0.05 and 0.10, we want the neural network to output 0.01
and 0.99.
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Forward Pass

Total net input for h1: z
z

We then squash it using the logistic function to get the output of :

1
yh1 =
1 + e h1
−z
x

Carrying out the same process for h2 we get:

x
y
A Step by Step Backpropagation Example De
ep
Le a r n i ng

Output for o1: zo1 = w5 * yh1 + w6 * yh2 + b2 * 1

1
yo1 =
1 + e o1
−z
x

For o2 we get: y x
A Step by Step Backpropagation Example
De
ep
Le a r n i ng

Calculating the Error

1 2
Eo1 = (to1 − yo1)
2
ET

x
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass

Output Layer
∂ET
Consider w5. We want to know how much a change in affects the total error, aka .
∂w5
∂ET ∂zo1 ∂yo1 ∂ET
By applying the chain rule we know that: =
∂w5 ∂w5 ∂zo1 ∂yo1

Visually, here’s what we’re doing:

1 x
zo1 yo1 Eo1 = (to1 − yo1)2
2
ET = Eo1 + Eo2
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass ∂ET ∂zo1 ∂yo1 ∂ET

=
∂w5 ∂w5 ∂zo1 ∂yo1
how much does the total error change w.r.t the output?
1 2 1 2
ET = (to1 − yo1) + (to2 − yo2)
2 2
∂ET 1 2−1
= 2 * (to1 − yo1) * (−1) + (−1) * 0
∂yo1 2

1
zo1 yo1 Eo1 = (to1 − yo1)2
2
ET = Eo1 + Eo2
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass ∂ET ∂zo1 ∂yo1 ∂ET

=
∂w5 ∂w5 ∂zo1 ∂yo1
how much does the output o1 change w.r.t its total net input?
1
yo1 =
1 + e −zo1
∂yo1
= yo1(1 − yo1)
∂zo1

1
zo1 yo1 Eo1 = (to1 − yo1)2
2
ET = Eo1 + Eo2
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass ∂ET ∂zo1 ∂yo1 ∂ET

=
∂w5 ∂w5 ∂zo1 ∂yo1
how much does the total net input of o1 change with respect to w5?
zo1 = w5 * yh1 + w6 * yh2 + b2 * 1
∂zo1
= 1 * yh1 = 0.593269992
∂w5
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass

Putting it all together: ∂ET ∂zo1 ∂yo1 ∂ET
=
∂w5 ∂w5 ∂zo1 ∂yo1
∂ET
= − (to1 − yo1)yo1 * (1 − yo1) * yh1
∂w5
∂ET
w5+ = w5 − η
∂w5
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass

We can repeat this process to get the new weights w6, w7, and w8

x
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass - Hidden Layer

Continue the backwards pass by calculating new values for w1, w2, w3, and w4
∂ET ∂zh1 ∂yh1 ∂ET
=
∂w1 ∂w1 ∂zh1 ∂yh1
∂ET ∂Eo1 ∂Eo2
= +
∂yh1 ∂yh1 ∂yh1

ET = Eo1 + E02
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass x

∂Eo2
∂yh1
x

∂Eo2
Following the same process for ,
∂yh1

we get:

∂ET ∂Eo1 ∂Eo2

Therefore: = +
∂yh1 ∂yh1 ∂yh1
A Step by Step Backpropagation Example De
ep
Le a r n i ng

The Backward Pass - Hidden Layer

∂E ∂z ∂y ∂E
T h1 h1 T
=
∂w1 ∂w1 ∂zh1 ∂yh1 x

The Backward Pass - Hidden Layer

zh1
∂zh1
∂w1
1
yh1 =
1 + e −zh1
∂yh1
= yh1(1 − yh1)
∂zh1

∂ET
∂w1
Similarly,
Take Home - Written Assignment De
ep n i ng
a r
Le
Derive the formula for
∂E ∂E
=? =?
∂w2 ∂w6
∂E ∂E
=? =?
∂w3 ∂w7
∂E ∂E
=? =?
∂w4 ∂w8

Find the value for

Δw6 Δw2 x

Δw7 Δw3
x
Δw8 Δw4
Readings De
ep
Le a r n i ng

• Handwritten Notes on LMS

• Linear Algebra Background
• https://www.deeplearningbook.org/contents/linear_algebra.html
• Probability background
• https://www.deeplearningbook.org/contents/prob.html
• ML Background
• https://www.deeplearningbook.org/contents/ml.html

A Survey On Vision Transformer
No ratings yet
A Survey On Vision Transformer
24 pages
Image Captioning
67% (3)
Image Captioning
16 pages
Module 2 Deep Feed Forward Networks
No ratings yet
Module 2 Deep Feed Forward Networks
18 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
6COM1044 Deep Learning 1
No ratings yet
6COM1044 Deep Learning 1
49 pages
Neural Network Training
No ratings yet
Neural Network Training
73 pages
Unit 1
No ratings yet
Unit 1
30 pages
3.NN Backprop
No ratings yet
3.NN Backprop
56 pages
Lecture 02-2
No ratings yet
Lecture 02-2
37 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
151 pages
Deep Learning PDF
No ratings yet
Deep Learning PDF
289 pages
5 - From Linear Models To Multi-Layer Perceptrons
No ratings yet
5 - From Linear Models To Multi-Layer Perceptrons
45 pages
DL 02 Deep Forward Networks
No ratings yet
DL 02 Deep Forward Networks
47 pages
Slides 11
No ratings yet
Slides 11
48 pages
Deep Learning
100% (4)
Deep Learning
100 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
Lecture 0.4 - Neural Networks
No ratings yet
Lecture 0.4 - Neural Networks
51 pages
Lecture2 Slides 1
No ratings yet
Lecture2 Slides 1
28 pages
Manual - Deep Learning Lab.
No ratings yet
Manual - Deep Learning Lab.
43 pages
DL Notes
No ratings yet
DL Notes
652 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Deep Learning
100% (2)
Deep Learning
49 pages
6S191 MIT DeepLearning L1
No ratings yet
6S191 MIT DeepLearning L1
108 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2016
14 pages
ML Unit-5
No ratings yet
ML Unit-5
20 pages
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2015
No ratings yet
CS 224D: Deep Learning For NLP: Lecture Notes: Part III Spring 2015
14 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
DL 2
No ratings yet
DL 2
62 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Introduction To Feed Forward Neural Networks
No ratings yet
Introduction To Feed Forward Neural Networks
121 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
2024 04 CS115 Vector Caculus
No ratings yet
2024 04 CS115 Vector Caculus
131 pages
Lecture 3-4
No ratings yet
Lecture 3-4
50 pages
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
No ratings yet
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
36 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
Applied Deep Learning - Part 1 - Artificial Neural Networks - by Arden Dertat - Towards Data Science
No ratings yet
Applied Deep Learning - Part 1 - Artificial Neural Networks - by Arden Dertat - Towards Data Science
34 pages
CS 611 Slides 5
No ratings yet
CS 611 Slides 5
28 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
LLM For Maths People
No ratings yet
LLM For Maths People
53 pages
6S191 MIT DeepLearning L1
No ratings yet
6S191 MIT DeepLearning L1
108 pages
1 Slides ANN
No ratings yet
1 Slides ANN
90 pages
Backpropagation Exercises
No ratings yet
Backpropagation Exercises
7 pages
Sol 4
No ratings yet
Sol 4
7 pages
Module I
No ratings yet
Module I
109 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Chapter 7
No ratings yet
Chapter 7
31 pages
Lecture Slides
No ratings yet
Lecture Slides
30 pages
Week 1 Solutions
No ratings yet
Week 1 Solutions
8 pages
Lect 12 - Deep Feed Forward NN - Review
No ratings yet
Lect 12 - Deep Feed Forward NN - Review
93 pages
Deep Neural Networks - 2
No ratings yet
Deep Neural Networks - 2
55 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Graded Assessment
No ratings yet
Graded Assessment
6 pages
The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff
No ratings yet
The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff
5 pages
02 Neural Network
No ratings yet
02 Neural Network
28 pages
Idap 2019 8875953
No ratings yet
Idap 2019 8875953
6 pages
MultilayerPerceptron Chapter9
No ratings yet
MultilayerPerceptron Chapter9
13 pages
Chapter - 4 & 5
No ratings yet
Chapter - 4 & 5
63 pages
NNFLC Question
No ratings yet
NNFLC Question
1 page
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
CNN Image Classification Report
No ratings yet
CNN Image Classification Report
2 pages
DL - Unit - 1 - Foundations of Deep Learning
No ratings yet
DL - Unit - 1 - Foundations of Deep Learning
35 pages
Slides CNN Unit 3
No ratings yet
Slides CNN Unit 3
36 pages
Exercise #2 28 - 4 - 2025
No ratings yet
Exercise #2 28 - 4 - 2025
7 pages
IPPR Ch9
No ratings yet
IPPR Ch9
66 pages
Transformer
No ratings yet
Transformer
3 pages
Deep Learning - IIT Ropar - Unit 6 - Week 3
No ratings yet
Deep Learning - IIT Ropar - Unit 6 - Week 3
4 pages
DL Mod1.PDF Flashcards
No ratings yet
DL Mod1.PDF Flashcards
10 pages
NN LAB 13 SEP - Jupyter Notebook
No ratings yet
NN LAB 13 SEP - Jupyter Notebook
6 pages
Introduction To ANN Set 4 (Network Architectures)
No ratings yet
Introduction To ANN Set 4 (Network Architectures)
6 pages
Deep Vs Shallow Neural Networks
No ratings yet
Deep Vs Shallow Neural Networks
13 pages
Dota 2 With Large Scale Deep Reinforcement Learning
No ratings yet
Dota 2 With Large Scale Deep Reinforcement Learning
2 pages
Artecle Review
No ratings yet
Artecle Review
4 pages
NEURAL NETWORKS Basics Using Matlab
100% (2)
NEURAL NETWORKS Basics Using Matlab
51 pages
Stock Market Prediction Using CNN and LSTM
No ratings yet
Stock Market Prediction Using CNN and LSTM
7 pages
ChatGPT Is A Natural Language Processing Chatbot That Is Driven by Generative AI Technology Which Allows People To Have Human-Like Conversations and Many More Things With It
No ratings yet
ChatGPT Is A Natural Language Processing Chatbot That Is Driven by Generative AI Technology Which Allows People To Have Human-Like Conversations and Many More Things With It
1 page
MTech AICurriculum 2022
No ratings yet
MTech AICurriculum 2022
2 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
2 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages