0% found this document useful (0 votes)

77 views30 pages

Part 1.2. Back Propagation

1. Softmax regression is used for multiclass classification problems where the predicted probabilities of each class sum to 1. It is commonly used as the final layer in neural networks. 2. The cross-entropy loss function measures the performance of a classification model whose output is a probability value between 0 and 1. It takes the negative log likelihood of the predicted probabilities. 3. Backpropagation is used to calculate the gradient of the loss function with respect to the weights in each layer, and update the weights to reduce the loss during training. This allows neural networks to learn through iterative weight updates.

Uploaded by

Việt Hoàng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views30 pages

Part 1.2. Back Propagation

Uploaded by

Việt Hoàng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

AI - FOUNDATION AND APPLICATION

Instructor:
Assoc. Prof. Dr. Truong Ngoc Son
Chapter 2
Back Propagation
Outline
 Multiclass Classification With Softmax regression
 Lost function – cross entropy
 Stochastic gradient descent – batch and
mini-batch gradient descent
 Translating math into code
Multiclass Classification
With Softmax regression
Multiclass classification example

b(1)1
W(1) z(1)1 a(1)1 W(2)
f
(1)
W(2)1,1
W 1,1 b(1)
2 b(2)1
(1) (1)
z 2 a 2
x1 W(1)2,1 W(2)1,2 a1 o1
f f
W(1)1,n W(2)k,1
W(1)2,n (2)
W(2)1,m
W k,2

b(2)k
W(1)j,1
ak ok
b(1)j W(2)k,j f
z(1)j a(1)j
xn
f W(2)k,m
Output layer
Input layer b(1)m
W(1)m,n z(1)m a(1)m
f

Hidden layer
Multiclass classification example
b(1)1
W(1) z(1)1 a(1)1 W(2)
f
(1)
W(2)1,1 Predictive Desired
W 1,1 b(1)
2 b(2)1 output,o output, y
(1)
z 2 a(1)2 z(2)1
x1 W(1)2,1 W(2)1,2 o1
1 f f
W(1)1,n W(2)k,1
0.9 1
0 W(1)2,n W(2)1,m
W(2)k,2 0.7 0
(2)
b k
W(1)j,1 z(2)k
ok 0.5 0
b(1)j W(2)k,j f
z(1)j a(1)j
1 xn
f W(2)k,m
Output layer
Input layer b(1)m N samples, K outputs
W(1)m,n z(1)m a (1)
m

f N K
1
Cost / Loss 𝐿= 𝑦𝑘𝑡 − 𝑜𝑘𝑡 2
Hidden layer N
𝑡=1 𝑘=1
Multiclass classification with logistic regression
Sigmoid function is used for the neurons at the output layer

 It is ideally for two-class classification

1  The outputs are independent
𝜎 𝑥 =
1 + 𝑒 −𝑥
 The sigmoid may produces high probability for all classes, some of
them, or none of them

0.9 0.9 0.3

0.8 0.2 0.2

0.6 0.5 0.1

Multiclass classification with softmax regression
We expect that there is only one right answer, the outputs are mutually exclusive.

𝑒 𝑧𝑗
𝜎(𝑧)𝑗 = 𝐾 𝑧𝑘
 The softmax will enforce that the sum of the
𝑘=1 𝑒
probabilities of output classes are equal to one
 Softmax is used for multi-classification in the Logistic
Regression model, whereas Sigmoid is used for binary
classification in the Logistic Regression model
Multiclass classification with softmax regression
 Softmax function is mostly used in a final layer of Neural Network
 The outputs are probability distribution

𝑒 𝑧𝑗
𝜎(𝑧)𝑗 = 𝐾 𝑧𝑘
𝑘=1 𝑒
Lost function – cross entropy
Cross-entropy takes the negative log likelihood of the predicted probability

Cross-entropy loss
𝑀

𝐿𝑜𝑠𝑠 = − 𝑦𝑖 log(𝑜𝑖 )
𝑖=1

M – number of classes
y: class label
o: predicted probability observation
Back propagation
Feedforward propagation with softmax activation
W(1) W(2)
(1)
a 1

W(1)1,1 W(2)1,1

x1 W(1)2,1 a(1)2 W(2)1,2 o1

W(1)1,n W(2)k,1 𝐾
(2)
(1) W k,2
W 2,n
W(2)1,m Cross-entropy loss 𝐿𝑜𝑠𝑠 = − 𝑦𝑘 log(𝑜𝑘 )
W(1)j,1 𝑘=1
W(2)k,j ok

xn a(1)j
W(2)k,m
W(1)m,n Output layer
Input layer
a(1)m

Hidden layer

𝑀 𝐾 𝑒 𝑧𝑘
(1) 1 𝑜𝑘 = (2)
(1) (1) (1) (2) (1) (2) (2) 𝑧𝑗
𝑧𝑗 = 𝑥𝑖 𝑤𝑗,𝑖 + 𝑏𝑗 𝑎𝑗 = 𝑧𝑘 = 𝑎𝑗 𝑤𝑘,𝑗 + 𝑏𝑘 𝐾
−𝑧𝑗
(1) 𝑗=1 𝑒
𝑖=1 1+𝑒 𝑗=1
Feedforward propagation with softmax activation
a1 W 𝜕𝐿
f Gradient descent 𝑤𝑘,𝑗 = 𝑤𝑘,𝑗 − 𝜂
W1,1
𝜕𝑤𝑘,𝑗
a2
W1,2 z1 o1
f f
Wk,1
𝐾
W1,m
Wk,2 𝐿(𝑦, 𝑜) = − 𝑦𝑘 log(𝑜𝑘 ) Apply the chain rule
𝑘=1
zk ok
Wk,j f 𝜕𝐿 𝜕𝐿 𝜕𝑧𝑘
aj 𝑒 𝑧𝑘 =
f Wk,m 𝑜𝑘 = (2) 𝜕𝑤𝑘,𝑗 𝜕𝑧𝑘 𝜕𝑤𝑘,𝑗
𝐾 𝑧𝑗
Output layer
𝑗=1 𝑒
am 𝜕𝐿 𝜕𝐿 𝜕𝑜𝑘
f =
𝜕𝑧𝑘 𝜕𝑜𝑘 𝜕𝑧𝑘
Hidden layer

Error of kth neuron of the 𝜕𝐿 𝜕𝐿 𝛿𝑘 = 𝑜𝑘 − 𝑦𝑘

𝛿𝑘 = Using calculus, we obtain = 𝑜𝑘 − 𝑦𝑘
output layer 𝜕𝑧𝑘 𝜕𝑧𝑘

𝑜𝑘 𝑖𝑠 𝑡ℎ𝑒 𝑘𝑡ℎ 𝑐𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑛𝑒𝑢𝑟𝑜𝑛′ 𝑠 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛

𝑎𝑛𝑑 𝑦𝑘 𝑖𝑠 𝑡ℎ𝑒 𝑘𝑡ℎ 𝑐𝑜𝑚𝑝𝑜𝑛𝑒𝑛𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑙𝑎𝑏𝑒𝑙
Back-propagation
(1)
𝛿𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑒𝑟𝑟𝑜𝑟 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑛𝑒𝑢𝑟𝑜𝑛 𝑖𝑛 ℎ𝑖𝑑𝑑𝑒𝑛 𝑙𝑎𝑦𝑒𝑟
W(1) W(2)
(1)
a 1

W(1)1,1 W(2)1,1

x1 W(1)2,1 a(1)2 W(2)1,2 o1

(2)
W(1)1,n W k,1

(1) W(2)k,2
W 2,n
W(2)1,m

W(1)j,1
W(2)k,j ok

xn a(1)j
W(2)k,m
W(1)m,n Output layer
Input layer
a(1)m

Hidden layer
𝜕𝐿 (1)
(1)
= 𝛿𝑗
𝜕𝑎𝑗

(1)
𝛿𝑗 affects all neurons of next layer via the weights
Back-propagation 𝐵𝑎𝑐𝑘 𝑝𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑒 𝑒𝑟𝑟𝑜𝑟
(1) (2) (2)
𝛿1 = 𝛿1 𝑤1,1 + ⋯ + 𝛿𝑘 𝑤𝑘,1
W(1) W(2)
a(1) 𝐾
1
(1) (2)
W(1)1,1 W(2)1,1 𝛿𝑗 = 𝛿𝑘 𝑤𝑘,𝑗
x1 W(1)2,1 a(1)2 W(2)1,2 o1 𝑘=1
W(1)1,n
W(2)k,2
W(2)k,1 𝐵𝑎𝑐𝑘 𝑝𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑒 𝑡ℎ𝑟𝑜𝑢𝑔ℎ 𝑠𝑖𝑔𝑚𝑜𝑖𝑑 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛
W(1)2,n
W(2)1,m
𝜕𝐿 1 1 1
(1)
W j,1
ok (1)
= 𝛿𝑗 𝑎𝑗 (1 − 𝑎𝑗 )
W(2)k,j 𝜕𝑧𝑗
xn a(1)j 𝜕𝐿
W(2)
𝑈𝑝𝑑𝑎𝑡𝑒 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 𝑤𝑗,𝑖 = 𝑤𝑗,𝑖 − 𝜂
Input layer
W(1)m,n k,m
Output layer 𝜕𝑤𝑗,𝑖
a(1)m
(2) (2) 𝜕𝐿 (2) 𝜕𝛿𝑘
Hidden layer 𝑤𝑘,𝑗 =𝑤𝑘,𝑗 − 𝜂 2 = 𝑤𝑘,𝑗 − 𝜂𝛿𝑘 (2)
𝑤𝑘,𝑗 𝜕𝑤𝑘,𝑗

(2) (2) (1) 𝛿𝑘 = 𝑜𝑘 − 𝑦𝑘

𝑤𝑘,𝑗 =𝑤𝑘,𝑗 − 𝜂𝛿𝑘 𝑎𝑗
𝐾
(1) (2)
𝛿𝑗 = 𝛿𝑘 𝑤𝑘,𝑗
𝑘=1

(1) (1) (1) 1 1

𝑦𝑜𝑢 𝑠ℎ𝑜𝑢𝑙𝑑 𝑠𝑝𝑒𝑛𝑑 𝑡𝑖𝑚𝑒 𝑡𝑜 𝑚𝑎𝑠𝑡𝑒𝑟 𝑡ℎ𝑒𝑚 𝑤𝑗,𝑖 =𝑤𝑗,𝑖 − 𝜂𝛿𝑗 𝑎𝑗 (1 − 𝑎𝑗 )𝑥𝑖
Stochastic gradient descent – batch and
mini-batch gradient descent
Stochastic gradient descent (SGD): use 1 sample in each iteration
Batch gradient descent (GD): use all samples in each iteration
Mini-batch gradient descent (Mini-batch GD): use b sample in each
iteration, in this case, b is the batch size
Mini-batch stochastic gradient descent (Mini-batch SGD): use b sample in
each iteration, the batch of training samples is randomly selected
PYTHON CODE
Translating Math into Code
Translating mathematics into code
W(1) W(2)

MNIST Dataset
(1)
a 1

W(1)1,1 W(2)1,1

x1 W(1)2,1 a(1)2 W(2)1,2 o1

W(1)1,n W(2)k,1
(2)
W k,2
W(1)2,n
W(2)1,m Cross-entropy loss

W(1)j,1
W(2)k,j ok

xn a(1)j
W(2)k,m
W(1)m,n Output layer
Input layer
a(1)m

Hidden layer
Feed forward propagation
W(1) W(2)
(1)
a 1

W(1)1,1 W(2)1,1

x1 W(1)2,1 a(1)2 W(2)1,2 o1

W(1)1,n
W(2)k,1
(1) W(2)k,2
W 2,n
W(2)1,m

W(1)j,1
W(2)k,j ok

xn a(1)j
W(2)k,m
W(1)m,n Output layer
Input layer
a(1)m

Hidden layer

𝐷𝑒𝑓𝑖𝑛𝑒: 𝐹𝑜𝑟𝑑𝑤𝑎𝑟𝑑
𝑧2 = 𝑧1 𝑎 = 𝑋𝑊ℎ𝑇 𝑒 𝑧2𝑘
𝑊ℎ = 𝑊 (1) 1 𝑧2 = 𝑋𝑊𝑜 𝑇 𝑜𝑘 = 𝐾 𝑧2𝑘
𝑗=1 𝑒
𝑧1 = 𝜎 𝑧1 =
𝑊𝑜 = 𝑊 (2) 1 + 𝑒 −𝑧1
Back-propagation error (element-wise product)
𝐵𝑎𝑐𝑘 𝑝𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑒 𝑒𝑟𝑟𝑜𝑟
Parameters
n: number of inputs Training sample # 1 (1) (2) (2)
m: number of neurons in hidden layer The kth output 𝛿1 = 𝛿1 𝑤1,1 + ⋯ + 𝛿𝑘 𝑤𝑘,1
k: number of neurons in output layer (1) (1) (1)
d1 d2 dk
t: number of training sample (batch_size) 𝐾
Hidden layer
(1) (2)
, ,…, ,...,
d
d1
(2) (2)
d2
(2)
dk
Outputs of neurons
𝛿𝑗 = 𝛿𝑘 𝑤𝑘,𝑗
M: number of training sample
W(1) W(2) 𝑘=1
(1) (t) (t) (t)
a 1 d1 d2 dk (1) (1) (1)
o1 o2 ok
W(1)1,1 W(2)1,1
(2) (2) (2)
x1 W(1)2,1 a(1)2 W(2)1,2 o1 o1 o2 ok
W(1)1,n W(2)k,1
(2)
W k,2
W(1)2,n (t) (t) (t)
W(2)1,m o1 o2 ok
W(1)j,1
W(2)k,j ok
W(2)
xn a(1)j d1
(1) (1)
d2
(1)
dk
W1,1 W1,2 W1,3 W1,m
W(2)k,m
W(1)m,n Output layer
n inputs (2) (2) (2) W2,1 W2,2 W2,3 W2,m
d1 d2 dk
(1)
Input layer a m
m neurons
Hidden layer (t) (t) (t) Wk,1 Wk,2 Wk,3 Wk,m
d1 d2 dk
txk kxm
n1 W1,1 W1,2 W1,3 W1,n n1 W1,1 W1,2 W1,3 W1,m

W2,1 W2,2 W2,3 W2,n W2,1 W2,2 W2,3 W2,m , ,…, ,...,
W(1) W(2)
mxn kxm (1) (1) (1)
dh1 dh2 dhk
nm Wm,1 Wm,2 Wm,3 Wk,n nk Wk,1 Wk,2 Wk,3 Wk,m
(2) (2) (2)
dh1 dh2 dhk

(t) (t) (t)

txm
dh1 dh2 dhk
Update weights
𝑈𝑝𝑑𝑎𝑡𝑒 𝑤𝑒𝑖𝑔ℎ𝑡𝑠
𝑜𝑢𝑡𝑝𝑢𝑡 𝑙𝑎𝑦𝑒𝑟
𝐾 𝜕𝐿
𝑤𝑗,𝑖 = 𝑤𝑗,𝑖 − 𝜂
𝐿(𝑦, 𝑜) = − 𝑦𝑘 log(𝑜𝑘 ) 𝜕𝑤𝑗,𝑖
𝑘=1 (2) (2) 𝜕𝐿 (2) 𝜕𝛿𝑘
𝑤𝑘,𝑗 =𝑤𝑘,𝑗 − 𝜂 2 = 𝑤𝑘,𝑗 − 𝜂𝛿𝑘 (2)
𝑑 =𝑜−𝑦 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑤𝑘,𝑗 𝜕𝑤𝑘,𝑗

2
∆𝑊𝑜 = −𝜂 𝑑𝑇 𝑎
t
W(1) W(2)
𝑊𝑜 = Wo+ ∆Wo a (1)
1

W(1)1,1 W(2)1,1

x1 W(1)2,1 a(1)2 W(2)1,2 o1

W(1)1,n
𝑜𝑢𝑡𝑝𝑢𝑡 𝑙𝑎𝑦𝑒𝑟 𝑏𝑎𝑐𝑘 𝑝𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑒 (1) W(2)k,2
W(2)k,1
W 2,n

𝑑ℎ = 𝑑𝑊𝑜 𝑡ℎ𝑟𝑜𝑢𝑔ℎ 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 W(2)1,m

W(1)j,1
𝑏𝑎𝑐𝑘 𝑝𝑟𝑜𝑝𝑎𝑔𝑎𝑡𝑒 W(2)k,j ok
𝑑ℎ𝑠 = 𝑑ℎ(𝑎 1 − 𝑎 )
𝑡ℎ𝑟𝑜𝑢𝑔ℎ 𝑠𝑖𝑔𝑛𝑚𝑜𝑖𝑑 xn a(1)j
2 W(2)k,m
∆𝑊𝑜 = −𝜂 𝑑ℎ𝑠 𝑇 𝑋 Input layer
W(1)m,n Output layer
t
a(1)m

𝑡: 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑠𝑎𝑚𝑝𝑙𝑒 Hidden layer

PYTHON CODE
Python code
Load dataset Batch Gradient Descent

import numpy as np
import tensorflow as tf
#load datashet
print("Load MNIST Database")
mnist = tf.keras.datasets.mnist
(x_train,y_train),(x_test,y_test)= mnist.load_data()
x_train=np.reshape(x_train,(60000,784))/255.0
x_test= np.reshape(x_test,(10000,784))/255.0
y_train = np.matrix(np.eye(10)[y_train])
y_test = np.matrix(np.eye(10)[y_test])
print("----------------------------------")
print(x_train.shape)
print(y_train.shape)
Python code
Define functions

def sigmoid(x):
return 1./(1.+np.exp(-x))

def softmax(x):
return np.divide(np.matrix(np.exp(x)),np.mat(np.sum(np.exp(x),axis=1)))

def Forwardpass(X,Wh,bh,Wo,bo):
zh = [email protected] + bh
a = sigmoid(zh)
[email protected] + bo
o = softmax(z)
return o
def AccTest(label,prediction): # calculate the matching score
OutMaxArg=np.argmax(prediction,axis=1)
LabelMaxArg=np.argmax(label,axis=1)
Accuracy=np.mean(OutMaxArg==LabelMaxArg)
return Accuracy
Python code
Define network architecture, initialize weights

learningRate = 0.5
Epoch=50
NumTrainSamples=60000
NumTestSamples=10000

NumInputs=784
NumHiddenUnits=512
NumClasses=10
#inital weights
#hidden layer
Wh=np.matrix(np.random.uniform(-0.5,0.5,(NumHiddenUnits,NumInputs)))
bh= np.random.uniform(0,0.5,(1,NumHiddenUnits))
dWh= np.zeros((NumHiddenUnits,NumInputs))
dbh= np.zeros((1,NumHiddenUnits))
#Output layer
Wo=np.random.uniform(-0.5,0.5,(NumClasses,NumHiddenUnits))
bo= np.random.uniform(0,0.5,(1,NumClasses))
dWo= np.zeros((NumClasses,NumHiddenUnits))
dbo= np.zeros((1,NumClasses))
Python code Batch Gradient Descent
Training the model
from IPython.display import clear_output
loss = []
Acc = []
for ep in range (Epoch):
#feed fordware propagation
x = x_train
y=y_train
zh = [email protected] + bh
a = sigmoid(zh)
[email protected] + bo
o = softmax(z)
#calculate loss
loss.append(-np.sum(np.multiply(y,np.log10(o))))
#calculate the error for the ouput layer
d = o-y
#Back propagate error
dh = d@Wo
dhs = np.multiply(np.multiply(dh,a),(1-a))
#update weight
dWo = np.matmul(np.transpose(d),a)
dbo = np.mean(d) # consider a is 1 for bias
dWh = np.matmul(np.transpose(dhs),x)
dbh = np.mean(dhs) # consider a is 1 for bias
Wo =Wo - learningRate*dWo/NumTrainSamples
bo =bo - learningRate*dbo
Wh =Wh-learningRate*dWh/NumTrainSamples
bh =bh-learningRate*dbh
#Test accuracy with random innitial weights
prediction = Forwardpass(x_test,Wh,bh,Wo,bo)
Acc.append(AccTest(y_test,prediction))
clear_output(wait=True)
plt.plot([i for i, _ in enumerate(Acc)],Acc,'o')
plt.show()
Python code

Test the model

prediction = Forwardpass(x_test,Wh,bh,Wo,bo)
Rate = AccTest(y_test,prediction)
Print(Rate)
Python code Mini-Batch Gradient Descent
Training the model

from IPython.display import clear_output

loss = []
Acc = []
Batch_size = 200 #update weight
Stochastic_samples = np.arange(NumTrainSamples) dWo = np.matmul(np.transpose(d),a)
for ep in range (Epoch): dbo = np.mean(d) # consider a is 1 for bias
np.random.shuffle(Stochastic_samples) dWh = np.matmul(np.transpose(dhs),x)
for ite in range (0,NumTrainSamples,Batch_size): dbh = np.mean(dhs) # consider a is 1 for bias
#feed fordware propagation
Wo =Wo - learningRate*dWo/Batch_size
Batch_samples = Stochastic_samples[ite:ite+Batch_size]
x = x_train[Batch_samples,:] bo =bo - learningRate*dbo
y=y_train[Batch_samples,:] Wh =Wh-learningRate*dWh/Batch_size
zh = [email protected] + bh bh =bh-learningRate*dbh
a = sigmoid(zh) #Test accuracy with random innitial weights
[email protected] + bo prediction = Forwardpass(x_test,Wh,bh,Wo,bo)
o = softmax(z)
Acc.append(AccTest(y_test,prediction))
#calculate loss
loss.append(-np.sum(np.multiply(y,np.log10(o)))) clear_output(wait=True)
#calculate the error for the ouput layer plt.plot([i for i, _ in enumerate(Acc)],Acc,'o')
d = o-y plt.show()
#Back propagate error print('Epoch:', ep )
dh = d@Wo print('Accuracy:',AccTest(y_test,prediction) )
dhs = np.multiply(np.multiply(dh,a),(1-a))
#update weight
Python code Mini-Batch Gradient Descent
Just calculate the loss and accuracy after each epoch

from IPython.display import clear_output dWo = np.matmul(np.transpose(d),a)

loss = [] dbo = np.mean(d) # consider a is 1 for bias
Acc = [] dWh = np.matmul(np.transpose(dhs),x)
Batch_size = 200 dbh = np.mean(dhs) # consider a is 1 for bias
Stochastic_samples = np.arange(NumTrainSamples) Wo =Wo - learningRate*dWo/Batch_size
for ep in range (Epoch): bo =bo - learningRate*dbo
np.random.shuffle(Stochastic_samples) Wh =Wh-learningRate*dWh/Batch_size
for ite in range (0,NumTrainSamples,Batch_size): bh =bh-learningRate*dbh
#feed fordware propagation #Test accuracy with random innitial weights
Batch_samples = prediction = Forwardpass(x_test,Wh,bh,Wo,bo)
Stochastic_samples[ite:ite+Batch_size] Acc.append(AccTest(y_test,prediction))
x = x_train[Batch_samples,:] print('Epoch:', ep )
y=y_train[Batch_samples,:] print('Accuracy:',AccTest(y_test,prediction) )
zh = [email protected] + bh
a = sigmoid(zh)
[email protected] + bo Epoch: 0
o = softmax(z) Accuracy: 0.8762
#calculate loss Epoch: 1
loss.append(-np.sum(np.multiply(y,np.log10(o)))) Accuracy: 0.9013
#calculate the error for the ouput layer Epoch: 2
d = o-y Accuracy: 0.9136
#Back propagate error Epoch: 3
dh = d@Wo Accuracy: 0.9165
dhs = np.multiply(np.multiply(dh,a),(1-a)) Epoch: 4
Accuracy: 0.9251
Python code Mini-Batch Gradient Descent
Training the model

Solution Dseclzg524 05-07-2020 Ec3r
No ratings yet
Solution Dseclzg524 05-07-2020 Ec3r
7 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Neural - Networks
No ratings yet
Neural - Networks
47 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Experiment 3
No ratings yet
Experiment 3
9 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
ML807 Distributed and Federated Learning Slides 2
No ratings yet
ML807 Distributed and Federated Learning Slides 2
211 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Neural Network
No ratings yet
Neural Network
97 pages
Chapter 5 Summary
No ratings yet
Chapter 5 Summary
5 pages
P5 Neural Nets
No ratings yet
P5 Neural Nets
114 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
22 NeuralNetworks
No ratings yet
22 NeuralNetworks
29 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
DL Exp-2 16010422230
No ratings yet
DL Exp-2 16010422230
6 pages
5 - From Linear Models To Multi-Layer Perceptrons
No ratings yet
5 - From Linear Models To Multi-Layer Perceptrons
45 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Pr3 ANN WriteUp
No ratings yet
Pr3 ANN WriteUp
8 pages
Question Example
No ratings yet
Question Example
10 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
Module 3.docxaiml
No ratings yet
Module 3.docxaiml
20 pages
DL Lab Manual
No ratings yet
DL Lab Manual
52 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Inference and Learning
No ratings yet
Inference and Learning
33 pages
Neural Network
100% (1)
Neural Network
54 pages
Da 3 Lab DL 21BCE2687
No ratings yet
Da 3 Lab DL 21BCE2687
15 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Chapter 2 - Artificial Neural Networks
No ratings yet
Chapter 2 - Artificial Neural Networks
19 pages
555610a19 DL Exp4
No ratings yet
555610a19 DL Exp4
11 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
19 pages
Test 2 Lab 6
No ratings yet
Test 2 Lab 6
8 pages
Multi Layer Perceptron 1
No ratings yet
Multi Layer Perceptron 1
54 pages
10 Neural Network
No ratings yet
10 Neural Network
65 pages
New Exp
No ratings yet
New Exp
12 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Back-Propagation Is Very Simple. Who Made It Complicated
No ratings yet
Back-Propagation Is Very Simple. Who Made It Complicated
26 pages
15 Neural Network Updated
No ratings yet
15 Neural Network Updated
85 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
Backpropagation (Numericals) SOLVED NEW
No ratings yet
Backpropagation (Numericals) SOLVED NEW
8 pages
Classification BP Regression KNN Other Classifiers - Final
No ratings yet
Classification BP Regression KNN Other Classifiers - Final
116 pages
ML Assignment-9
No ratings yet
ML Assignment-9
4 pages
MLP 1122 20240509 ch10 DeepNN
No ratings yet
MLP 1122 20240509 ch10 DeepNN
47 pages
Neural
No ratings yet
Neural
53 pages
Session XX - Neural Network
No ratings yet
Session XX - Neural Network
43 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
Module 2 DL Snotes P1
No ratings yet
Module 2 DL Snotes P1
16 pages
Slide 2-f2
No ratings yet
Slide 2-f2
52 pages
ML Expt 9
No ratings yet
ML Expt 9
9 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
Lecture 13.3 Classification ANN
No ratings yet
Lecture 13.3 Classification ANN
64 pages
cs519 hw2
No ratings yet
cs519 hw2
15 pages
Experiment 2.4 DL
No ratings yet
Experiment 2.4 DL
4 pages
Ece18898g Neural Networks
No ratings yet
Ece18898g Neural Networks
47 pages
Freshwater Fish Image Classifier
No ratings yet
Freshwater Fish Image Classifier
54 pages
Score-Based Generative Modeling in Latent Space
No ratings yet
Score-Based Generative Modeling in Latent Space
46 pages
Logistic Regression Notes
No ratings yet
Logistic Regression Notes
23 pages
Lesson 4 Deep Neural Network and Tools
No ratings yet
Lesson 4 Deep Neural Network and Tools
159 pages
Math For Machine Learning 1694120073
No ratings yet
Math For Machine Learning 1694120073
11 pages
Early Prediction of Poststroke Rehabilitation Outcomes Using Wearable Sensors
No ratings yet
Early Prediction of Poststroke Rehabilitation Outcomes Using Wearable Sensors
33 pages
Fundamentals of Neural Networks
No ratings yet
Fundamentals of Neural Networks
24 pages
IOU Balanced Loss Functions
No ratings yet
IOU Balanced Loss Functions
22 pages
An Intelligent IoT Sensing System For Rail Vehicle Running States Based On TinyML
No ratings yet
An Intelligent IoT Sensing System For Rail Vehicle Running States Based On TinyML
12 pages
PyTorch Neural Network Classifcation
No ratings yet
PyTorch Neural Network Classifcation
1 page
You Shouldn't Trust Me: Learning Models Which Conceal Unfairness From Multiple Explanation Methods
No ratings yet
You Shouldn't Trust Me: Learning Models Which Conceal Unfairness From Multiple Explanation Methods
8 pages
Forecasting Ambulance Demand Using Machine Learning A Case Study From Oslo Norway
No ratings yet
Forecasting Ambulance Demand Using Machine Learning A Case Study From Oslo Norway
10 pages
Generative Semi-Supervised Learning For Multivariate Time Series Imputation
No ratings yet
Generative Semi-Supervised Learning For Multivariate Time Series Imputation
9 pages
Loss Odyssey in Medical Image Segmentation
No ratings yet
Loss Odyssey in Medical Image Segmentation
13 pages
An Introduction To Deep Learning On Remote Sensing Images (Tutorial) - Mdl4eo
100% (1)
An Introduction To Deep Learning On Remote Sensing Images (Tutorial) - Mdl4eo
13 pages
Fair Thesis
No ratings yet
Fair Thesis
45 pages
(DownSub - Com) Stanford CS230 - Deep Learning - Autumn 2018 - Lecture 4 - Adversarial Attacks - GANs
No ratings yet
(DownSub - Com) Stanford CS230 - Deep Learning - Autumn 2018 - Lecture 4 - Adversarial Attacks - GANs
21 pages
Loss Function
No ratings yet
Loss Function
9 pages
Activation Functions: Sigmoid, Tanh, Relu, Leaky Relu, Prelu, Elu, Threshold Relu and Softmax Basics For Neural Networks and Deep Learning
No ratings yet
Activation Functions: Sigmoid, Tanh, Relu, Leaky Relu, Prelu, Elu, Threshold Relu and Softmax Basics For Neural Networks and Deep Learning
15 pages
Smart Attendance Based Decision Support System For Mid-Day Meal Scheme
No ratings yet
Smart Attendance Based Decision Support System For Mid-Day Meal Scheme
6 pages
The Role of Mathematics in Machine Learning: March 2023
No ratings yet
The Role of Mathematics in Machine Learning: March 2023
13 pages
8 PDF
No ratings yet
8 PDF
76 pages
Handwritten Digit Recognition
No ratings yet
Handwritten Digit Recognition
8 pages
Information Theory in Machine Learning
No ratings yet
Information Theory in Machine Learning
3 pages
Stochastic Gradient Descent Algorithm With Python and NumPy - Real
No ratings yet
Stochastic Gradient Descent Algorithm With Python and NumPy - Real
21 pages
20 1 7 Rubinstein
No ratings yet
20 1 7 Rubinstein
58 pages
ML Cheatsheet
100% (1)
ML Cheatsheet
219 pages
Tensorflow/Keras Assignment: Problem Specification
No ratings yet
Tensorflow/Keras Assignment: Problem Specification
10 pages
Deep Learning Assignment 1 Solution: Name: Vivek Rana Roll No.: 1709113908
No ratings yet
Deep Learning Assignment 1 Solution: Name: Vivek Rana Roll No.: 1709113908
5 pages