0% found this document useful (0 votes)
96 views15 pages

Neural - N - Problems - MLP

A multi-layer perceptron uses backpropagation to update weights in its network. It processes input patterns one at a time. For each pattern it performs a forward pass to calculate outputs, then a backward pass to calculate errors and update weights. This decreases the total error across all patterns over multiple epochs until a stopping criteria is met.

Uploaded by

Abdou Abdelali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views15 pages

Neural - N - Problems - MLP

A multi-layer perceptron uses backpropagation to update weights in its network. It processes input patterns one at a time. For each pattern it performs a forward pass to calculate outputs, then a backward pass to calculate errors and update weights. This decreases the total error across all patterns over multiple epochs until a stopping criteria is met.

Uploaded by

Abdou Abdelali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Mult-layer perceptron

Back Propagation
Steps of a single epoch
For each pattern
• Forward prop
o Calculate 𝑛𝑒𝑡𝑗 and 𝑜𝑗 for all neurons (except input layer and bias neurons)
o Calculate specific error (for single pattern)
• Back prop
o Calculate 𝛿𝑗 for all neurons (except input layer and bias neurons)
o Calculate ∆𝑤𝑖,𝑗 for all variable weights including bias weights
o 𝑤𝑖,𝑗 ≔ 𝑤𝑖,𝑗 + ∆𝑤𝑖,𝑗
After end of epoch
• Calculate total error = sum of specific errors
• Check stopping condition
• Run another epoch if stopping condition is False
Notes Assuming layer K is before layer H, and either layer L is after
layer H, or layer H is the output layer

• 𝑛𝑒𝑡ℎ = 𝑘∈𝐾 𝑤𝑘,ℎ 𝑜𝑘


1
• 𝑜ℎ = 𝑓𝑎𝑐𝑡 𝑛𝑒𝑡ℎ Default is
1+𝑒 −𝑛𝑒𝑡
• 𝛿ℎ =

• 𝑓𝑎𝑐𝑡 𝑛𝑒𝑡ℎ ∙ 𝑙∈𝐿 𝛿𝑙 𝑤ℎ,𝑙 if h is hidden neuron
• Default is: 𝑜ℎ 1 − 𝑜ℎ 𝑙∈𝐿 𝛿𝑙 𝑤ℎ,𝑙 for sigmoid activation function
′ 𝜕𝐸𝑟𝑟𝑝
• 𝑓𝑎𝑐𝑡 𝑛𝑒𝑡ℎ ∙ − if h is output neuron
𝜕𝑦ℎ
• Default is: 𝑦ℎ (1 − 𝑦ℎ )(𝑡ℎ − 𝑦ℎ ) for sigmoid activation function
1
and for 𝐸𝑟𝑟𝑝 = ℎ∈𝐻 𝑡ℎ − 𝑦ℎ 2
2
Warning! if different activation function or different error function is used, you must
′ 𝜕𝐸𝑟𝑟𝑝
calculate the derivatives 𝑓𝑎𝑐𝑡 𝑛𝑒𝑡ℎ ∙ −
𝜕𝑦ℎ
Notes
• ∆𝑤𝑖,𝑗 = 𝜂 𝑜𝑖 𝛿𝑗
• 𝑤𝑖,𝑗 ≔ 𝑤𝑖,𝑗 + 𝜂 𝑜𝑖 𝛿𝑗
• If i is input neuron: 𝑜𝑖 = 𝑥𝑖
• If i is bias neuron: 𝑜𝑖 = 1
Example
x1 x2 t
0 0 0
0 1 1
1 0 1
1 1 0

Assume learning rate = 0.3


Epoch: 1
Pattern: 1: x1 = 0, x2 = 0, t = 0
• Initial weights:
w13 = 0.3 w23 = -0.1 wb3 = 0.2
w14 = -0.2 w24 = 0.2 wb4 = -0.3
w35 = 0.4 w45 = -0.2 wb5 = 0.4

• Forward prop:
o net3 = w13 * x1 + w23 * x2 + wb3 = 0.3 * 0 – 0.1 * 0 + 0.2 = 0.2
o o3 = 1/(1 + e^-net3) = 1/(1 + e^-0.2) = 0.5498
o net4 = w14 * x1 + w24 * x2 + wb4 = -0.2 * 0 + 0.2 * 0 – 0.3 = -0.3
o o4 = 1/(1 + e^-net4) = 1/(1 + e^0.3) = 0.4256
o net5 = w35 * o3 + w45 * o4 + wb5 = 0.4 * 0.5498 – 0.2 * 0.4256 + 0.4 = 0.5348
o y = 1/(1+e^-net5) = 1/(1+e^-0.5348) = 0.6306
• Calculating error:
o Err_p1 = 0.5 * (0 – 0.6306)^2 = 0.1988
Epoch: 1
Pattern: 1: x1 = 0, x2 = 0, t = 0
Back Prop:

1) Finding delta
δ5 = y*(1 - y)*(t - y) = 0.6306 * (1 - 0.6306) * (0 - 0.6306) = - 0.1469
δ3 = o3(1 – o3)* δ5 * w35 = 0.5498 * (1 - 0.5498) * - 0.1469 * 0.4 = - 0.0145
δ4 = o4(1 – o4)* δ5 * w45 = 0.4256 * (1 - 0.4256) * - 0.1469 * -0.2 = 0.0072

2) Finding new weights


w35 := w35 + η * o3 * δ5 = 0.4 + 0.3 * 0.5498 * - 0.1469 = 0.3758
w45 := w45 + η * o4 * δ5 = -0.2 + 0.3 * 0.4256 * - 0.1469 = -0.2188
wb5 := wb5 + η * 1 * δ5 = 0.4 + 0.3 * 1 * - 0.1469 = 0.3559
w14 := w14 + η * x1 * δ4 = - 0.2 + 0.3 * 0 * 0.0072 = - 0.2
w24 := w24 + η * x2 * δ4 = 0.2 + 0.3 * 0 * 0.0072 = 0.2
wb4 := wb4 + η * 1 * δ4 = - 0.3 + 0.3 * 1 * 0.0072 = - 0.2978
w13 := w13 + η * x1 * δ3 = 0.3 + 0.3 * 0 * - 0.0145 = 0.3
w23 := w23 + η * x2 * δ3 = - 0.1 + 0.3 * 0 * - 0.0145 = - 0.1
wb3 := wb3 + η * 1 * δ3 = 0.2 + 0.3 * 1 * - 0.0145 = 0.1957
Epoch: 1
Pattern: 2: x1 = 0, x2 = 1, t = 1
• weights:
w13 = 0.3 w23 = - 0.1 wb3 = 0.1957
w14 = - 0.2 w24 = 0.2 wb4 = - 0.2978
w35 = 0.3758 w45 = - 0.2188 wb5 = 0.3559

• Forward prop:
o net3 = w13 * x1 + w23 * x2 + wb3 = …
o o3 = 1/(1 + e^-net3) = …
o net4 = w14 * x1 + w24 * x2 + wb4 = …
o o4 = 1/(1 + e^-net4) = …
o net5 = w35 * o3 + w45 * o4 + wb5 = …
o y = 1/(1+e^-net5) = …
• Calculating error:
o Err_p2 = 0.5 * (t – y)^2 = …
Epoch: 1
Pattern: 2: x1 = 0, x2 = 1, t = 1
Back Prop:

1) Finding delta
δ5 = y*(1 - y)*(t - y) = …
δ3 = o3(1 – o3)* δ5 * w35 = …
δ4 = o4(1 – o4)* δ5 * w45 = …

2) Finding new weights


w35 := w35 + η * o3 * δ5 = …
w45 := w45 + η * o4 * δ5 = …
wb5 := wb5 + η * 1 * δ5 = …
w14 := w14 + η * x1 * δ4 = …
w24 := w24 + η * x2 * δ4 = …
wb4 := wb4 + η * 1 * δ4 = …
w13 := w13 + η * x1 * δ3 = …
w23 := w23 + η * x2 * δ3 = …
wb3 := wb3 + η * 1 * δ3 = …
Epoch: 1
Pattern: 3: x1 = 1, x2 = 0, t = 1
• weights:
w13 = ? w23 = ? wb3 = ?
w14 = ? w24 = ? wb4 = ?
w35 = ? w45 = ? wb5 = ?

• Forward prop:
o net3 = w13 * x1 + w23 * x2 + wb3 = …
o o3 = 1/(1 + e^-net3) = …
o net4 = w14 * x1 + w24 * x2 + wb4 = …
o o4 = 1/(1 + e^-net4) = …
o net5 = w35 * o3 + w45 * o4 + wb5 = …
o y = 1/(1+e^-net5) = …
• Calculating error:
o Err_p3 = 0.5 * (t – y)^2 = …
Epoch: 1
Pattern: 3: x1 = 1, x2 = 0, t = 1
Back Prop:

1) Finding delta
δ5 = y*(1 - y)*(t - y) = …
δ3 = o3(1 – o3)* δ5 * w35 = …
δ4 = o4(1 – o4)* δ5 * w45 = …

2) Finding new weights


w35 := w35 + η * o3 * δ5 = …
w45 := w45 + η * o4 * δ5 = …
wb5 := wb5 + η * 1 * δ5 = …
w14 := w14 + η * x1 * δ4 = …
w24 := w24 + η * x2 * δ4 = …
wb4 := wb4 + η * 1 * δ4 = …
w13 := w13 + η * x1 * δ3 = …
w23 := w23 + η * x2 * δ3 = …
wb3 := wb3 + η * 1 * δ3 = …
Epoch: 1
Pattern: 4: x1 = 1, x2 = 1, t = 0
• weights:
w13 = ? w23 = ? wb3 = ?
w14 = ? w24 = ? wb4 = ?
w35 = ? w45 = ? wb5 = ?

• Forward prop:
o net3 = w13 * x1 + w23 * x2 + wb3 = …
o o3 = 1/(1 + e^-net3) = …
o net4 = w14 * x1 + w24 * x2 + wb4 = …
o o4 = 1/(1 + e^-net4) = …
o net5 = w35 * o3 + w45 * o4 + wb5 = …
o y = 1/(1+e^-net5) = …
• Calculating error:
o Err_p4 = 0.5 * (t – y)^2 = …
Epoch: 1
Pattern: 4: x1 = 1, x2 = 1, t = 0
Back Prop:

1) Finding delta
δ5 = y*(1 - y)*(t - y) = …
δ3 = o3(1 – o3)* δ5 * w35 = …
δ4 = o4(1 – o4)* δ5 * w45 = …

2) Finding new weights


w35 := w35 + η * o3 * δ5 = …
w45 := w45 + η * o4 * δ5 = …
wb5 := wb5 + η * 1 * δ5 = …
w14 := w14 + η * x1 * δ4 = …
w24 := w24 + η * x2 * δ4 = …
wb4 := wb4 + η * 1 * δ4 = …
w13 := w13 + η * x1 * δ3 = …
w23 := w23 + η * x2 * δ3 = …
wb3 := wb3 + η * 1 * δ3 = …
End of Epoch 1

Total error = Err_p1 + Err_p2 + Err_p3 + Err_p4 = 0.1988 + …

If Total error <= tolerance (If given): Then stop training

If epoch number = max number of epochs (if given): Then stop training

Otherwise, run another epoch using last weights

You might also like