0% found this document useful (0 votes)

14 views

Neural network intro lecture 4

The document provides an overview of deep learning and neural networks, detailing their components such as layers, input/output, loss functions, and optimizers. It explains the processes of feedforward, backpropagation, and the importance of parameters and hyperparameters in training models. Additionally, it discusses performance assessment, including concepts like overfitting and dropout techniques to improve model accuracy.

Uploaded by

bukaraisha99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Neural network intro lecture 4

Uploaded by

bukaraisha99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Neural Network

What is Deep Learning

label
dataset (with label)

Cat

Dog

It is called deep learning if we use neural network as a model in supervise learning

What is Deep Learning

Neural network

We can have million of neuron to analyse and learn the pattern (features) of a given data
and memorize the pattern.
NN Basic and Concept
Neural Network components

Neural network compose of 4 main components

1) Layers
2) Input and output
3) Loss function
4) optimizer
Layer
Layers

Node aka neuron

(e.g. layer has 4
neurons)

Fully connected layer/ dense

Input & Output

Input layer
(features)
Output layer
(classes/labels)
Input & Output
n
Class 1
m
Class 2

Xmn

The whole image is input

to first layer at once.

E.g. If you have 100 images, each image

will be inserted into NN one by one
Input & Output

Sentosa
X3 others

The four features at first row are input to blue layer at once
Loss Function

4 main components

1) Neuron
2) Weight and Biases
3) Activation function
4) Feedforward
Neuron

Artificial
b1 neuron is
w1
referred
S F perceptron
Weight & Bias

Input layer Layer 1 x : input

w :weight
b b : bias
x F(s) : output
w s F F(s)

s = w*x + b
Activation function

Input layer Layer 1 x : input

w :weight
b b : bias
x F(s) : output
w s F F(s)

F(s) = 1
s = w*x + b 1+ e-x
Feed forward

Input layer Layer 1 x : input

x=1
w :weight
w=0.3
b b : bias
b= - 0.3 x F(s) : output
s= 0 w s F F(s)
F(s) = ?

F(s) = 1
s = w*x + b 1+ e-x
Loss
Forward direction to
reach F(S)

x=1
w=0.3 x b
w
b= - 0.3
s F
s=
F(s) = F(S)
Target value (label)
target value =
Loss =
Loss = target value – F(S)

Loss is the different between target value and F(s). Also called error
MSE
Loss
Forward direction to
reach F(S)

x=1
x b
w=0.3
w
w=0.3, b=-0.3
b= - 0.3
s F s = w*x +b
s= F(S) w x b s target value loss
F(s) =
Target value (label) 0.3*1 + (-0.3) = 0 0 0
target value =
Loss =
0.3*2 + (-0.3) = 0.3 -1 1.68
Loss = target value – F(S) 0.3*3 + (-0.3) = 0.6 -2 6.75
0.3*4 + (-0.3) = 0.9 -3 15.21

Lets x = 1, 2, 3, 4
Optimizer

4 main components

1) Backpropagation
2) Optimizer
3) Learning rate
4) Epoch & Accuracy
Back propagation
Backward direction

We go
backward in b
effort to x w
minimize s F Loss
the loss F(S)
Target value (label)

Loss = target value – F(S)

Optimizer (Reducing the loss)
Backward direction

We go backward
Change in value
in effort to b
minimize x w
s= (x*w) + b What are w and b so that loss is zero?
the loss using s F
F(s) w=-1, b=1
Optimizer. It is a
function to Target value (label) s = w*x +b
change w x b s target value loss
w and b so that -1*1 + 1 = 0 0 0
loss is zero
-1*2 + 1 = -1 -1 0
-1*3 + 1 = -2 -2 0
Loss = target value – F(s) = 0 0
-1*4 + 1 = -3 -3
Learning rate
Backward direction

Optimizer
updates Change in value
weight and x
b
w
bias toward s= (x*w) + b
s F
zero loss. F(s)
Learning rate Target value (label)
is the rate of
optimizer
changes
weights and Loss = target value – F(s) = 0
biases
Epoch
Backward direction

Forward direction to
One epoch Reach target value
Input layer 1 output
consists of (Dataset)
one forward
direction and b
then one x1, x2, w Optimizer
x3, x4,
backward x5, x6 s F Loss
F(s)
direction and Target value (label)
optimizer is
executed 6 inputs are fetched into neuron in one cycle is one epoch
once for each
samples in You need run several epoch until loss is zero
dataset
Now let adding more layers (multilayer)
Adding more neuron

w1, w2 .. W4 are weights

(Every input must x1 (Every path has weight)
w1 b1
has path to neuron(node))
w2 1
S1 F F(S1) =
1+ e-s
w3 b1and b2 is bias
x2
b2 (Node has bias)
w4

S2 F 1
F(S2) =
1+ e-s Every node has function F(s)
Adding more layers
Input layer 1 layer 2 output
2 inputs
b3 Model has 2 hidden layers
Layer 1 has 2 nodes
x1 w1 b1 Layer 2 has 3 nodes
S F w11
w5 Layer output has two labels
w2
S F w6 w12
w7 b4 S F Target value 1
w3
x2 w13
w4 b2 S F
w8 w14
Target value 2
S F w15 S F
w9
b5 w16
w10 Every input has path to every node
Every node has path to every node
S F
Every path has weight and value are different
Every node has different bias (b) except the output nodes
Every node has same activation function (F)
Every node has different output F(s)
Every node has to do summation (s)
F(s) is Activation function
Layer 1
Final value F(s) come out from nodes
determine by activation function. In
this example we use sigmoid function as
wn+1 activation function.
S F F(s)
1
F(S) =
1+ e-s

F(s)

1
Sigmoid function: F(S) = 1+ e-x
Sigmoid vs. Tanh ReLU Leaky ReLU
Feed forward
From x forward direction to
reach F(S)

Input layer 1 layer 2 output

(Dataset)

x1
S F

S F
S F F1(S)

x2 S F

S F S F F2(S)

S F
Loss
Forward direction to
reach F(S)

Input layer 1 layer 2 output

(Dataset)

x1
S F

S F
S F Target value (label)

x2 S F Loss = target value – F

S F Every path has loss

S F
Average of loss

S F
Back propagation
From F(s) backward direction
To reach x

We go Input layer 1 layer 2 output

backward in b3
effort to x1 b1
w1 S F
minimize w5 w11
w2
the loss by S F w6 w12
w7 b4 S F Target value 1
changing x2
w3
w13
value of w4 b2 S F
w8 w14
weight &, Target value 2
S F w15 S F
biases w9
b5
w10 w16

S F
Optimizer
From x forward direction to
reach F(S)
We go backward Input layer 1 layer 2 output
in effort to b3
minimize x1 b1
w1 S F
the loss. w5 w11
w2
Optimizer is a S F w6 w12
w7 b4 S F Target value 1
function to x2
w3
w13
change w4 b2 S F
w8 w14
weights and Target value 2
S F w15 S F
biases so that w9
b5
w10 w16
loss is zero
S F
Optimizer
Backward direction

Change in value

x1
w1 b1 b3 Loss

S F
w2 b2 S F Target value (label)
w3
S F

Optimizer works at every path

Objective : Loss = 0
Types of optimizer

GradientDescentOptimizer

AdadeltaOptimizer

MomentumOptimizer

AdamOptimizer

FtrlOptimizer

RMSPropOptimizer
Learning rate
Backward direction

Optimizer Change the

updates value according
to learning rate
weight and
bias toward
x1
zero loss. w1 b1 b3 Loss
Learning rate
S F
is the rate of w2 b2 S F Target value (label)
w3
optimizer
S F
changes
weights and Loss = target value - F
biases
Learning rate : rule that optimizer has to follow in changing w and b
Epoch
Backward direction

Forward direction to
One epoch Reach target value
Input layer 1 layer 2 output
consists of (Dataset)
one forward
direction and x1
S F Loss
then one
backward S F
Target value (label)
S F
direction and
optimizer is S F
Loss = target value - F
executed
S F S F
once for all
sample in
dataset. S F
Epoch, batch & iterations
Epoch

Batch

Iteration
Epoch, batch & iterations

This approach called “Mini batch gradient descent”

Epoch, batch & iterations

Dataset is 100 samples

Epoch = 40
Num_of_batch (iteration) = 5
Batch_size = 20

for i less than or equal to Epoch

for j less than or equal to Num_of_batch
compute loss and optimized Batch_size
Epoch, batch & iterations

What is happening during epoch?

Dataset is 1 sample

Epoch = 4
Num_of_batch = 1
Batch_size = 1

for i less than or equal to Epoch

for j less than or equal to Num_of_batch
After 4 epoch the optimizer achieves 0 error
compute loss and optimized Batch_size
Parameter and hyperparameter

Parameter Any value that change by computer.

They are weight and biases. Automatically update by optimizer

Hyperparameter Any value that change by human.

They are learning rate, epoch, batch, number of layer, number of nodes
dropped out rate.
Tutorial
How many parameters in this model
b3

w1 b1
X1 S F w11
w5
w2
S F w6 w12
w7 b4 S F
w3
x2 w13
w4 b2 S F
w8 w14
S F w15 S F
w9
b5 w16
w10
S F
Layers How many layers?
Input
How many nodes?
Output How many inputs?
How many activation functions?
How many classes?
How many weights ?
How many biases
x4 How many optimizer?
How many parameters?
Assessing performance
Assessing the performance
Train data Test data
(80%) (20%)
Dataset is 100 samples
Validation phase
Epoch = 40
Each single epoch → we run train data.
Num_of_batch = 5
End of each single epoch → we run test data
Batch_size = 20

for i less than or equal to Epoch

for j less than or equal to Num_of_batch Accuracy is the percentage of right prediction
compute loss and optimized Batch_size over number of sample in test data. It uses
during validation(of every epoch) or
testing phase(end of whole epoch).

Loss is the percentage during 1 epoch.

Assessing the performance
Train data Test data
(80%) (20%)
Dataset is 100 samples
Validation phase
Epoch = 40
Each single epoch → we run train data.
Num_of_batch = 5
End of each single epoch → we run test data
Batch_size = 20

for i less than or equal to Epoch

for j less than or equal to Num_of_batch Overfitting is when loss in validation phase is much
compute loss and optimized Batch_size bigger than in training phase.

Underfitting is simply the loss is much bigger

during training phase.
Assessing the performance
Overfitting is when training is so good but then when validation/testing phase is bit worsts

Dropped out

Randomly pick any nodes and disable it.

We gives every nodes a probability for being alive.

E.g. say probability is 0.5. So every node will be
50% alive or 50% dead.

Dropped out is always related to overcome overfitting.

Thank you

Full (Ebook PDF) Experience Sociology 4th Edition by David Croteau PDF All Chapters
67% (3)
Full (Ebook PDF) Experience Sociology 4th Edition by David Croteau PDF All Chapters
41 pages
Sample Teaching Internship Agreement
100% (1)
Sample Teaching Internship Agreement
2 pages
None Exp22 Word Ch02 CumulativeAssessment Space
No ratings yet
None Exp22 Word Ch02 CumulativeAssessment Space
2 pages
Lec 15 MLP Cont
No ratings yet
Lec 15 MLP Cont
34 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
NeuralNetworks
No ratings yet
NeuralNetworks
29 pages
ml
No ratings yet
ml
10 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
7 pages
ML Lec 09 ANN Quadratic Training
No ratings yet
ML Lec 09 ANN Quadratic Training
44 pages
Machine Learning With Convolutional Neural Networks
No ratings yet
Machine Learning With Convolutional Neural Networks
22 pages
Part 13 MD
No ratings yet
Part 13 MD
41 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
week 06 - Deep Feedforward Networks - Optimization
No ratings yet
week 06 - Deep Feedforward Networks - Optimization
83 pages
Optimization
No ratings yet
Optimization
44 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
59 pages
DeepLearning Recap
No ratings yet
DeepLearning Recap
104 pages
HODL Lec 2 Training NNs Intro TF
No ratings yet
HODL Lec 2 Training NNs Intro TF
83 pages
Presentation 1
No ratings yet
Presentation 1
14 pages
PowerPoint Presentation-2
No ratings yet
PowerPoint Presentation-2
52 pages
CS601_Machine Learning_Unit 2 New
No ratings yet
CS601_Machine Learning_Unit 2 New
56 pages
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
No ratings yet
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
86 pages
Ch2-Training, Optimization and Regularization of DNN-new (1)
No ratings yet
Ch2-Training, Optimization and Regularization of DNN-new (1)
114 pages
CE6146_Lecture_3
No ratings yet
CE6146_Lecture_3
83 pages
6 Working Example 01-08-2024
No ratings yet
6 Working Example 01-08-2024
21 pages
EPS-DL-Handout3-Build ANN From Scratch Basics
No ratings yet
EPS-DL-Handout3-Build ANN From Scratch Basics
25 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
AI_Lec24-25
No ratings yet
AI_Lec24-25
63 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Slide 2-f2
No ratings yet
Slide 2-f2
52 pages
Week 2 Artificial Neural Networks
No ratings yet
Week 2 Artificial Neural Networks
62 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
3rd Ass
No ratings yet
3rd Ass
6 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
09-Neural Networks
No ratings yet
09-Neural Networks
18 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Back Propagation Learning Algorithm
No ratings yet
Back Propagation Learning Algorithm
15 pages
00005187-Deep Learning PPT
No ratings yet
00005187-Deep Learning PPT
11 pages
Fundamentals of Deep Learning: Part 2: How A Neural Network Trains
No ratings yet
Fundamentals of Deep Learning: Part 2: How A Neural Network Trains
54 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
ANN_Presentation_Exam_Hafsa
No ratings yet
ANN_Presentation_Exam_Hafsa
29 pages
DL Mod2
No ratings yet
DL Mod2
45 pages
ML807_Distributed_and_Federated_Learning_Slides_2
No ratings yet
ML807_Distributed_and_Federated_Learning_Slides_2
211 pages
Slides 11
No ratings yet
Slides 11
48 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
Lecture 40,41 BP Algorithm
No ratings yet
Lecture 40,41 BP Algorithm
11 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
Back Propagation ALGORITHM
No ratings yet
Back Propagation ALGORITHM
11 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Lec 7 Optimization Part 2
No ratings yet
Lec 7 Optimization Part 2
139 pages
Neural Network (Perceptrons)
No ratings yet
Neural Network (Perceptrons)
31 pages
Week 2
No ratings yet
Week 2
17 pages
Unit 2.4
No ratings yet
Unit 2.4
31 pages
Ann-Back Propagation
No ratings yet
Ann-Back Propagation
21 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Lecture 7 - Optimization Part I
No ratings yet
Lecture 7 - Optimization Part I
38 pages
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
COM_415 CHAPTER
No ratings yet
COM_415 CHAPTER
8 pages
UNIT 1 AND 2
No ratings yet
UNIT 1 AND 2
62 pages
COM 412 Lecture Note
No ratings yet
COM 412 Lecture Note
19 pages
com 414 lecture note HND II
No ratings yet
com 414 lecture note HND II
34 pages
Chapter OneLTUCL7QDHN (2)
No ratings yet
Chapter OneLTUCL7QDHN (2)
3 pages
HS UNIT 10 TEST 1 (CẬP NHẬT)
No ratings yet
HS UNIT 10 TEST 1 (CẬP NHẬT)
4 pages
Reflection 2
No ratings yet
Reflection 2
6 pages
Municipal Social Welfare and Development Office Social Case Study
No ratings yet
Municipal Social Welfare and Development Office Social Case Study
2 pages
OJT-MOA - Sample
No ratings yet
OJT-MOA - Sample
6 pages
Introduction To Inverse Kinematics With Jacobian Transpose, Pseudo Inverse and Damped Least Squares Methods
No ratings yet
Introduction To Inverse Kinematics With Jacobian Transpose, Pseudo Inverse and Damped Least Squares Methods
19 pages
Bottom Up
No ratings yet
Bottom Up
9 pages
JSS3 3rd Term English
No ratings yet
JSS3 3rd Term English
16 pages
Ron's Horrible Disease (Bilingual Script
No ratings yet
Ron's Horrible Disease (Bilingual Script
5 pages
Diagnostic Test A PDF
No ratings yet
Diagnostic Test A PDF
2 pages
District Core Math Exam s4
No ratings yet
District Core Math Exam s4
4 pages
Sri Sai University, Palampur (HP) : Programme: MCA Term: Ist (2011)
No ratings yet
Sri Sai University, Palampur (HP) : Programme: MCA Term: Ist (2011)
3 pages
My Reflective Essay
No ratings yet
My Reflective Essay
2 pages
Guitnangbayan Elementary School: The Imparting or Exchanging Information or News
No ratings yet
Guitnangbayan Elementary School: The Imparting or Exchanging Information or News
4 pages
FITT 3 Practical Exam 1 Dance Fundamental
No ratings yet
FITT 3 Practical Exam 1 Dance Fundamental
10 pages
Doctolero Avenue, Rupenta 1 Tagum City, Davao
No ratings yet
Doctolero Avenue, Rupenta 1 Tagum City, Davao
1 page
Lecture 07 - Sre - Se2001 (Bse)
No ratings yet
Lecture 07 - Sre - Se2001 (Bse)
40 pages
Media Management Lecture 2
No ratings yet
Media Management Lecture 2
36 pages
Quick Reference For SFT
No ratings yet
Quick Reference For SFT
3 pages
LIS EBEIS Orientation
No ratings yet
LIS EBEIS Orientation
2 pages
Q3 - LE - Science 4 - Lesson 1 - Week 1
No ratings yet
Q3 - LE - Science 4 - Lesson 1 - Week 1
16 pages
WBC Anomalies
No ratings yet
WBC Anomalies
29 pages
Nachiket Sakharam Jambhorkar
No ratings yet
Nachiket Sakharam Jambhorkar
1 page
2nd Quarter Week 1-Activity 1:: Freedom of The Human Person
No ratings yet
2nd Quarter Week 1-Activity 1:: Freedom of The Human Person
2 pages
Management Trainee Application Form Updatedtcm 75270164
No ratings yet
Management Trainee Application Form Updatedtcm 75270164
8 pages
2006 CSSA English Standard Trials Paper 1
0% (1)
2006 CSSA English Standard Trials Paper 1
12 pages
55 059A - REVISI - Enny Irawaty 1 FINAL 1215-1220
No ratings yet
55 059A - REVISI - Enny Irawaty 1 FINAL 1215-1220
6 pages
Deepak Sundrani: B.E. (Civil) M.E. (Construction Management) M.B.A. (Marketing) LL.B Doing PH.D
No ratings yet
Deepak Sundrani: B.E. (Civil) M.E. (Construction Management) M.B.A. (Marketing) LL.B Doing PH.D
37 pages

Neural network intro lecture 4

Uploaded by

Neural network intro lecture 4

Uploaded by

Neural Network

What is Deep Learning

It is called deep learning if we use neural network as a model in supervise learning

Neural network compose of 4 main components

Node aka neuron

Fully connected layer/ dense

The whole image is input

E.g. If you have 100 images, each image

Input layer Layer 1 x : input

Input layer Layer 1 x : input

Input layer Layer 1 x : input

Loss = target value – F(S)

w1, w2 .. W4 are weights

Input layer 1 layer 2 output

Input layer 1 layer 2 output

x2 S F Loss = target value – F

S F Every path has loss

We go Input layer 1 layer 2 output

Optimizer works at every path

Optimizer Change the

This approach called “Mini batch gradient descent”

Dataset is 100 samples

for i less than or equal to Epoch

What is happening during epoch?

for i less than or equal to Epoch

Parameter Any value that change by computer.

Hyperparameter Any value that change by human.

for i less than or equal to Epoch

Loss is the percentage during 1 epoch.

for i less than or equal to Epoch

Underfitting is simply the loss is much bigger

Randomly pick any nodes and disable it.

We gives every nodes a probability for being alive.

Dropped out is always related to overcome overfitting.

You might also like