0% found this document useful (0 votes)

23 views52 pages

L10 Neural Network

Uploaded by

black hello

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views52 pages

L10 Neural Network

Uploaded by

black hello

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 52

Neural Network

1
Objectives
• Understand neural network
• What are activation functions?
• How backpropagation work in neural network?
• Understand feedforward neural network

2
Motivation for Neural Network

• Use biology as inspiration for mathematical

model
• Get signals from previous neurons
• Generate signals (or not) according to inputs
• Pass signals on to next neurons
• By layering many neurons, can create complex
model
3
Overview of Neural Network
Input Hidden Hidden
• Neural network layer layer 1 layer 2 Output
(Feature layer
• a function vector)
(Label)

• Comprised of
• Neurons will pass input values through
functions and output the result
• Weights carry values between neurons
• 3 main types of layers
• Input layer A 3-layer neural net with 3 input units, 4 hidden units in the 1st and
2nd hidden layer and 1 output unit
• Hidden layer(s) Naming conventions: a N-layer neural network:
• N-1 layers of hidden units
• Output layer • 1 output layer 4
Basic Neuron VisualizationSome form of
computation
Neuron
transforms the input
Weight outputs the
𝑥1 𝑤 with activation
transformed
1
data

𝑤2
Data from 𝑥2 neuron
previous
layer 𝑤3

𝑥3
Mathematical Model of the Neuron in a
Neural Network Activation function
𝑥1 𝑤1
𝑤1 𝑥 1 Cell body 𝑓
( ∑ 𝑤 𝑖 𝑥 𝑖 +𝑏
)
𝑖
𝑥 2 𝑤 2 𝑤 2 𝑥2
𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏 𝑓 Output
𝑖
𝑤3 𝑥 3
𝑥 3 𝑤3

𝑏
𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=𝑤1 𝑥1 +𝑤 2 𝑥 2+𝑤 3 𝑥 3 +𝑏
𝑖
Another form of single node
visualization

McCulloch-Pitts Model

7
In Vector Notation

• Bias,
• Activation function,
• Net input,

• Output to the next layer,

8
Relation to Logistic Regression
When we choose:
the “sigmoid” function:

Then a neuron is simply a “unit” of logistic

regression.

9
Example Neuron Computation

= 0.9 Sigmoid activation function

𝑤1 =2
1
( )
𝑓 ∑ 𝑤 𝑖 𝑥 𝑖 +𝑏 = 𝑓 ( 𝑧 )=
1+𝑒
−𝑥

𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏
𝑖
𝑤2 =3
= 0.2
𝑖
𝑤3 =−1

𝑥 3=0.3

𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=𝑤1 𝑥1 +𝑤 2 𝑥 2+𝑤 3 𝑥 3 +𝑏

= 0.5
𝑖
Example Neuron Computation

= 0.9 Sigmoid activation function

𝑤1 =2
1
( )
𝑓 ∑ 𝑤 𝑖 𝑥 𝑖 +𝑏 = 𝑓 ( 𝑧 )=
1+𝑒
−𝑥

𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏
𝑖
𝑤2 =3
= 0.2
𝑖
𝑤3 =−1

𝑥 3=0.3

= 0.5 𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=2 ( .9 ) +3 ( .2 ) + (− 1 ) (.3 ) +.5=2.6

𝑖
Example Neuron Computation
Sigmoid activation function

= 0.9 1
𝑤1 =2
(
𝑖
)
𝑓 ∑ 𝑤 𝑖 𝑥 𝑖 +𝑏 = 𝑓 ( 𝑧 )=
1+𝑒
−2.6

= 0.2
𝑤2 =3 𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏 =
0.93
𝑖
𝑤3 =−1

𝑥 3=0.3 Neuron
output

= 0.5 𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=2 ( .9 ) +3 ( .2 ) + (− 1 ) (.3 ) +.5=2.6

𝑖
Why Neural Network?
• Why not just use a single neuron? Why do we need a larger network?
• A single neuron (like logistic regression) only permits a linear decision
boundary.
• Most real-world problems are considerably more complicated!

13
Feedforward Neural Network

𝜎 𝜎
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎
14
Feedforward Neural Network
Input layer

𝜎 𝜎
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎
15
Feedforward Neural Network
Hidden Hidden
layer layer

𝜎 𝜎
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎
16
Feedforward Neural Network
Output layer

𝜎 𝜎
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎
17
Feedforward Neural Network
( 1) ( 2) ( 3)
Weights are represented
𝑊 𝑊 𝑊
by matrices
𝜎 𝜎
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎
18
Feedforward Neural Network
Net Input (sum of weighted
( 2) ( 3)
inputs, before activation
𝜎 𝜎
𝑧 𝑧 (4 )
𝑧
function)
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎

19
Feedforward Neural Network
𝑎
(2 )
𝑎
(3 )
Activations (output of
𝜎 𝜎
(1)
𝑎 𝑎
( 4)
neurons to next layer)
𝑥1 ^
𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3
𝜎 𝜎

20
Matrix representation of computation
, where 𝑧
( 2)
𝑎
(2 )

𝜎
( 1)
𝑊
is a 3x4 matrix

and is a 4-vector
𝑥1
𝜎
and is a 4-vector
𝑥2
𝜎
𝑥3
𝜎
21
Matrix representation of computation
For a single training instance (data point) 𝑧
( 2)
𝑎
(2 )

𝜎
( 1)
𝑊
• Input: vector x (a row vector of length 3)

• Output: vector (a row vector of length 3) 𝑥1

𝜎
,
𝑥2
,
𝜎
,
𝑥3
𝜎
22
Matrix representation of computation
• In practice, we do these computation for many data points at the
same time, by “stacking” the rows into a matrix.
( 2) (2 )
𝑧 𝑎

𝜎
( 1)
• But the equations look the same! 𝑊
• Input: matrix x (a nx3 matrix) (each row a single instance)
𝑥1
• Output: vector (a nx3 matrix) (each row a single prediction)
𝜎
, 𝑥2
, 𝜎
, 𝑥3
Next step will need to adjust the weights to learn
𝜎
from data.

23
How to Train a Neural Net?
• Put in Training inputs, get the
output
• Compare output to correct
Input
answers: Look at loss function J (Feature
Output
(Label)
Vector)
• Adjust and repeat!
• Backpropagation tells us how to
make a single adjustment using
calculus.

24
How to Train a Neural Net?
• Using Gradient Descent!
1. Make prediction
2. Calculate Loss
3. Calculate gradient of the loss function w.r.t. parameters
4. Update parameters by taking a step in the opposite direction
5. Iterate

25
Feedforward Neural Network
4. Evaluate
2. Calculate each layer

𝜎 𝜎 3. Get output

^
1. Pass in input
𝑥1 𝑦1 𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2 𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3 𝑦3
𝜎 𝜎
26
How to Train a Neural Net?
• How could we change the weights to make our Loss Function
lower?
• Think of neural net as a function F: X  Y
• F is a complex computation involving many weights
• Given the structure, the weights “define” the function F (and
therefore define our model)
• Loss Function is J(y,F(x))
27
How to Train a Neural Net?
• Get for every weight in the network.
• This tells us what direction to adjust each if we want to
lower our loss function.
• Make an adjustment and repeat!

28
How to Train a Neural Net?
(1) (2) (3)
𝑊 𝑊 𝑊 Want
𝜎 𝜎
𝑥1 ^
𝑦1 𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2 𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3 𝑦3
𝜎 𝜎
29
How to Train a Neural Net?
• Get for every weight in the network
• Use calculus, chain rule, etc.
• Functions are chosen to have “nice” derivatives
• Numerical issues to be considered

30
How to Train a Neural Net?

• Recall that:
• Though they appear complex, above are easy to
compute!
31
Backpropagation
𝜕 𝐽 (𝑦𝑖, ^
𝑦𝑖)
𝜕 𝐽 (𝑦𝑖, ^
𝑦𝑖) 𝜕 𝐽 (𝑦𝑖, ^
𝑦𝑖) Want
𝜕𝑊 2
𝜕𝑊 1
𝜎 𝜎 𝜕𝑊 3

𝑥1 ^
𝑦1 𝑦1
𝜎 𝜎
𝑥2 ^
𝑦2 𝑦2
𝜎 𝜎
𝑥3 ^
𝑦3 𝑦3
𝜎 𝜎
Update parameters by taking a step in the
opposite direction 32
Activation Functions – Sigmoid
Function
• curve looks like a S-shape
“sigmoid” function:
• Sigmoid output value between
0 to 1
• used for models where we have
to predict the probability as an
output (0 to 1)

33
Activation Functions – Softmax
Function
𝑧
𝑒 𝑗

( ) 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 𝑧 𝑗 = 𝑘
𝑓𝑜𝑟 𝑗 =1 , … , 𝑘
∑ 𝑒
𝑧𝑘

• For multiclass classification

𝑘=1

• takes vectors of real numbers as inputs, and normalizes them into a

probability distribution proportional to the exponentials of the input
numbers
• after applying Softmax, each element will be in the range of 0 to 1, and the
elements will add up to 1
• output can be interpreted as a probability distribution

34
Activation Functions – Hyperbolic tangent
function
• Better than sigmoid function
• negative inputs will be mapped
strongly to negative, and the zero
inputs will be mapped near zero in
the tanh graph
• mainly used classification between
two classes
35
Activation Functions – Rectified Linear Unit
(ReLU)
• most used activation function in the world
right now
• used in almost all the convolutional neural
networks or deep learning
• half rectified (from bottom)
• f(z) is zero when z is less than zero and f(z)
is equal to z when z is above or equal to zero
• Range: [ 0 to infinity)

36
Activation Functions – “Leaky” Rectified Linear Unit
(LReLU or LReL)
• the leak helps to increase the
range of the ReLU function α is
0.01 or so
• range of the Leaky ReLU is (-infinity
to infinity)

37
Choice of Activation Functions
• Sigmoid functions and their combinations generally work better in the case of
classifiers
• Sigmoids and tanh functions are sometimes avoided due to the vanishing gradient
problem
• ReLU function is a general activation function and is used in most cases these days

• If we encounter a case of dead neurons in our networks the leaky ReLU function is the
best choice
• Always keep in mind that ReLU function should only be used in the hidden layers

• As a rule of thumb, you can begin with using ReLU function and then move over to
other activation functions in case ReLU doesn’t provide with optimum results
38
Activation Functions
• Each neural network had
three hidden layers with
three units in each one.
• The only difference was
the activation function.
• Learning rate: 0.03,
regularization: L2.
39
Dropout overview
• Neural networks can represent extremely complex data
• Very large number of parameters allows NNs to memorize a dataset

• We want to regularize (smooth) their solution

• Prevent single neurons from dominating
• Require other neurons to be more flexible
• Randomly zero the output of neurons during training
• Other neurons have to adapt

40
Dropout model

41
Dropout layer
Gates

42
Knocking out and rescaling neurons

During training, we randomly drop each neuron with probability

43
Knocking out and rescaling neurons

When running the model, we scale the outputs of the neuron

44
Knocking out and rescaling neurons

This ensures that the expected value of the weights stays the same
at run time

45
Early Stopping
• Another, more heuristically approach to regularization is early
stopping.
• This refers to choosing some rules after which to stop training.
• Example:
• Check the validation log-loss every 10 epochs.
• If it is higher than it was last time, stop and use the previous model (i.e.,
from 10 epochs previous)

46
Concept of a “pseudo-ensemble”
• a collection of child models
spawned from a parent model by
perturbing it according to some
noise process
• in a deep neural network, it trains
a pseudo-ensemble of child
subnetworks generated by
randomly masking nodes in the
parent network
47
Model 1

48
Model 2

49
Model 3 etc.

50
List of all the hyperparameters you can
tweak in a basic Multilayer Perceptron
1. the
(MLP) number of hidden layers
2. the number of neurons in each hidden layer
3. the activation function used in each hidden layer and
in the output layer. Generally, ReLU activation
function (or one of its variants) is a good default for
the hidden layers.
4. For the output layer, in general you will want the
logistic activation function for binary classification,
the softmax activation function for multiclass
classification, or no activation function for regression.
5. If the MLP overfits the training data, you can try
Reference:
• https://levelup.gitconnected.com/vanishing-and-
exploding-gradients-ae7fb88f3b66
• https://towardsdatascience.com/activation-functions-
neural-networks-1cbd9f8d91d6
• https://medium.com/arteos-ai/the-differences-between-
sigmoid-and-softmax-activation-function-12adee8cf322

Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
MLS 1 - Presentation
No ratings yet
MLS 1 - Presentation
11 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
Neural Networks
No ratings yet
Neural Networks
12 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
No ratings yet
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
52 pages
Neural Networks
No ratings yet
Neural Networks
61 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Unit 4
No ratings yet
Unit 4
19 pages
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
No ratings yet
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
52 pages
Lecture8,9-Neural Networks
No ratings yet
Lecture8,9-Neural Networks
65 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
CS 329 Lecture4 2025new
No ratings yet
CS 329 Lecture4 2025new
61 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Neural Networks From Scratch: 3.1 Formal Neuron
No ratings yet
Neural Networks From Scratch: 3.1 Formal Neuron
8 pages
Lesson 2 Neural Network Architectures
No ratings yet
Lesson 2 Neural Network Architectures
35 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
Unit I
No ratings yet
Unit I
90 pages
Single Neuron Model
No ratings yet
Single Neuron Model
16 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
Slides 11
No ratings yet
Slides 11
48 pages
Unit III
No ratings yet
Unit III
29 pages
Notes Chapter Neural Networks
No ratings yet
Notes Chapter Neural Networks
18 pages
Understanding Neural Networks
No ratings yet
Understanding Neural Networks
12 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Chapter 4 Neural Network
No ratings yet
Chapter 4 Neural Network
46 pages
Neural Network: BY, Deekshitha J P Rakshitha Shankar
No ratings yet
Neural Network: BY, Deekshitha J P Rakshitha Shankar
27 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
CNN and Gan: Introduction To
No ratings yet
CNN and Gan: Introduction To
58 pages
Lec 06
No ratings yet
Lec 06
20 pages
Unit 1
No ratings yet
Unit 1
20 pages
1725876123-Unit 1 Fundamental of Deep Learning
No ratings yet
1725876123-Unit 1 Fundamental of Deep Learning
51 pages
Unit 3 - Ann
No ratings yet
Unit 3 - Ann
49 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Sparseautoencoder 2011new
No ratings yet
Sparseautoencoder 2011new
19 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Machine Learning
No ratings yet
Machine Learning
77 pages
Neural
No ratings yet
Neural
53 pages
Learning Algorithm
No ratings yet
Learning Algorithm
100 pages
Chapter 6 AI
No ratings yet
Chapter 6 AI
52 pages
Session NN
No ratings yet
Session NN
32 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Guide To Install Visual Studio 2019
No ratings yet
Guide To Install Visual Studio 2019
3 pages
Chapter 10 Application Layer - July 2023
No ratings yet
Chapter 10 Application Layer - July 2023
36 pages
Chap01 - Intro To Programming
No ratings yet
Chap01 - Intro To Programming
37 pages
Chapter 6 Network Layer - July 2023
No ratings yet
Chapter 6 Network Layer - July 2023
58 pages
Chapter 6 - Multimedia Element Video
No ratings yet
Chapter 6 - Multimedia Element Video
44 pages
Chapter 2 Network Protocols - Communication - July 2023
No ratings yet
Chapter 2 Network Protocols - Communication - July 2023
56 pages
Chapter 4 Data Link Layer (OSI Model) - July 2023
No ratings yet
Chapter 4 Data Link Layer (OSI Model) - July 2023
39 pages
L03 Generalization, Train Test Splits and Validation
No ratings yet
L03 Generalization, Train Test Splits and Validation
49 pages
Setup - Firebase
No ratings yet
Setup - Firebase
9 pages
Practical 1 Slide
No ratings yet
Practical 1 Slide
20 pages
Practical 3 - ESP32 WiFi
100% (1)
Practical 3 - ESP32 WiFi
9 pages
L01 Introduction To ML
No ratings yet
L01 Introduction To ML
16 pages
L02 Classification and Regression
No ratings yet
L02 Classification and Regression
26 pages
L08 Hierachical Agglomerative Clustering
No ratings yet
L08 Hierachical Agglomerative Clustering
41 pages
Practical 2 Hadoop Distributed File System (HDFS)
No ratings yet
Practical 2 Hadoop Distributed File System (HDFS)
4 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
L05 Unsupervised Learning - Overview
No ratings yet
L05 Unsupervised Learning - Overview
16 pages
Assignment NeuralNetwork
No ratings yet
Assignment NeuralNetwork
8 pages
Machine Learning Unit 2 MCQ
No ratings yet
Machine Learning Unit 2 MCQ
17 pages
Deep Learning
No ratings yet
Deep Learning
1 page
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
3 pages
Deep Learning: Hung-yi Lee 李宏毅
No ratings yet
Deep Learning: Hung-yi Lee 李宏毅
29 pages
DCS 304
No ratings yet
DCS 304
8 pages
Foundations of AI
No ratings yet
Foundations of AI
21 pages
Residual Attention Network For Image Classification
No ratings yet
Residual Attention Network For Image Classification
9 pages
Weak Artificial Intelligence
No ratings yet
Weak Artificial Intelligence
5 pages
PSLT
No ratings yet
PSLT
16 pages
L5 Neural Network
No ratings yet
L5 Neural Network
67 pages
Neural Networks - Comprehensive Foundation (Introduction)
No ratings yet
Neural Networks - Comprehensive Foundation (Introduction)
47 pages
Sakshi Singh
No ratings yet
Sakshi Singh
1 page
Artificial Intelligence and Deep Learning: Certificate Program
No ratings yet
Artificial Intelligence and Deep Learning: Certificate Program
12 pages
Class11 MCQ
No ratings yet
Class11 MCQ
2 pages
One-Shot Image Classification: Adv. Computer Vision Term Project Presentation
No ratings yet
One-Shot Image Classification: Adv. Computer Vision Term Project Presentation
20 pages
Deep Learning: Seungsang Oh
No ratings yet
Deep Learning: Seungsang Oh
6 pages
Untitled Document
No ratings yet
Untitled Document
15 pages
Module 4 Recurrent Neural Network
No ratings yet
Module 4 Recurrent Neural Network
78 pages
AI Escape Room - Questions
No ratings yet
AI Escape Room - Questions
6 pages
Artificial Intelligence: 2019 One 80Hrs 3 Hiring
No ratings yet
Artificial Intelligence: 2019 One 80Hrs 3 Hiring
4 pages
Viva Questions For Soft Computing
No ratings yet
Viva Questions For Soft Computing
5 pages
Emotion Recognition Using CNN and RNN
No ratings yet
Emotion Recognition Using CNN and RNN
37 pages
The Backpropagation Algorithm: Indian Institute of Technology Roorkee
No ratings yet
The Backpropagation Algorithm: Indian Institute of Technology Roorkee
19 pages
SNN Overview Website
No ratings yet
SNN Overview Website
22 pages
Python - Programming
No ratings yet
Python - Programming
9 pages
Ain3001 - Introduction - To.ann
No ratings yet
Ain3001 - Introduction - To.ann
39 pages
On Handwritten Digit Recognition
100% (1)
On Handwritten Digit Recognition
15 pages
ML Mentorship Prahitha Movva V1
No ratings yet
ML Mentorship Prahitha Movva V1
5 pages
PR 02 Activity 2
No ratings yet
PR 02 Activity 2
3 pages

L10 Neural Network

Uploaded by

L10 Neural Network

Uploaded by

Neural Network

• Use biology as inspiration for mathematical

• Output to the next layer,

Then a neuron is simply a “unit” of logistic

= 0.9 Sigmoid activation function

𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=𝑤1 𝑥1 +𝑤 2 𝑥 2+𝑤 3 𝑥 3 +𝑏

= 0.9 Sigmoid activation function

= 0.5 𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=2 ( .9 ) +3 ( .2 ) + (− 1 ) (.3 ) +.5=2.6

= 0.5 𝑧=∑ 𝑤 𝑖 𝑥 𝑖 +𝑏=2 ( .9 ) +3 ( .2 ) + (− 1 ) (.3 ) +.5=2.6

• Output: vector (a row vector of length 3) 𝑥1

• For multiclass classification

• takes vectors of real numbers as inputs, and normalizes them into a

• We want to regularize (smooth) their solution

During training, we randomly drop each neuron with probability

When running the model, we scale the outputs of the neuron

You might also like