0% found this document useful (0 votes)

29 views89 pages

L2 Perceptrons, Function Approximation, Classification

This document discusses neural networks and deep learning. It provides an overview of perceptrons and feedforward neural networks. Specifically, it describes Frank Rosenblatt's original perceptron model from 1958 which used groups of sensors and association units to combine inputs and determine responses. It notes that even a single perceptron unit cannot represent the XOR function. The document emphasizes that networked neural units are needed for more complex computations, and introduces multi-layer perceptron networks.

Uploaded by

Uzair Siddiqui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views89 pages

L2 Perceptrons, Function Approximation, Classification

Uploaded by

Uzair Siddiqui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 89

Neural Networks and Deep

Learning
Perceptrons, Feed-Forward Nets and
partition function specifications
PERCEPTRONS
A better model

• Frank Rosenblatt
– Psychologist, Logician
– Inventor of the solution to everything, aka the Perceptron (1958)

69
Rosenblatt’s perceptron

• Original perceptron model

– Groups of sensors (S) on retina combine onto cells in association
area A1
– Groups of A1 cells combine into Association cells A2
– Signals from A2 cells combine into response cells R
– All connections may be excitatory or inhibitory 70
Rosenblatt’s perceptron

• Even included feedback between A and R cells

– Ensures mutually exclusive outputs

71
Rosenblatt’s perceptron

• Simplified perceptron model

– Association units combine sensory input with fixed
weights
– Response units combine associative units with
learnable weights 72
Perceptron: Simplified model

• Number of inputs combine linearly

– Threshold logic: Fire if combined input exceeds threshold
The Universal Model
• Originally assumed could represent any Boolean circuit and
perform any logic
– “the embryo of an electronic computer that [the Navy] expects
will be able to walk, talk, see, write, reproduce itself and be
conscious of its existence,” New York Times (8 July) 1958
– “Frankenstein Monster Designed by Navy That Thinks,” Tulsa,
Oklahoma Times 1958
Linearity or affinity
1
Linear vs Affine?
1
2
2

.
i i
3 i
3

. +
N
N

• A threshold unit
– “Fires” if the affine function of inputs is positive
• The bias is the negative of the threshold T in the previous
slide
10
The standard paradigm for The standard Perceptron
statistical pattern recognition architecture

1. Convert the raw input vector into a vector

of feature activations. decision unit
Use hand-written programs based on
common-sense to define the features. learned weights
2. Learn how to weight each of the feature
activations to get a single scalar quantity.
feature units
3. If this quantity is above some threshold,
decide that the input vector is a positive
example of the target class. hand-coded weights
or programs

input units
How to learn biases using the same rule
as we use for learning weights

• A threshold is equivalent to having a

negative bias.
• We can avoid having to figure out a
separate learning rule for the bias by
using a trick:
– A bias is exactly equivalent to a weight
on an extra input line that always has
an activity of 1. b w1 w2
– We can now learn a bias as if it were a
weight. 1 x1 x2
Also provided a learning algorithm

Sequential Learning:
is the desired output in response to input
is the actual output in response to

• Boolean tasks
• Update the weights whenever the perceptron output is
wrong
– Update the weight by the product of the input and the
error between the desired and actual outputs
• Proved convergence for linearly separable classes
Perceptron
X 1
-1
2 X 0
1

Y
X 1

1
1

Y Values shown on edges are weights,

numbers in the circles are thresholds

• Easily shown to mimic any Boolean gate

• But…
Individual units

No solution for XOR!

X ?

?
?

Y
A single neuron is not enough

• Individual elements are weak computational elements

– Marvin Minsky and Seymour Papert, 1969, Perceptrons:
An Introduction to Computational Geometry

• Networked elements are required

Multi-layer Perceptron!
X 1

1
-1 1

2
1
1

-1
-1

Y
Hidden Layer

• XOR
– The first layer is a “hidden” layer
A more generic model

2
1 1
0 1
1 -1 1 1

2 2 1 2
1 1 1 -1 1 -1
1 1 1

X Y Z A

• A “multi-layer” perceptron
• Can compose arbitrarily complicated Boolean functions!
– In cognitive terms: Can compute arbitrary Boolean functions over
sensory input
– More on this in the next class
80
Neuron model: Logistic unit

Sigmoid (logistic) activation function.

Credit: Andrew Ng
Neural Network

Layer 1 Layer 2 Layer 3

Other network architectures

Layer 1 Layer 2 Layer 3 Layer 4

Deep Structures
• In any directed graph with input source nodes and
output sink nodes, “depth” is the length of the longest
path from a source to a sink
– A “source” node in a directed graph is a node that has only
outgoing edges
– A “sink” node is a node that has only incoming edges

• Left: Depth = 2. Right: Depth = 3

17
FEATURE-SPACE PARTITIONING
TOY EXAMPLE WITH HUMAN-
LEARNING OF WEIGHTS
Non-linear classification example: XOR/XNOR
, are binary (0 or 1).

x2
x2

x1
Simple example: AND 1.0

0 0
0 1
1 0
1 1
Example: OR function

-10

20 0 0
20 0 1
1 0
1 1
Negation:

0
1
Putting it together:

-30 10 -10

20 -20 20
20 -20 20

0 0
0 1
1 0
1 1
Capabilities of Threshold Neurons
•What do we do if we need a more complex function?
•Just like Threshold Logic Units, we can also combine
multiple artificial neurons to form networks with increased
capabilities.
•For example, we can build a two-layer network with any
number of neurons in the first layer giving input to a single
neuron in the second layer.
•The neuron in the second layer could, for example,
implement an AND function.
Capabilities of Threshold Neurons
x1

x2 xi
.
.
.

•What kind of function can such a network realize?

Capabilities of Threshold Neurons
•Assume that the dotted lines in the diagram represent the input-
dividing lines implemented by the neurons in the first layer:

2nd comp.

1st comp.

Then, for example, the second-layer neuron could output 1 if

the input is within a polygon, and 0 otherwise.
Capabilities of Threshold Neurons
•However, we still may want to implement functions that
are more complex than that.
•An obvious idea is to extend our network even further.
•Let us build a network that has three layers, with
arbitrary numbers of neurons in the first and second layers
and one neuron in the third layer.
•The first and second layers are completely connected,
that is, each neuron in the first layer sends its output to
every neuron in the second layer.
Capabilities of Threshold Neurons
x1

x2 oi
. .
. .
. .

•What type of function can a three-layer network realize?

Capabilities of Threshold Neurons
•Assume that the polygons in the diagram indicate the input regions
for which each of the second-layer neurons yields output 1:

2nd comp.

1st comp.

Then, for example, the third-layer neuron could output 1 if the

input is within any of the polygons, and 0 otherwise.
Capabilities of Threshold Neurons
•The more neurons there are in the first layer, the more
vertices can the polygons have.
•With a sufficient number of first-layer neurons, the
polygons can approximate any given shape.
•The more neurons there are in the second layer, the more
of these polygons can be combined to form the output
function of the network.
•With a sufficient number of neurons and appropriate
weight vectors wi, a three-layer network of threshold
neurons can realize any function Rn → {0, 1}.
Pattern Separation and NN architecture
The perceptron convergence procedure:
Training binary output neurons as classifiers

• Add an extra component with value 1 to each input vector. The “bias” weight on this
component is minus the threshold. Now we can forget the threshold.
• Pick training cases using any policy that ensures that every training case will keep
getting picked.
– If the output unit is correct, leave its weights alone.
– If the output unit incorrectly outputs a zero, add the input vector to the weight
vector.
– If the output unit incorrectly outputs a 1, subtract the input vector from the
weight vector.
• This is guaranteed to find a set of weights that gets the right answer for all the
training cases if any such set exists.
FUNCTION APPROXIMATION,
CLASSIFICATION….

• Multi-layer Perceptrons as universal Boolean

functions
• MLPs as universal classifiers
• MLPs as universal approximators
The perceptron as a Boolean gate
X 1
-1
2 X 0
1

Y
X 1

1
1
Values in the circles are thresholds
Y Values on edges are weights

• A perceptron can model any simple binary

Boolean gate
Perceptron as a Boolean gate
1
1

1
L
-1
-1

-1 Will fire only if X1 .. XL are all 1

and XL+1 .. XN are all 0

• The universal AND gate

– AND any number of inputs
• Any subset of who may be negated
Perceptron as a Boolean gate
1
1

1
L-N+1
-1
-1

-1 Will fire only if any of X1 .. XL are 1

or any of XL+1 .. XN are 0

• The universal OR gate

– OR any number of inputs
• Any subset of who may be negated
Perceptron as a Boolean Gate
1
1
Will fire only if at least K inputs are 1
1
K
1
1

• Generalized majority gate

– Fire if at least K inputs are of the desired polarity
Perceptron as a Boolean Gate
1
1
Will fire only if the total number of
1
L-N+K of X1 .. XL that are 1 and X L+1 .. X N that
-1
are 0 is at least K
-1

-1

• Generalized majority gate

– Fire if at least K inputs are of the desired polarity

30
The perceptron is not enough

X ?

?
?

• Cannot compute an XOR

Multi-layer perceptron
X 1

1
-1 1

2
1
1

-1
-1

Y
Hidden Layer

• MLPs can compute the XOR

Multi-layer perceptron XOR
X
1

1
-2
2 1
1

1 Thanks to Gerald Friedland

• With 2 neurons
– 5 weights and two thresholds
Multi-layer perceptron
2
1 1
0 1
1 -1 1 1

2 2 1 2
1 1 1 -1 1 -1
1 1 1

X Y Z A
• MLPs can compute more complex Boolean functions
• MLPs can compute any Boolean function
– Since they can emulate individual gates
• MLPs are universal Boolean functions
MLP as Boolean Functions
2
1 1
0 1
1 -1 1 1

2 2 1 2
1 1 1 -1 1 -1
1 1 1

X Y Z A

• MLPs are universal Boolean functions

– Any function over any number of inputs and any number of outputs
• But how many “layers” will they need?
How many layers for a Boolean
MLP?
Truth table shows all input combinations
Truth Table for which output is 1

X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1

• A Boolean function is just a truth table

How many layers for a Boolean
MLP?
Truth table shows all input combinations
Truth Table for which output is 1

X1 X2 X3 X4 X5 Y 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
0 0 1 1 0 1 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1

• Expressed in disjunctive normal form

How many layers for a Boolean
MLP?
Truth table shows all input combinations
Truth Table for which output is 1

X1 X2 X3 X4 X5 Y 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
0 0 1 1 0 1 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5

0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1
X1 1X2 X3 X4 X5
• Any truth table can be expressed in this manner!
• A one-hidden-layer MLP is a Universal Boolean Function
Recap: The MLP as a classifier

784 dimensions
(MNIST)
784 dimensions

• MLP as a function over real inputs

• MLP as a function that finds a complex “decision
boundary” over a space of reals
77
A Perceptron on Reals
x1

x3
1
x2 w1x1+w2x2=T

xN
0
x1
i i
i

• A perceptron operates on x2
x1
real-valued vectors
– This is a linear classifier 78
Boolean functions with a
real perceptron
0,1 1,1 0,1 1,1 0,1 1,1

X Y X

0,0 Y 1,0 0,0 X 1,0 0,0 Y 1,0

• Boolean perceptrons are also linear classifiers

– Purple regions are 1
79
Composing complicated
“decision”
boundaries
x2 Can now be composed into
“networks” to compute arbitrary
classification “boundaries”

• Build a network of units with a single output

that fires if the input is in the coloured area

82
Booleans over the reals

x2 x1

• The network must fire if the input is in the

coloured area
83
Booleans over t he reals

x
1

x2 x1

• The network must fire if the input is in the

coloured area
84
Booleans over the reals

x2 x1

• The network must fire if the input is in the

coloured area
85
Booleans over the reals

x2 x1

• The network must fire if the input is in the

coloured area
86
Booleans over the reals

x
2

x
1

x2 x1

• The network must fire if the input is in the

coloured area
87
Booleans over t he reals
3 N

i
x2
4 i=1
4 AND
3 3
5
y1 y2 y3 y4 y5
4 x11
4

3 3
4 x2 x1

• The network must fire if the input is in the coloured area

– The AND compares the sum of the hidden outputs to 5
• NB: What would the pattern be if it compared it to 4?
88
Composing a hexagon
3

3 4 3
5
5 5
4 6 4
5 5
3 5 3
4 4

3
N

Σ yi ≥ 6?
i=1

y1 y2 y3 y4 y5 y6

• The polygon net

x2 x1
How about a heptagon

• What are the sums in the different regions?

– A pattern emerges as we consider N > 6..
• N is the number of sides of the polygon
16 sides

• What are the sums in the different regions?

– A pattern emerges as we consider N > 6..
64 sides

• What are the sums in the different regions?

– A pattern emerges as we consider N > 6..
1000 sides

• What are the sums in the different regions?

– A pattern emerges as we consider N > 6..
100
Polygon net
N

Σ yi ≥ 𝑁?
i=1

y1 y2 y3 y4 y5

x2 x1

101
In the limit
N

Σ yi ≥ 𝑁?
i=1

y1 y2 y3 y4 y5

x2 x1

N/2

1 ra dius
• i i π 𝐱–cent

– Value of the sum at the output unit, as a function of distance from center, as N increases
• For small radius, it’s a near perfect cylinder
– N in the cylinder, N/2 outside
102
Composing a circle
N
N Σ yi ≥ 𝑁?
i=1

N/2

• The circle net

– Very large number of neurons
– Sum is N inside the circle, N/2 outside almost everywhere
– Circle can be at any location
103
Adding circles
𝟐𝑵
𝑵
Σ 𝐲𝒊 − ≥ 𝟎?
𝟐
𝒊=𝟏

• The “sum” of two circles sub nets is exactly N/2 inside

either circle, and 0 almost everywhere outside
105
Composing an arbitrary figure

𝑲𝑵
𝑵
Σ 𝐲𝒊 − ≥ 𝟎?
𝟐
𝒊=𝟏

• Just fit in an arbitrary number of circles

– More accurate approximation with greater number of
smaller circles
– Can achieve arbitrary precision
106
MLP: Universal classifier

𝑲𝑵
𝑵
Σ 𝐲𝒊 − ≥ 𝟎?
𝟐
𝒊=𝟏

• MLPs can capture any classification boundary

• A one-hidden-layer MLP can model any
classification boundary
• MLPs are universal classifiers 107
MLP as a continuous-valued
regression
T1
T1
1 1 f(x)
x +
T1 T2 x
1 -1
T2
T2

• A simple 3-unit MLP with a “summing” output unit can

generate a “square pulse” over an input
– Output is 1 only if the input lies between T1 and T2
– T1 and T2 can be arbitrarily specified
127
MLP as a continuous-valued
regression
2

1
n
T1
1
T1 x
1 f(x)
x T1 T2 x
1 -1
T2 +
1 n
2
T2

• A simple 3-unit MLP can generate a “square pulse” over an input

• An MLP with many units can model an arbitrary function over an input
– To arbitrary precision
• Simply make the individual pulses narrower
• A one-hidden-layer MLP can model an arbitrary function of a single input
128
For higher dimensions
N/2

+
-N/2
1
0

• An MLP can compose a cylinder

– in the circle, 0 outside

129
MLP as a continuous-valued
function
1
2

1
+ n n

• MLPs can actually compose arbitrary functions in any number of

dimensions!
– Even with only one hidden layer
• As sums of scaled and shifted cylinders
– To arbitrary precision
• By making the cylinders thinner
– The MLP is a universal approximator!
130
MLPs with additive output units
are universal approximators
N,K

i (i – 1)K +j 1
2
i=1,j=1

1
+ n n
1 nK
2

• MLPs can actually compose arbitrary functions

• But explanation so far only holds if the output
unit only performs summation
– i.e. does not have an additional “activation”
133
The network as a function
N

• Output unit with activation function

– Threshold or Sigmoid, or any other
• The network is actually a universal map from the entire domain of input values to
the entire range of the output activation
– All values the activation function of the output neuron
135
A GEOMETRICAL VIEW OF
PERCEPTRONS
Weight-space

• This space has one dimension per weight.

• A point in the space represents a particular setting of all the weights.

• Assuming that we have eliminated the threshold, each training case can
be represented as a hyperplane through the origin.
– The weights must lie on one side of this hyper-plane to get the answer
correct.
Weight space
• Each training case defines a plane
(shown as a black line) an input
good vector with
– The plane goes through the origin weight correct
and is perpendicular to the input vector answer=1
vector.
– On one side of the plane the output
is wrong because the scalar product
of the weight vector with the input
o
vector has the wrong sign.
the origin

bad
weight
vector
Weight space
• Each training case defines a plane
(shown as a black line)
– The plane goes through the origin
and is perpendicular to the input
vector.
bad
– On one side of the plane the output weights an input
is wrong because the scalar product vector with
of the weight vector with the input correct
good
answer=0
vector has the wrong sign. weights

o
the
origin
The cone of feasible solutions an input
vector with
correct
• To get all training cases right we need to
answer=0
find a point on the right side of all the
planes.
– There may not be any such point! bad
• If there are any weight vectors that get the good weights
right answer for all cases, they lie in a weights
hyper-cone with its apex at the origin.
– So the average of two good weight
vectors is a good weight vector.
• The problem is convex. an input o
vector with the origin
correct
answer=1
Learning with hidden units
• Networks without hidden units are very limited in the input-output mappings they can
learn to model.
– More layers of linear units do not help. Its still linear.
– Fixed output non-linearities are not enough.
• We need multiple layers of adaptive, non-linear hidden units. But how can we train such
nets?
– We need an efficient way of adapting all the weights, not just the last layer. This is
hard.
– Learning the weights going into hidden units is equivalent to learning features.
– This is difficult because nobody is telling us directly what the hidden units should do.
LEARNING THE WEIGHTS OF A LOGISTIC
OUTPUT NEURON
Logistic neurons

z = b + å xi wi
• These give a real-valued 1
y=
output that is a smooth and
bounded function of their 1+ e
-z
i
total input.
1
– They have nice
derivatives which make y 0.5
learning easy.
0
0 z
The derivatives of a logistic neuron

• The derivatives of the logit, z, • The derivative of the output with

with respect to the inputs and the respect to the logit is simple if you
weights are very simple: express it in terms of the output:

z = b + å xi wi 1
y=
i
1+ e
-z
¶z ¶z
= xi = wi dy
¶wi ¶xi = y (1 - y)
dz
The derivatives of a logistic neuron

= (1 + e-z )-1
1
y=
1+ e
-z

dy -1(-e-z ) æ 1 ö æ e-z ö
= = ç ÷ çç ÷ = y(1- y)
dz (1 + e-z )2 è 1 + e-z ø è 1 + e-z ÷ø

e-z (1+ e-z ) -1 (1+ e-z ) -1

because = = = 1- y
1+ e
-z 1+ e
-z 1+ e
-z 1 + e-z
Using the chain rule to get the derivatives needed for
learning the weights of a logistic unit

• To learn the weights we need the derivative of the output with respect
to each weight:
¶y ¶z dy
= = xi y (1- y)
¶wi ¶wi dz
delta-rule
¶E ¶y n ¶E
¶wi
= å ¶wi ¶y n
= - å xi
n n
y (1- y n
) (t n
- y n
)
n n
extra term = slope of logistic

NeuralNetworks-1
No ratings yet
NeuralNetworks-1
36 pages
04Introduction to Neural Networks
No ratings yet
04Introduction to Neural Networks
62 pages
Perceptron 2015
No ratings yet
Perceptron 2015
63 pages
Neural Net
No ratings yet
Neural Net
160 pages
02 Universal
No ratings yet
02 Universal
148 pages
MODULE 5
No ratings yet
MODULE 5
27 pages
Percptron
No ratings yet
Percptron
25 pages
Neural Networks
100% (1)
Neural Networks
119 pages
UNIT 3-Multilayer-Perceptrons
No ratings yet
UNIT 3-Multilayer-Perceptrons
23 pages
connectionist
No ratings yet
connectionist
34 pages
Neural Deep Learning
No ratings yet
Neural Deep Learning
221 pages
DL Slides 1
No ratings yet
DL Slides 1
63 pages
G51IAI Introduction To AI: Neural Networks 1
No ratings yet
G51IAI Introduction To AI: Neural Networks 1
36 pages
Learning XOR - Gradient Based Learning - Hidden Units
No ratings yet
Learning XOR - Gradient Based Learning - Hidden Units
43 pages
Deep Leaning
No ratings yet
Deep Leaning
117 pages
The Introduction To Neural Networks 10 4 24
No ratings yet
The Introduction To Neural Networks 10 4 24
54 pages
FALLSEM2024-25_BCSE209L_TH_VL2024250101737_2024-08-06_Reference-Material-I
No ratings yet
FALLSEM2024-25_BCSE209L_TH_VL2024250101737_2024-08-06_Reference-Material-I
20 pages
Introduction & Basics Perceptrons Perceptron Learning and PLR Beyond Perceptrons Two-Layered Feed-Forward Neural Networks
No ratings yet
Introduction & Basics Perceptrons Perceptron Learning and PLR Beyond Perceptrons Two-Layered Feed-Forward Neural Networks
60 pages
AN2DL_02_2324_Perceptron_2_FeedForward
No ratings yet
AN2DL_02_2324_Perceptron_2_FeedForward
55 pages
Module I
No ratings yet
Module I
109 pages
Unit 1 Until MLP
No ratings yet
Unit 1 Until MLP
56 pages
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
No ratings yet
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
65 pages
Perceptron For Class
No ratings yet
Perceptron For Class
28 pages
Lab ANDandXOR REGRESSION ANN
No ratings yet
Lab ANDandXOR REGRESSION ANN
13 pages
Lec 6-7 (Neural Networks)
No ratings yet
Lec 6-7 (Neural Networks)
26 pages
Ann Muj
No ratings yet
Ann Muj
65 pages
Unit 2
No ratings yet
Unit 2
15 pages
DL CHPT 1
No ratings yet
DL CHPT 1
59 pages
ADVANCED_SUPERVISED_LEARNING[1]
No ratings yet
ADVANCED_SUPERVISED_LEARNING[1]
17 pages
Introduction To Artificial Neural Networks and Perceptron
No ratings yet
Introduction To Artificial Neural Networks and Perceptron
59 pages
Lecture+8
No ratings yet
Lecture+8
65 pages
Lesson 7.0 Supervised Learning With Neural Networks (1)
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks (1)
22 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
02 Neural Network
No ratings yet
02 Neural Network
28 pages
DL Concepts 1 Overview
No ratings yet
DL Concepts 1 Overview
80 pages
Unit 3 Endsem PYQs
No ratings yet
Unit 3 Endsem PYQs
19 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
53 pages
Gate DA Syllabus
No ratings yet
Gate DA Syllabus
1 page
FALLSEM2023-24 CSE4020 ETH VL2023240103694 2023-09-01 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ETH VL2023240103694 2023-09-01 Reference-Material-I
35 pages
20200428135045cfbc718e2c (1)
No ratings yet
20200428135045cfbc718e2c (1)
30 pages
UNIT V
No ratings yet
UNIT V
49 pages
Learning Basics of Artificial Intelligence Through Neural Networks
No ratings yet
Learning Basics of Artificial Intelligence Through Neural Networks
90 pages
Neural Networks - V Unit (2)
No ratings yet
Neural Networks - V Unit (2)
43 pages
Unit 1
No ratings yet
Unit 1
19 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
Deep-learning (1)
No ratings yet
Deep-learning (1)
180 pages
ECE/CS 559 - Neural Networks Lecture Notes #4 The Perceptron and Its Training
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #4 The Perceptron and Its Training
12 pages
Neural Networks: - Genetic Algorithms - Genetic Programming - Behavior-Based Systems
No ratings yet
Neural Networks: - Genetic Algorithms - Genetic Programming - Behavior-Based Systems
74 pages
AI: Neural Network For Beginners (Part 1 of 3) : Sacha Barber
No ratings yet
AI: Neural Network For Beginners (Part 1 of 3) : Sacha Barber
9 pages
NNDL
No ratings yet
NNDL
96 pages
Dave Reed: Connectionist Approach To AI
No ratings yet
Dave Reed: Connectionist Approach To AI
26 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
221 pages
Supervised Learning Neural Networks
No ratings yet
Supervised Learning Neural Networks
34 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Part7.2 Artificial Neural Networks
No ratings yet
Part7.2 Artificial Neural Networks
51 pages
Simple Perceptrons: 1 Nonlinearity
No ratings yet
Simple Perceptrons: 1 Nonlinearity
5 pages
Vectors and Tensors Solutions D. Fleisch
50% (6)
Vectors and Tensors Solutions D. Fleisch
129 pages
From Dimension Free Matrix Theory to Cross Dimensional Dynamic Systems 1st Edition Daizhan Cheng pdf download
100% (4)
From Dimension Free Matrix Theory to Cross Dimensional Dynamic Systems 1st Edition Daizhan Cheng pdf download
52 pages
Applied Math I Lecture Notes 2 (Chapters 1-5)
No ratings yet
Applied Math I Lecture Notes 2 (Chapters 1-5)
110 pages
Functional AnFunctional Analysis Lecture Notes Jeff Schenkeralysis Jeff Schenker
0% (1)
Functional AnFunctional Analysis Lecture Notes Jeff Schenkeralysis Jeff Schenker
169 pages
Mathematics For Economists - An Elementary Survey (PDFDrive) PDF
No ratings yet
Mathematics For Economists - An Elementary Survey (PDFDrive) PDF
570 pages
Spinor
No ratings yet
Spinor
20 pages
Sub-Signature Operators and The Dabrowski-Sitarz-Zalecki Type Theorems For Manifolds With Boundary
No ratings yet
Sub-Signature Operators and The Dabrowski-Sitarz-Zalecki Type Theorems For Manifolds With Boundary
27 pages
Commuting Vector Fields
No ratings yet
Commuting Vector Fields
39 pages
B Senior Phase Further Education Training Teaching Sciences 2022
No ratings yet
B Senior Phase Further Education Training Teaching Sciences 2022
11 pages
Emtl Notes
No ratings yet
Emtl Notes
139 pages
Chapter 14 Notes-1
No ratings yet
Chapter 14 Notes-1
14 pages
Wk02 Cad Cam
No ratings yet
Wk02 Cad Cam
41 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Qise Qis College of Engineering & Technology:: Ongole (Autonomous)
No ratings yet
Qise Qis College of Engineering & Technology:: Ongole (Autonomous)
21 pages
UG022528 International GCSE in Mathematics Spec B For Web
No ratings yet
UG022528 International GCSE in Mathematics Spec B For Web
55 pages
Matlab Fundamental 12
No ratings yet
Matlab Fundamental 12
11 pages
Moculas Syllabus (2014-17)
No ratings yet
Moculas Syllabus (2014-17)
30 pages
Interpretability Tcav
No ratings yet
Interpretability Tcav
5 pages
A User-Centric Machine Learning Framework For Cyber Security Operations Center
No ratings yet
A User-Centric Machine Learning Framework For Cyber Security Operations Center
11 pages
Chapter 1 Introduciton To Vectors
No ratings yet
Chapter 1 Introduciton To Vectors
9 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Geometry in Space and Vectors: The Cross Product
No ratings yet
Geometry in Space and Vectors: The Cross Product
11 pages
Crash Course The Math of Quantum Mechanics PDF
No ratings yet
Crash Course The Math of Quantum Mechanics PDF
6 pages
Cbse+2 Unit-2 Exam Time-Table, Syllabus & Paper Setting Details - 2021-22
No ratings yet
Cbse+2 Unit-2 Exam Time-Table, Syllabus & Paper Setting Details - 2021-22
6 pages
Advanced Calculus
100% (2)
Advanced Calculus
730 pages
Examples - 1
No ratings yet
Examples - 1
4 pages
Math 218 Syllabus Fall 2013
No ratings yet
Math 218 Syllabus Fall 2013
2 pages
Expected No. of Questions For Exams Lecturer TOPIC Details TCEM 1101 ENGINEERING MATHEMATICS I SEM-I YR I-2020/2021 Topics and Lecturers
No ratings yet
Expected No. of Questions For Exams Lecturer TOPIC Details TCEM 1101 ENGINEERING MATHEMATICS I SEM-I YR I-2020/2021 Topics and Lecturers
1 page
Unit-Ii: Core T14: Ring Theory and Linear Algebra II
0% (1)
Unit-Ii: Core T14: Ring Theory and Linear Algebra II
28 pages
ChE641A 201920I Outline
No ratings yet
ChE641A 201920I Outline
3 pages
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
From Everand
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
Fouad Sabry
No ratings yet
Rev - 2023-25 - MAT - JR - Super60 - Nucleus BT - Teaching&Test Schedule M, P, C - W.E.F - 23-04-23@ 20th June
100% (1)
Rev - 2023-25 - MAT - JR - Super60 - Nucleus BT - Teaching&Test Schedule M, P, C - W.E.F - 23-04-23@ 20th June
22 pages

L2 Perceptrons, Function Approximation, Classification

Uploaded by

L2 Perceptrons, Function Approximation, Classification

Uploaded by

Neural Networks and Deep

• Original perceptron model

• Even included feedback between A and R cells

• Simplified perceptron model

• Number of inputs combine linearly

1. Convert the raw input vector into a vector

• A threshold is equivalent to having a

Y Values shown on edges are weights,

• Easily shown to mimic any Boolean gate

No solution for XOR!

• Individual elements are weak computational elements

• Networked elements are required

Sigmoid (logistic) activation function.

Layer 1 Layer 2 Layer 3

Layer 1 Layer 2 Layer 3 Layer 4

• Left: Depth = 2. Right: Depth = 3

•What kind of function can such a network realize?

Then, for example, the second-layer neuron could output 1 if

•What type of function can a three-layer network realize?

Then, for example, the third-layer neuron could output 1 if the

• Multi-layer Perceptrons as universal Boolean

• A perceptron can model any simple binary

-1 Will fire only if X1 .. XL are all 1

• The universal AND gate

-1 Will fire only if any of X1 .. XL are 1

• The universal OR gate

• Generalized majority gate

• Generalized majority gate

• Cannot compute an XOR

• MLPs can compute the XOR

1 Thanks to Gerald Friedland

• MLPs are universal Boolean functions

• A Boolean function is just a truth table

• Expressed in disjunctive normal form

• MLP as a function over real inputs

0,0 Y 1,0 0,0 X 1,0 0,0 Y 1,0

• Boolean perceptrons are also linear classifiers

• Build a network of units with a single output

• The network must fire if the input is in the

• The network must fire if the input is in the

• The network must fire if the input is in the

• The network must fire if the input is in the

• The network must fire if the input is in the

• The network must fire if the input is in the coloured area

• The polygon net

• What are the sums in the different regions?

• What are the sums in the different regions?

• What are the sums in the different regions?

• What are the sums in the different regions?

• The circle net

• The “sum” of two circles sub nets is exactly N/2 inside

• Just fit in an arbitrary number of circles

• MLPs can capture any classification boundary

• A simple 3-unit MLP with a “summing” output unit can

• A simple 3-unit MLP can generate a “square pulse” over an input

• An MLP can compose a cylinder

• MLPs can actually compose arbitrary functions in any number of

• MLPs can actually compose arbitrary functions

• Output unit with activation function

• This space has one dimension per weight.

• A point in the space represents a particular setting of all the weights.

• The derivatives of the logit, z, • The derivative of the output with

e-z (1+ e-z ) -1 (1+ e-z ) -1

You might also like