0% found this document useful (0 votes)
4 views16 pages

Artificial Neural Networks and Deep Learning

The document provides an overview of artificial neural networks and deep learning, covering various types such as threshold logic units, multi-layer perceptrons, and recurrent neural networks. It discusses the structure, operation, and training of these networks, along with concepts like function approximation and error definition in learning tasks. Additionally, it includes details on normalization techniques for input vectors to enhance network performance.

Uploaded by

mahsa.kh.1980
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views16 pages

Artificial Neural Networks and Deep Learning

The document provides an overview of artificial neural networks and deep learning, covering various types such as threshold logic units, multi-layer perceptrons, and recurrent neural networks. It discusses the structure, operation, and training of these networks, along with concepts like function approximation and error definition in learning tasks. Additionally, it includes details on normalization techniques for input vectors to enhance network performance.

Uploaded by

mahsa.kh.1980
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Artificial N eu r al Networks

and Deep Lear n in g

1
Contents

• Introduction
Motivation, Biological Background

• Th res h o l d L o g i c U n i t s
Definition, Geometric Interpretation, Limitations, Networks of T L U s , Training

• General N e u r a l Networks
Structure, Operation, Training

• M u l t i - layer Perceptrons
Definition, Function Approximation, Gradient Descent, Backpropagation, Variants, Sensitivity Analysis

• Deep Learn i n g
Many-layered Perceptrons, Rectified Linear Units, Auto-Encoders, Feature Construction, Image Analysis

• R a d i a l B a s i s Fu n ct ion Networks
Definition, Function Approximation, Initialization, Training, Generalized Version

• Self-Org an i zi n g Map s
Definition, Learning Vector Quantization, Neighborhood of Output Neurons

• Hopfield Networks and B o l t z m a n n Machines


Definition, Convergence, Associative Memory, Solving Optimization Problems, Probabilistic Models

• Recu rren t N e u r a l Networks


Differential Equations, Vector Networks, Backpropagation through Time

2
General ( A r tif ic ia l) N e u r a l Networks

50
General N e u r a l Networks

B a s i c graph theoretic notions

A (directed) graph is a pair G = (V, E ) consisting of a (finite) set Vof vertices


or nodes and a (finite) set E ⊆ V × V of edges.
We call an edge e = (u, v) ∈ E directed from vertex u to vertex v.

Let G = (V, E ) be a (directed) graph and u ∈ V a vertex.


Then the vertices of the set
pred(u) = { v ∈ V | (v, u) ∈ E }
are called the predecessors of the vertex u
and the vertices of the set
succ(u) = { v ∈ V | (u, v) ∈ E }
are called the successors of the vertex u.

51
General N e u r a l Networks

General definition of a neural network

An (artificial) neural network is a (directed) graph G = (U,


C), whose vertices u ∈ U are called neurons or units and
whose edges c ∈ C are called connections.

The set U of vertices is partitioned into


• the set Uin of i n p u t neurons,
• the set Uout of output neurons, and
• the set Uhidden of hidden neurons.

It is
U = Uin ∪ Uout ∪ Uhidden,

Uin /= ∅, Uout /= ∅, Uhidden ∩ (Uin ∪ Uout) = ∅.

52
General N e u r a l Networks

Each connection (v, u) ∈ C possesses a weight wuv and


each neuron u ∈ U possesses three (real-valued) state variables:
• the network i n p u t netu,
• the activation actu, and
• the output outu.
Each input neuron u ∈ Uin also possesses a fourth (real-valued) state variable,
• the external i n p u t extu.
Furthermore, each neuron u ∈ U possesses three functions:
(u)
• the network i n p u t function f net : IR2| pred(u)|+κ1(u) → IR,
(u)
• the activation function f act : IR κ 2 (u) → IR, and
f out : IR → IR,
(u)
• the output function
which are used to compute the values of the state variables.

53
General N e u r a l Networks

T y p e s of (artificial) neural networks:

• If the graph of a neural network is acyclic,


it is called a feed-forward network.
• If the graph of a neural network contains cycles (backward connections),
it is called a recurrent network.

Representation of the connection weights as a mat ri x :

54
General N e u r a l Networks: E x a m p l e

A simple recurrent neural network


−2
x1 u1 u3 y
4
1
3
x2 u2

Weight m a t r i x of this network

55
Structure of a Generalized Neuron

A generalized neuron is a simple numeric processor

ext u

out v 1 = in uv 1 u
wuv1

u)
f n( et f a( cut) f o( uu t)
net u act u out u

out v n = in uv n
wuvn

σ1, . . . , σl θ 1 , . . . , θk

56
General N e u r a l Networks: E x a m p l e

u1 −2 u3
x1 1 1 y
4
1
3
x2 1
u2
Σ Σ
(u)
f net → u, i→
(w nu) = v∈pred(u) wuvin uv = v∈pred(u) wuv outv

(u) 1, if netu ≥ θ,
f act (netu, θ) =
0, otherwise.
(u)
f out (actu ) = actu

57
General N e u r a l Networks: E x a m p l e

U p d a t i n g the activations of the neurons

u1 u2 u3
input phase 1 0 0
work phase 1 0 0 netu3 = −2 < 1
0 0 0 netu1 = 0 < 1
0 0 0 netu2 = 0 < 1
0 0 0 netu3 = 0 < 1
0 0 0 netu1 = 0 < 1

• Order in which the neurons are updated:


u3, u1, u2, u3, u1, u2, u3, . . .
• Input phase: activations/outputs in the initial state.
• Work phase: activations/outputs of the next neuron to update (bold) are com-
puted from the outputs of the other neurons and the weights/threshold.
• A stable state with a unique output is reached.

58
General N e u r a l Networks: E x a m p l e

U p d a t i n g the activations of the neurons

u1 u2 u3
input phase 1 0 0
work phase 1 0 0 netu3 = −2 < 1
1 1 0 netu2 = 1 ≥1
0 1 0 netu1 = 0 < 1
0 1 1 netu3 = 3 ≥1
0 0 1 netu2 = 0 < 1
1 0 1 netu1 = 4 ≥1
1 0 0 netu3 = −2 < 1

• Order in which the neurons are updated:


u3, u2, u1, u3, u2, u1, u3, . . .
• No stable state is reached (oscillation of output).

59
General N e u r a l Networks:
Trainin
g
Definition of learning tasks for a neural network

A fixed learning task L fixed for a neural network with


• n input neurons Uin = {u 1 , . . . , u n } and
• m output neurons Uout = {v 1 , . . . , v m },

is a set of training patterns l = (→ ı (l) , →o (l) ), each consisting


of (l) (l)
• an i n p u t vector →
ı (l) = ( extu1 , . . . , extun ) and
(l) (l)
• an output vector →
o (l) = (ov1 , . . . , ovm ).

A fixed learning task is solved, if for all training patterns l ∈ Lfixed the neural network
computes from the external inputs contained in the input vector → ı (l) of a training
pattern l the outputs contained in the corresponding output vector →o (l) .

60
General N e u r a l Networks: T r a i n i n g

So l v i n g a fixed learning task: E r r o r definition

• Measure how well a neural network solves a given fixed learning task.
• Compute differences between desired and actual outputs.
• Do not sum differences directly in order to avoid errors canceling each other.
• Square has favorable properties for deriving the adaptation rules.

Σ (l)
e= e (l) = Σ ev = Σ Σ ev ,
l∈L fixed v∈Uout l∈L fixed v∈Uout

(l) (l) 2
(ov —out v )
(l)
where ev =

61
General N e u r a l Networks:
Trainin
g
Definition of learning tasks for a neural network
A free learning task Lfree for a neural network with
• n input neurons Uin = {u 1 , . . . , u n },

ı (l) ), each consisting


is a set of training patterns l = (→
of
(l) (l)
• an i n p u t vector →
ı (l) = ( extu1 , . . . , extun ).

Properties:
• There is no desired output for the training patterns.
• Outputs can be chosen freely by the training method.
• Solution idea: S i mi l a r inputs should lead to similar outputs.
(clustering of input vectors)

62
General N e u r a l Networks: Preprocessing

No rmal i zat i on of the i n p u t vectors

• Compute expected value and (corrected) standard deviation for each input:

1 (l) 1 (l)
µk = Σ ext u k and σk = Σ ( ext u −µk )2,
|L| |L| − 1 k
l∈L l∈L

• Normalize the input vectors to expected value 0 and standard deviation 1:

(l)(old)
ext u k −µk
ext(l)(new)
uk =
σk

• Such a normalization avoids unit and scaling problems.


It is also known as z-scaling or z-score standardization.

63

You might also like