0% found this document useful (0 votes)
22 views8 pages

Slide 08

Uploaded by

Rishitha Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views8 pages

Slide 08

Uploaded by

Rishitha Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Slide10 Neural Networks with Temporal Behavior

Haykin Chapter 14: Neurodynamics • Inclusion of feedback gives temporal characteristics to neural
networks: recurrent networks.
(3rd Ed. Chapter 13) • Two ways to add feedback:
– Local feedback
CPSC 636-600
– Global feedback
Instructor: Yoonsuck Choe
Spring 2015 • Recurrent networks can become unstable or stable.

• Main interest is in recurrent network’s stability: neurodynamics.

• Stability is a property of the whole system: coordination between


parts is necessary.

1 2

Stability in Nonlinear Dynamical System Preliminaries: Dynamical Systems


• A dynamical system is a system whose state varies with time.
• Lyapunov stablity: more on this later.
• State-space model: values of state variables change over time.
• Study of neurodynamics:
• Example: x1 (t), x2 (t), ..., xN (t) are state variables that hold
– Deterministic neurodynamics: expressed as nonlinear
different values under independent variable t. This describes a
differential equations.
system of order N , and x(t) is called the state vector. The
– Stochastic neurodynamics: expressed in terms of stochastic dynamics of the system is expressed using ordinary differential
nonlinear differential equations. Recurrent networks equations:
perturbed by noise. d
xj (t) = Fj (xj (t)), j = 1, 2, ..., N.
dt
or, more conveniently
d
x(t) = F(x(t)).
dt

3 4
Autonomous vs. Non-autonomous Dynamical State Space
x2
Systems t=1
dx/dt
t=0

• Autonomous: F(·) does not explicitly depend on time.


t=2
• Non-autonomous: F(·) explicitly depends on time.
...

x1

dx
• It is convenient to view the state-space equation = F(x) as
F as a Vector Field dt
describing the motion of a point in N-dimensional space
dx
• Since dt
can be seen as velocity, F(x) can be seen as a (Euclidean or non-Euclidean). Note: t is continuous!
velocity vector field, or a vector field.
• The points traversed over time is called the trajectory or the
• In a vector field, each point in space (x) is associated with one orbit.
unique vector (direction and magnitude). In a scalar field, one
• The tangent vector shows the instantaneous velocity at the initial
point has one scalar value.
condition.
5 6

Phase Portrait and Vector Field Conditions for the Solution of the State Space
Equation
• A unique solution to the state space equation exists only under certain
conditions, which resticts the form of F(x).

• For a solution to exist, it is sufficient for F(x) to be continuous in all of its


arguments.

• For uniqueness, it must meet the Lipschitz condition.


• Lipschitz condition:
– Let x and u be a pair of vectors in an open set M in a normal vector
space. A vector function F(x) that satisfies:

• Red curves show the state (phase) portrait represented by kF(x) − F(u)k ≤ Kkx − uk
trajectories from different initial points. for some constant K , the above is said to be Lipschitz, and K is
called the Lipschitz constant for F(x).
• The blue arrows in the background shows the vector field.
– If ∂Fi /∂xj are finite everywhere, F(x) meet the Lipschitz
Source: http://www.math.ku.edu/˜byers/ode/b_cp_lab/pict.html
condition.
7 8
Stability of Equilibrium States Stability of in Linearized System

• x̄ ∈ M is said to be an equilibrium state (or singular point) of the system


• In the linearized system, the property of the Jacobian matrix A
if
dx determine the behavior near equilibrium points.
= F(x̄) = 0.
dt x=x̄ • This is because
• How the system behaves near these equilibrium states is of great interest. d
∆x(t) ≈ A∆x(t).
• Near these points, we can approximate the dynamics by linearizing F(x) dt
(using Taylor expansion) around x̄, i.e., x(t) = x̄ + ∆x(t):
• If A is nonsingular, A−1 exists and this can be used to describe
F(x) ≈ x̄ + A∆x(t) the local behavior near the equilibrium x̄.

where A is the Jacobian: • The eigenvalues of the matrix A characterize different classes of
∂ behaviors.
A= F(x)
∂x x=x̄

9 10

Eigenvalues/Eigenvectors Example: 2nd-Order System


• For a square matrix A, if a vector x and a scalar value λ exists
so that
(A − λI)x = 0
then x is called an eigenvector of A and λ an eigenvalue.

• Note, the above is simply


Ax = λx

• An intuitive meaning is: x is the direction in which applying the


linear transformation A only changes the magnitude of x (by λ)
Positive/negative, real/imaginary character of eigenvalues of Jacobian
but not the angle.
determine behavior.
• There can be as many as n eigenvector/eigenvalue for an n × n • Stable node (real -), unstable focus (Complex, + real)
matrix.
• Stable focus (Complex, - real), Saddle point (real + -)
• Unstable node(real +), Center (Complex, 0 real)
11 12
Definitions of Stability Lyapunov’s Theorem

• Uniformly stable for an arbitrary  > 0, if there exists a positive • Theorem 1: The equilibrium state x̄ is stable if in a small
δ such that kx(0) − x̄k < δ implies kx(t) − x̄k <  for all neighborhood of x̄ there exists a positive definite function V (x)
t > 0. such that its derivative with respect to time is negative
semidefinite in that region.
• Convergent if there exists a positive δ such that
kx(0) − x̄k < δ implies x(t) → x̄ as t → ∞ • Theorem 2: The equilibrium state x̄ is asymptotically stable if in a
small neighborhood of x̄ there exists a positive definite function
• Asymptotically stable if both stable and convergent.
V (x) such that its derivative with respect to time is negative
• Globally asymptotically stable if stable and all trajectoreis of the definite in that region.
system converge to x̄ as time t approaches infinity.
• A scalar function V (x) that satisfies these conditions is called a
Lyapunov function for the equilibrium state x̄.

13 14

Attractors Types of Attractors

• Dissipative systems are characterized by attracting sets or


manifolds of dimensionality lower than that of the embedding • Point attractors (left)
space. These are called attractors. • Limit cycle attractors (right)
• Regions of initial conditions of nonzero state space volume • Strange (chaotic) attractors (not shown)
converge to these attractors as time t increases.

15 16
Neurodynamical Models Manipulation of Attractors as a Recurrent Nnet
Paradigm
We will focus on state variables are continuous-valued, and those with
dynamics expressed in differential equations or difference equations.
• We can identify attractors with computational objects (associative
Properties: memories, input-output mappers, etc.).

• Large number of degree of freedom. • In order to do so, we must exercise control over the location of the
attractors in the state space of the system.
• Nonlinearity
• A learning algorithm will manipulate the equations governing the
• Dissipative (as opposed to conservative), i.e., open system.
dynamical behavior so that a desired location of attractors are set.
• Noise
• One good way to do this is to use the energy minimization
paradigm (e.g., by Hopfield).

17 18

Hopfield Model Discrete Hopfield Model

• Based on McCulloch-Pitts model (neurons with +1 or -1 output).

• Energy function is defined as

1 X
N
X
N

E=− wji xi xj (i 6= j).


2
i=1 j=1

• Network dynamics will evolve in the direction that minimizes E .


• N units with full connection among every node (no self-feedback).
• Implements a content-addressable memory.
• Given M input patterns, each having the same dimensionality as the
network, can be memorized in attractors of the network.

• Starting with an initial pattern, the dynamic will converge toward the
attractor of the basin of attraction where the inital pattern was placed.

19 20
Content-Addressable Memory Hopfield Model: Storage

• Map a set of patterns to be memorized ξµ onto fixed points xµ in • The learning is similar to Hebbian learning:
the dynamical system realized by the recurrent network.
1 X
M

• Encoding: Mapping from ξµ to xµ wji = ξµ,j ξµ,i


N
µ=1
• Decoding: Reverse mapping from state space xµ to ξµ .
with wji = 0 if i = j . (Learning is one-shot.)
• In matrix form the above becomes:

1 X
M
W= ξµ ξT
µ − MI
N
µ=1

• The resulting weight matrix W is symmetric: W = WT .

21 22

Hopfield Model: Activation (Retrieval) Spurious States


• The weight matrix W is symmetric, thus the eigenvalues of W are all
• Initialize the network with a probe pattern ξprobe . real.

xj (0) = ξprobe,j . • For large number of patters M , the matrix is degenerate, i.e., several
eigenvectors can have the same eigenvalue.

• Update output of each neuron (picking them by random) as • These eigenvectors form a subspace, and when the associated eigenvalue
! is 0, it is called a null space.
X
N
• This is due to M being smaller than the number of neurons N .
xj (n + 1) = sgn wji xi (n) .
• Hopfield network as content addressable memory:
i=1
– Discrete Hopfield network acts as a vector projector (project probe
until x reaches a fixed point. vector onto subspace spanned by training patterns).
– Underlying dynamics drive the network to converge to one of the
corners of the unit hypercube.

• Spurious states are those corners of the hypercube that do not belong to
the training pattern set.

23 24
Storage Capacity of Hopfield Network Storage Capacity of Hopfield Network (cont’d)
• Given a probe equal to the stored pattern ξν , the activation of the j th
neuron can be decomposed into the signal term and the noise term: • Given α = 0.14, the storage capacity becomes
PN
vj = wji ξν,i Mc = αc N = 0.14N
1
P
i=1
M PN
= N ξµ,j ξµ,i ξν,i
µ=1 i=1 when some error is allowed in the final patterns.
1
X
M
X
N

= ξν,j + ξµ,j ξµ,i ξν,i • For almost error-free performance, the storage capacity become
|{z} N
µ=1,µ6=ν i=1
signal (ξ3 =ξν,j ∈{±1})
| {z } N
ν,j Mc =
noise 2 loge N
• The signal-to-noise ratio is defined as • Thus, storage capacity of Hopfield network scales less than
variance of signal 1 N linearly with the size N of the network.
ρ= = ≈
variance of noise (M − 1)/N M
• This is a major limitation of the Hopfield model.
• The reciprocal of ρ, called the load parameter is designated as α.
According to Amit and others, this value needs to be less than 0.14 (critical
value αc ). 25 26

Cohen-Grossberg Theorem Cohen-Grossberg Theorem (cont’d)


• Cohen and Grossberg (1983) showed how to assess the stability • For the definition in the previous slide to be valid, the following
of a certain class of neural networks: conditions need to be met.
" #
d X N – The synaptic weights are symmetric.
uj = aj (uj ) bj (uj ) − cji ϕi (ui ) , j = 1, 2, ..., N – The function aj (uj ) satisfies the condition for nonnegativity.
dt
i=1
– The nonlinear activation function ϕj (uj ) needs to follow the
.
monotonicity condition:
• Neural network with the above dynamics admits a Lyapunov
d
function defined as: ϕ0j (uj ) = ϕj (uj ) ≥ 0.
duj
X
N
X
N N Z
X uj
1 • With the above
E= cji ϕi (ui )ϕj (uj )− b−j(λ)ϕ0j (λ)dλ, dE
2 0 ≤0
i=1 j=1 j=1 dt
where ensuring global stability of the system.
d
ϕ0 (λ) = (ϕj (λ)). • Hopfield model can be seen as a special case of the

27 Cohen-Grossberg theorem. 28
Demo

• Noisy input

• Partial input

• Capacity overload

29

You might also like