Slide 08
Slide 08
Haykin Chapter 14: Neurodynamics • Inclusion of feedback gives temporal characteristics to neural
networks: recurrent networks.
(3rd Ed. Chapter 13) • Two ways to add feedback:
– Local feedback
CPSC 636-600
– Global feedback
Instructor: Yoonsuck Choe
Spring 2015 • Recurrent networks can become unstable or stable.
1 2
3 4
Autonomous vs. Non-autonomous Dynamical State Space
x2
Systems t=1
dx/dt
t=0
x1
dx
• It is convenient to view the state-space equation = F(x) as
F as a Vector Field dt
describing the motion of a point in N-dimensional space
dx
• Since dt
can be seen as velocity, F(x) can be seen as a (Euclidean or non-Euclidean). Note: t is continuous!
velocity vector field, or a vector field.
• The points traversed over time is called the trajectory or the
• In a vector field, each point in space (x) is associated with one orbit.
unique vector (direction and magnitude). In a scalar field, one
• The tangent vector shows the instantaneous velocity at the initial
point has one scalar value.
condition.
5 6
Phase Portrait and Vector Field Conditions for the Solution of the State Space
Equation
• A unique solution to the state space equation exists only under certain
conditions, which resticts the form of F(x).
• Red curves show the state (phase) portrait represented by kF(x) − F(u)k ≤ Kkx − uk
trajectories from different initial points. for some constant K , the above is said to be Lipschitz, and K is
called the Lipschitz constant for F(x).
• The blue arrows in the background shows the vector field.
– If ∂Fi /∂xj are finite everywhere, F(x) meet the Lipschitz
Source: http://www.math.ku.edu/˜byers/ode/b_cp_lab/pict.html
condition.
7 8
Stability of Equilibrium States Stability of in Linearized System
where A is the Jacobian: • The eigenvalues of the matrix A characterize different classes of
∂ behaviors.
A= F(x)
∂x x=x̄
9 10
• Uniformly stable for an arbitrary > 0, if there exists a positive • Theorem 1: The equilibrium state x̄ is stable if in a small
δ such that kx(0) − x̄k < δ implies kx(t) − x̄k < for all neighborhood of x̄ there exists a positive definite function V (x)
t > 0. such that its derivative with respect to time is negative
semidefinite in that region.
• Convergent if there exists a positive δ such that
kx(0) − x̄k < δ implies x(t) → x̄ as t → ∞ • Theorem 2: The equilibrium state x̄ is asymptotically stable if in a
small neighborhood of x̄ there exists a positive definite function
• Asymptotically stable if both stable and convergent.
V (x) such that its derivative with respect to time is negative
• Globally asymptotically stable if stable and all trajectoreis of the definite in that region.
system converge to x̄ as time t approaches infinity.
• A scalar function V (x) that satisfies these conditions is called a
Lyapunov function for the equilibrium state x̄.
13 14
15 16
Neurodynamical Models Manipulation of Attractors as a Recurrent Nnet
Paradigm
We will focus on state variables are continuous-valued, and those with
dynamics expressed in differential equations or difference equations.
• We can identify attractors with computational objects (associative
Properties: memories, input-output mappers, etc.).
• Large number of degree of freedom. • In order to do so, we must exercise control over the location of the
attractors in the state space of the system.
• Nonlinearity
• A learning algorithm will manipulate the equations governing the
• Dissipative (as opposed to conservative), i.e., open system.
dynamical behavior so that a desired location of attractors are set.
• Noise
• One good way to do this is to use the energy minimization
paradigm (e.g., by Hopfield).
17 18
1 X
N
X
N
• Starting with an initial pattern, the dynamic will converge toward the
attractor of the basin of attraction where the inital pattern was placed.
19 20
Content-Addressable Memory Hopfield Model: Storage
• Map a set of patterns to be memorized ξµ onto fixed points xµ in • The learning is similar to Hebbian learning:
the dynamical system realized by the recurrent network.
1 X
M
1 X
M
W= ξµ ξT
µ − MI
N
µ=1
21 22
xj (0) = ξprobe,j . • For large number of patters M , the matrix is degenerate, i.e., several
eigenvectors can have the same eigenvalue.
• Update output of each neuron (picking them by random) as • These eigenvectors form a subspace, and when the associated eigenvalue
! is 0, it is called a null space.
X
N
• This is due to M being smaller than the number of neurons N .
xj (n + 1) = sgn wji xi (n) .
• Hopfield network as content addressable memory:
i=1
– Discrete Hopfield network acts as a vector projector (project probe
until x reaches a fixed point. vector onto subspace spanned by training patterns).
– Underlying dynamics drive the network to converge to one of the
corners of the unit hypercube.
• Spurious states are those corners of the hypercube that do not belong to
the training pattern set.
23 24
Storage Capacity of Hopfield Network Storage Capacity of Hopfield Network (cont’d)
• Given a probe equal to the stored pattern ξν , the activation of the j th
neuron can be decomposed into the signal term and the noise term: • Given α = 0.14, the storage capacity becomes
PN
vj = wji ξν,i Mc = αc N = 0.14N
1
P
i=1
M PN
= N ξµ,j ξµ,i ξν,i
µ=1 i=1 when some error is allowed in the final patterns.
1
X
M
X
N
= ξν,j + ξµ,j ξµ,i ξν,i • For almost error-free performance, the storage capacity become
|{z} N
µ=1,µ6=ν i=1
signal (ξ3 =ξν,j ∈{±1})
| {z } N
ν,j Mc =
noise 2 loge N
• The signal-to-noise ratio is defined as • Thus, storage capacity of Hopfield network scales less than
variance of signal 1 N linearly with the size N of the network.
ρ= = ≈
variance of noise (M − 1)/N M
• This is a major limitation of the Hopfield model.
• The reciprocal of ρ, called the load parameter is designated as α.
According to Amit and others, this value needs to be less than 0.14 (critical
value αc ). 25 26
• Noisy input
• Partial input
• Capacity overload
29