SFML DATE 18 Lecture2 Eigen Notes
SFML DATE 18 Lecture2 Eigen Notes
I asked: “What does mathematics mean to you?” And some people answered: “The
manipulation of numbers, the manipulation of structures.” And if I had asked what
music means to you, would you have answered: “The manipulation of notes”? - Serge
Lang
1 Introduction
1.1 Previously...
In the previous class, Alex taught you all about the basics of vectors and matrices, as well as
mathematical operations to manipulate them. As Alex pointed out, the study of vectors and
matrices are vital tools for our ability to make sense of neural data.
In this class, we will begin to explore some fundamental properties of matrices that tell us very
important information about the structure of those matrices.
[**Important Note**]: an intuitive understanding of these topics is best achieved through
geometry and pictures, which were presented in class. At some point I will incorporate picture
examples within this document, but for now, refer to the accompanying ”chalkboard” notes (and
take your own notes!).
1
had used B to describe specific locations in space, we now must use the new coordinate system
defined by the matrix A.
3
This means that the vector v = is also transformed according to this matrix A, and that
4
transformation can be summarized in the following matrix-vector multiplication:
3 1 3 (3)(3) + (1)(4) 13
Av = = =
0 2 4 (0)(3) + (2)(4) 8
In sum, you can think of a matrix as reshaping the whole space via operations on it’s basis
vectors.
Now, think of what this transformation does to other vectors on the grid. In particular, think
of the span of that vector. To remind you, a simple intuition for the span is that it is all vectors
that are scalar multiples of that vector, or the line that runs through that vector. Most vectors,
when undergoing a transformation, will be knocked off their span.
Example 2. To illustrate this numerically, simply apply the transformation to the vector (multiply
the matrix by the vector), and observe that the resulting
vector is not a scalar multiple of the
1
original vector. We will demonstrate with the vector :
1
3 1 1 4
Av = =
0 2 1 2
However, some really special vectors don’t get knocked off their span. How do we know that?
Well, if we apply the transformation to the vector, what we get is a simple rescaling of that vector,
which by definition, means that it remains on it’s span.
1
Example 3. For example, vector does not get knocked off it’s span:
−1
3 1 1 2
A= =
0 2 −1 −2
2 1
Notice that the resulting vector is just the original vector scaled by a factor of 2: = 2∗
−2 −1
Guess what? You’ve basically just understood eigenvalues and eigenvectors.
Av = λv (1)
Therefore, finding the eigenvectors and eigenvalues of the matrix A comes down to finding the
values of v and λ that satisfy this equation.
Av = λIv (2)
(which nicely balances out both sides as matrix-matrix products) and then move those right-
hand side variables to the left, and factor out the v:
(A − λI)v = 0 (3)
2
Notice that this new equation essentially gives a new matrix, which is the original matrix with
the eigenvalues subtracted from the diagonal. We have to find the v that when multiplied by this
new matrix, gives us zero. But before we do that, let’s think a little bit about what this equation
means.
First, notice that if v is 0, we will always satisfy this equation. That means that the zero vector
is always an eigenvector. But that is kind of uninteresting. So we want to look for non-zero vectors
v that that satisfy this equation. Notice that to solve to v , we have to first solve for λ, which also
presents another unknown.
2.4 Determinants
We have to take a bit of a digression in order to proceed. The determinant is again one of those
concepts that’s a bit hard to grasp and is usually presented unintuitively. But a simple, intuitive
way to think about the determinant is that it describes the change in area that occurs when
applying matrix transformation.
Example 4. For example, here is a matrix that stretches space 3 units wide and 4 units tall:
3 0
A=
0 4
The resulting ”area” of the expanded square is (3)(4) = 12, and det(A) = 12. Of course, this is a
very simple scaling matrix.
Example 5. Another simple example that instead ”squishes” space to zero is a matrix transfor-
mation that maps all vectors onto a single line:
1 2
A=
1 2
This ”squishing” into a lower-dimensional space also corresponds to the determinant of the
matrix being zero.
Other matrices will transform the space such that the area is not so easy to calculate. Luckily
for us, the determinant of any 2x2 matrix is pretty easy to calculate:
a b
Definition 2. Formally, the determinant det(A) of a 2x2 matrix is
c d
det(A) = ad − bc
So back to the original objective: we want to find eigenvalues λ, that when plugged into this
equation, make the determinant of this new matrix 0. Why? Because a determinant of 0 means
that this matrix transformation squishes all the vectors onto a lower dimension (line), and that’s
the only way to ensure that your eigenvector times this matrix (i.e. (A − λI)v) would equal 0
(which, remember, is our original goal– to find an eigenvector v that would satisfy Eq. 1). Now
let’s try with an example:
2 1
Example 6. Let A = . Find the eigenvectors v and eigenvalues λ of A.
2 3
Part 1. Find (A − λI):
2−λ 1
(A − λI) =
2 3−λ
Part 2. Find the eigenvalues λ that make det(A − λI) = 0:
3
Therefore, our eigenvalues λ = 4 and 1.
Part 3. Plug each eigenvalue into (A − λI)v = 0 and solve for v:
For λ = 4 :
2−4 1 x1
(A − λI)v = =0
2 3 − 4 x2
−2 1 x1
= =0
2 −1 x2
−2 1 x1
= =0
0 0 x2
Notice between lines 2 and 3, I applied the following row operation in order to make the linear
system more solvable: row2 = row2 + row1.
This is a technique called Gaussian Elimination for solving linear systems of equations.
However, because the purpose of you learning about eigenvalues and eigenvectors is not so you
can spend your days calculating them by hand, we’ll just do it once. In the future, know that
the [V,D] = eig(A) function in MATLAB will return to you eigenvectors in the columns of the
matrix V, and eigenvalues in the diagonal matrix D.
Now we can solve the system easily.
−2x1 + 1x2 = 0
x2 = 2x1
Here, we see that any x1 and x2 ’s that satisfy this relationship can constitute our eigenvector.
For simplicity, let’s set x1 = 1. Then x2 = 2. So now we’ve found our first eigenvector with the
corresponding λ = 4:
1
v1 =
2
Now, one more eigenvector to find.
For λ = 1 :
2−1 1 x1
(A − λI)v = =0
2 3 − 1 x2
1 1 x1
= =0
2 2 x2
1 1 x1
= =0
0 0 x2
1x1 + 1x2 = 0
x1 = −x2
Again, any x1 and x2 ’s that satisfy this relationship can constitute our eigenvector. Let’s set
x1 = 1, making x2 = −1. Now we’ve found our second eigenvector with the corresponding λ = 1:
1
v1 =
−1
4
the space will continue to be stretched and stretched and stretched in the particular direction of
the corresponding eigenvector. But if a matrix has eigenvalues, λ < 1, then the space will shrink
and shrink and shrinkin the particular direction of the corresponding eigenvector. In this way,
eigenvalues can describe the dynamics of a system, how it evolves over time, and whether it is
stable (λ < 1) or unstable (λ > 1).
To bring this back to an application to neuroscience, let us consider a network of 3 neurons,
N1 , N2 , and N3 . They are all connected to each other, and the connection strengths between them
can be summarized into a matrix of synaptic weights Wij , where the ij-th entry describes the
synaptic weight between neuron Ni and neuron Nj . This matrix is symmetric, meaning that the
ij-th element is also the ji-th element.
If we go back to our idea of a matrix as a transformation of coordinate space, what does this
matrix of synaptic weights, W , tell us? More specifically, what do the eigenvalues and eigenvectors
of W tell us?
How can we predict the activity of the three neurons, N1 , N2 , and N3 , in response to this
input? To calculate the total synaptic drive to each of the three neurons, we take the product of
the synaptic weight matrix with the vector of inputs, Wu. Because each row of the matrix defines
the connectivity from one neuron to the other three, the dot product of one row in W and the
column vector of inputs u will ”sum up” the synaptic drive from all inputs to one neuron, giving
us the activity of that neuron. Doing this for all three rows will give us the resulting activity of
all three neurons in response to that input:
w11 w12 w13 u1 w11 u1 + w12 u2 + w13 u3
Wu = w21 w22 w23 u2 = w21 u1 + w22 u2 + w23 u3
w31 w32 w33 u3 w31 u1 + w32 u2 + w33 u3
Now imagine this is a recurrent process, such that your inputs cause a chain-reaction of events
where the output activity from the previous timestep becomes the input at the next timestep. This
is like watching neural activity change over time:
ut+1 = Wut
Now we can begin to see what the eigenvalues of W are telling us: if λ < 1, we know that the
neural activity in the direction of the corresponding eigenvector will decay over time. If the λ > 1,
we know that the neural activity in the direction of the corresponding eigenvector will explode over
time. This foreshadows what John will teach you in a couple lectures about differential equations,
dynamical systems, and fixed points!