SFML DATE 16 Lecture1 Basics Notes
SFML DATE 16 Lecture1 Basics Notes
1 Introduction
The first week of this course serves as an introduction to linear algebra and its applications to
neuroscience. The amount of neural data that one can collect in a single experiment nowadays
dwarfs the amount that could be collected even twenty years ago. We can now image calcium
signals from nearly all of the 100,000 neurons in a larval zebrafish brain, or, using dense silicon
probes, record spikes from thousands of neurons in the mouse brain.
The magnitude of this data boggles the mind and frustrates our intuition, so to make sense of it,
we must use mathematical tools. Linear algebra is central among these. When you fit your neural
data to behavioral or cognitive variables using a general linear model (GLM), you’re using linear
algebra. When you perform a principal component analysis (PCA) to visualize high-dimensional
neural data in 2 or 3 dimensions, you’re using linear algebra. So too, when you model neural
dynamics using linear differential equations.
The centrality of linear algebra in neural data analysis necessitates our taking a little time to
familiarize ourselves with it. Our goal is to achieve a geometric intuition for neural data structures
as objects in a space and for linear operations as manipulations of these objects (or of how we see
the space).
(35, 35, 10) − (40, 25, 15) = (35 − 40, 35 − 25, 10 − 15) = (−5, 10, −5)
Example 3. Example 2 illustrates an important idea about vectors, that it really only makes sense
to call two objects vectors if you can compare them. Let us consider the case where we record spikes
from a mouse and obtain firing rates in Hz from three units, (15, 24, 8). The next day, we record
from a second mouse and obtain firing rates from three units in the same brain region, (20, 14, 12).
While these two lists can both be seen as vectors, it makes no sense for us to add them up, or to
1
take a difference between them, or to interact them in any other way, because they correspond to
different neurons! On the other hand, we could concatenate them into (15, 24, 8, 20, 14, 12). This
would be pooling units across animals. We could then compare how activity changes across all of
these neurons when we make an experimental manipulation.
We should formalize the idea illustrated in the above example. Formally, we should say that the
two vectors belong to two different vector spaces. We should think of a vector space as a collection
of all vectors of the same ”type” - vectors that can be added together, subtracted from each other,
and manipulated with each other in other ways. We give the formal definition of a vector space
below.
Definition 4. A vector space V is a collection of vectors, along with two arithmetic operations,
vector addition and scalar multiplication, that satisfy the following 8 axioms, letting a, b, and c
be vector members of V :
5. For real number scalars α and β and v ∈ V , α (βv) = (αβ) v (associativity of multiplication)
6. 1 ∗ a = a (multiplicative identity)
7. For every real number scalar α, α (a + b) = αa + αb (distributivity of multiplication I)
What does the dot product of two vectors tell us? It tells us how similar the vectors are. More
precisely, it is related to the angle between the vectors. For vectors of fixed length, the smaller
the dot product, the smaller than angle. You will derive an expression of this relationship in your
homework.
2
Definition 6. A set of non-zero vectors {v1 , v2 , ..., vn } is linearly dependent if there exists one
member of the set, vi , that can be written as a linear combination of the other members:
vi = c1 v1 + c2 v2 + ... + ci−1 vi−1 + ci+1 vi+1 + ... + cn vn
Where the c’s are scalar multiplication factors. Since all of the vectors are non-zero, at least one
of the c’s must also be non-zero. Therefore, by moving vi from the left to the right side in the
above equation, we can see that an alternative, equivalent definition of linear dependence is that
there exists a set of scalars {cj }j=1 , at least one of which is non-zero, such that
n
n
cj v j = 0
X
j=1
j=1
1 0
to see that , is a basis set for R2 . This basis has a special name, the standard basis,
0 1
because it is the most intuitive.
This is an efficient description of a vector space because any element of the space can be written
as a linear combination of members of the basis set. Obviously, there are many different bases for
any vector space. In fact, there are infinitely many. However, all bases for a vector space must
obey a single rule - they must have the same cardinality, or number of members. The proof of this
fact is beyond the scope of this course. The cardinality of a vector space’s basis is known is the
dimension of the vector space. So when we say that the Cartesian plane is two-dimensional, we
mean that any basis of it must consist of two vectors. Similarly, the space that you, the reader,
lives in, is three dimensional, because to specify any point relative to an arbitrary origin, we need
to provide exactly three distances. These can be the x, y, and z Cartesian coordinates, or the
radius and two angles of spherical coordinates.
3
How does the idea of a basis set help us describe vector spaces? We can think of the basis set
as the coordinate system in which each member of a vector space can be specified.
Example
14. Let us return to R2 and
define our coordinates
using the standard basis
1 0 1 0
B= , . Then, letting e1 = and e2 = , we can express any location x in R2 as
0 1 0 1
α
x= = αe1 + βe2
β B
In other words, we head α steps in the e1 direction and β steps in the e2 direction.
But recall that there are many different bases for any vector space. This means that there are
many different coordinates with which locations in that space √can be expressed. I can√tell you
to go 2 miles north (and 0 miles east), or I can tell you to go 2 miles northwest and 2 miles
northeast. Following either direction, or using either set of coordinates, you would arrive in the
same place. We can this this idea more rigorously in the following way.
1 0 1 −1
Example 15. Let B = , ,C= , be two bases for R2 . Then
0 1 1 1
√
2 2
= √
0 B 2 C
Definition 16. A linear transformation T that maps elements of one vector space, V , to another
vector space, W , is a transformation that preserves vector addition and scalar multiplication. More
precisely, we have for vectors v1 , v2 ∈ V and a real number α:
1. T (v1 + v2 ) = T (v1 ) + T (v2 )
2. T (αv1 ) = αT (v1 )
Linear transformations are very important for the analysis and modeling of data, including
neuroscientific data. For example, the popular dimensionality reduction methods principal compo-
nent analysis (PCA), and independent component analysis (ICA) rely on linear transformations.
Linear transformations are also used for change-of-variables methods for solving differential equa-
tions. Finally, fitting data with linear models (e.g. regression, GLMs) uses linear transformations
as well.
4.2 Matrices
We will now introduce the idea of matrices, relate matrices to linear transformations, and develop
an intuition for matrix-vector multiplication as coordinate transformation.
v
Let v = 1 be a vector, and a, b, c, d be real-valued scalars. Then a linear transformation of
v2
av1 + bv2
v, T (v) will yield a vector . Recalling our definition of dot product, we can write the
cv1 + dv2
transformation in two equivalent ways:
(a, b) · (v1 , v2 ) a b
T (v) = = v1 + v2
(c, d) · (v1 , v2 ) c d
4
Matrix multiplication is a convenient shorthand with which we can expressed the above operation.
a b v1
T (v) =
c d v2
a b
Here, is called a matrix. In general, we can think of any matrix A as a list of vectors ai ;
c d
each vector is a row of the matrix:
a1
a2
A= .
..
an
When a matrix is multiplied with a vector, the result is a new vector whose ith element is the dot
product of the ith row of the matrix and the vector:
a1 · v
u1
u2 a2 · v
u = Av = . = .
.. ..
un an · v
Note some potentially confusing notation here: the italic u’s are the components of the bold,
unitalicized vector u. They are not vectors but are instead real-valued scalars. On the other hand,
the bold a’s and v’s are vectors, lists of numbers.
Also note that because we must take the dot products of the matrix rows with our vector, the
number of rows in a matrix must equal the number of vector components in order for multiplication
to happen.
Finally, similarly to multiplying a matrix and a vector together, we can also multiply together
two matrices. Here, we think the second matrix (the matrix on the right of the multiplication) as
a list of vectors, in which each vector is a column of the matrix. Therefore, we require that the
number of columns of the left matrix equals the number of rows of the right matrix, again to allow
for dot products.
Example 17. Let
a1
a2
A= .
..
an
and
B = b1 b2 bm
···
Then
a1 · b1 a1 · b2 a1 · bm
···
a2 · b1 a2 · b2 ··· a2 · bm
AB = . .. .. ..
..
. . .
an · b1 an · b2 ··· an · bm
5
1 0
Clearly, the first two columns of T, and are linearly independent. Furthermore, they serve
0 2
as a basis for R2 , so the other columns of T are not linearly independent. Therefore, the rank of
T is 2.
Finally, our view of matrix multiplication as linear transformation allows us to develop a method
for changing the coordinate system with which we navigate a vector space; such a change is also
known as a change of basis.
Example 19. Change of basis.
3 1 0
Suppose we have a vector v = , where B is the standard basis, B = , . Note that
4 B 0 1
1 0 1 1
v = 3∗ +4 ∗ . Can we write v using the coordinates of another basis, say C = , .
0 1 1 −1
In other words, our goal is to find a linear combination of elements from C that give us v. We wish
to find real numbers u1 and u2 that solve the following equation:
1 1
v = u1 ∗ + u2 ∗
1 −1
How can we solve this equation? First, let’s think back to basic algebra. If we have an equation
a ∗ x = b and we with to solve for x, to get x by itself, we can multiply both sides by the
multiplicative inverse of a:
x = a−1 ∗ a ∗ x = a−1 ∗ b
For matrix-vector equations, the same problem-solving process applies. First, we notice that
0.5 0.5 1 1 1 0
∗ =
0.5 −0.5 1 −1 0 1
Then
0.5 0.5 1 1 u 1 0 u u 0.5 0.5 1 0 3 3.5
∗ ∗ 1 = ∗ 1 = 1 = ∗ ∗ =
0.5 −0.5 1 −1 u2 0 1 u2 u2 0.5 −0.5 0 1 4 −0.5
In general, suppose we have one vector b written with the coordinate system B and another, c
written with the corrdinate system C. Then letting, B and C denote the matrices specified by the
bases B and C, respsectively, the following equality holds:
Bb = Cc
6
Here, the neuron receives four excitatory inputs (shown in blue) and one inhibitory input (shown
in red). The membrane potential of the neuron, and therefore its propensity to fire is a function of
these inputs. Let us call the inputs i5 through i5 . Each input depends on two things - the activity
of the presynaptic neuron, and the strength of the synaptic connection between the presynaptic
and postsynaptic neuron. Taking the overall drive, d to the postsynaptic neuron to be the sum of
the five inputs, we have
d = i1 + i2 + i3 + i4 + i5
In shorthand, we can write
5
X
d= ij
j=1
If we call the activity of the jth presynaptic neuron aj and the synaptic strength or weight wj ,
when we have
X5 5
X
d= ij = aj wj
j=1 j=1
Notice that this second sum is simply a dot product of two vectors, the activity vector specifying
the activities of all presynaptic neurons, and the weight vector, which specifies the strength of the
synapse between the two neurons:
a1
a2
d = w1 w2 w3 w4 w5 ∗ a3
a4
a5