Sparse Matrix Operations
Sparse Matrix Operations
David F. Gleich
September 5, 2022
Learning objectives
Learn how to perform basic operations
with sparse matrices using the Candyland
1 intro to candyland matrix as an example.
As we mentioned, there are many real-world problems that involve sparse matrices. In
a few lectures, we’ll see how discretizations of Laplacian operator on 2d grids will give us
sparse matrices. For this class, we are going to continue working with random processes.
The game of Candyland is played on 133 squares. At each turn, a player draws a card
from a deck of cards. This determines where they move to for the next turn. There is no
interaction with other players (other than sharing the same deck). For our study here,
we are going to model the game where we simply draw a random card from the total
set at each time, so there is no memory at all in the game. This means that the resulting
system is a Markov chain, or a memoryless stochastic process. While there is a great deal
of formality we can get into with all of these things, the key thing to remember is that what
happens at each step can just be described by a matrix that tells you the probability of what
happens next.
1.1 T H E C A N DY L A N D M O D E L
So we are going to create the matrix for Candyland that gives the probabilities of
moving between the 133 squares, along with two special squares; one for the start (state
140) and one for the destination (134) . There are also a set of 5 special cases that involve
exceptions to the rules (135,136,137,138,139).
In this case, the game of Candyland can be modeled with a 140 × 140 matrix T.1 If we 1
The data files to recreate T are available on
show the matrix with a small ● for each nonzero entry, then it looks like2 the the course website.
⎡ ●●●● ●●● ⎤
⎢ ●●●●●●● ●● ⎥
⎢ ●●●● ●●● ⎥
●●●● 2
TODO – Double check this one isn’t trans-
⎢ ●●●●●●●●●● ●● ⎥
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●● ●●●
⎢ ●●●●●●●●●●●● ⎥
●●●●●●●●● posed.
●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ●●●●●●●●●●●●●● ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ⎥
●●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
⎢ ●⎥
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
⎢ ● ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●
⎢ ● ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●● ●●
⎢ ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ●⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
T = ⎢ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ⎥.
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●● ⎥
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ● ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎢ ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●●
●●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●●
●●●●●●●●●●●●●
●●●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●
●●●●●●●●●●● ●
●●●●●●●●●●●
⎢ ⎥
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
●●●●●●●●●●●
⎢ ● ⎥
●●●●●●●●●●●
●●●●●●●●●●●● ●
●●●●●●●●●●●●
⎢ ●● ⎥
●●●●●●●●●●●●
●●●●●●●●●●●●
●●●●●●●●●●●●
⎣ ⎦
● ● ● ● ●
This is clearly sparse as most of the matrix is empty. This is because it’s impossible to get
between most pairs of squares in Candyland in a single move.
Let T = [t1 t2 . . . t140 ] be the column-wise partition. Where t j describes the
probability of ending up at each of the 140 states given that we are in state j. Put another way,
T(i, j) is the probability of moving to state i given that you are in state j. Consequently,
1
after one step of the game, the probability that the player is in any state can be read off
from t140 . This is because the player starts in state 140.
Now, what’s the probability of being in any state after two-steps? We can use the matrix
to work out this probability:
p2 = Tt140
is the probability of the player being in any state after two steps. This is just a matrix-vector
operation!
Now to figure out where the player is after any number of steps, we proceed iteratively:
p3 = Tp2 = T 2 t140 .
By induction now, we get that the probability the player is in any state after k steps:
p k = T k−1 t140 .
The key point: in order to compute this probability, we only need to compute
matrix-vector products with a sparse matrix.
1.2 C O M P U T I N G E X P E C T E D L E N G T H O F A C A N DY L A N D G A M E
The Candyland game ends when then the player is in state 134 in this particular model.
Let X be the random variable that is the length of the Candyland game. Then we want
to compute the expected value of X. Recall that the expected value of a discrete random
variable is:
[T 4 t140 ]134 .
In practice, we can’t run this until infinity, even though the game could, in theory, last
a very long time. We can compute this via the following algorithm. 4 4
In Julia, the code is
1 function candylandlength(T, maxlen)
Create a starting vector p = e140 because we start in state 140.
2 n = size(T,1)
EX 0
← 3 (p = zeros(n))[140] = 1
For length = 1 to maximum game length considered
4 ex = 0.0
p ← Tp 5 for l=1:len
EX ← EX + length ⋅ p134
6 p = T*p
return EX
7 ex += length*p[134]
The key algorithm step is to compute the matrix-vector product Tp. 8 end
9 return ex
10 end
2
2 sparse matrix storage: storing only the valid
transitions for candyland and performing a
matrix-vector product
The idea with sparse matrix storage is that we only store the nonzero entries of the
matrix. Anything that is not stored is assumed to be zero. This is illustrated in the following
figure.
⎡0 16 13 0 0 0 ⎤
⎢ ⎥
⎢0 0 10 12 0 0 ⎥ Indexed storage
⎢ ⎥
⎢0 4 0 0 14 0 ⎥
⎢ ⎥ 3 3 5 5 2 1 2 4 4 1
⎢ ⎥
I
⎢0 0 9 0 0 20⎥ 5 2 6 4 3 3 4 6 3 2
⎢ ⎥ J
⎢0 0 0 7 0 4 ⎥
⎢ ⎥ 14 4 4 7 10 13 12 20 9 16
⎢0 0 0 0 0 0 ⎥
V
⎣ ⎦
The arrays I, J, and V store the row index, column index, and nonzero value associated
with each nonzero entry in the matrix. There are 30 values in the arrays, whereas storing
all the entries in the matrix would need 36 values. Although this isn’t a particularly large
difference, is is less data.
For the matrix T in the Candyland problem, there are 6816 entries in the arrays I, J, V
whereas there would be 19600 entries in the matrix T had we stored all the zeros.
We can use this data structure to implement a matrix-vector product. Recall that
y = Ax means that y i = ∑ A i , j x j for all i .
j=1,...,n
If A i , j = 0 then it plays no role in the final summation and we can write the equivalent
expression:
y = Ax means that y i = ∑ A i , j x j for all i .
j where A i , j =
/0
This means that an algorithm simply has to implement this accumulation over all nonzero
entries in the matrix. This is exactly what is stored in the arrays I, J, V.
The algorithm in Julia is:
1 function indexed_sparse_matrix_vector_product(x,I,J,V,m,n)
2 y = zeros(m)
3 for nzi=1:length(I)
4 i,j,v = I[nzi], J[nzi], V[nzi]
5 y[i] += v*x[j]
6 end
7 return y
8 end
⎡0 0 ⎤⎥ 1 3 1 2 4 2 5 3 4 5
⎢ 16 13 0 0 I
⎢0 0 ⎥⎥
⎢ 0 10 12 0 16 4 13 10 9 12 7 14 20 4
⎢0 0 ⎥⎥
V
⎢
⎢ ⎥
4 0 0 14
⎢0 20⎥⎥
⎢ 0 9 0 0
⎢0 4 ⎥⎥ Compressed sparse column
⎢ 0 0 7 0
⎢0 0 ⎥⎦
⎣ 0 0 0 0 colptr 1 1 3 6 8 9 11
rowval 1 3 1 2 4 2 5 3 4 5 ∅
nzval 16 4 13 10 9 12 7 14 20 4
3
This figure shows that when the data are sorted by increasing column index. Then
there are multiple values with the same column in adjacent entries of the J array. We
can compress these into a list of pointers. This means that we create a new array called
colptr that stores the starting index for all the entries in I, V arrays associated with a given
column.
rowval[colptr[ j]]...rowval[colptr[ j + 1] − 1]
Entries of column j are stored in .
nzval[colptr[ j]]...nzval[colptr[ j + 1] − 1]
This means if colptr[ j] = colptr[ j + 1] then there are no entries in the column. (See the
example in column 1.) This
This structure enables efficient iteration over the elements of the matrix for matrix-
vector products, just like indexed storage, with only minimal changes to the loop. In Julia,
the algorithm is:6 6
This algorithm is especially particular to
using 0 based or 1-based indexing.
1 function indexed_sparse_matrix_vector_product(x,colptr,rowval,nzval,m,n)
2 y = zeros(m)
3 for j=1:n
4 for nzi=colptr[j]:colptr[j+1]-1
5 i,v = rowval[nzi], nzval[nzi]
6 y[i] += v*x[j]
7 end
8 end
9 return y
10 end
Both Julia and Matlab use compressed sparse column formats for
their preferred sparse matrix format.
⎡0 0 ⎤⎥ 2 3 3 4 2 5 3 6 4 6
⎢ 16 13 0 0 J
⎢0 0 ⎥⎥
⎢ 0 10 12 0 16 13 10 12 4 14 9 20 7 4
⎢0 0 ⎥⎥
V
⎢
⎢ ⎥
4 0 0 14
⎢0 20⎥⎥
⎢ 0 9 0 0
⎢0 4 ⎥⎥ Compressed sparse row
⎢ 0 0 7 0
⎢0 0 ⎥⎦
⎣ 0 0 0 0 rowptr 1 3 5 7 9 11 11
colval 2 3 3 4 2 5 3 6 4 6 ∅
nzval 16 13 10 12 4 14 9 20 7 4
colval[rowptr[i]]...colval[rowptr[ j + 1] − 1]
Entries of row i are stored in .
nzval[rowptr[ j]]...nzval[rowptr[ j + 1] − 1]
If we were to implement the matrix-vector routine for compressed sparse row matrices,
however, there is an interesting optimization possible because all of the updates to the
output vector y happen in the same index.
4
1 function indexed_sparse_matrix_vector_product(x,rowptr,colval,nzval,m,n)
2 y = zeros(m)
3 for i=1:m
4 yi = 0.0
5 for nzi=rowptr[j]:rowptr[j+1]-1
6 j,v = colval[nzi], nzval[nzi]
7 yi += v*x[j]
8 end
9 y[i] = yi
10 end
11 return y
12 end