03 Myers Bit Vector
03 Myers Bit Vector
This exposition has been developed by C. Gröpl, G. Klau, D. Weese, and K. Reinert. It is based on the following
sources, which are all recommended reading:
1. G. Myers (1999) A Fast Bit-Vector Algorithm for Approximate String Matching Based on Dynamic Programming,
Journal of the ACM, 46(3): 495-415
2. Heikki Hyyrö (2001) Explaining and Extending the Bit-parallel Approximate String Matching Algorithm of
Myers, Technical report, University of Tampere, Finland.
3. Heikki Hyyrö (2003) A bit-vector algorithm for computing Levenshtein and Damerau edit distances, Nordic
Journal of Computing, 10: 29-39
1. The classical dynamic programming algorithm, discovered and rediscovered many times since the 1960’s.
It computes a DP table indexed by the positions of the text and the pattern.
2. This can be improved using an idea due to Ukkonen, if we are interested in hits of a pattern in a text with
bounded edit distance (which is usually the case).
3. The latter can be further speeded up using bit parallelism by an algorithm due to Myers. The key idea is
to represent the differences between the entries in the DP matrix instead of their absolute values.
where
0,
if pi = t j ,
δi j =
1,
otherwise.
Each location j in the last (m-th) row with C[m, j] ≤ k is a solution to our query.
The matrix C is initialized:
• at the upper boundary by C[0, j] = 0, since an occurrence can start anywhere in the text, and
• at the left boundary by C[i, 0] = i, according to the edit distance.
Example. We want to find all occurences with less than 2 differences of the query annual in the text
annealing. The DP matrix looks as follows:
Bit vector algorithms for approximate string matching, by C. Gröpl, G. Klau, K. Reinert, October 28, 2013, 14:053001
A N N E A L I N G
0 0 0 0 0 0 0 0 0 0
A 1 0 1 1 1 0 1 1 1 1
N 2 1 0 1 2 1 1 2 2 2
N 3 2 1 0 1 2 2 2 2 2
U 4 3 2 1 1 2 3 3 3 3
A 5 4 3 2 2 1 2 3 4 4
L 6 5 4 3 3 2 1 2 3 4
sequence y
sequence y
sequence x sequence x
sequence y
sequence y
sequence x sequence x
sequence y
sequence y
3002Bit vector algorithms for approximate string matching, by C. Gröpl, G. Klau, K. Reinert, October 28, 2013, 14:05
sequence x
initial case
sequence y
recursive case
in memory active
The approximate matches of the pattern can be output on-the-fly when the final value of the column vector
is (re-)calculated.
The algorithm still requires in O(mn) time to run, but it uses only O(m) memory.
Note that in order to obtain the actual alignment (not just its score), we need to trace back by some other
means from the point where the match was found, as the preceeding columns of the DP matrix are no longer
available.
Exercise. Prove that these deltas are indeed within the claimed ranges.
The delta vectors are encoded as bit-vectors by the following boolean variables:
∆vi, j : A N N E A L I N G
0 0 0 0 0 0 0 0 0 0
A 1 0 1 1 1 0 1 1 1 1
N 1 1 -1 0 1 1 0 1 1 1
N 1 1 1 -1 -1 1 1 0 0 0
U 1 1 1 1 0 0 1 1 1 1
A 1 1 1 1 1 -1 -1 0 1 1
L 1 1 1 1 1 1 -1 -1 -1 0
We denote by score j the edit distance of a pattern occurrence ending at text position j. The key ideas of
Myers’ algorithm are as follows:
1. Instead of computing C we compute the ∆ values, which in turn are represented as bit-vectors.
2. We compute the matrix column by column as in Ukkonen’s version of the DP algorithm.
3. We maintain the value score j using the fact that score0 = m and score j = score j−1 + ∆hm, j .
In the following slides we will make some observations about the dependencies of the bit-vectors.
3.9 Observations
Lets have a look at the ∆s. The first observation is that
Proof. If HNi,j then ∆hi, j = −1 by definition, and hence ∆vi, j−1 = 1 and ∆di, j = 0. This holds true, because
otherwise the range of possible values ({−1, 0, +1}) would be violated.
(i−1,j−1) (i−1,j)
x+1 x
(i,j−1) (i,j)
By symmetry we have:
VNi, j ⇔ HPi−1, j AND D0i,j .
Proof. If HPi, j then VPi, j−1 cannot hold without violating the ranges. Hence ∆vi, j−1 is −1 or 0. In the first case
VNi, j−1 is true, whereas in the second case we have neither VPi, j−1 nor D0i, j
Bit vector algorithms for approximate string matching, by C. Gröpl, G. Klau, K. Reinert, October 28, 2013, 14:053005
(i−1,j−1) (i−1,j)
x or x+1
x x+1
(i,j−1) (i,j)
Again by symmetry we have
VPi, j ⇔ HNi−1, j OR NOT (HPi−1,j OR D0i, j ) .
Finally, D0i, j = 1 iff C[i, j] and C[i − 1, j − 1] have the same value. This can be true for three possible reasons,
which correspond to the three cases of the DP recurrence:
The formula
D0i, j = (pi = t j ) OR VNi, j−1 OR (VPi−1, j−1 AND D0i−1, j )
is of the form
D0i = Xi OR (Yi−1 AND D0i−1 ) ,
where
Xi := (t j = pi ) OR VNi
and Yi := VPi .
Here we omitted the index j from notation for clarity. We solve for D0i . Unrolling now the first few terms we
get:
D01 = X1 ,
D02 = X2 OR (X1 AND Y1 ) ,
D03 = X3 OR (X2 AND Y2 ) OR (X1 AND Y1 AND Y2 ) .
In general we have:
D0i = OR ir=1 (Xr AND Yr AND Yr+1 AND . . . AND Yi−1 ) .
To put this in words, let s < i be such that Ys . . . Yi−1 = 1 and Ys−1 = 0 . Then D0i = 1 if Xr = 1 for some
s ≤ r ≤ i. That is, the first consecutive block of 1s in Y that is right of i or i itself must be covered by a 1 in X.
Here is an example:
Y = 00011111000011
X = 00001010000101
D0 = 00111110000111
A position i in D0 is set to 1 if one position to the right in Y is a 1 and in the consecutive runs of 1s from there
on or at position i is also an Xr set to 1.
3.11 Computing D0
Now we solve the problem to compute D0 from X and Y using bit-vector and arithmetic operations.
The first step is to compute the & of X and Y and add Y. The result will be that every Xr = 1 that is aligned
to a Yr = 1 will be propagated one position to the left of a block of 1s if we then add Y to the result.
Again our example:
Y = 00011111000011
X = 00001010000101
X & Y = 00001010000001
(X & Y) + Y = 00101001000100
···
D0 = 00111110000111
Note the two 1s that got propagated to the end of the runs of 1s in Y.
However, note also the one 1 that is not in the solution but was introduced by adding a 1 in Y to a 0 in
(X & Y).
As a remedy we XOR the term with Y so only the bits that changed during the propagation stay turned
on. Again our example:
Y = 00011111000011
X = 00001010000101
X & Y = 00001010000001
(X & Y) + Y = 00101001000100
((X & Y) + Y) ∧ Y = 00110110000111
···
D0 = 00111110000111
Bit vector algorithms for approximate string matching, by C. Gröpl, G. Klau, K. Reinert, October 28, 2013, 14:053007
Now we are almost done. The only thing left to fix is the observation that there may be several Xr bits under
the same block of Y. Of those all but the first remain unchanged and hence will not be marked by the XOR.
To fix this and to account for the case X1 = 1 we OR the final result with X. Again our example:
Y = 00011111000011
X = 00001010000101
X & Y = 00001010000001
(X & Y) + Y = 00101001000100
((X & Y) + Y) ∧ Y = 00110110000111
(((X & Y) + Y) ∧ Y)) | X = 00111110000111
= D0 = 00111110000111
Now we can substitute X back by (pi = t j ) OR VN and Y by VP.
(1) // Preprocessing
(2) for c ∈ Σ do B[c] = 0m od
(3) for j ∈ 1 . . . m do B[p j ] = B[p j ] | 0m− j 10 j−1 od
(4) VP = 1m ; VN = 0m ;
(5) score = m;
(1) // Searching
(2) for pos ∈ 1 . . . n do
(3) X = B[tpos ] | VN;
(4) D0 = ((VP + (X & VP)) ∧ VP) | X;
(5) HN = VP & D0;
(6) HP = VN | ∼ (VP | D0);
(7) X = HP 1;
(8) VN = X & D0;
(9) VP = (HN 1) | ∼ (X | D0);
(10) // Scoring and output
(11) if HP & 10m−1 , 0m
(12) then score += 1;
(13) else if HN & 10m−1 , 0m
(14) then score −= 1;
(15) fi
(16) fi
(17) if score ≤ k report occurrence at pos fi;
(18) od
Note that this algorithm easily computes the edit distance if we add in line 7 the following code fragment:
| 0m−1 1. This because in this case there is a horizontal increment in the 0-th row. In the case of reporting all
occurrences each column starts with 0.
3008Bit vector algorithms for approximate string matching, by C. Gröpl, G. Klau, K. Reinert, October 28, 2013, 14:05
a 0 1 0 0 0 1
l 1 0 0 0 0 0
n 0 0 0 1 1 0
u 0 0 1 0 0 0
* 0 0 0 0 0 0
in addition
VN = 000000
VP = 111111
score 6
Reading a 010001
D0 = 111111
HN = 111111
HP = 000000
VN = 000000
VP = 111110
score 5
Reading n 000110
D0 = 111110
HN = 111110
HP = 000001
VN = 000010
VP = 111101
score 4
Reading n 000110
D0 = 111110
HN = 111100
HP = 000010
VN = 000100
VP = 111001
score 3
Reading e 000000
D0 = 000100
HN = 000000
HP = 000110
VN = 000100
VP = 110001
score 3
Reading a 010001
D0 = 110111
HN = 110011
HP = 001100
VN = 010000
VP = 100110
score 2
and so on..... In the exercises you will extend the algorithm to handle queries larger than w.
recursive case
t p
Myers’ bit-vector algorithm is efficient if the number of DP rows is less or equal to the machine word
length (typically 64). For larger matrices, the bitwise operations must be emulated using multiple words at the
expense of running time.
Banded Myers’ bit vector algorithm:
Algorithm outline:
A Bit-Vector
• Right-shift Algorithm
VP and VN (see b) for Computing Levenshtein and Damerau Edit Distances
• Use either D0 or HP, HN to track score (dark cells in c)
j -1 j j -1 j j -1 j
a) b) c)
Figure 5: a) Horizontal tiling (left) and diagonal tiling (right). b) The figure shows
how3.16
the diagonal
Preprocessingstep aligns the (j − 1)th column vector one step above the jth
and Searching
column vector. c) The figure depicts in gray the region of diagonals, which are filled
• Given
according text t of length nrule.
toa Ukkonen’s and a pattern p of length
The cells m
on the lower boundary are in darker tone.
• We consider a band of the DP matrix:
– consisting of w consecutive diagonals
have a value V Nj [lv ] = 1. This is impossible, because it can happen only if D0j has
– where the leftmost diagonal is the main diagonal shifted by c to left
an ”extra” set bit at position lv + 1 and HPj [lv ] = 1, and these two conditions cannot
simultaneously 0 be true.
In addition to the obvious way of first computing V Pj and V Nj in normal fashion
c
and then shifting them up (to the right) when processing the (j + 1)th column, we
propose also a secondband option.
width w
It can be seen that essentially the same shifting effect
can be achieved already when the vectors V Pj and V Nj are computed by making the
following changes on the last two lines of the algorithms in Figures 3 and 4:
-The diagonal zero delta vector D0j is shifted one step to the right on the second
m
last line. 0 m−c n
-The left shifts of the horizontal delta vectors are removed.
-The OR-operation
Now, we don’t encode of the V
wholePj withDP column 1 is but
removed.
the intersection of a column and the band (plus one
diagonal left of the band). Thus we store only w vertical deltas in VP and VN. The lowest bits encode the
Thisdifferences
secondbetween alternative uses less bit operations, but the choice between the two may
the rightmost and the left adjacent diagonal.
dependThe onpattern
other bit practical
mask computationissues. remains Fortheexample
same. if several bit vectors have to be used
in encoding D0j , the column-wise top-to-bottom order may make it more difficult to
shift D0j up than shifting both V Pj and V Nj down.
We(1) also modify the way some cell values are explicitly maintained. We choose
// Preprocessing
(2) for i ∈ Σthe
to calculate do B[i] = 0m od along the lower boundary of the filled area of the dynamic
values
(3) for j ∈ 1 . . . m do B[p j ] = B[p j ] | 0
m− j
10 j−1 ; od
programming w matrix
(4) VP = 1 ; VN = 0 ;
w (Figure 5c). For two diagonally consecutive cells D[i − 1, j − 1]
// Original // Banded
for pos ∈ 1 . . . n do for pos ∈ 1 . . . n do
// Use shifted pattern mask
B = (B[tpos ]0w (pos + c)) & 1w ;
X = B[tpos ] | VN; X = B | VN;
D0 = ((VP + (X & VP)) ∧ VP) | X; D0 = ((VP + (X & VP)) ∧ VP) | X;
HN = VP & D0; HN = VP & D0;
HP = VN | ∼ (VP | D0); HP = VN | ∼ (VP | D0);
X = HP 1; X = D0 1;
VN = X & D0; VN = X & HP;
VP = (HN 1) | ∼ (X | D0); VP = HN | ∼ (X | HP);
// Scoring and output // Scoring and output
... ...
od od
-5 -5 -5 -5 -5 -5 0 1 2 3 4 5
-4 -4 -4 -4 -4 -4 0 1 2 3 4 5
-3 -3 -3 -3 -3 -3 0 1 2 3 4 5
-2 -2 -2 -2 -2 -2 0 1 2 3 4 5
-1 -1 -1 -1 -1 -1 0 1 2 3 4 5
0 0 0 0 0 0 0 1 2 3 4 5
1 1
2 2
3 3
4 4
5 5
6 6
VP = 1w , VN = 0w VP = 1c+1 0w−c−1 , VN = 0w