0% found this document useful (1 vote)
67 views2 pages

DGIM Algorithm Theory Explanation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
67 views2 pages

DGIM Algorithm Theory Explanation

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Why DGIM algorithm?

Suppose we have a window of length N on a binary stream. We want at all times to be able
to answer queries of the form “how many 1’s are there in the last k bits?” for any k≤ N. For
this purpose we use the DGIM algorithm.
The basic version of the algorithm uses O(log2 N) bits to represent a window of N bits, and
allows us to estimate the number of 1’s in the window with an error of no more than 50%.
To begin, each bit of the stream has a timestamp, the position in which it arrives. The first
bit has timestamp 1, the second has timestamp 2, and so on.
Since we only need to distinguish positions within the window of length N, we shall
represent timestamps modulo N, so they can be represented by log2 N bits. If we also store
the total number of bits ever seen in the stream (i.e., the most recent timestamp) modulo N,
then we can determine from a timestamp modulo N where in the current window the bit with
that timestamp is.
We divide the window into buckets, 5 consisting of:
1. The timestamp of its right (most recent) end.
2. The number of 1’s in the bucket. This number must be a power of 2, and we refer to
the number of 1’s as the size of the bucket.

To represent a bucket, we need log2 N bits to represent the timestamp (modulo N) of its right
end. To represent the number of 1’s we only need log2 log2 N bits. The reason is that we
know this number i is a power of 2, say 2j , so we can represent i by coding j in binary. Since
j is at most log2 N, it requires log2 log2 N bits. Thus, O(logN) bits suffice to represent a
bucket. There are six rules that must be followed when representing a stream by buckets.

• The right end of a bucket is always a position with a 1.

• Every position with a 1 is in some bucket.

• No position is in more than one bucket.

• There are one or two buckets of any given size, up to some maximum size.
• All sizes must be a power of 2.

• Buckets cannot decrease in size as we move to the left (back in time).

You might also like