0% found this document useful (0 votes)
437 views5 pages

Bda Experiment 4: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade

The document describes an experiment implementing the DGIM algorithm. The DGIM algorithm uses O(log2N) bits to represent a window of N bits and allows estimating the number of 1's in the window with an error of no more than 50%. The algorithm forms "buckets" of 1's based on certain rules and stores the start and end index and size of each bucket. The program takes an input stream, implements the bucketing algorithm, and outputs the buckets and estimates the number of 1's in a given suffix of the stream.

Uploaded by

Alka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
437 views5 pages

Bda Experiment 4: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade

The document describes an experiment implementing the DGIM algorithm. The DGIM algorithm uses O(log2N) bits to represent a window of N bits and allows estimating the number of 1's in the window with an error of no more than 50%. The algorithm forms "buckets" of 1's based on certain rules and stores the start and end index and size of each bucket. The program takes an input stream, implements the bucketing algorithm, and outputs the buckets and estimates the number of 1's in a given suffix of the stream.

Uploaded by

Alka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

BDA EXPERIMENT 4

Roll No. A-52 Name: JANMEJAY PATIL


Class: BE-A Batch: A3
Date of Experiment: Date of Submission
Grade :

B.1. DGIM algorithm:

Write a program by considering any stream to implement the DGIM algorithm.

DGIM algorithm (Datar-Gionis-Indyk-Motwani Algorithm)

Designed to find the number 1’s in a data set. This algorithm uses O(log²N) bits to
represent a window of N bit, allows estimating the number of 1’s in the window with an
error of no more than 50%.

So this algorithm gives a 50% precise answer.

In the DGIM algorithm, each bit that arrives has a timestamp, for the position at which it
arrives. if the first bit has a timestamp 1, the second bit has a timestamp 2, and so on..
the positions are recognized with the window size N (the window sizes are usually taken
as a multiple of 2). The windows are divided into buckets consisting of 1’s and 0's.

RULES FOR FORMING THE BUCKETS:

➢ The right side of the bucket should always start with 1. (if it starts with a 0, it is to
be neglected) E.g. · 1001011 → a bucket of size 4, having four 1’s and starting
with 1 on its right end.
➢ Every bucket should have at least one 1, else no bucket can be formed.
➢ All buckets should be in powers of 2.
➢ The buckets cannot decrease in size as we move to the left. (move-in increasing
order towards left)
B.2. Input and Output:

inp = list(map(int, input("Enter Elements : ").split()))

print("Length of Input: ",len(inp))

bucket_list = []

bucket_size_count = {}

def checker():

for ct in bucket_size_count.keys():

if bucket_size_count[ct] > 2:

s2, e2, size2 = bucket_list.pop(-2)

s1, e1, size1 = bucket_list.pop(-2)

bucket_list.insert(-1, (s1, e2, size1 * 2))

bucket_size_count[ct] -= 2

start_index = 0

end_index = 0

pair = 0

for i in range(len(inp)):

bit = inp[i]

if bit == 1:

if pair == 1:

end_index = i

pair = 0

bucket_list.append((start_index, end_index, 2))

if 2 in bucket_size_count:

bucket_size_count[2] += 1

else:
bucket_size_count[2] = 1

checker()

else:

start_index = i

pair = 1

print(bucket_list)

starts = []

ends = []

for s, e, size in bucket_list:

starts.append(s)

ends.append(e)

print("Buckets are: ", end="")

for i in range(len(inp)):

bit = inp[i]

if i in starts:

print(" ", bit, end="")

elif i in ends:

print(bit, end=" ")

else:

print(bit, end=" ")

print("\nNo. of buckets: ", len(bucket_list))

k = int(input("\nEnter k : "))

length = len(inp)

bound1 = length - 1 - k
bound2 = length - 1

ones_count = 0

for s, e, size in bucket_list[::-1]:

if s < bound1 and e < bound1:

break

elif s <= bound1 <= e:

ones_count += int(size / 2)

elif s >= bound1 and e >= bound1:

ones_count += size

print("Number of 1's in Last", k, "bits are ", ones_count)

OUTPUT
B.3. Observations and learning:

Advantages

➢ Stores only O(log2 N) bits


➢ O(log N)counts of log2N bits each
➢ Easy update as more bits enter
➢ Error in count no greater than the number of 1’s in the unknown area.

Drawbacks

➢ As long as the 1s are fairly evenly distributed, the error due to the unknown region
is small – no more than 50%.
➢ But it could be that all the 1s are in the unknown area at the end. In that case, the
error is unbounded.

B.4. Conclusion:

Hence we’ve successfully implemented a program to implement the DGIM algorithm.

You might also like