Module 4
Module 4
How Bloom filter is useful in Big Data Analytics, explain with suitable
example?
Bloom Filter: A Bloom filter is a space-efficient probabilistic data
structure that is
1.Unlike a standard hash table , a bloom filter with fixed size can represent a set
with large number of elements. It does not result in a situation such as “filled up”
data structure. However, the probability of false positives increases as the entries
made to it increases until all the bits are set to 1, after which all the queries will
result in positive
2.It is not possible to delete an element from bloom filter because if we delete the
bits at indices generated by hash function for a single element bits corresponding to
other elements will also get deleted.
3.Blooms may result in false positive but never in false negative. Which means that
it may tell you that an entry is present while it may not be present in the set (false
positive) but it would never say its not while its there (false negative).
Bloom Filtering
• For a set or list, and space is an issue,
• It selects tuples that satisfy the given criteria and others are rejected
• Bloom Filters may give you false positive but would never
yield false negative
depending on what false positive rate our application can tolerate and if we know approximately
how many elements are going to be inserted in the filter we can control the probability.
So by varying different values for k and m we can set the size of the filter and its probability to generate false
positives. The larger the filter , the less is the probability of false positives.
Number of Hash Functions:
The more the number of hash functions , the less the false probability as the quicker it fills up .
But more hash functions makes the filter slower.
The relation between number of hash functions k , the size of filter m and the number of expected
entries n is
optimized number of hash functions that we will require for efficient working
of filter at a desirable false positive rates.
https://www.geeksforgeeks.org/bloom-filters-introd
uction-and-python-implementation/
https://www.enjoyalgorithms.com/blog/bloom-filter