0% found this document useful (0 votes)
10 views7 pages

Bloom Filters - A Probabilistic Data Structure - LinkedIn

The document discusses Bloom Filters, a probabilistic data structure used for membership queries, highlighting their ability to filter out non-members while potentially generating false positives. It explains the implementation process, including how to insert elements and check for membership using hash functions, with a runtime complexity of O(K). The article encourages readers to subscribe for future content on system design and algorithms.

Uploaded by

Aman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views7 pages

Bloom Filters - A Probabilistic Data Structure - LinkedIn

The document discusses Bloom Filters, a probabilistic data structure used for membership queries, highlighting their ability to filter out non-members while potentially generating false positives. It explains the implementation process, including how to insert elements and check for membership using hash functions, with a runtime complexity of O(K). The article encourages readers to subscribe for future content on system design and algorithms.

Uploaded by

Aman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2 2 4 R

Home My Network Jobs Messaging Notifications Me For Business

Create your own newsletter Systems That Scale


Start your own discussion with a newsletter on LinkedIn.Discussing
Share whatSystem
you know and concepts and Algorithms
Design
build your thought leadership with every new edition. Weekly newsletter
Try it out 26,690 subscribers Subscribe

Bloom Filters - A
Probabilistic Data
Structure
Saurav Prateek
Engineer @ Google | Ex-SWE @ 42 articles Follow
GeeksForGeeks | Authoring engineering…
February 23, 2024
Open Immersive Reader

Introduction
Bloom Filters are compact data-structures that are
responsible for answering the membership queries. They
successfully filter out the elements which are not a part of
the set.
In summary the Bloom Filters are responsible for
answering whether an element is a part of a given set or
not. Suppose we have a list of elements present in a set,
say S. Now we want to check whether an element X is a
part of the set S or not. In this case Bloom Filters can
answer in two ways:
1. Element X can or can not be a part of Set S: The
Bloom Filters can respond with a Maybe! This means
that there is some probability for the element X to be
present in the set S. The more efficient the filters,
higher will be the prediction of probability for an
element to be present in the set.
2. Element X is definitely not a part of Set S: The Bloom
Filters can respond with a Definitely Not! This means
that there is a 0 probability for the element X to be
present in the set S. In this case we can be completely
sure that the element is not present in the set.
In this way Bloom Filters are always able to recognize True
Negatives but can not always recognize True
Positives and hence can generate some False Positives.
These data structures are generally used in those
scenarios where False Positives are accepted.

For Example we can raise a false Alarm for a Fire hazard


which did not take place but we can never avoid raising an
Alarm for an actual fire Hazard.

Implementing a Standard Bloom


Filter
A Standard Bloom Filter can be implemented through a
group of Hash Functions.
Standard Bloom Filters work most efficiently when we
know the size of our search space in advance. Suppose for
us the size of our search space was N. Now let’s
understand how we can insert an element in the Bloom
Filter data-structure and how we can answer
the Membership queries.
Inserting an element into Bloom Filter
Suppose we have an element X which needs to be
inserted into the Bloom Filter. We will take the Bloom Filter
to be a vector of size N. Remember N was the size of our
search space!
Initially the Bloom Filter vector has all its values set
to 0 denoting an empty filter.
We will also have a set of K hash functions H where:

H = {h1, h2, h3, … , hk}

These hash functions will generate a distributed random


value in the range from 0 to N-1.
Now, during insertion we will compute the hash value of X
with all the K hash functions and set the corresponding
addresses in the Bloom Filter vector to 1.

The Algorithm for this method can be described as


follows.
Searching an element in the Bloom Filter
Suppose we have an element X which needs to be
checked if it is present in a set S or not. Given that all the
elements of the set S are present in the Bloom Filter (BF),
we will use the data-structure to answer the Membership
queries.
To answer this we again need to perform the hash of
element X with all the K hash functions we talked about in
the previous section. Once computed, we check whether
for all the hash values the corresponding locations in
the Bloom Filter vector are set to 1 or not. If any one of the
locations is set to 0, we can safely say that the element
does not exist in the set.

The algorithm for the operation looks like this.


It’s safe to say that the run time complexity of both the
operations for setting an element into the Bloom Filters
and the Membership query is O(K) where K is the size of
the set of Hash Functions i.e. the number of distinct hash
functions used in the Bloom Filter.

Conclusion
We discussed Bloom Filters in detail along with the
implementation process of a Standard Bloom Filter. We
also looked into how we can insert and search an element
in the Bloom Filter along with their run-time complexity
and code demonstrations.
Meanwhile what you all can do is to Like and Share this
edition among your peers and also subscribe to
this Newsletter so that you all can get notified when I
come up with more content in future. Share this
Newsletter with anyone who might be benefitted from this
content.
Until next time, Dive Deep and Keep Learning!
Report this

Published by
Saurav Prateek 42 Follow
Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring engineering n…
articles
Published • 1mo
37th edition in "Systems That Scale"! 🔥 ⚡
We discussed "Bloom Filters" a probabilistic data structure in depth. This edition
involve:
✅ Implementation of Standard Bloom Filters.
✅ Inserting an element into the Bloom Filters and Membership Queries.
✅ Code demonstration of the insertion and membership-query methods.
Read On! 💯

#programming #systemdesign #datastructures #distributedsystems


#database #newsletter #coding #softwareengineering #algorithms

Like Comment Share


Sanjay Tangudu and 124 others
125 12 comments

Reactions


12 Comments
Most relevant

Add a comment…
Deepak A. (He/Him) • 2nd 1mo
Morningstar | RNF Tech | TERI
Expected this article 😁
Like · 1 Reply · 2 Replies
Saurav Prateek • 2nd 1mo
Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring
engineering newsletter with 25K+ Subs | 50K+ Linkedin | Content
Creator | Mentor
Deepak A. haha.. was not able to explain this in depth at our
session 😄
Like · 1 Reply
Deepak A. (He/Him) • 2nd 1mo
Morningstar | RNF Tech | TERI
Yes
Like Reply
Sagar Maheta (He/Him) • 2nd 1mo
Founder @ Code Engine | 8+ YOE | JavaScript, C++, Python, C#,
Angular, React Native | Proficient in Software Development, Client
Communication, Team Leadership, and Project Management. |
Immediate Joiner.
Keep em coming, I've read most of your pdfs about system design and i
like how in few words you manage to describe a lot.
Like · 1 Reply · 1 Reply
Saurav Prateek • 2nd 1mo
Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring
engineering newsletter with 25K+ Subs | 50K+ Linkedin | Content
Creator | Mentor
Sagar Maheta Thanks! 😃
Like Reply
Load more comments

Systems That Scale


Discussing System Design concepts and Algorithms
26,690 subscribers
Subscribe

More from this newsletter

Facade Pattern - Strategy Pattern - A flexible


Simplifying your design architecture
complexity Saurav Prateek on LinkedIn
Saurav Prateek on LinkedIn

Graph Database - Trying


out Neo4J
Saurav Prateek on LinkedIn

You might also like