Ranged Queries Using Bloom Filters Final

This document proposes using Bloom filters to enable efficient range queries on a set of strings. It describes: 1. Inserting prefixes of each string into the Bloom filter to allow checking if a substring is contained between two query strings. 2. The range query algorithm checks prefixes of increasing length to see if any substrings fall between the query range. 3. The space complexity is O(nk) and time is O(k) per query/insert, or it can be optimized to O(nk/logk) space with the same time.

Uploaded by

Alice Qing Wong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

406 views19 pages

Ranged Queries Using Bloom Filters Final

Uploaded by

Alice Qing Wong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Range Queries using Bloom Filters

Basim Baig, Hau Chan, Samuel McCauley, Alice Wong Computer Science Department, Stony Brook University

Bloom Filter (Review)

An efficient data structure to represent a set S (subset of U) to efficiently answer membership queries such that
Given: if x in U Output: return No => if x not in S return Yes => if x in S (with prob. >= (1-)) (false positive)

Goal
Let S be a subset of a universe U (containing strings) that supports the following operations:
Insert(x): S <- S U {x} Query(x, y): Is there a string between x and y?
return No => if nothing between x and y return Yes => if there is a string between x and y with false positive probability (small)

Our Result
C*nk space O(k) time for range queries/inserts An Optimized version that reduces the space to C*nk/log(k) while retaining the same query time

Idea
Let S = B be our bloom filter structure. For each insert(K), K in U, we insert each substring/prefix of K[0, pi], i=1,, |K|/p, we insert it into B. (We assume that |K| is divisible by p.)

Algorithm for Range Queries

We are assuming that all the strings in U is maximum length K. For query between X and Y, query(X, Y), if X and Y are uneven length, we can pad wildcard characters to the shorter one. Procedures:
1. Check pi > K => return yes 2. For any substring x in between X[0,pi] and Y[0,pi] (inclusively)
1. 2. 3. if bloom filter query(x) returns true for more than one children => return yes if bloom filter query(x) returns true for only the left most=> then increment and repeat If every query(x) return no => then return no

Space Analysis
The size of the structure would be the number of inserted strings, say N, times the number of inserts requires to insert the string with the maximum length. Suppose the maximum length of the string inserted is K, then we insert K/p times for this particular string. We need at most O(NK/p) inserts to the bloom filter.
Space:

Query Analysis
Since bloom filter has look up time of O(1), we need to look up at most all the brute force elements at each level Hence, the range query time of our structure is:

Error analysis:
You can set the appropriate value of error that you desire. But less error means you need more space. However this does not impact the space as much as you would think. The dominant factor in the space is still the k/p factor outside the log.

Optimization
Instead of brute forcing down each of the paths that matches the input range strings we will just brute force selected nodes.

Modified Bloom filter

Modified Query Algorithm

Ex 2: [cbcaa-cbcc] LCA = <cbc>

Modified costs
Query cost:

Space cost:

How to get the labels

Have four bloom filters labeled left, right, middle and both. Preprocessing allows you to put all the nodes in the appropriate bloom filters. But a downside is this makes the structure very static. In the worst case, you need to revert to unmodified algorithm.

Thank you

Data Structures Cheat Sheet
71% (14)
Data Structures Cheat Sheet
2 pages
Introduction To Abaqus Scripting (ABAQUS)
100% (1)
Introduction To Abaqus Scripting (ABAQUS)
19 pages
Design and Analysis of Algorithm Advanced Data Structure: (B Tree
No ratings yet
Design and Analysis of Algorithm Advanced Data Structure: (B Tree
155 pages
05 Tree Maps
No ratings yet
05 Tree Maps
47 pages
Binary Search Trees, Selection Trees Forests Representation of Disjoint Sets Counting Binary Trees
No ratings yet
Binary Search Trees, Selection Trees Forests Representation of Disjoint Sets Counting Binary Trees
71 pages
Mcs 21 Impquestion
No ratings yet
Mcs 21 Impquestion
45 pages
Bda PT 2
No ratings yet
Bda PT 2
35 pages
Data Structures Module 5 Complete Solutions
No ratings yet
Data Structures Module 5 Complete Solutions
34 pages
Computer Security, Ethics and Privacy PDF
100% (2)
Computer Security, Ethics and Privacy PDF
55 pages
Data Stream Sampling
No ratings yet
Data Stream Sampling
25 pages
Siddharthashankar Daa
No ratings yet
Siddharthashankar Daa
60 pages
Bloom Filter
No ratings yet
Bloom Filter
50 pages
Blooms Filter
No ratings yet
Blooms Filter
15 pages
On Implementing Bloom Filters in C - Andreinc
No ratings yet
On Implementing Bloom Filters in C - Andreinc
16 pages
Week 5
No ratings yet
Week 5
74 pages
Assignment-1 Daa KCS503
No ratings yet
Assignment-1 Daa KCS503
2 pages
Bloom Filter Guo
No ratings yet
Bloom Filter Guo
90 pages
IE2108 Data Structure and Algorithm Summary
No ratings yet
IE2108 Data Structure and Algorithm Summary
28 pages
Pleiger Control Loop: User's Manual V1.30
100% (2)
Pleiger Control Loop: User's Manual V1.30
26 pages
ADS MidSolution Feb25
No ratings yet
ADS MidSolution Feb25
14 pages
Catalogue Yamaha Music Instruments
No ratings yet
Catalogue Yamaha Music Instruments
24 pages
Daa Kcs503 2021-22 Aktu Qpaper Sol
No ratings yet
Daa Kcs503 2021-22 Aktu Qpaper Sol
40 pages
ECE250 Notes
No ratings yet
ECE250 Notes
23 pages
Lecture Notes For Design and Analysis of Algorithms
No ratings yet
Lecture Notes For Design and Analysis of Algorithms
25 pages
Guc 437 59 28260 2023-03-02T20 40 06
No ratings yet
Guc 437 59 28260 2023-03-02T20 40 06
22 pages
Bloom Filter
No ratings yet
Bloom Filter
29 pages
Ribbon Filter: Practically Smaller Than Bloom and Xor: Peter C. Dillinger Stefan Walzer
No ratings yet
Ribbon Filter: Practically Smaller Than Bloom and Xor: Peter C. Dillinger Stefan Walzer
14 pages
PRX535 Input Board Schematic
100% (1)
PRX535 Input Board Schematic
5 pages
Viden Io Data Analytics Lecture7 Data Stream Filtering PDF
No ratings yet
Viden Io Data Analytics Lecture7 Data Stream Filtering PDF
20 pages
ADS EXP 8 Tanisha Kanal
No ratings yet
ADS EXP 8 Tanisha Kanal
10 pages
Chapter 09 Advanced Data Structures
No ratings yet
Chapter 09 Advanced Data Structures
9 pages
Bloom Filters - A Probabilistic Data Structure - LinkedIn
No ratings yet
Bloom Filters - A Probabilistic Data Structure - LinkedIn
7 pages
Imp Ads Print Merged
No ratings yet
Imp Ads Print Merged
6 pages
DATA1001 23t2 Assignment3
No ratings yet
DATA1001 23t2 Assignment3
7 pages
Data Structures Question Bank MGU semIV
No ratings yet
Data Structures Question Bank MGU semIV
5 pages
Final Info2206 en 2023-2024.vers1
No ratings yet
Final Info2206 en 2023-2024.vers1
12 pages
Alg Exercise Set 4
No ratings yet
Alg Exercise Set 4
7 pages
CS261 Fundamentals of Data Structures and Algorithm
No ratings yet
CS261 Fundamentals of Data Structures and Algorithm
3 pages
Manual Bda 6 7 8
No ratings yet
Manual Bda 6 7 8
6 pages
Algos Qpaper 2022
No ratings yet
Algos Qpaper 2022
6 pages
Bloom Filters: Presented By: Eman Shafiq (2017-EE-389) Bareera Azhar (2017-EE-379) Ruqia Rubab (2017-EE-383
No ratings yet
Bloom Filters: Presented By: Eman Shafiq (2017-EE-389) Bareera Azhar (2017-EE-379) Ruqia Rubab (2017-EE-383
14 pages
AA Exam 2021 Answers
No ratings yet
AA Exam 2021 Answers
6 pages
1 Overview: Lecture 2 - February 3, 2005
No ratings yet
1 Overview: Lecture 2 - February 3, 2005
6 pages
Jenkins On AWS
No ratings yet
Jenkins On AWS
48 pages
Veritas 5360 Appliance Product Description Guide
No ratings yet
Veritas 5360 Appliance Product Description Guide
79 pages
BBEdit User Manual (11.6.8)
No ratings yet
BBEdit User Manual (11.6.8)
382 pages
Bda Exp4 Chinmay
No ratings yet
Bda Exp4 Chinmay
4 pages
Ps 4
No ratings yet
Ps 4
3 pages
Data Structure Questions
No ratings yet
Data Structure Questions
13 pages
Ds181 Artix 7 Data Sheet
No ratings yet
Ds181 Artix 7 Data Sheet
63 pages
System Builder - Ryzen 3 2200g 3
No ratings yet
System Builder - Ryzen 3 2200g 3
2 pages
2017 Est
No ratings yet
2017 Est
3 pages
40 Figma Plugins You Should Know About For Your
No ratings yet
40 Figma Plugins You Should Know About For Your
29 pages
Teaching Image Processing in Engineering Using Python
No ratings yet
Teaching Image Processing in Engineering Using Python
8 pages
Deep Packet Inspection Using Parallel Bloom Filters
No ratings yet
Deep Packet Inspection Using Parallel Bloom Filters
8 pages
Algo Ds Bloom Typed
No ratings yet
Algo Ds Bloom Typed
8 pages
Ds 1
No ratings yet
Ds 1
8 pages
Resources: Algorithm - Week 2
No ratings yet
Resources: Algorithm - Week 2
3 pages
DS (3rd) May2018
No ratings yet
DS (3rd) May2018
2 pages
2020-Dec CS-201 34
No ratings yet
2020-Dec CS-201 34
3 pages
Bloom Filters: What Is A Bloom Filter?
No ratings yet
Bloom Filters: What Is A Bloom Filter?
7 pages
Bloom Filters - Short Tutorial: Web Cache Sharing ( (3) ) Collaborating Web Caches Use Bloom Filters (Dubbed
No ratings yet
Bloom Filters - Short Tutorial: Web Cache Sharing ( (3) ) Collaborating Web Caches Use Bloom Filters (Dubbed
4 pages
Bloom Filters - Short Tutorial: Web Cache Sharing ( (3) ) Collaborating Web Caches Use Bloom Filters (Dubbed
No ratings yet
Bloom Filters - Short Tutorial: Web Cache Sharing ( (3) ) Collaborating Web Caches Use Bloom Filters (Dubbed
4 pages
Lect0208 PDF
No ratings yet
Lect0208 PDF
7 pages
Chapter 1 Slides Posted
No ratings yet
Chapter 1 Slides Posted
26 pages
Daa Question Bank
No ratings yet
Daa Question Bank
5 pages
B Tree Exo + Sol
No ratings yet
B Tree Exo + Sol
5 pages
Sikkim Manipal University: July 2011
No ratings yet
Sikkim Manipal University: July 2011
25 pages
DS Cheatsheet
No ratings yet
DS Cheatsheet
2 pages
Best Mt5 Mobile Bot
No ratings yet
Best Mt5 Mobile Bot
2 pages
Improving UI Test Automation Using Robotic Process Automation
No ratings yet
Improving UI Test Automation Using Robotic Process Automation
9 pages
Thisisdbms
No ratings yet
Thisisdbms
29 pages
Complaint: X Incorporated
No ratings yet
Complaint: X Incorporated
6 pages
Teradyne Corporation Jaguar Project
No ratings yet
Teradyne Corporation Jaguar Project
12 pages
WCE 2021 Nomination Bogie Systems - Sample
No ratings yet
WCE 2021 Nomination Bogie Systems - Sample
2 pages
Advanced Digital I&C Technology in Nuclear Power Plants A Success Story From Finland and China
No ratings yet
Advanced Digital I&C Technology in Nuclear Power Plants A Success Story From Finland and China
13 pages
Arvind Sudarsanam Resume
No ratings yet
Arvind Sudarsanam Resume
4 pages
9691 s12 QP 12
No ratings yet
9691 s12 QP 12
12 pages
Saad Qureshi Resume-1-4 - Compressed
No ratings yet
Saad Qureshi Resume-1-4 - Compressed
2 pages
Adjusting Event Log Size and Retention Settings
No ratings yet
Adjusting Event Log Size and Retention Settings
5 pages
Basic Unix
No ratings yet
Basic Unix
4 pages
Questionable Files
No ratings yet
Questionable Files
2 pages
Aqsad Ali CV
No ratings yet
Aqsad Ali CV
1 page
Financial Statement Analysis: Dhruva College of Management
No ratings yet
Financial Statement Analysis: Dhruva College of Management
15 pages
An Ecient Algorithm For Mining Frequent Closed Itemsets
No ratings yet
An Ecient Algorithm For Mining Frequent Closed Itemsets
10 pages
About The Lab Wednesday Classes Workshops: Grocs Dl1 Cportal
No ratings yet
About The Lab Wednesday Classes Workshops: Grocs Dl1 Cportal
1 page
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet