0% found this document useful (0 votes)

93 views36 pages

Principles of Database Management Systems: 4.2: Hashing Techniques

This document discusses hashing techniques for database management systems. It describes how hashing works by using a hash function to map keys to storage locations. Two common hashing alternatives are presented: using the hash value directly to determine the storage block, or locating records indirectly via index buckets. The document then discusses dynamic hashing techniques like extensible hashing and linear hashing that allow hash tables to grow without full reorganizations.

Uploaded by

gowtham1990

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views36 pages

Principles of Database Management Systems: 4.2: Hashing Techniques

Uploaded by

gowtham1990

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 36

Principles of Database

Management Systems

4.2: Hashing Techniques

Pekka Kilpelinen
(after Stanford CS245 slide originals by
Hector Garcia-Molina, Jeff Ullman and
Jennifer Widom)

DBMS 200 Notes 4.2: Hashi 1

Hashing?
Locating the storage block of a
record by the hash value h(k) of
its key k
Normally really fast
records (often) located by a single
disk access

DBMS 200 Notes 4.2: Hashi 2

Hashing

<key>
key h(key)
Buckets
(typically 1
disk block)

DBMS 200 Notes 4.2: Hashi 3

Two alternatives
(1) Hash value determines the storage block directly
.

records
key h(key)
.

to implement a primary index

DBMS 200 Notes 4.2: Hashi 4

Two alternatives
(2) Records located indirectly via index buckets

record
key h(key) key 1

Index

for a secondary index

DBMS 200 Notes 4.2: Hashi 5

Example hash function

Key = x1 x2 xn n byte character

string
Have b buckets
h = (x1 + x2 + + xn) mod b
{0, 1, , b-1}

DBMS 200 Notes 4.2: Hashi 6

This may not be best function

Good hash Expected number of

function: keys/bucket is the
same for all buckets

Read Knuth Vol. 3 if you really

need to select a good function.

DBMS 200 Notes 4.2: Hashi 7

Next: example to illustrate
inserts, overflows,
deletes
h(K)

DBMS 200 Notes 4.2: Hashi 8

EXAMPLE 2 records/bucket

0
INSERT: d

h(a) = 1 1
a e
h(b) = 2 c
2
b
h(c) = 1
3
h(d) = 0
h(e) = 1

DBMS 200 Notes 4.2: Hashi 9

EXAMPLE: deletion

Delete: 0
a
e 1
b d
f c d
2
c e
3
f maybe move
g g up

DBMS 200 Notes 4.2: Hashi 10

Rule of thumb:
Try to keep space utilization
between 50% and 80%
Utilization = # keys used
total # keys that fit
If < 50%, wasting space
If > 80%, overflows significant
depends on how good hash
function is & on # keys/bucket

DBMS 200 Notes 4.2: Hashi 11

How do we cope with growth?
Overflows and reorganizations
Dynamic hashing: # of buckets
may vary
Extensible
Linear
also others ...

DBMS 200 Notes 4.2: Hashi 12

Extensible hashing: two ideas

(a) Use i of b bits output by hash

function For example,
b=32
b
00110101
h(K)

use i grows over time.

DBMS 200 Notes 4.2: Hashi 13

(b) Use directory

h(K)[i ] to bucket

Directory contains 2i pointers to buckets, and

stores i.
Each bucket stores j, indicating #bits used for
placing the records in this block (j i)

DBMS 200 Notes 4.2: Hashi 14

Extensible Hashing:
Insertion
If there's room in bucket h(k)[i], place
record there; Otherwise
If j=i, set i=i+1 and double the directory
If j<i, split the block in two, distribute
records among them now using j+1 bits
of h(k); (Repeat until some records end
up in the new bucket); Update pointers of
bucket array
See the next example
DBMS 200 Notes 4.2: Hashi 15
Example: h(k) is 4 bits; 2
keys/block
(j) i =2
1
00
i=1 0001
01

10
1 2
1001 11
1010 1100

1 2 New directory
Insert 1100
1010
DBMS 200 Notes 4.2: Hashi 16
Example continued 2
0000
i= 2 0001
00

01
12
0001 0111
10 0111
11 2
1001
1010
Insert:
2
0111 1100
0000

DBMS 200 Notes 4.2: Hashi 17

Example continued
i=3
0000 2 000
i= 2 0001
001
00
0111 2
010
01
011
10 1001 3
1001 100
11
10101001 2 3 101
1010
Insert: 110

1001 1100 2 111

DBMS 200 Notes 4.2: Hashi 18

Extensible hashing: deletion

Reverse insert procedure

Example:
Walk thru insert example in reverse!

DBMS 200 Notes 4.2: Hashi 19

Summary Extensible hashing
+ Can handle growing files
- without full reorganizations
+ Only one data block examined
- Indirection
(Not bad if directory in memory)

- Directory doubles in size

(First it fits in memory, then it does not
sudden performance degradation)

DBMS 200 Notes 4.2: Hashi 20

Linear hashing: grow # of buckets by
one

Two ideas:
(a) Use i low order bits of hash b

01110101
grows i
(b) File grows linearly

No bucket directory needed

DBMS 200 Notes 4.2: Hashi 21

Linear Hashing:
Parameters
n: number of buckets in use
buckets numbered 0n-1
i: number of bits of h(k) used to address
buckets i log(n)

r: number of records in hash table
ratio r/n limited to fit an avg bucket in a block
next example: r 1.7n, and block holds 2 records
=> AVG bucket occupancy is 1.7/2 = 0.85 of a block

DBMS 200 Notes 4.2: Hashi 22

Example: 2 keys/block, b=4 bits, n=2, i =1
insert 0101

now r=4 >1.7n

get new bucket
0000 0101
10
1010 1111 and distribute keys btw
00 01 buckets 00 and 10

Rule If h(k)[i ] = (a1 ai)2 < n, then

look at bucket h(k)[i ]; else
look at bucket h(k)[i ] - 2i -1 = (0a2 ai)2

DBMS 200 Notes 4.2: Hashi 23

n=3, i =2;
distribute keys btw buckets 00 and
10:

0000 0101 1010

1010 1111
00 01 10

DBMS 200 Notes 4.2: Hashi 24

n=3, i =2; insert 0001:
0001
can have overflow
chains!

0000 0101 1010

1111
00 01 10

DBMS 200 Notes 4.2: Hashi 25

n=3, i =2
0001 insert 0111
0111
bucket 11 not in use
0000 0101 1010 redirect to 01
1111
now r=6 > 1.7n
00 01 10
-> get new bucket 11

DBMS 200 Notes 4.2: Hashi 26

n=4, i =2; distribute keys btw 01 and
11 0001
0111

0000 0101 1010 1111

1111
0001 0111
00 01 10 11

DBMS 200 Notes 4.2: Hashi 27

Example Continued: How to grow beyond
this?

i = 23

0000 0101 1010 1111 0101

0101 0101
000 0 01 0 10 11
0 100 101
...
101 110 111
m = 11 (max used block)
100
101

DBMS 200 Notes 4.2: Hashi 28

Summary Linear Hashing
+ Can handle growing files
- without full reorganizations
+ No indirection directory of extensible
- hashing
Can have overflow chains
- but probability of long chains can be
kept low by controlling the r/n fill ratio (?)

DBMS 200 Notes 4.2: Hashi 29

Summary

Hashing
- How it works
- Dynamic hashing
- Extensible
- Linear

DBMS 200 Notes 4.2: Hashi 30

Indexing vs Hashing
Index definition in SQL

DBMS 200 Notes 4.2: Hashi 31

Indexing vs Hashing

Hashing good for probes given key

e.g., SELECT
FROM R
WHERE R.A = 5

DBMS 200 Notes 4.2: Hashi 32

Indexing vs Hashing

INDEXING (Including B-Trees) good

for
Range Searches:
e.g., SELECT
FROM R
WHERE R.A > 5

DBMS 200 Notes 4.2: Hashi 33

Index definition in SQL

Create index name on rel (attr)

Create unique index name on rel
(attr)
defines candidate key

Drop INDEX name

DBMS 200 Notes 4.2: Hashi 34

CANNOT SPECIFY TYPE OF INDEX
Note
(e.g. B-tree, Hashing, )
OR PARAMETERS
(e.g. Load Factor, Size of Hash,...)
... at least in SQL
Oracle and IBM DB2 UDB provide a
PCTFREE clause to inditate the proportion
of B-tree blocks initially left unfilled
Oracle: Hash clusters with built-in or DBA-
specified hash function

DBMS 200 Notes 4.2: Hashi 35

The BIG picture.
Chapters 2 & 3: Storage, records,
blocks...
Chapter 4: Access Mechanisms
- Indexes
- B trees
- Hashing
NEXT
Chapters 6 & 7: Query Processing

DBMS 200 Notes 4.2: Hashi 36

9-Hashing Schemes
No ratings yet
9-Hashing Schemes
23 pages
Chapter 11
No ratings yet
Chapter 11
22 pages
04 UW Hashing
No ratings yet
04 UW Hashing
79 pages
06 Hashtables
No ratings yet
06 Hashtables
85 pages
Visualizing Pi System Data WorkBook
100% (1)
Visualizing Pi System Data WorkBook
232 pages
25-Hashing Techniques - 16-09-2024
No ratings yet
25-Hashing Techniques - 16-09-2024
39 pages
Lecture14 Hash Based Indexing and Sorting MHH 18oct 2016
No ratings yet
Lecture14 Hash Based Indexing and Sorting MHH 18oct 2016
71 pages
Block 03
No ratings yet
Block 03
31 pages
Lecture 5 6
No ratings yet
Lecture 5 6
30 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
6 pages
GCP Associate Cloud Engineer v5 Live
100% (1)
GCP Associate Cloud Engineer v5 Live
537 pages
Lecture 09 Hash Index - Without Answers
No ratings yet
Lecture 09 Hash Index - Without Answers
37 pages
Unit-4 Hand Written
No ratings yet
Unit-4 Hand Written
35 pages
Chapter 7 Indexing Part2
No ratings yet
Chapter 7 Indexing Part2
41 pages
Static and Dynamic Hashing
No ratings yet
Static and Dynamic Hashing
12 pages
Unit 6
No ratings yet
Unit 6
38 pages
Unit 4-Hashing
No ratings yet
Unit 4-Hashing
24 pages
CO3 Session 6
No ratings yet
CO3 Session 6
29 pages
Unit 3.docx Dbms
No ratings yet
Unit 3.docx Dbms
25 pages
Ch11 Hash Indexes 1perpage Annotated
No ratings yet
Ch11 Hash Indexes 1perpage Annotated
28 pages
ds-5 Removed
No ratings yet
ds-5 Removed
16 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
Easy Tri Eve Plus
No ratings yet
Easy Tri Eve Plus
131 pages
Hashing 2
No ratings yet
Hashing 2
17 pages
Database Systems (資料庫系統) : November 26/28, 2007 Lecture #9
No ratings yet
Database Systems (資料庫系統) : November 26/28, 2007 Lecture #9
43 pages
Static and Dynamic Hashing
No ratings yet
Static and Dynamic Hashing
10 pages
Module 12a: Dynamic Hashing: Database System Concepts, 6 Ed
No ratings yet
Module 12a: Dynamic Hashing: Database System Concepts, 6 Ed
19 pages
Adbs 5
No ratings yet
Adbs 5
37 pages
CS143: Hash Index
No ratings yet
CS143: Hash Index
26 pages
Unit-3 Part 2 Indexing and Hashing
No ratings yet
Unit-3 Part 2 Indexing and Hashing
36 pages
Aplikasi DB-MKG 7
No ratings yet
Aplikasi DB-MKG 7
22 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
60 pages
Unit-3 Hashing Storage Btree
No ratings yet
Unit-3 Hashing Storage Btree
26 pages
Hashing
No ratings yet
Hashing
8 pages
Lec04 Hashing CH 11 P2
No ratings yet
Lec04 Hashing CH 11 P2
44 pages
11 What Is Hashing in DBMS
No ratings yet
11 What Is Hashing in DBMS
20 pages
DSAD Dynamic Hashing
No ratings yet
DSAD Dynamic Hashing
79 pages
Unit 6.2 Indexing and Hashing
No ratings yet
Unit 6.2 Indexing and Hashing
37 pages
m5 Index PDF
No ratings yet
m5 Index PDF
60 pages
4.5 Static Hashing, Dynamic Hashing
No ratings yet
4.5 Static Hashing, Dynamic Hashing
8 pages
5 Data Storage and Indexing
No ratings yet
5 Data Storage and Indexing
58 pages
6 Hash-Based Indexing
No ratings yet
6 Hash-Based Indexing
26 pages
File Organization and Indexing: Structure of Disks
No ratings yet
File Organization and Indexing: Structure of Disks
28 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
5 pages
Hash Dbms
No ratings yet
Hash Dbms
5 pages
Lord Shivas 16 Mondays of Fasting
No ratings yet
Lord Shivas 16 Mondays of Fasting
12 pages
Lord Shivas 16 Mondays of Fasting
No ratings yet
Lord Shivas 16 Mondays of Fasting
12 pages
CO3 Notes Hashing
No ratings yet
CO3 Notes Hashing
10 pages
Dynamic Hashing and Indexing
No ratings yet
Dynamic Hashing and Indexing
24 pages
07 Hashtables
No ratings yet
07 Hashtables
4 pages
06 Hashtables
No ratings yet
06 Hashtables
3 pages
10.dynamic Hashing
No ratings yet
10.dynamic Hashing
4 pages
Dynamic Hashing Notes
No ratings yet
Dynamic Hashing Notes
3 pages
Hash-Based Indexes: Introduction To Database, Fall 2004/melikyan 1
No ratings yet
Hash-Based Indexes: Introduction To Database, Fall 2004/melikyan 1
19 pages
Image Enhancement-Spatial Filtering From: Digital Image Processing, Chapter 3
No ratings yet
Image Enhancement-Spatial Filtering From: Digital Image Processing, Chapter 3
56 pages
D365 Dployment CloudVsOnprem
No ratings yet
D365 Dployment CloudVsOnprem
17 pages
Group Assignment - On - Hashing in DBMS
No ratings yet
Group Assignment - On - Hashing in DBMS
4 pages
Sharepoint
No ratings yet
Sharepoint
15 pages
Hashing
No ratings yet
Hashing
8 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
4 pages
Data Manipulation Language: Module of Instruction
No ratings yet
Data Manipulation Language: Module of Instruction
11 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
11 pages
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
No ratings yet
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
15 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
9 pages
Different Programming Languages
No ratings yet
Different Programming Languages
20 pages
Class 11 Notes Informatics Practices Chap 8 (2024-25)
No ratings yet
Class 11 Notes Informatics Practices Chap 8 (2024-25)
4 pages
87695a5a6dfa0b0649650d5ccf468a15
No ratings yet
87695a5a6dfa0b0649650d5ccf468a15
402 pages
Hash-Based Indexes: As For Any Index, 3 Alternatives For Data Entries K
No ratings yet
Hash-Based Indexes: As For Any Index, 3 Alternatives For Data Entries K
7 pages
Role of A SAP-MM Consultant
No ratings yet
Role of A SAP-MM Consultant
1 page
Database Indexing and Hashing
No ratings yet
Database Indexing and Hashing
7 pages
SAP NetWeaver Developer Studio 7.30 Installation Guide
No ratings yet
SAP NetWeaver Developer Studio 7.30 Installation Guide
11 pages
DBMS Hashing
No ratings yet
DBMS Hashing
3 pages
DBMS Practical Question and Answer
No ratings yet
DBMS Practical Question and Answer
5 pages
There Are Two Types of Hashing
No ratings yet
There Are Two Types of Hashing
2 pages
Security Features in Intouch 8.X: Tech Note 295
No ratings yet
Security Features in Intouch 8.X: Tech Note 295
6 pages
Designing A Defense For Mobile Applications: Examining An Ecosystem of Risk
No ratings yet
Designing A Defense For Mobile Applications: Examining An Ecosystem of Risk
4 pages
1387 Liewhonchin2011
No ratings yet
1387 Liewhonchin2011
84 pages
Keyphrase Extraction (3rd Review)
No ratings yet
Keyphrase Extraction (3rd Review)
22 pages
DHTML Data Binding With TDC
100% (1)
DHTML Data Binding With TDC
45 pages
Peter Beverloo: List of Chromium Command Line Switches
No ratings yet
Peter Beverloo: List of Chromium Command Line Switches
24 pages
Reversibility &: Quantum Computing
No ratings yet
Reversibility &: Quantum Computing
56 pages
Hitachi Nas Replication Overview
No ratings yet
Hitachi Nas Replication Overview
17 pages
ASTM Learning Management System (LMS) : Guide For System Administrators
No ratings yet
ASTM Learning Management System (LMS) : Guide For System Administrators
14 pages
Profile Sheet 1215-41500496396
No ratings yet
Profile Sheet 1215-41500496396
7 pages
Guide:T.Sarathamani MCA, M.Phil., Asst - Professor School of IT & Science DR.G.R.D College of Science Cbe-14
No ratings yet
Guide:T.Sarathamani MCA, M.Phil., Asst - Professor School of IT & Science DR.G.R.D College of Science Cbe-14
20 pages
Program Flow in Embedded Applications
No ratings yet
Program Flow in Embedded Applications
16 pages
Designer Client Guide: Ibm Infosphere Datastage and Qualitystage
No ratings yet
Designer Client Guide: Ibm Infosphere Datastage and Qualitystage
3 pages
A Guide To Make Your SSO UGM Account2 Min
No ratings yet
A Guide To Make Your SSO UGM Account2 Min
12 pages
Handout 1 - Cape-Notes-Unit-2-Module-1-Content-4-5
No ratings yet
Handout 1 - Cape-Notes-Unit-2-Module-1-Content-4-5
11 pages
BioDAQ Overview
No ratings yet
BioDAQ Overview
3 pages
Unloading - Data - From - Snowflake
No ratings yet
Unloading - Data - From - Snowflake
12 pages
Mad Lab3 20bce7051
No ratings yet
Mad Lab3 20bce7051
20 pages
Rajalakshmi R B - DevOps Engineer
No ratings yet
Rajalakshmi R B - DevOps Engineer
5 pages
50 Copywriting Tips PDF
No ratings yet
50 Copywriting Tips PDF
9 pages
Procedure 1610 PR.01 Systems and Network Security: Revision Date: 6/10/11
No ratings yet
Procedure 1610 PR.01 Systems and Network Security: Revision Date: 6/10/11
5 pages
Business Logic Bugs
No ratings yet
Business Logic Bugs
13 pages
Backup and RestoreOnLinux
No ratings yet
Backup and RestoreOnLinux
4 pages
Introduction To Programming: Mr. Imran Lecturer, Department of Computer Science, Jahan University Kabul, Afghanistan
No ratings yet
Introduction To Programming: Mr. Imran Lecturer, Department of Computer Science, Jahan University Kabul, Afghanistan
21 pages
CST1340 - Section1GOALs - M00986493 - Amaan Khan
No ratings yet
CST1340 - Section1GOALs - M00986493 - Amaan Khan
9 pages
Mayuri Sonawane: Objective
No ratings yet
Mayuri Sonawane: Objective
3 pages
Lab Manual 03 CSE 314
No ratings yet
Lab Manual 03 CSE 314
7 pages
Sandhyacv
No ratings yet
Sandhyacv
2 pages

Principles of Database Management Systems: 4.2: Hashing Techniques

Uploaded by

Principles of Database Management Systems: 4.2: Hashing Techniques

Uploaded by

Principles of Database

4.2: Hashing Techniques

DBMS 200 Notes 4.2: Hashi 1

DBMS 200 Notes 4.2: Hashi 2

DBMS 200 Notes 4.2: Hashi 3

to implement a primary index

DBMS 200 Notes 4.2: Hashi 4

for a secondary index

DBMS 200 Notes 4.2: Hashi 5

Key = x1 x2 xn n byte character

DBMS 200 Notes 4.2: Hashi 6

Good hash Expected number of

Read Knuth Vol. 3 if you really

DBMS 200 Notes 4.2: Hashi 7

DBMS 200 Notes 4.2: Hashi 8

DBMS 200 Notes 4.2: Hashi 9

DBMS 200 Notes 4.2: Hashi 10

DBMS 200 Notes 4.2: Hashi 11

DBMS 200 Notes 4.2: Hashi 12

(a) Use i of b bits output by hash

use i grows over time.

DBMS 200 Notes 4.2: Hashi 13

Directory contains 2i pointers to buckets, and

DBMS 200 Notes 4.2: Hashi 14

DBMS 200 Notes 4.2: Hashi 17

1001 1100 2 111

DBMS 200 Notes 4.2: Hashi 18

Reverse insert procedure

DBMS 200 Notes 4.2: Hashi 19

- Directory doubles in size

DBMS 200 Notes 4.2: Hashi 20

No bucket directory needed

DBMS 200 Notes 4.2: Hashi 21

DBMS 200 Notes 4.2: Hashi 22

now r=4 >1.7n

Rule If h(k)[i ] = (a1 ai)2 < n, then

DBMS 200 Notes 4.2: Hashi 23

0000 0101 1010

DBMS 200 Notes 4.2: Hashi 24

0000 0101 1010

DBMS 200 Notes 4.2: Hashi 25

DBMS 200 Notes 4.2: Hashi 26

0000 0101 1010 1111

DBMS 200 Notes 4.2: Hashi 27

0000 0101 1010 1111 0101

DBMS 200 Notes 4.2: Hashi 28

DBMS 200 Notes 4.2: Hashi 29

DBMS 200 Notes 4.2: Hashi 30

DBMS 200 Notes 4.2: Hashi 31

Hashing good for probes given key

DBMS 200 Notes 4.2: Hashi 32

INDEXING (Including B-Trees) good

DBMS 200 Notes 4.2: Hashi 33

Create index name on rel (attr)

Drop INDEX name

DBMS 200 Notes 4.2: Hashi 34

DBMS 200 Notes 4.2: Hashi 35

DBMS 200 Notes 4.2: Hashi 36

You might also like