Linear Hashing: Historical Background

Uploaded by

girls1271138

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views4 pages

Linear Hashing: Historical Background

Uploaded by

girls1271138

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

L

Historical Background
Linear Hashing
A hash table is an in-memory data structure that
Donghui Zhang1 , Yannis Manolopoulos2 ,
associates keys with values. The primary opera-
Yannis Theodoridis3 , and Vassilis J. Tsotras4
1 tion it supports efficiently is a lookup: given a
Paradigm4, Inc., Waltham, MA, USA
2 key, find the corresponding value. It works by
Aristotle University, Thessaloniki, Greece
3 transforming the key using a hash function into a
University of Piraeus, Piraeus, Greece
4 hash, a number that is used as an index in an array
University of California-Riverside, Riverside,
to locate the desired location where the values
MA, USA
should be. Multiple keys may be hashed to the
same bucket, and all keys in a bucket should be
searched upon a query. Hash tables are often used
Definition to implement associative arrays, sets and caches.
Like arrays, hash tables have O(1) lookup cost on
Linear Hashing is a dynamically updateable disk-
average.
based index structure which implements a hash-
ing scheme and which grows or shrinks one
bucket at a time. The index is used to support
Foundations
exact match queries, i.e., find the record with
a given key. Compared with the BC-tree index
The Linear Hashing scheme was introduced
which also supports exact match queries (in log-
by [2].
arithmic number of I/Os), Linear Hashing has
better expected query cost O(1) I/O. Compared
Initial Layout
with Extendible Hashing, Linear Hashing does
The Linear Hashing scheme has m initial buckets
not use a bucket directory, and when an overflow
labeled 0 through m 1, and an initial hashing
occurs, it is not always the overflown bucket that
function h0 (k)Df(k) % m that is used to map any
is split. The name Linear Hashing is used because
key k into one of the m buckets (for simplicity
the number of buckets grows or shrinks in a
assume h0 (k)Dk% m), and a pointer p which
linear fashion. Overflows are handled by creating
points to the bucket to be split next whenever an
a chain of pages under the overflown bucket. The
overflow page is generated (initially p D 0). An
hashing function changes dynamically and at any
example is shown in Fig. 1.
given instant there can be at most two hashing
functions used by the scheme.

© Springer Science+Business Media LLC 2017

L. Liu, M.T. Özsu (eds.), Encyclopedia of Database Systems,
DOI 10.1007/978-1-4899-7993-3_742-2
2 Linear Hashing

Linear Hashing, Fig. 1 An initial Linear Hashing. Here

m D 4, p D 0, h0 (k) D k % 4
Linear Hashing, Fig. 2 The Linear Hashing after in-
serting 11 into Fig. 1. Here p D 1, h0 (k) D k % 4,
Bucket Split h1 (k) D k% 8
When the first overflow occurs (it can occur in
any bucket), bucket 0, which is pointed by p,
is split (rehashed) into two buckets: the original Round and Hash Function Advancement
bucket 0 and a new bucket m. A new empty page After enough overflows, all original m buckets
is also added in the overflown bucket to accom- will be split. This marks the end of splitting-
modate the overflow. The search values originally round 0. During round 0, p went subsequently
mapped into bucket 0 (using function h0 ) are now from bucket 0 to bucket m 1. At the end of
distributed between buckets 0 and m using a new round 0 the Linear Hashing scheme has a total
hashing function h1 . of 2m buckets. Hashing function h0 is no longer
As an example, Fig. 2 shows the layout of needed as all 2m buckets can be addressed by
the Linear Hashing of Fig. 1 after inserting a hashing function h1 . Variable p is reset to 0 and
new record with key 11. The circled records are a new round, namely splitting-round 1, starts. A
the existing records that are moved to the new new hash function h2 will start to be used.
bucket. In more detail, the record is inserted into In general, the Linear Hashing scheme in-
bucket 11%4 D 3. The bucket overflows and an volves a family of hash functions h0 , h1 , h2 , and
overflow page is introduced to accommodate the so on. Let the initial function be h0 (k) D f(k)%
new record. Bucket 0 is split and the records m, then any later hash function hi (k) D f(k) % 2
i
originally in bucket 0 are distributed between m. This way, it is guaranteed that if hi hashes a
bucket 0 and bucket 4, using a new hash function key to bucket j 2 [0.0.2i m 1], hiC1 will hash
h1 (k) D k % 8. the same key to either bucket j or bucket j C 2i
The next bucket overflow, such as triggered by m. At any time, two hash functions hi and hiC1
inserting two records in bucket 2 or four records are used.
in bucket 3 in Fig. 2, will cause a new split that Figures 3 and 4 illustrates the cases at the
will attach a new bucket m C 1 and the contents end of splitting-round 0 and at the beginning of
of bucket 1 will be distributed using h1 between splitting-round 1. In general, in splitting round i,
buckets 1 and m C 1. A crucial property of h1 the hash functions hi and hiC1 are used. At the
is that search values that were originally mapped beginning of round i, p D 0 and there are 2i
by h0 to some bucket j must be remapped either m buckets. When all of these buckets are split,
to bucket j or bucket j C m. This is a necessary splitting round i C 1 starts. p goes back to 0. The
property for Linear Hashing to work. An example number of buckets becomes 2iC1 m. And hash
of such hashing function is: h1 (k) D k % 2m. functions hiC1 and hiC2 will start to be used.
Further bucket overflows will cause additional
bucket splits in a linear bucket-number order
(increasing p by one for every split).
Linear Hashing 3

• A total of 2i m C p buckets, each of which

consists of a primary page and possibly some
overflow pages.
• Two hash functions hi and hiC1 .

A search scheme is needed to map a key k to

a bucket, either when searching for an existing
record or when inserting a new record. The search
scheme works as follows:

1. If hi (k) p, choose bucket hi (k) since the

bucket has not been split yet in the current
round.
2. If hi (k) < p, choose bucket hiC1 (k), which can
be either hi (k) or its spit image hi (k) C 2i m.
Linear Hashing, Fig. 3 The Linear Hashing at the end of
round 0. Here p D 3, h0 (k) D k % m, h1 (k) D k % 21 m For example, in Fig. 2, p D 1. To search for
record 5, since h0 (5) D 1 p, one directly goes
to bucket to find the record. But to search for
record 4, since h0 (4) D 0 <p, one needs to use
h1 to decide the actual bucket. In this case, the
record should be searched in bucket h1 (4) D 4. L
Variations
A split performed whenever a bucket overflow
occurs is an uncontrolled split. Let l denote
the Linear Hashing scheme’s load factor, i.e.,
l D S/b where S is the total number of records
and b is the number of buckets used. The load
factor achieved by uncontrolled splits is usually
between 50% and 70%, depending on the page
size and the search value distribution [2]. In
practice, higher storage utilization is achieved
if a split is triggered not by an overflow, but
when the load factor l becomes greater than some
Linear Hashing, Fig. 4 The Linear Hashing at the be- upper threshold. This is called a controlled split
ginning of round 1. Here p D 0, h1 (k) D k % 21 m, and can typically achieve 95% utilization. Other
h2 (k) D k % 22 m controlled schemes exist where a split is delayed
until both the threshold condition holds and an
Component Summary and Search Scheme overflow occurs.
In summary, at any time a Linear Hashing scheme Deletions will cause the hashing scheme to
has the following components: shrink. Buckets that have been split can be re-
combined if the load factor falls below some
lower threshold. Then two buckets are merged
• A value i which indicates the current splitting
together; this operation is the reverse of splitting
round.
and occurs in reverse linear order. Practical values
• A variable p 2 [0..2i m 1] which indicates
for the lower and upper thresholds are 0.7 and 0.9
the bucket to be split next.
respectively.
4 Linear Hashing

Linear Hashing has been further investigated Recommended Reading

in an effort to design more efficient variations.
In [3] a performance comparison study of four 1. Griswold WG, Townsend GM. The design and imple-
mentation of dynamic hashing for sets and tables in
Linear Hashing variations is reported.
icon. Softw Pract Ex. 1993;23(4):351–67.
2. Litwin W. Linear hashing: a new tool for file and
table addressing. In: Proceedings of the Sixth Inter-
Key Applications national Conference on Very Large Databases; 1980.
p. 212–23.
3. Manolopoulos Y, Lorentzos N. Performance of linear
Linear Hashing has been implemented into com- hashing schemes for primary key retrieval. Inf Syst.
mercial database systems. It is used in appli- 1994;19(5):433–46.
cations where exact match query is the most 4. Schneider DA., DeWitt DJ. Tradeoffs in processing
complex join queries via hashing in multiprocessor
important query such as hash join [4]. It has been
database machines. In: Proceedings of the 16th Inter-
adopted in the Icon language [1]. national Conference on Very Large Databases; 1990.
p. 469–80.

Cross-References

Extendible Hashing
Hashing
Hash-based Indexing

Smart 3D Plant Curriculum Path Training Guidelines 2019
No ratings yet
Smart 3D Plant Curriculum Path Training Guidelines 2019
28 pages
Comparison Between Operating Systems
No ratings yet
Comparison Between Operating Systems
26 pages
6 Hash-Based Indexing
No ratings yet
6 Hash-Based Indexing
26 pages
Linear Hash
No ratings yet
Linear Hash
15 pages
Linear Hashing
No ratings yet
Linear Hashing
21 pages
Unit 4-Hashing
No ratings yet
Unit 4-Hashing
24 pages
Database Systems (資料庫系統) : November 26/28, 2007 Lecture #9
No ratings yet
Database Systems (資料庫系統) : November 26/28, 2007 Lecture #9
43 pages
Hashing
No ratings yet
Hashing
38 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Ch11 Hash Indexes 1perpage Annotated
No ratings yet
Ch11 Hash Indexes 1perpage Annotated
28 pages
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
No ratings yet
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
78 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
It Is A Very Efficient Method To Search The Exact Data Items Based On Hash Table
No ratings yet
It Is A Very Efficient Method To Search The Exact Data Items Based On Hash Table
49 pages
9-Hashing Schemes
No ratings yet
9-Hashing Schemes
23 pages
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
No ratings yet
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
32 pages
DSimp 2
No ratings yet
DSimp 2
21 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
26 pages
Performance Comparison of Extendible Hashing and Linear Hashing Techniques
No ratings yet
Performance Comparison of Extendible Hashing and Linear Hashing Techniques
8 pages
Hash-Based Indexes: Introduction To Database, Fall 2004/melikyan 1
No ratings yet
Hash-Based Indexes: Introduction To Database, Fall 2004/melikyan 1
19 pages
Adbs 5
No ratings yet
Adbs 5
37 pages
Hashing PPT
No ratings yet
Hashing PPT
39 pages
DS 8
No ratings yet
DS 8
30 pages
11 What Is Hashing in DBMS
No ratings yet
11 What Is Hashing in DBMS
20 pages
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
No ratings yet
Chap. 6 Hash-Based Indexing: Abel J.P. Gomes
15 pages
Hashing
No ratings yet
Hashing
8 pages
Hashing On The Disk: Keys Are Stored in " " (" ") Retrieval
No ratings yet
Hashing On The Disk: Keys Are Stored in " " (" ") Retrieval
45 pages
Mod 5
No ratings yet
Mod 5
13 pages
Chapter 11
No ratings yet
Chapter 11
22 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
CE204 Data Structures and Algorithms Final Exam
No ratings yet
CE204 Data Structures and Algorithms Final Exam
2 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
CO3 Session 6
No ratings yet
CO3 Session 6
29 pages
Lec04 Hashing CH 11 P2
No ratings yet
Lec04 Hashing CH 11 P2
44 pages
Hash-Based Indexes: As For Any Index, 3 Alternatives For Data Entries K
No ratings yet
Hash-Based Indexes: As For Any Index, 3 Alternatives For Data Entries K
7 pages
Unit III-Hashing
100% (1)
Unit III-Hashing
135 pages
Hashing
No ratings yet
Hashing
22 pages
Chapter 11 Hashing
No ratings yet
Chapter 11 Hashing
42 pages
Unit 12 External Searching Techniques
No ratings yet
Unit 12 External Searching Techniques
8 pages
COMP211slides 11
No ratings yet
COMP211slides 11
36 pages
Module 5
No ratings yet
Module 5
25 pages
Study Material On Hashing
No ratings yet
Study Material On Hashing
4 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Hash Dbms
No ratings yet
Hash Dbms
5 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
DS Revision On Heap
No ratings yet
DS Revision On Heap
34 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
DBMS Hashing
No ratings yet
DBMS Hashing
3 pages
UNIT 1 - Hashing
No ratings yet
UNIT 1 - Hashing
118 pages
Chapter 11
No ratings yet
Chapter 11
22 pages
CSE 326: Data Structures Hash Tables: Autumn 2007
No ratings yet
CSE 326: Data Structures Hash Tables: Autumn 2007
29 pages
Hash Function
No ratings yet
Hash Function
9 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
5 pages
Lab 09 - Hashing
No ratings yet
Lab 09 - Hashing
47 pages
Hashing
No ratings yet
Hashing
16 pages
c11 Hashing
No ratings yet
c11 Hashing
9 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Fourier Analysis on Groups
From Everand
Fourier Analysis on Groups
Walter Rudin
No ratings yet
Constructive Real Analysis
From Everand
Constructive Real Analysis
Allen A. Goldstein
No ratings yet
Elementary Functional Analysis
From Everand
Elementary Functional Analysis
Georgi E. Shilov
4/5 (1)
IV Year - 22-04-2020
No ratings yet
IV Year - 22-04-2020
23 pages
IV Year - 21-04-2020
No ratings yet
IV Year - 21-04-2020
23 pages
Solved Previous Papers
No ratings yet
Solved Previous Papers
13 pages
Searching and Sorting
No ratings yet
Searching and Sorting
8 pages
GCF Presentation
No ratings yet
GCF Presentation
23 pages
Week 7 Graded Assessment: Progress Weekly Quiz Report
No ratings yet
Week 7 Graded Assessment: Progress Weekly Quiz Report
5 pages
E Commerce MLRIT Notes
No ratings yet
E Commerce MLRIT Notes
95 pages
DA Python PDF
No ratings yet
DA Python PDF
41 pages
Continuum 2.03 CompMatrix Oct - 2018
No ratings yet
Continuum 2.03 CompMatrix Oct - 2018
11 pages
Erp Mba Notes
No ratings yet
Erp Mba Notes
96 pages
PP Fruit Vendor Application PDF
No ratings yet
PP Fruit Vendor Application PDF
4 pages
Persistence With Spring
No ratings yet
Persistence With Spring
98 pages
Cute FTP Manual
No ratings yet
Cute FTP Manual
67 pages
Visual Lab Pro v1.15 User Manual Rev 1.22 06-11-2008
No ratings yet
Visual Lab Pro v1.15 User Manual Rev 1.22 06-11-2008
329 pages
Azure Fundamentals
No ratings yet
Azure Fundamentals
41 pages
Cara Update Firmware Kenwood ddx5032 PDF
No ratings yet
Cara Update Firmware Kenwood ddx5032 PDF
1 page
08 - SD Card Interface and Application Notes
No ratings yet
08 - SD Card Interface and Application Notes
2 pages
Diff Between Delete, Drop and Truncate SQL
No ratings yet
Diff Between Delete, Drop and Truncate SQL
6 pages
LAZER - Editorial-CodeChef
No ratings yet
LAZER - Editorial-CodeChef
2 pages
M. Tech. (Sem-Ii) Theory Examination 2017-18 Distributed Data Base
100% (1)
M. Tech. (Sem-Ii) Theory Examination 2017-18 Distributed Data Base
2 pages
Raspberry Pi SDR IGate PDF
No ratings yet
Raspberry Pi SDR IGate PDF
10 pages
How To Download and Install Android Studio On Windows 10
No ratings yet
How To Download and Install Android Studio On Windows 10
11 pages
NGINX Plus As A Load Balancer
No ratings yet
NGINX Plus As A Load Balancer
3 pages
Access Control
No ratings yet
Access Control
47 pages
RPGLE - Example Chaining To Logical File To Write/update Data
No ratings yet
RPGLE - Example Chaining To Logical File To Write/update Data
9 pages
Extracting A 19 Year Old Code Execution From Winrar: Menu
No ratings yet
Extracting A 19 Year Old Code Execution From Winrar: Menu
40 pages
8580bf5e 586f 455b 9b04 D2477a6c6bbgfg7 - AngularJS - Syllabus - BestDotNetTraining
No ratings yet
8580bf5e 586f 455b 9b04 D2477a6c6bbgfg7 - AngularJS - Syllabus - BestDotNetTraining
4 pages
Xenserver 7 0 Management API Guide
No ratings yet
Xenserver 7 0 Management API Guide
600 pages
Self Assessment Tool (RPMS 2018) User Guide
84% (63)
Self Assessment Tool (RPMS 2018) User Guide
2 pages
Release Notes RW 6.08.01
No ratings yet
Release Notes RW 6.08.01
21 pages
Resteasy Jax Rs
No ratings yet
Resteasy Jax Rs
250 pages
Department of Computing: Lab 06: Node - Js Mongodb
No ratings yet
Department of Computing: Lab 06: Node - Js Mongodb
4 pages

Linear Hashing: Historical Background

Uploaded by

Linear Hashing: Historical Background

Uploaded by

L

© Springer Science+Business Media LLC 2017

Linear Hashing, Fig. 1 An initial Linear Hashing. Here

• A total of 2i m C p buckets, each of which

A search scheme is needed to map a key k to

1. If hi (k) p, choose bucket hi (k) since the

Linear Hashing has been further investigated Recommended Reading

You might also like