0% found this document useful (0 votes)
105 views

Hashing: An Ideal Hash Table

The document discusses hashing and hash tables. It defines hashing as a technique for performing insertions, deletions and searches in constant average time by mapping keys to addresses in a structure. It defines a hash table as a data structure for storing items so that searching, insertion and deletion can be done quickly. A hash table uses a hash function to map keys to indexes in an array of buckets or memory locations. The document also discusses hash functions, load factor, separate chaining, open addressing including linear probing, quadratic probing and double hashing to handle collisions when keys map to the same index.

Uploaded by

Mehwish Mehmood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views

Hashing: An Ideal Hash Table

The document discusses hashing and hash tables. It defines hashing as a technique for performing insertions, deletions and searches in constant average time by mapping keys to addresses in a structure. It defines a hash table as a data structure for storing items so that searching, insertion and deletion can be done quickly. A hash table uses a hash function to map keys to indexes in an array of buckets or memory locations. The document also discusses hash functions, load factor, separate chaining, open addressing including linear probing, quadratic probing and double hashing to handle collisions when keys map to the same index.

Uploaded by

Mehwish Mehmood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 11

HASHING

 The implementation of hash tables is called hashing.

 Hashing is a technique used for performing insertion, deletions and finds in constant
average time.

 A process whereby an item is placed into a structure based on a key-to-address.

 It takes a potential huge range of values and maps it to a much smaller range of values.

HASH TABLE
 A hash table is data structure for storing a set of items, so that we can quickly
perform searching, insertions and deletions.

 A hash table is normally created with a certain number of buckets or storing locations.

 A hash table data structure is merely an array of fixed size, containing keys.

 Data element or elements in a hash tables is identified by a 0 john


key or keys.
1 Phil
 A key is a string with an associated value.
2
 Each key is mapped into some number in the range 0 to m-1
(m=table size) and placed in the appropriate cell. 3 Dave

 A mapping is called hash function, which ideally should be 4


simple to compute and should ensure that two distinct keys
get different cells. AN IDEAL HASH TABLE

 Hash tables are one of the most efficient means of searching for data.

LOAD FACTOR
Load factor is the ratio of the number of elements in the hash table to the table size.
Denoted by lambda ;
N

M

Where;
N= Number of Elements/Keys.
M= Size of Table.
HASH FUNCTION
 The hashing algorithm is called hash function.

 The value returned by hash function is called hash values, hash codes, hash sums, or
simply hashes.

 Hash table works best if table size is prime number.

 Hashing is done with a hash function such that;

H (Primary Key) = External Key

There are seven different methods of hashing, or ways to insert values into a key accessed
table. But the two most popular methods are:

1. Division Method
2. Multiplication Method

1. DIVISION METHOD
Formula H (k) = k mod m (if the table starts at key 0)
H (k) = k mod m + 1 (if the table starts at key 1)
Where;
m = table size
k = key

EXAMPLE:
Store the given key values {18, 12, 71, 28} into the hash table of m=7 using division
method.
Hash
Keys
H (18) = 18 mod 7 , H (12) = 12 mod 7 value
H (18) = 4 H (12) = 5 0 28
1 71
H (71) = 71 mod 7 , H (28) = 28 mod 7 2
H (71) = 1 H (28) = 0 3
4 18
- Plot the given key values to their respective hash values, we get 5 12
the hash table. 6

2. MULTIPLICATION METHOD
Formula H (k) =  m (kA mod 1) 
Where;
A = 0.01 (constant)

EXAMPLE:
Store the given key values {18, 25, 71} into the hash table of m=5 using multiplication
method.

H (18) =  5 (18 * 0.01 mod 1)


H (18) =  5(0.18) 
H (18) =  0.9 
H (18) = 0 Hash
Keys
value
H (71) =  5 (71 * 0.01 mod 1) 0 18
H (71) =  5 (0.71) 
1 25
H (71) =  3.55 
H (71) = 3 2

H (25) =  5 (25 * 0.01 mod 1)  3 71


H (25) =  5 (0.25) 
4
H (25) = 1.25 
H (25) = 1

SEPARATE CHAINING
 A Chained hash table is one where the data is stored in linked lists which contain all of
the data entries whose keys map to the same hash value. The linked list (or bucket) can
grow to contain as many entries as required but as the list grows it becomes gradually less
efficient in the speed at which the entries can be accessed.

 For separate chaining the load factor should be close to 1.

 The main disadvantage to chaining is that one needs 3m memory cells for the data.

 Disadvantage of requiring pointers that tends to slow the algorithm down a bit slow.

0 A
1
2
3
4
5 E C
6

A SEPARATE CHAINING TABLE

OPEN ADDRESSING
 In the open addressing hash table scheme, the elements are stored in the hash table
itself.

 In open addressing, if a collision occurs, alternative cells are tried until an empty space is
found.

 Standard deletion cannot be performed in open addressing hash tables because the cell
might have caused a collision.

 Load factor should below 0.5 for open addressing.

 Three probing techniques are introduced i.e.

 LINEAR PROBING
 QUADRATIC PROBING
 DOUBLE HASHING
1. LINEAR PROBING
 Least efficient method.

 Hash table gets nearly full because performance degrades.

 Causes primary clustering.

FORMULA FOR LINEAR PROBING:

H (k, i) = (h’ (k) +i) mod m


Where;
i = collision number
k= key
h’(k)= hash value obtained by division method.

2. QUADRATIC PROBING
 Eliminates primary clustering problem of linear probing.

 There is no guarantee of finding an empty cell once the table gets half full.

 Faster in practice.

 Creates secondary clustering problem.

 If quadratic probing is used, and the table size is prime, then a new element can always
be inserted if the table is at least half empty.
FORMULA FOR QUADRATIC PROBING:

H (k, i) = (h’ (k) +i2) mod m

3. DOUBLE HASHING
 Ideal hashing approach.

 Prevents both primary and clustering problem.

 More complicated and requires more running time for hashing.

 M must be a prime number.

 h2 (k) should not be equal to zero.

FORMULA FOR DOUBLE HASHING:

H (k, i) = (h1 (k) +i h2 (k)) mod m


Where;
h2 (k) = R-(K mod R)
R= nearest prime number < table size.
NUMERICAL
By using Linear probing store the following elements in the hash table of size 10
19, 211, 54, 39, 43,119, 1101, 7,111,207

Data
M=10, index 0-9
First finding the hash values for the given keys using division method formula
H (k) = k (mod 10)
H (19) = 19 (mod 10) = 9
H (211) = 211 (mod 10) = 1
H (54) = 54 (mod 10) = 4
H (39) = 39 (mod 10) = 9
H (43) = 43 (mod 10) = 3 Collisions
H (119) = 119(mod 10) = 9 occurs here
H (1101) =1101(mod 10)= 1
H (7) = 7 (mod 10) =7
H (111) = 111 (mod 10) = 1
H (207) = 207 (mod 10) = 7
HASH
KEYS
USING LINEAR PROBING: VALUES

H (k, i) = (h’ (k) +i) mod m


0 39
FOR 39:
H (39, 1) = (39+1) mod 10 Put i=1 at start
= 40 mod 10 1 211
=0

FOR 119: 2 119


Now value of “i” will be 2.
H (119, 2) = (119+2) mod 10
= 121 mod 10 3 43
= 1 (already occupied, put i=3 now)
H (119, 3) = (119+3) mod 10 4 54
= 122 mod 10
= 2 (empty space found)
FOR 1101: 5 1101
Now value of “i” will be 4.
H (1101, 4) = (1101+4) mod 10
= 1101 mod 10 6 111
= 5 (empty space found)
7 7
FOR 111:
Now value of “i” will be 5.
8 207

9 19
H (111, 5) = (111+5) mod 10
= 116 mod 10
= 6 (empty space found)

FOR 207:
Now value of “i” will be 6.
H (207, 6) = (207+6) mod 10
= 213 mod 10
= 3 (already occupied, put i=7 now)
H (207, 7) = (207+7) mod 10
= 214 mod 10
= 4 (already occupied, put i=8 now)
H (207, 8) = (207+8) mod 10
= 215 mod 10
= 5 (already occupied, put i=9 now)
H (207, 9) = (207+9) mod 10
= 216 mod 10
= 6 (already occupied, put i=10 now)
H (207, 10) = (207+10) mod 10
= 217 mod 10
= 7 (already occupied, put i=11 now)
H (207, 11) = (207+11) mod 10
= 218 mod 10
= 8 (empty space found)

By using Quadratic probing store the following elements in the hash table.
199, 11, 354, 99, 343, 9, 101, 77, 1011,107

GIVEN HASH TABLE

H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]

Data:
M=10, index 0-9
First finding the hash values for the given keys using division method formula
H (k) = k (mod 10)
H (199) = 199 (mod 10) =9
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
199

H (11) = 11 (mod 10) = 1


H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
11 199

H (354) = 354 (mod 10) = 4


H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
11 354 199

H (99) = 99 (mod 10) = 9 Here collision occurs


Using quadratic formula
H (k, i) = (h’ (k) +i2) mod m
H (99, 1) = (99+12) mod 10
= 100 mod 10
= 0 (empty space found)

H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 354 199

H (343) = 343 (mod 10) = 3


H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 343 354 199

H (9) = 9 (mod 10) =9


Now i will be “2”.
H (9, 2) = (9+22) mod 10
= 13 mod 10
= 3 (already occupied, put i=3 now)
H (9, 3) = (9+32) mod 10
= 18 mod 10
= 8 (empty space found)

H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 343 354 9 199

H (101) = 101 (mod 10) = 1


Now i will be “4”.
H (101, 4) = (101+42) mod 10
= 107 mod 10
= 7 (empty space found)
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 343 354 101 9 199

H (77) = 77 (mod 10) = 7


Now i will be “5”.
H (77, 5) = (77+52) mod 10
= 102 mod 10
= 2 (empty space found)
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 77 343 354 101 9 199

H (1011)= 1011(mod 10)= 1

Now i will be “6”. Now i will be “7”.


H (1011, 6) = (1011+62) mod 10 H (1011, 7) = (1011+72) mod 10
= 1047 mod 10 = 1060 mod 10
= 7 (already occupied, put i=7) = 0 (already occupied, put i=8)

H (1011, 8) = (1011+82) mod 10


= 1075 mod 10
= 5 (empty space found)

H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 77 343 354 1011 101 9 199

H (107)= 107(mod 10) =7


H (107, 9) = (107+92) mod 10
= 188 mod 10
= 8 (already occupied, put i=10)
H (107, 9) = (107+102) mod 10
= 207 mod 10
= 7 (already occupied, put i=11)

H (107, 11) = (107+112) mod 10


= 228 mod 10
= 8 (already occupied, put i=12)

H (107, 12) = (107+122) mod 10


= 188 mod 10
= 8 (already occupied, put i=13)
H (107, 13) = (107+132) mod 10
= 276 mod 10
= 6 (empty space found)
H[0] H[1] H[2] H[3] H[4] H[5] H[6] H[7] H[8] H[9]
99 11 77 343 354 1011 107 101 9 199

HASH TABLE WITH SECOND HASH FUNCTION


H2(X) =7-(X MOD 7)
DATA {89, 18 49, 58, 69}, m=10
H (k, i) = (h1 (k) +ih2 (k)) mod m
h2 (k) = 7-(X mod 7)

H (89) = 89 mod 10 => 9


KEY VALUE
H (18) = 18 mod 10 => 8
0 69
h2 (49) = 7-(49 mod 7)
=7 1
H (49, 1) = (49+1x7) mod 10
= 56 mod 10 2
= 6 (empty space found)
3 58
h2 (58) = 7-(58 mod 7)
=5 4
H (58, 1) = (58+1x5) mod 10
5
= 63 mod 10
= 3 (empty space found) 6 49

h2 (69) = 7-(69 mod 7) 7


=1
H (58, 1) = (58+1x1) mod 10 8 18
= 59 mod 10
= 9 (already occupied, put i=2 now) 9 89
H (58, 2) = (58+2x1) mod 10
= 60 mod 10
= 0 (empty space found)

APPLICATIONS OF HASH
TABLES/ HASHING

1. Hashing can be used to reduce the time and space requirements of programs.
2. Providing powerful techniques of pattern matching.

3. Hashing is used for performance improvement, error checking and authentication.

4. Provides faster data retrieval.


5. Also used to encrypt and decrypt digital signatures.

6. Compilers use hash tables to keep track of declared variables in source code.

7. Only inserts and find are performed.

8. Well suited for online spell checking.

9. Useful for any graph theory problem where the nodes have real names instead of
numbers.

10. Common use of hash table is in programs that play games.

11. Typically used for database management systems.

You might also like