0% found this document useful (0 votes)
26 views22 pages

Hashing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views22 pages

Hashing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Hashing

Chapter 28

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
1
Objectives
➢ To know what hashing is for.
➢ To obtain the hash code for an object and design the hash
function to map a key to an index .
➢ To handle collisions using open addressing.
➢ To know the differences among linear probing, quadratic
probing, and double hashing.
➢ To handle collisions using separate chaining.
➢ To understand the load factor and the need for rehashing.
➢ To implement MyHashMap using hashing.
➢ To implement MyHashSet using hashing.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
2
Why Hashing?
• In a binary search tree, an element can be found in
O(log(n)) time in a well-balanced search tree.
• Is there a more efficient way to search for an element in
a container?
• This chapter introduces a technique called hashing.
• You can use hashing to implement a map or a set to
search, insert, and delete an element in O(1) time.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
3
Map
• A map is a data structure that stores entries. Each entry
contains two parts: key and value.
• The key is also called a search key, which is used to
search for the corresponding value.
• For example, a dictionary can be stored in a map, where
the words are the keys and the definitions of the words
are the values.
• A map is also called a dictionary, a hash table, or an
associative array. The new trend is to use the term map.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
4
What is Hashing?
• If you know the index of an element in the array, you can
retrieve the element using the index in O(1) time.
• So, can we store the values in an array and use the key as the
index to find the value?
• The answer is yes if you can map a key to an index.
• The array that stores the values is called a hash table. The
function that maps a key to an index in the hash table is called a
hash function.
• Hashing is a technique that retrieves the value using the index
obtained from key without performing a search.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
5
Hash Function and Hash Codes
• A typical hash function;
✓ first converts a search key to an integer value called a hash
code, and
✓ … then compresses the hash code into an index of the hash
table.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
6
Hash Codes for Primitive Data Types
➢ To get hash codes for keys of type byte, short,
int and char simply cast them to int.
➢ To get the hash code for a float, use
Float.floatToIntBits(key)
➢ To get the one for long, use folding
➢int hashcode = (int)(key^(key>>32))
➢ For a double convert the double to its long value
using Double.doubleToLongBits(key) and then
do folding on resulting long value.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
7
Hash Codes for Strings
➢ Search keys are often strings, hence it is important to have
good hashing functions for strings
➢ One solution is to use the sum of the Unicode of all
characters to get the hash code
– This solution will produce a lot of collision for different
keys with same letters
➢ A better solution is to generate a hash code that takes the
position of the characters into consideration
– E.g. Using a polynomial hash code

S0*b(n-1) +s1b(n-2) +…sn-1


Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
8
Compressing Hash Codes
The hash code can be a large integer that is out of the range of the
hash table index
– In such cases, the hash code need to be scaled down to fit the
index range.
The most common way for scaling a hash code to a range
between 0 and N-1 is
h(hashCode) = hashCode % N
In java.util.HashMap N is set to a value of the power of 2. In this
case the following holds;
h(hashCode) = hashCode % N
= hashCode &(N-1)
Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
9
Handling Collision
A collision occurs when two keys are mapped to
the same index.
Generally, there are two ways of handling
collisions;
– Open Addressing
The process of finding an open location in the table in the
event of a collision
– Separate Chaining
Places all entries with the same hash index in the same
location. A bucket is used to hold the entries in the same
location
Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
10
Handling Collisions Using Open
Addressing

Open Addressing has several variations:


– Linear Probing,
– Quadratic Probing, and
– Double hashing

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
11
Linear Probing Animation
http://www.cs.armstrong.edu/liang/animation/web/LinearProbing.html

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
12
Quadratic Probing
• Linear probing looks at the consecutive cells beginning at index
k.
• This often results in clustering of entries
• Quadratic probing can avoid the clustering problem in linear
probing.

0 key: 44
New element with For simplicity, only the keys are
key 26 to be inserted 1 shown and the values are not
shown.
2
3 .
k + j2 for j>=1,
4 .key: 4 k, k+1, k+4, k+9, …
key: 16
5
Quadratic probe 2 6 .key: 28
N-1
times before finding .
an empty cell N-1 N-1
7 .
N-1
8
N-1
9
N-1 N-1
key: 21
10
N-1

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
13
Quadratic Probing
http://cs.armstrong.edu/liang/animation/web/QuadraticProbing.html

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
14
Double Hashing
• Double hashing uses a secondary hash function on the keys to
determine the increments to avoid the clustering problem.

h`(k) = 7 – k % 7;

0 0 0
key: 45 key: 45 1 key: 45
h(12) 1 1
2 2 2

3 key: 58 h(12) + h'(12) 3 key: 58 3 key:


.
58
. .
4 .key: 4 4 .key: 4 4 .key: 4

5 5 h(12) + 2*h'(12) 5
. 27
6 key: . 27
6 key: . 27
6 key:
N-1
N-1 N-1
. . .
N-1 N-1 N-1 N-1
7 .
7 . 7 .
N-1 N-1 N-1
8
8 8
N-1 N-1 N-1 N-1
9 9 9
N-1 N-1 N-1
key: 23 key: 23 key: 23
10 10 10
N-1 N-1 N-1 N-1

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
15
Class Exercise
Show how the keys 15, 57, 34, 83, 28, 41, 53, 70, 44, and 39
will be place in hash table of size 13 using
a. Quadratic probing
b. Double hashing with h`(key)= 7-key % 7
Suppose the size of the hash table is doubled after adding 39,
give the new hash table after adding the key 52 using double
hashing.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
16
Handling Collisions Using Separate Chaining
• The separate chaining scheme places all entries with the same hash index into
the same location, rather than finding new locations.
• Each location in the separate chaining scheme is called a bucket. A bucket is a
container that holds multiple entries.

New element with 0 key: 44 For simplicity, only the keys are
key 26 to be inserted shown and the values are not
1
shown.
2
3 .
4 key: 4 key: 26

5 . key: 16

6 .
N-1 key: 28
.
N-1 N-1
7 N-1
N-1
8
N-1
9
N-1
10 key: 21
N-1

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
17
Separate Chaining Animation
http://cs.armstrong.edu/liang/animation/web/SeparateChaining.html

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
18
Implementing Map Using Hashing
«interface»
MyMap<K, V>
+clear(): void Removes all entries from this map.
+containsKey(key: K): boolean Returns true if this map contains an entry for the
specified key.
+containsValue(value: V): boolean Returns true if this map maps one or more keys to the
specified value.
+entrySet(): Set< Entry<K, V>> Returns a set consisting of the entries in this map.
+get(key: K): V Returns a value for the specified key in this map.
+getAll(key: K): Set<V> Returns all values for the specified key in this map.
+isEmpty(): boolean Returns true if this map contains no mappings.
+keySet(): Set<K> Returns a set consisting of the keys in this map.
+put(key: K, value: V): V Puts a mapping in this map.

MyMap
+remove(key: K): void Removes the entries for the specified key.
+size(): int Returns the number of mappings in this map.
+values(): Set<V> Returns a set consisting of the values in this map.

MyHashMap
MyHashMap<K, V> Concrete class that implements MyMap

MyMap.Entry<K, V>

TestMyHashMap
-key: K
-value: V

+Entry(key: K, value: V) Constructs an entry with the specified key and value.
+getKey(): Key Returns the key in the entry.
+getValue(): Value Returns the value in the entry.

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
19
Complexity of Key Map Operations

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
20
Implementing Set Using Hashing
«interface»
MySet<E>
+clear(): void Removes all elements from this set.
+contains(e: E): boolean Returns true if the element is in the set.
+add(e: E): boolean Adds the element to the set and returns true if the element is added
successfully.
+remove(e: E): boolean Removes the element from the set and returns true if the set
contained the element.
+isEmpty(): boolean Returns true if this set contains no elements.
+size(): int Returns the number of elements in this set.
+iterator(): java.util.Iterator<E> Returns an iterator for the elements in this set.

MyHashSet<E> Concrete class that implements MySet

MySet MyHashSet

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
21
Complexity of Key Set Operations

Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All
rights reserved.
22

You might also like