5 Hash - New
5 Hash - New
Introduction to Hashing
1
Objective
• Sparse Data
• Key Based Data
• Hash Table
• Hash Functions
• Collision Resolution
• Applications
2
Sparse data
• There are many players in a complex game
• Each player has an identification number (key)
• The range of the keys can be 1~1,000,000
• No two players have the same key
• Suppose now we have 3 players
– 954,323
– 447,829
– 194,332
• They are far away from each other
• The player information is called key-based data
3
Sparse data
• How to store those data in the computer so that we
can easily get the player’s information by their keys?
– Array:
• A lot of memory space wasted
194,332 447,829 954,323
3 2 1
– Linked List:
• Hard to search if we have 10,000 players
– Hash Table
• Best solution in this case!
4
Basic Hash Table
• Advantages:
– Quickly store sparse key-based data in a reasonable
amount of space
– Quickly determine if a certain key is within the table
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
6
Hash Functions
Good hash function:
Fast computation, Minimize collision
7
Combination of Hash Functions
• Collision is easy to happen if we use % function
• Combination:
– Apply hash function h1 on key to obtain mid_key
– Apply hash function h2 on mid_key to obtain Slot_id
• Example:
– We apply %101 on 12320324111220 and get 79
– We apply %10 on the result 79 obtained by %101
• 79 % 10 =9
8
Collision Resolution - Open Addressing
• Linear Probing 954,323
– If collide, try Slot_id+1, Slot_id+2
0 1 2 Full Full 5 6 7 8 9
• Quadratic Probing
954,323
– If collide, try Slot_id+1, Slot_id+4,…
• Double Hashing 0 1 2 Full Full 5 6 7 8 9
– If collide, try Slot_id+h2(x), Slot_id+2h2(x),… (prime size important)
• General rule: If collide, try other slots in a certain order
• How to find data?
– If not found, try the next position according to different probing rule
– Every key has a preference over all the positions
– When finding them, just search in the order of their preferences
9
Collision Resolution - Separate Chaining
• Problems with Open Addressing?
• Using linked list to solve Collision
– Every slot in the hash table is a linked list
– CollisionInsert into the corresponding list
– Find dataSearch the corresponding list
1 441
361 91
2 512
3 63
723
4
74 10
Collision Resolution
• Example: 11,22,33,44,55,66,77,88,99,21
– Using linear probing
21 11 22 33 44 55 66 77 88 99
11
More on Hash Table Size
• Table of prime size is important in the following cases:
a) For quadratic probing, we have the following property:
– If quadratic probing is used and the table size is prime,
then a new element can always be inserted if the table
is at least half empty (Why only prime can do?).
See Section 5.4.2
13
More Questions
• How can rehashing be used?
– If we allow rehashing, then quadratic probing
can always succeed in inserting new items
because the table will always be at least half
empty.
• How to keep the table size still prime when you
do rehashing?
14
Application —— Dictionary
• How do Word perform spelling check?
• A dictionary (large hash table) is kept
• Hash words into that dictionary
• The way to hash words
– Establish a map between characters and numbers
– E.g. A—136, F—356, T—927, E—442, R—091
– “AFTER” corresponds to the key
136,356,927,442,091
– Hashing ‘AFTER’ will be equivalent to hashing the
key
15
How to write Hash Class?
• Exercises:
– 1. use linear probing to write a hash class
– 2. use this class to implement your own small dictionary
16
Hash implementation (linear probing)
17
Hash implementation (linear probing)
// Linear probing
// Linear probing
18
Hash implementation (linear probing)
// Linear probing
19
Hash implementation (quadratic probing)
• Modify codes to implement quadratic probing
j=0
while (arr[hashIndex] != NULL && arr[hashIndex]->key != key && j < capacity)
{ j++;
// Computing the new hash value
hashIndex = (hashCode(key)+j*j) % capacity; }
}
20
Hash implementation (double hashing)
• Modify codes to implement double hashing
j=0
while (arr[hashIndex] != NULL && arr[hashIndex]->key != key && j < capacity)
{ j++;
// Computing the new hash value
hash2 = hashCode2(key);
hashIndex = (hashCode(key)+j*hash2) % capacity; }
}
21
Applications of hashing
22
Learning Objectives
1. Understand the concept of Hash
2. Able to insert step by step in a hash table
given the data and the probing rule
3. Know the property of Quadratic Probing and
Double Hashing
4. Able to Implement Hash Table
23
Programming Resources
• Data Structure/Algorithm Visualization
– https://www.cs.usfca.edu/~galles/visualization
/Algorithms.html
• Online Judge
– PKU OJ, USACO, Leetcode
• ACM ICPC
24