Lecture18 PDF
Lecture18 PDF
Questions?
Why do we need hashing?
Can there be entries in the hash table with same key?
Can there be entries in the hash table with same value?
Can there be two entries in the hash table with same
Questions
What would be a good table size to select given n keys
to insert
What is load factor?
What would be a good load factor?
What would you do if the load factor is too high?
questions
how would you select a hash function?
is it to find one?
What is a collision
A collision occurs when two keys map to the same
location
Why do collisions occur?
Mainly due to bad hash functions
Eg: imagine hashing 1000 keys, where each key is on
Separate Chaining
Separate Chaining
Pros
No probing necessary
Cons
Complicated implementation of array of linked lists
Still lots of collisions can create a bad hash table
Load factor
Need to keep the load factor reasonably under control
If load factor becomes too large, rehash
Rehash
The process of creating a larger table to distribute the
keys better
Array of hashnode*s
Size of the table
Function pointers
equal compare two elems and return success (equal) or
failure(not equal)
free_key, free_value
Client considerations
Must provide a hash function
It is also possible to provide a generic hash function like
java API
Must allocate memory for key and value (if necessary)
Implementation
hashlib.h
client.c
hashlib.c
hashlib.0
a.out
Data Structures
Library Interface
ht_init
ht_insert
ht_retrieve
ht_rehash
ht_set functions
equal, free_key, free_value
Client implementation
int hashcode(void* s, int m) {
/* this takes a pointer to a key and
computes the hash code. m is string size
*/
}
Code Examples