0% found this document useful (0 votes)

7 views43 pages

Lab8-Hash

Lab 6 focuses on hash functions and hash tables, introducing their definitions, properties, and common techniques for implementation. It covers collision resolution strategies, performance analysis, and various applications of hash tables. The lab includes exercises for implementing hash functions, detecting collisions, and creating a hash table using linear probing for collision resolution.

Uploaded by

thhainguyen1206

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views43 pages

Lab8-Hash

Uploaded by

thhainguyen1206

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Lab 6.

HF & HT Data structures and Algorithms CSC10004

Lab 6

Hash Functions & Hash Tables

1 Introduction to Hash Functions and Hash Tables

1.1 Hash Functions
A hash function is a mathematical function that maps data of arbitrary size to fixed-size values.
In the context of data structures, hash functions are used to transform keys into array indices,
allowing for efficient data access. A good hash function has several desirable properties:

• Deterministic: The same input should always produce the same output.

• Efficiency: The computation should be fast.

• Uniform Distribution: The function should map inputs as evenly as possible over the
output range.

• Low Collision Rate: Different inputs should rarely map to the same output.

For a hash function h and a key k, the hash value (or hash code) is calculated as:

h(k) = index in the hash table (1)

Common techniques for creating hash functions include:

1. Division Method: h(k) = k mod m, where m is the size of the hash table.

2. Multiplication Method: h(k) = ⌊m · (k · A mod 1)⌋, where A is a constant in the range

(0, 1).

3. Universal Hashing: A family of hash functions chosen randomly to ensure good average-
case performance.

1.2 Hash Tables

A hash table (or hash map) is a data structure that implements an associative array, a structure
that can map keys to values. A hash table uses a hash function to compute an index into an array
of buckets or slots, from which the desired value can be found.

University of Science Faculty of Information Technology Page 1

Lab 6. HF & HT Data structures and Algorithms CSC10004

Figure 1: A simple hash table with string keys and integer values

The main advantage of hash tables is their efficiency—they provide constant-time average-case
performance O(1) for basic operations like insertion, deletion, and lookup, regardless of the number
of elements stored.

1.3 Collision Resolution

A collision occurs when two different keys hash to the same index. Since this is practically un-
avoidable, hash table implementations must include collision resolution strategies:

1. Separate Chaining: Each bucket holds a linked list of all key-value pairs that hash to the
same index.
lookup time = O(1 + α) (2)

where α is the load factor (number of elements divided by the number of buckets).

2. Open Addressing: All elements are stored directly in the hash table array. When a collision
occurs, we probe for an empty slot according to some probing sequence:

University of Science Faculty of Information Technology Page 2

Lab 6. HF & HT Data structures and Algorithms CSC10004

• Linear Probing: h(k, i) = (h(k) + i) mod m, where i is the probe sequence number.
• Quadratic Probing: h(k, i) = (h(k) + c1 i + c2 i2 ) mod m, where c1 , c2 are constants.
• Double Hashing: h(k, i) = (h1 (k) + i · h2 (k)) mod m, using two different hash func-
tions.

1.4 Performance Analysis

The performance of hash tables depends on various factors:

• Load Factor (α): The ratio of the number of elements to the table size.

n
α= (3)
m

where n is the number of elements and m is the table size.

• Time Complexity:

– Average case: O(1) for search, insert, and delete operations.

– Worst case: O(n) when many elements collide at the same index.

• Space Complexity: O(n), where n is the number of elements.

1.5 Applications
Hash tables are widely used in various applications:

• Database Indexing: To quickly locate records.

• Caches: For fast data retrieval and lookup.

• Symbol Tables: In compilers and interpreters.

• Associative Arrays: Implementation in programming languages.

• Password Authentication: Storing password hashes instead of actual passwords.

• Spell Checkers: Quick word lookup.

• Internet Routers: For packet forwarding.

University of Science Faculty of Information Technology Page 3

Lab 6. HF & HT Data structures and Algorithms CSC10004

In this lab, you will implement various hash functions and hash table operations, exploring
the trade-offs between different collision resolution strategies and analyzing their performance
characteristics.

2 Exercise 1: Basic Hash Function Implementation

In this exercise, you will implement a simple hash function for string keys. Hash functions are
crucial components of hash tables, mapping data of variable size to fixed-size values.
Your task is to create a hash function that:

1. Takes a string key and table size as parameters

2. Sums the ASCII values of all characters in the string

3. Returns the sum modulo the table size

For more especially, the requirements are:

1. Implement the hashFunction() that processes string keys

2. The function should return an integer index within the range [0, tableSize-1]

3. Use modular arithmetic to ensure the hash value fits within the table size

4. Test the function with the provided example strings

Example Input/Output

1. Input:

• Input strings: "apple", "banana", "cherry", "date", "elderberry"

• Table size: 10

2. Expected output:
1 Hash values for different keys ( table size = 10) :
2 Key : apple , Hash value : 0
3 Key : banana , Hash value : 9
4 Key : cherry , Hash value : 3
5 Key : date , Hash value : 4
6 Key : elderberry , Hash value : 2

University of Science Faculty of Information Technology Page 4

Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // File : Exercise_1 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 int hashFunction ( const std :: string & key , int tableSize ) {

7 // TODO : Implement a hash function that sums the ASCII values of all
characters
8 // in the key and returns the sum modulo tableSize
9

10 return 0; // Replace with your implementation

11 }
12

13 int main () {
14 std :: string keys [] = { " apple " , " banana " , " cherry " , " date " , " elderberry " };
15 int tableSize = 10;
16

17 std :: cout << " Hash values for different keys ( table size = " << tableSize
<< " ) : " << std :: endl ;
18 for ( const auto & key : keys ) {
19 std :: cout << " Key : " << key << " , Hash value : " << hashFunction ( key ,
tableSize ) << std :: endl ;
20 }
21

22 return 0;
23 }

Furthermore, you can leverage some hash function to improve the hashing performance:

1. Polynomial Hash Function: A polynomial hash function considers character positions, mak-
ing it more sensitive to character order
1 int i m p r o v e d H a s h F u n c t i o n 1 ( const std :: string & key , int tableSize ) {
2 long hash = 0;
3 const int p = 31; // Prime number
4 long p_pow = 1;
5

6 for ( char c : key ) {

7 hash = ( hash + ( c - ’a ’ + 1) * p_pow ) % tableSize ;
8 p_pow = ( p_pow * p ) % tableSize ;
9 }
10

University of Science Faculty of Information Technology Page 5

Lab 6. HF & HT Data structures and Algorithms CSC10004

11 return hash ;
12 }

2. djb2 Hash Function: A well-known string hash function with good distribution properties
1 int i m p r o v e d H a s h F u n c t i o n 2 ( const std :: string & key , int tableSize ) {
2 unsigned long hash = 5381;
3

4 for ( char c : key ) {

5 hash = (( hash << 5) + hash ) + c ; // hash * 33 + c
6 }
7

8 return hash % tableSize ;

9 }\

3. FNV-1a Hash Function: Fast with good distribution and low collision rates
1 int i m p r o v e d H a s h F u n c t i o n 3 ( const std :: string & key , int tableSize ) {
2 const unsigned int fnv_prime = 16777619;
3 unsigned int hash = 2166136261;
4

5 for ( char c : key ) {

6 hash ^= c ;
7 hash *= fnv_prime ;
8 }
9

10 return hash % tableSize ;

11 }

3 Exercise 2: Collision Detection

In this exercise, you will implement a function to detect collisions in a hash table. Collisions occur
when two or more distinct keys hash to the same index in the hash table. Your task is to identify
and report all collisions that occur when a set of keys is hashed using a simple hash function.
For more especially, the requirements are:

1. Implement the detectCollisions() function that identifies which indices have collisions

2. Track how many keys hash to each index in the table

3. Print all indices that have more than one key (collisions)

University of Science Faculty of Information Technology Page 6

Lab 6. HF & HT Data structures and Algorithms CSC10004

4. List the specific keys that collide at each index

Example Input/Output

1. Input:

• Input keys: ”cat”, ”dog”, ”rat”, ”pig”, ”owl”, ”fox”, ”hen”, ”ant”, ”bee”
• Table size: 7

2. Expected output:
1 Detecting collisions for table size 7:
2 Collision at index 4:
3 - " cat "
4 - " fox "
5 Collision at index 5:
6 - " rat "
7 - " pig "
8 Collision at index 6:
9 - " dog "
10 - " bee "

1 // File : Exercise_2 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 int hashFunction ( const std :: string & key , int tableSize ) {

7 int sum = 0;
8 for ( char c : key ) {
9 sum += static_cast < int >( c ) ;
10 }
11 return sum % tableSize ;
12 }
13

14 void detectCollisions ( const std :: vector < std :: string >& keys , int tableSize ) {
15 // TODO : Create a vector to track how many keys hash to each index
16 // Print out all indices that have more than one key ( collisions )
17 // For each collision , print the keys that collided
18 }
19

20 int main () {

University of Science Faculty of Information Technology Page 7

Lab 6. HF & HT Data structures and Algorithms CSC10004

21 std :: vector < std :: string > keys = { " cat " , " dog " , " rat " , " pig " , " owl " , " fox " ,
" hen " , " ant " , " bee " };
22 int tableSize = 7;
23

24 std :: cout << " Detecting collisions for table size " << tableSize << " : " <<
std :: endl ;
25 detectCollisions ( keys , tableSize ) ;
26

27 return 0;
28 }

Furthermore, you can leverage some hash function to reduce collisions:

1. Use Prime Number Table Sizes: Prime numbers help distribute hash values more evenly
1 // Choose a prime number close to but larger than your expected number of
elements
2 int betterTableSize = 11; // Instead of 10

2. Universal Hashing: Using different hash functions randomly from a carefully designed family
1 int u n i v e r s a l H a s h F u n c t i o n ( const std :: string & key , int tableSize , int a ,
int b , int p ) {
2 // p is a prime larger than the largest possible character value
3 // a and b are random integers between 1 and p -1
4 long hash = 0;
5 for ( char c : key ) {
6 hash = ( hash * a + static_cast < int >( c ) ) % p ;
7 }
8 return ( hash % tableSize ) ;
9 }

3. Double Hashing: Using a secondary hash function to resolve collisions

1 int se co nd Ha sh Fun ct io n ( const std :: string & key , int tableSize ) {
2 // A different hash function than the primary one
3 int hash = 0;
4 for ( char c : key ) {
5 hash = hash * 31 + c ;
6 }
7 // Make sure this never returns 0 to avoid infinite loops in probing
8 return 1 + ( hash % ( tableSize - 1) ) ;
9 }

University of Science Faculty of Information Technology Page 8

Lab 6. HF & HT Data structures and Algorithms CSC10004

4 Exercise 3: Linear Probing Implementation

In this exercise, you will implement a hash table that uses linear probing to handle collisions.
Linear probing is a collision resolution technique where, if the intended slot for a key is already
occupied, the algorithm checks the next slot, and continues until it finds an empty slot.
For more especially, the requirements are:

1. Complete the implementation of a hash table class with linear probing

2. Implement the insert method that handles collisions using linear probing

3. Implement the search method that can find values even after collision resolution

4. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input:

• Insert ("apple", 5); Insert ("banana", 8); Insert ("cherry", 3); Insert ("date",
12); Insert ("grape", 10); Insert ("lemon", 7)
• Search for "banana"; Search for "kiwi"

2. Expected output:
1 Hash Table Contents :
2 0: apple -> 5
3 1: lemon -> 7
4 2: Empty
5 3: cherry -> 3
6 4: date -> 12
7 5: Empty
8 6: Empty
9 7: grape -> 10
10 8: Empty
11 9: banana -> 8
12

13 Lookup Operations :
14 Found banana with value 8
15 Could not find kiwi

University of Science Faculty of Information Technology Page 9

Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // File : Exercise_3 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Entry {
9 std :: string key ;
10 int value ;
11 bool isOccupied ;
12

13 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}

14 };
15

16 std :: vector < Entry > table ;

17 int size ;
18

19 int hashFunction ( const std :: string & key ) {

20 int sum = 0;
21 for ( char c : key ) {
22 sum += static_cast < int >( c ) ;
23 }
24 return sum % size ;
25 }
26

27 public :
28 HashTable ( int tableSize ) : size ( tableSize ) {
29 table . resize ( size ) ;
30 }
31

32 // TODO : Implement the insert method with linear probing

33 bool insert ( const std :: string & key , int value ) {
34 // 1. Compute the initial hash value
35 // 2. If the slot is empty , insert the entry
36 // 3. If there ’s a collision , use linear probing to find the next
available slot
37 // 4. Return false if the table is full
38

39 return false ; // Replace with your implementation

40 }
41

University of Science Faculty of Information Technology Page 10

Lab 6. HF & HT Data structures and Algorithms CSC10004

42 // TODO : Implement the search method

43 bool search ( const std :: string & key , int & value ) {
44 // 1. Compute the initial hash value
45 // 2. Check if the key exists at that position
46 // 3. If not , use linear probing to search for the key
47 // 4. Return true and set the value if found , false otherwise
48

49 return false ; // Replace with your implementation

50 }
51

52 // Print the contents of the hash table

53 void print () {
54 for ( int i = 0; i < size ; i ++) {
55 if ( table [ i ]. isOccupied ) {
56 std :: cout << i << " : " << table [ i ]. key << " -> " << table [ i ].
value << std :: endl ;
57 } else {
58 std :: cout << i << " : Empty " << std :: endl ;
59 }
60 }
61 }
62 };
63

64 int main () {
65 HashTable ht (10) ;
66

67 ht . insert ( " apple " , 5) ;

68 ht . insert ( " banana " , 8) ;
69 ht . insert ( " cherry " , 3) ;
70 ht . insert ( " date " , 12) ;
71 ht . insert ( " grape " , 10) ;
72 ht . insert ( " lemon " , 7) ;
73

74 std :: cout << " Hash Table Contents : " << std :: endl ;
75 ht . print () ;
76

77 std :: cout << " \ nLookup Operations : " << std :: endl ;
78 int value ;
79 if ( ht . search ( " banana " , value ) ) {
80 std :: cout << " Found banana with value " << value << std :: endl ;
81 } else {
82 std :: cout << " Could not find banana " << std :: endl ;

University of Science Faculty of Information Technology Page 11

Lab 6. HF & HT Data structures and Algorithms CSC10004

83 }
84

85 if ( ht . search ( " kiwi " , value ) ) {

86 std :: cout << " Found kiwi with value " << value << std :: endl ;
87 } else {
88 std :: cout << " Could not find kiwi " << std :: endl ;
89 }
90

91 return 0;
92 }

Furthermore, you can leverage some techniques to improve the hashing performance:

1. Adding Deletion Capability: Add a method to delete entries while properly handling the
probe sequence
1 bool remove ( const std :: string & key ) {
2 // Find the key using linear probing
3 // Mark the slot as deleted but not empty ( tombstone )
4 // This requires adding a " isDeleted " flag to the Entry struct
5 // Return true if successful , false if key not found
6 }

2. Load Factor Tracking: Add functionality to monitor and maintain an efficient load factor
1 float getLoadFactor () {
2 int occupiedCount = 0;
3 for ( const auto & entry : table ) {
4 if ( entry . isOccupied ) {
5 occupiedCount ++;
6 }
7 }
8 return static_cast < float >( occupiedCount ) / size ;
9 }
10

11 void rehash () {
12 // Implement a rehashing mechanism when load factor exceeds threshold
13 // Create a new table with larger size
14 // Re - insert all elements from the old table
15 }

3. Quadratic Probing

University of Science Faculty of Information Technology Page 12

Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // In insert method :
2 int i = 0;
3 int index = ( initialHash + i * i ) % size ; // Use quadratic sequence instead
of linear

5 Exercise 4: Quadratic Probing

In this exercise, you will implement a hash table that uses quadratic probing to resolve collisions.
Quadratic probing is a collision resolution technique where, if the intended slot for a key is already
occupied, the algorithm checks positions at quadratically increasing distances from the original
position.
For more especially, the requirements are:

1. Complete the implementation of a hash table class with quadratic probing

2. Implement the insert method that handles collisions using quadratic probing

3. Implement the search method that can find values even after collision resolution

4. Ensure the table returns false if it becomes more than 70% full

5. Handle deleted entries correctly during search operations

Example Input/Output

1. Input:

• Insert 9 key-value pairs including "apple", "banana", "cherry", etc.

2. Expected output:
1 Inserted apple
2 Inserted banana
3 Inserted cherry
4 Inserted date
5 Inserted grape
6 Inserted lemon
7 Inserted orange
8 Inserted pear
9 Inserted fig
10

University of Science Faculty of Information Technology Page 13

Lab 6. HF & HT Data structures and Algorithms CSC10004

11 Hash Table Contents :

12 0: orange -> 9
13 1: Empty
14 2: fig -> 6
15 3: cherry -> 3
16 4: Empty
17 5: Empty
18 6: lemon -> 7
19 7: grape -> 10
20 8: pear -> 4
21 9: Empty
22 10: apple -> 5
23 11: banana -> 8
24 12: date -> 12

1 // File : Exercise_4 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Entry {
9 std :: string key ;
10 int value ;
11 bool isOccupied ;
12 bool isDeleted ; // For handling deletions
13

14 Entry () : key ( " " ) , value (0) , isOccupied ( false ) , isDeleted ( false ) {}
15 };
16

17 std :: vector < Entry > table ;

18 int size ;
19 int count ; // Number of elements in the table
20

21 int hashFunction ( const std :: string & key ) {

22 int sum = 0;
23 for ( char c : key ) {
24 sum += static_cast < int >( c ) ;
25 }
26 return sum % size ;

University of Science Faculty of Information Technology Page 14

Lab 6. HF & HT Data structures and Algorithms CSC10004

27 }
28

29 public :
30 HashTable ( int tableSize ) : size ( tableSize ) , count (0) {
31 table . resize ( size ) ;
32 }
33

34 // TODO : Implement the insert method with quadratic probing

35 bool insert ( const std :: string & key , int value ) {
36 // 1. If the table is more than 70% full , return false
37 // 2. Compute the initial hash value
38 // 3. Use quadratic probing : h (k , i ) = ( h ( k ) + i ^2) % size to find an
empty slot
39 // 4. Return true if successful , false if the item cannot be inserted
40

41 return false ; // Replace with your implementation

42 }
43

44 // TODO : Implement the search method

45 bool search ( const std :: string & key , int & value ) {
46 // Use quadratic probing for searching
47

48 return false ; // Replace with your implementation

49 }
50

51 // Print the contents of the hash table

52 void print () {
53 for ( int i = 0; i < size ; i ++) {
54 if ( table [ i ]. isOccupied && ! table [ i ]. isDeleted ) {
55 std :: cout << i << " : " << table [ i ]. key << " -> " << table [ i ].
value << std :: endl ;
56 } else if ( table [ i ]. isDeleted ) {
57 std :: cout << i << " : Deleted " << std :: endl ;
58 } else {
59 std :: cout << i << " : Empty " << std :: endl ;
60 }
61 }
62 }
63 };
64

65 int main () {
66 HashTable ht (13) ; // Using a prime number for table size is recommended

University of Science Faculty of Information Technology Page 15

Lab 6. HF & HT Data structures and Algorithms CSC10004

for quadratic probing

68 std :: vector < std :: pair < std :: string , int > > data = {
69 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} ,
70 { " date " , 12} , { " grape " , 10} , { " lemon " , 7} ,
71 { " orange " , 9} , { " pear " , 4} , { " fig " , 6}
72 };
73

74 for ( const auto & item : data ) {

75 if ( ht . insert ( item . first , item . second ) ) {
76 std :: cout << " Inserted " << item . first << std :: endl ;
77 } else {
78 std :: cout << " Failed to insert " << item . first << std :: endl ;
79 }
80 }
81

82 std :: cout << " \ nHash Table Contents : " << std :: endl ;
83 ht . print () ;
84

85 return 0;
86 }

Furthermore, you can leverage some techniques to improve the hashing performance:

1. Double Hashing: Combine quadratic probing with a second hash function for better distri-
bution
1 int secondHash ( const std :: string & key ) {
2 int hash = 0;
3 for ( char c : key ) {
4 hash = hash * 17 + c ;
5 }
6 return 1 + ( hash % ( size - 1) ) ; // Ensure result is between 1 and size
-1
7 }
8

9 // Then in insert / search :

10 int step = secondHash ( key ) ;
11 int index = ( initialHash + i * step ) % size ;

2. Automatic Rehashing: Implement a rehashing mechanism when the load factor exceeds a
threshold

University of Science Faculty of Information Technology Page 16

Lab 6. HF & HT Data structures and Algorithms CSC10004

1 void rehash () {
2 std :: vector < Entry > oldTable = table ;
3 int oldSize = size ;
4

5 // Double the size , preferably to the next prime number

6 size = nextPrime (2 * size ) ;
7 count = 0;
8 table . clear () ;
9 table . resize ( size ) ;
10

11 // Reinsert all non - deleted entries

12 for ( const auto & entry : oldTable ) {
13 if ( entry . isOccupied && ! entry . isDeleted ) {
14 insert ( entry . key , entry . value ) ;
15 }
16 }
17 }

3. Prime-Sized Tables: Always use prime numbers for the table size to ensure optimal distri-
bution with quadratic probing
1 // Recall your simple check prime function in your Fundamental class
2 bool isPrime ( int n ) {
3 if ( n <= 1) return false ;
4 if ( n <= 3) return true ;
5 if ( n % 2 == 0 || n % 3 == 0) return false ;
6

7 for ( int i = 5; i * i <= n ; i += 6) {

8 if ( n % i == 0 || n % ( i + 2) == 0) return false ;
9 }
10 return true ;
11 }
12

13 int nextPrime ( int n ) {

14 if ( n <= 1) return 2;
15

16 int prime = n ;
17 bool found = false ;
18

19 while (! found ) {
20 prime ++;
21 if ( isPrime ( prime ) ) found = true ;

University of Science Faculty of Information Technology Page 17

Lab 6. HF & HT Data structures and Algorithms CSC10004

22 }
23

24 return prime ;
25 }

6 Exercise 5: Double Hashing

In this exercise, you will implement a hash table that uses double hashing to resolve collisions.
Double hashing is a collision resolution technique that uses two hash functions: the first to deter-
mine the initial position, and the second to determine the step size for probing when a collision
occurs.
For more especially, the requirements are:

1. Complete the implementation of a hash table class with double hashing

2. Implement the hash2() function that creates a suitable secondary hash

3. Implement the insert() method that handles collisions using double hashing

4. Implement the search() method that can find values after collision resolution

5. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input:

• Insert 10 key-value pairs including "cat", "dog", "bird", etc.

• Search for "tiger"

2. Expected output:
1 Inserted cat
2 Inserted dog
3 Inserted bird
4 Inserted fish
5 Inserted lion
6 Inserted tiger
7 Inserted bear
8 Inserted wolf
9 Inserted fox

University of Science Faculty of Information Technology Page 18

Lab 6. HF & HT Data structures and Algorithms CSC10004

10 Inserted deer
11

12 Hash Table Contents :

13 0: cat -> 1
14 1: bird -> 3
15 2: dog -> 2
16 3: Empty
17 4: Empty
18 5: lion -> 5
19 6: tiger -> 6
20 7: bear -> 7
21 8: fox -> 9
22 9: deer -> 10
23 10: fish -> 4
24 11: wolf -> 8
25 12: Empty
26

27 Found tiger with value 6

1 // File : Exercise_5 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Entry {
9 std :: string key ;
10 int value ;
11 bool isOccupied ;
12

13 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}

14 };
15

16 std :: vector < Entry > table ;

17 int size ;
18 int count ;
19

20 // Primary hash function

21 int hash1 ( const std :: string & key ) {
22 int sum = 0;

University of Science Faculty of Information Technology Page 19

Lab 6. HF & HT Data structures and Algorithms CSC10004

23 for ( char c : key ) {

24 sum += static_cast < int >( c ) ;
25 }
26 return sum % size ;
27 }
28

29 // Secondary hash function

30 int hash2 ( const std :: string & key ) {
31 // TODO : Implement a different hash function for double hashing
32 // It should never return 0 to avoid an infinite loop during probing
33 // A common approach is to use a prime number less than the table size :
R - ( key hash % R )
34 // where R is the largest prime number less than table size
35

36 return 1; // Replace with your implementation

37 }
38

39 public :
40 HashTable ( int tableSize ) : size ( tableSize ) , count (0) {
41 table . resize ( size ) ;
42 }
43

44 // TODO : Implement insert with double hashing

45 bool insert ( const std :: string & key , int value ) {
46 // 1. If the table is more than 70% full , return false
47 // 2. Compute both hash values
48 // 3. Use double hashing formula : h (k , i ) = ( h1 ( k ) + i * h2 ( k ) ) % size
49 // 4. Return true if successful , false if the item cannot be inserted
50

51 return false ; // Replace with your implementation

52 }
53

54 // TODO : Implement search with double hashing

55 bool search ( const std :: string & key , int & value ) {
56 // Use double hashing for searching
57

58 return false ; // Replace with your implementation

59 }
60

61 void print () {
62 for ( int i = 0; i < size ; i ++) {
63 if ( table [ i ]. isOccupied ) {

University of Science Faculty of Information Technology Page 20

Lab 6. HF & HT Data structures and Algorithms CSC10004

64 std :: cout << i << " : " << table [ i ]. key << " -> " << table [ i ].
value << std :: endl ;
65 } else {
66 std :: cout << i << " : Empty " << std :: endl ;
67 }
68 }
69 }
70 };
71

72 int main () {
73 HashTable ht (13) ; // Using a prime number for table size
74

75 std :: vector < std :: pair < std :: string , int > > data = {
76 { " cat " , 1} , { " dog " , 2} , { " bird " , 3} , { " fish " , 4} ,
77 { " lion " , 5} , { " tiger " , 6} , { " bear " , 7} , { " wolf " , 8} ,
78 { " fox " , 9} , { " deer " , 10}
79 };
80

81 for ( const auto & item : data ) {

82 if ( ht . insert ( item . first , item . second ) ) {
83 std :: cout << " Inserted " << item . first << std :: endl ;
84 } else {
85 std :: cout << " Failed to insert " << item . first << std :: endl ;
86 }
87 }
88

89 std :: cout << " \ nHash Table Contents : " << std :: endl ;
90 ht . print () ;
91

92 int value ;
93 if ( ht . search ( " tiger " , value ) ) {
94 std :: cout << " \ nFound tiger with value " << value << std :: endl ;
95 } else {
96 std :: cout << " \ nCould not find tiger " << std :: endl ;
97 }
98

99 return 0;
100 }

Furthermore, you can leverage some techniques to improve the hashing performance:

1. Better Secondary Hash Function: A popular secondary hash function is to use a prime

University of Science Faculty of Information Technology Page 21

Lab 6. HF & HT Data structures and Algorithms CSC10004

number smaller than the table size: PRIME - (key % PRIME) where PRIME is a prime smaller
than the TABLE SIZE.

2. Load Factor Monitoring: Add functionality to track and respond to the hash table’s load
factor

3. Handling Deletion: When deleting elements, it’s important to place what is called a “tomb-
stone” at the location rather than simply marking it as empty. This allows the search se-
quence to continue past deleted elements. This can be implemented by adding an isDeleted
flag to the Entry structure. Please check this link for more information.

7 Exercise 6: Universal Hash Function

In this exercise, you will implement a universal hash function using the multiplication method as
described by Knuth. This method is particularly effective at distributing hash values uniformly
across a hash table, making it suitable for various applications.
For more especially, the requirements are:

1. Implement the universalHash() function using the multiplication method

2. The function should take a key, a constant A (between 0 and 1), and the table size as
parameters

3. Test the function with different values of A to observe how the distribution changes

4. Analyze the distribution of hash values for different keys and constants

Example Input/Output

1. Input:

• Keys: 123, 456, 789, 101, 202, 303, 404, 505, 606, 707
• Constant A: 0.6180339887 (golden ratio minus 1)
• Table size: 10

2. Expected output:

University of Science Faculty of Information Technology Page 22

Lab 6. HF & HT Data structures and Algorithms CSC10004

1 Hash values using multiplication method ( table size = 10) :

2 Key : 123 , Hash : 6
3 Key : 456 , Hash : 1
4 Key : 789 , Hash : 7
5 Key : 101 , Hash : 2
6 Key : 202 , Hash : 4
7 Key : 303 , Hash : 7
8 Key : 404 , Hash : 9
9 Key : 505 , Hash : 2
10 Key : 606 , Hash : 4
11 Key : 707 , Hash : 7

1 // File : Exercise_6 . cpp

2 # include < iostream >
3 # include < vector >
4 # include < cmath >
5

6 // Universal hash function using multiplication method

7 int universalHash ( int key , double A , int m ) {
8 // TODO : Implement the multiplication method
9 // 1. Multiply the key by A ( a constant between 0 and 1)
10 // 2. Take the fractional part of the result
11 // 3. Multiply by m and take the floor
12 // This approach tends to distribute keys uniformly across the hash table ,
even when the keys have patterns or are sequential .
13 // The optimal choice of A is often cited as ( sqrt (5) -1) /2 \ approx
0.6180339887 , which is derived from the golden ratio and has been shown to
produce excellent distribution properties .
14

15 return 0; // Replace with your implementation

16 }
17

18 int main () {
19 std :: vector < int > keys = {123 , 456 , 789 , 101 , 202 , 303 , 404 , 505 , 606 , 707};
20 double A = 0.6180339887; // ( sqrt (5) - 1) / 2 , a popular choice
21 int tableSize = 10;
22

23 std :: cout << " Hash values using multiplication method ( table size = " <<
tableSize << " ) : " << std :: endl ;
24 for ( int key : keys ) {
25 std :: cout << " Key : " << key << " , Hash : " << universalHash ( key , A ,
tableSize ) << std :: endl ;

University of Science Faculty of Information Technology Page 23

Lab 6. HF & HT Data structures and Algorithms CSC10004

26 }
27

28 // Try different values of A

29 std :: vector < double > AValues = {0.1 , 0.3 , 0.5 , 0.7 , 0.9};
30

31 std :: cout << " \ nHash distribution with different values of A : " << std :: endl
;
32 for ( double a : AValues ) {
33 std :: cout << " A = " << a << " : " ;
34 for ( int key : keys ) {
35 std :: cout << universalHash ( key , a , tableSize ) << " " ;
36 }
37 std :: cout << std :: endl ;
38 }
39

40 return 0;
41 }

Theoretical Background
The multiplication method involves choosing a table size m that is a power of 2 and a constant
A that is a random-looking real number. Knuth suggests using A = 0.5*(sqrt(5) - 1), which
is the golden ratio minus 1. This value of A helps ensure a good distribution of hash values.
Multiplicative hashing works by setting the hash index from the fractional part of multiplying
the key by a large real number. For computational efficiency, this is typically done using fixed-point
arithmetic rather than floating-point operations.
The choice of A as related to the golden ratio is interesting because repeated multiplication by
this value minimizes gaps in the hash space, creating a well-distributed sequence of values.

8 Exercise 7: Separate Chaining Implementation

In this exercise, you will implement a hash table that uses separate chaining with linked lists to
handle collisions. Separate chaining is a collision resolution technique where each bucket in the
hash table contains a linked list of key-value pairs that hash to the same bucket.
Separate chaining provides reliable performance with:

• Average case time complexity: O(1) for insert, search, and delete operations

• Worst case (when all keys hash to the same bucket): O(n) where n is the number of elements

For more especially, the requirements are:

University of Science Faculty of Information Technology Page 24

Lab 6. HF & HT Data structures and Algorithms CSC10004

1. Complete the implementation of a hash table class with separate chaining

2. Implement the destructor to properly free memory

3. Implement the insert method to add new key-value pairs or update existing ones

4. Implement the search method to find values by key

5. Implement the remove method to delete nodes from the chains

6. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input:

• Insert 12 key-value pairs

• Search for "orange"
• Remove "banana" and "fig"

2. Expected output:
1 Hash Table Contents :
2 Bucket 0: ( cherry , 3) ( peach , 15)
3 Bucket 1: ( date , 12) ( kiwi , 11)
4 Bucket 2: ( mango , 2)
5 Bucket 3: ( grape , 10)
6 Bucket 4: ( apple , 5) ( lemon , 7)
7 Bucket 5: ( banana , 8) ( orange , 9)
8 Bucket 6: ( pear , 4) ( fig , 6)
9

10 Lookup Operations :
11 Found orange with value 9
12

13 After removing ’ banana ’ and ’ fig ’:

14 Bucket 0: ( cherry , 3) ( peach , 15)
15 Bucket 1: ( date , 12) ( kiwi , 11)
16 Bucket 2: ( mango , 2)
17 Bucket 3: ( grape , 10)
18 Bucket 4: ( apple , 5) ( lemon , 7)
19 Bucket 5: ( orange , 9)
20 Bucket 6: ( pear , 4)

University of Science Faculty of Information Technology Page 25

Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // File : Exercise_7 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Node {
9 std :: string key ;
10 int value ;
11 Node * next ;
12

13 Node ( const std :: string & k , int v ) : key ( k ) , value ( v ) , next ( nullptr ) {}
14 };
15

16 std :: vector < Node * > table ;

17 int size ;
18

19 int hashFunction ( const std :: string & key ) {

20 int sum = 0;
21 for ( char c : key ) {
22 sum += static_cast < int >( c ) ;
23 }
24 return sum % size ;
25 }
26

27 public :
28 HashTable ( int tableSize ) : size ( tableSize ) {
29 table . resize ( size , nullptr ) ;
30 }
31

32 ~ HashTable () {
33 // TODO : Free all allocated memory to prevent memory leaks
34 }
35

36 // TODO : Implement insert with separate chaining

37 void insert ( const std :: string & key , int value ) {
38 // 1. Compute the hash value to determine the bucket
39 // 2. Create a new node
40 // 3. If the bucket is empty , make the new node the head
41 // 4. If not , append the node to the linked list or update if the key
exists

University of Science Faculty of Information Technology Page 26

Lab 6. HF & HT Data structures and Algorithms CSC10004

42 }
43

44 // TODO : Implement search

45 bool search ( const std :: string & key , int & value ) {
46 // 1. Compute the hash value
47 // 2. Traverse the linked list at that bucket to find the key
48 // 3. Return true and set the value if found , false otherwise
49

50 return false ; // Replace with your implementation

51 }
52

53 // TODO : Implement remove

54 bool remove ( const std :: string & key ) {
55 // 1. Compute the hash value
56 // 2. Traverse the linked list to find and remove the node with the
given key
57 // 3. Return true if removed , false if not found
58

59 return false ; // Replace with your implementation

60 }
61

62 void print () {
63 for ( int i = 0; i < size ; i ++) {
64 std :: cout << " Bucket " << i << " : " ;
65 Node * current = table [ i ];
66 if (! current ) {
67 std :: cout << " Empty " ;
68 }
69 while ( current ) {
70 std :: cout << " ( " << current - > key << " , " << current - > value << "
) ";
71 current = current - > next ;
72 }
73 std :: cout << std :: endl ;
74 }
75 }
76 };
77

78 int main () {
79 HashTable ht (7) ;
80

81 std :: vector < std :: pair < std :: string , int > > data = {

University of Science Faculty of Information Technology Page 27

Lab 6. HF & HT Data structures and Algorithms CSC10004

82 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} , { " date " , 12} ,
83 { " grape " , 10} , { " lemon " , 7} , { " orange " , 9} , { " pear " , 4} ,
84 { " fig " , 6} , { " kiwi " , 11} , { " mango " , 2} , { " peach " , 15}
85 };
86

87 for ( const auto & item : data ) {

88 ht . insert ( item . first , item . second ) ;
89 }
90

91 std :: cout << " Hash Table Contents : " << std :: endl ;
92 ht . print () ;
93

94 std :: cout << " \ nLookup Operations : " << std :: endl ;
95 int value ;
96 if ( ht . search ( " orange " , value ) ) {
97 std :: cout << " Found orange with value " << value << std :: endl ;
98 } else {
99 std :: cout << " Could not find orange " << std :: endl ;
100 }
101

102 std :: cout << " \ nAfter removing ’ banana ’ and ’ fig ’: " << std :: endl ;
103 ht . remove ( " banana " ) ;
104 ht . remove ( " fig " ) ;
105 ht . print () ;
106

107 return 0;
108 }

9 Exercise 8: Load Factor Analysis

In this exercise, you will implement a function to analyze how the load factor of a hash table
affects its performance. The load factor is a critical metric in hash tables, defined as the ratio
of the number of stored elements to the total number of buckets. As the load factor increases,
collisions become more frequent, which can degrade performance.
For more especially, the requirements are:

1. Implement the analyzeLoadFactor() function to measure performance at different load

factors

2. Create a hash table with a specified initial size

University of Science Faculty of Information Technology Page 28

Lab 6. HF & HT Data structures and Algorithms CSC10004

3. Insert random keys until reaching different load factor thresholds (e.g., 0.1, 0.2, ..., 0.9)

4. For each load factor threshold, perform a fixed number of lookup operations

5. Measure and report the average number of probes needed for operations

6. Visualize or display the results in a meaningful way

Example Input/Output

1. Expected output:
1 Performance Analysis of Hash Table with Different Load Factors
2 --------------------------------------------------------------
3 Table size : 1000
4 Operations per load factor : 1000
5

6 Load Factor | Avg Probes ( Insert ) | Avg Probes ( Search ) | Avg Probes (
Failed Search )
7 ---------------------------------------------------------------------------

8 0.10 | 1.05 | 1.03 | 1.10

9 0.20 | 1.11 | 1.08 | 1.22
10 0.30 | 1.17 | 1.15 | 1.43
11 0.40 | 1.25 | 1.22 | 1.66
12 0.50 | 1.38 | 1.33 | 2.00
13 0.60 | 1.52 | 1.48 | 2.50
14 0.70 | 1.83 | 1.76 | 3.33
15 0.80 | 2.27 | 2.18 | 5.00
16 0.90 | 3.60 | 3.48 | 10.00

1 // File : Exercise_8 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5 # include < chrono >
6 # include < random >
7 # include < algorithm >
8

9 class HashTable {
10 private :
11 struct Entry {
12 std :: string key ;

University of Science Faculty of Information Technology Page 29

Lab 6. HF & HT Data structures and Algorithms CSC10004

13 int value ;
14 bool isOccupied ;
15

16 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}

17 };
18

19 std :: vector < Entry > table ;

20 int size ;
21 int count ;
22 int probeCount ; // Track total probes for performance analysis
23

24 int hashFunction ( const std :: string & key ) {

25 int sum = 0;
26 for ( char c : key ) {
27 sum += static_cast < int >( c ) ;
28 }
29 return sum % size ;
30 }
31

32 public :
33 HashTable ( int tableSize ) : size ( tableSize ) , count (0) , probeCount (0) {
34 table . resize ( size ) ;
35 }
36

37 bool insert ( const std :: string & key , int value ) {

38 if ( static_cast < double >( count ) / size >= 0.9) {
39 return false ; // Table is too full
40 }
41

42 int index = hashFunction ( key ) ;

43 int i = 0;
44

45 while ( i < size ) {

46 int probeIndex = ( index + i ) % size ; // Linear probing
47 probeCount ++;
48

49 if (! table [ probeIndex ]. isOccupied ) {

50 table [ probeIndex ]. key = key ;
51 table [ probeIndex ]. value = value ;
52 table [ probeIndex ]. isOccupied = true ;
53 count ++;
54 return true ;

University of Science Faculty of Information Technology Page 30

Lab 6. HF & HT Data structures and Algorithms CSC10004

55 } else if ( table [ probeIndex ]. key == key ) {

56 table [ probeIndex ]. value = value ; // Update existing value
57 return true ;
58 }
59

60 i ++;
61 }
62

63 return false ; // Table is full after probing all positions

64 }
65

66 bool search ( const std :: string & key ) {

67 int index = hashFunction ( key ) ;
68 int i = 0;
69

70 while ( i < size ) {

71 int probeIndex = ( index + i ) % size ;
72 probeCount ++;
73

74 if (! table [ probeIndex ]. isOccupied ) {

75 return false ; // Key doesn ’t exist
76 } else if ( table [ probeIndex ]. key == key ) {
77 return true ; // Key found
78 }
79

80 i ++;
81 }
82

83 return false ; // Key doesn ’t exist after probing all positions

84 }
85

86 int getProbeCount () const {

87 return probeCount ;
88 }
89

90 void resetProbeCount () {
91 probeCount = 0;
92 }
93

94 double getLoadFactor () const {

95 return static_cast < double >( count ) / size ;
96 }

University of Science Faculty of Information Technology Page 31

Lab 6. HF & HT Data structures and Algorithms CSC10004

97 };
98

99 // Generate a random string of a given length

100 std :: string g e n e r a t e R a n d o m S t r i n g ( int length ) {
101 static const char alphanum [] =
102 " 0123456789 "
103 " ABCDEFGHIJKLMNOPQRSTUVWXYZ "
104 " abcdefghijklmnopqrstuvwxyz ";
105

106 std :: random_device rd ;

107 std :: mt19937 gen ( rd () ) ;
108 std :: uniform_int_distribution < > dis (0 , sizeof ( alphanum ) - 2) ;
109

110 std :: string result ;

111 result . reserve ( length ) ;
112

113 for ( int i = 0; i < length ; ++ i ) {

114 result += alphanum [ dis ( gen ) ];
115 }
116

117 return result ;

118 }
119

120 // TODO : Implement a function to analyze how load factor affects performance
121 void ana lyzeL oadFac tor () {
122 // 1. Create a hash table with a large size ( e . g . , 1000)
123 // 2. Insert random keys until reaching different load factors ( e . g . , 0.1 ,
0.2 , ... , 0.9)
124 // 3. For each load factor , perform a fixed number of random lookups
125 // 4. Measure the average number of probes needed for successful and
unsuccessful lookups
126 // 5. Print out the results and draw conclusions
127 }
128

129 int main () {

130 anal yzeLoa dFacto r () ;
131 return 0;
132 }

University of Science Faculty of Information Technology Page 32

Lab 6. HF & HT Data structures and Algorithms CSC10004

10 Exercise 9: Implement Cuckoo Hashing

In this exercise, you will implement a hash table using cuckoo hashing with two hash functions.
Cuckoo hashing is a collision resolution technique named after the cuckoo bird’s behavior of pushing
other birds’ eggs out of their nests. Similarly, when inserting a new key that collides with an existing
key, the existing key is displaced to its alternative location.
For more especially, the requirements are:

1. Complete the implementation of a cuckoo hash table class with two tables

2. Implement the insert() method that handles collisions using the cuckoo algorithm

3. Implement the search() method that looks for keys in both tables

4. Implement the rehash() method to resize the table when cycles are detected

5. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input: Insert 8 key-value pairs including "apple", "banana", "cherry", etc.

2. Expected output:
1 Inserted apple
2 Inserted banana
3 Inserted cherry
4 Inserted date
5 Inserted grape
6 Inserted lemon
7 Inserted orange
8 Inserted pear
9

10 Cuckoo Hash Table Contents :

11 Table 1:
12 0: Empty
13 1: Empty
14 2: banana -> 8
15 3: Empty
16 4: Empty
17 5: date -> 12
18 6: Empty
19 7: orange -> 9

University of Science Faculty of Information Technology Page 33

Lab 6. HF & HT Data structures and Algorithms CSC10004

20 8: pear -> 4
21 9: Empty
22

23 Table 2:
24 0: grape -> 10
25 1: Empty
26 2: apple -> 5
27 3: Empty
28 4: cherry -> 3
29 5: Empty
30 6: lemon -> 7
31 7: Empty
32 8: Empty
33 9: Empty
34

35 Current load factor : 0.4

37 Found orange with value 9

1 // File : Exercise_9 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5 # include < chrono > // For seeding the random number generator
6

7 class CuckooHashTable {
8 private :
9 struct Entry {
10 std :: string key ;
11 int value ;
12 bool isOccupied ;
13

14 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}

15 };
16

17 std :: vector < Entry > table1 ;

18 std :: vector < Entry > table2 ;
19 int size ;
20 int count ;
21 int maxLoop ; // Maximum number of displacement iterations
22

23 // First hash function

University of Science Faculty of Information Technology Page 34

Lab 6. HF & HT Data structures and Algorithms CSC10004

24 int hash1 ( const std :: string & key ) {

25 int sum = 0;
26 for ( char c : key ) {
27 sum = sum * 31 + static_cast < int >( c ) ;
28 }
29 return std :: abs ( sum ) % size ;
30 }
31

32 // Second hash function

33 int hash2 ( const std :: string & key ) {
34 int sum = 0;
35 for ( char c : key ) {
36 sum = sum * 37 + static_cast < int >( c ) ;
37 }
38 return std :: abs ( sum ) % size ;
39 }
40

41 public :
42 CuckooHashTable ( int tableSize ) : size ( tableSize ) , count (0) , maxLoop (
tableSize ) {
43 table1 . resize ( size ) ;
44 table2 . resize ( size ) ;
45 }
46

47 // TODO : Implement insert with cuckoo hashing

48 bool insert ( const std :: string & key , int value ) {
49 // 1. Check if the key is already in either table
50 // 2. If not , try to insert in table1
51 // 3. If table1 position is occupied , evict the current entry and move
it to table2
52 // 4. Continue this process until either :
53 // - An empty slot is found
54 // - We ’ ve exceeded maxLoop iterations ( indicating a cycle )
55

56 return false ; // Replace with your implementation

57 }
58

59 // TODO : Implement search

60 bool search ( const std :: string & key , int & value ) {
61 // 1. Check table1 using hash1
62 // 2. If not found , check table2 using hash2
63 // 3. Return true and set the value if found , false otherwise

University of Science Faculty of Information Technology Page 35

Lab 6. HF & HT Data structures and Algorithms CSC10004

65 return false ; // Replace with your implementation

66 }
67

68 // TODO : Implement rehash for when we detect a cycle

69 bool rehash () {
70 // 1. Create new tables with increased size
71 // 2. Reinsert all elements from the original tables
72 // 3. Return true if successful , false otherwise
73

74 return false ; // Replace with your implementation

75 }
76

77 double getLoadFactor () const {

78 return static_cast < double >( count ) / (2 * size ) ; // Two tables
79 }
80

81 void print () {
82 std :: cout << " Table 1: " << std :: endl ;
83 for ( int i = 0; i < size ; i ++) {
84 if ( table1 [ i ]. isOccupied ) {
85 std :: cout << i << " : " << table1 [ i ]. key << " -> " << table1 [ i ].
value << std :: endl ;
86 } else {
87 std :: cout << i << " : Empty " << std :: endl ;
88 }
89 }
90

91 std :: cout << " \ nTable 2: " << std :: endl ;

92 for ( int i = 0; i < size ; i ++) {
93 if ( table2 [ i ]. isOccupied ) {
94 std :: cout << i << " : " << table2 [ i ]. key << " -> " << table2 [ i ].
value << std :: endl ;
95 } else {
96 std :: cout << i << " : Empty " << std :: endl ;
97 }
98 }
99 }
100 };
101

102 int main () {

103 CuckooHashTable ht (10) ;

University of Science Faculty of Information Technology Page 36

Lab 6. HF & HT Data structures and Algorithms CSC10004

104

105 std :: vector < std :: pair < std :: string , int > > data = {
106 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} , { " date " , 12} ,
107 { " grape " , 10} , { " lemon " , 7} , { " orange " , 9} , { " pear " , 4}
108 };
109

110 for ( const auto & item : data ) {

111 if ( ht . insert ( item . first , item . second ) ) {
112 std :: cout << " Inserted " << item . first << std :: endl ;
113 } else {
114 std :: cout << " Failed to insert " << item . first << " , rehashing ... "
<< std :: endl ;
115 if ( ht . rehash () ) {
116 ht . insert ( item . first , item . second ) ;
117 std :: cout << " Inserted " << item . first << " after rehash " <<
std :: endl ;
118 } else {
119 std :: cout << " Failed to rehash and insert " << item . first <<
std :: endl ;
120 }
121 }
122 }
123

124 std :: cout << " \ nCuckoo Hash Table Contents : " << std :: endl ;
125 ht . print () ;
126

127 std :: cout << " \ nCurrent load factor : " << ht . getLoadFactor () << std :: endl ;
128

129 int value ;

130 if ( ht . search ( " orange " , value ) ) {
131 std :: cout << " Found orange with value " << value << std :: endl ;
132 } else {
133 std :: cout << " Could not find orange " << std :: endl ;
134 }
135

136 return 0;
137 }

University of Science Faculty of Information Technology Page 37

Lab 6. HF & HT Data structures and Algorithms CSC10004

11 Exercise 10: Perfect Hashing with Secondary Tables

In this exercise, you will implement a two-level hash table structure for perfect hashing of a static
set of keys. Perfect hashing guarantees O(1) worst-case lookup time with no collisions, making it
ideal for static datasets that are queried frequently but rarely updated.
For more especially, the requirements are:

1. Complete the implementation of a perfect hash table using a two-level structure

2. Implement the build() function to construct the hash table from a fixed set of key-value
pairs

3. Implement the search() function to find values by key in constant time

4. Ensure the hash table uses O(n) space overall, where n is the number of elements

1 // File : Exercise_10 . cpp

2 # include < iostream >
3 # include < string >
4 # include < vector >
5 # include < cmath >
6 # include < algorithm >
7

8 class PerfectHashTable {
9 private :
10 struct SecondaryTable {
11 std :: vector < std :: pair < std :: string , int > > entries ;
12 int size ;
13 double a ; // Universal hash function parameter
14

15 SecondaryTable ( int tableSize , double hashParam )

16 : size ( tableSize ) , a ( hashParam ) {
17 entries . resize ( size , { " " , -1}) ;
18 }
19

20 int hash ( const std :: string & key ) {

21 // Universal hash function for secondary table
22 int sum = 0;
23 for ( char c : key ) {
24 sum = sum * 31 + static_cast < int >( c ) ;
25 }
26 return static_cast < int >( size * fmod ( a * sum , 1.0) ) ;

University of Science Faculty of Information Technology Page 38

Lab 6. HF & HT Data structures and Algorithms CSC10004

27 }
28 };
29

30 std :: vector < SecondaryTable * > primaryTable ;

31 int size ;
32

33 int primaryHash ( const std :: string & key ) {

34 int sum = 0;
35 for ( char c : key ) {
36 sum += static_cast < int >( c ) ;
37 }
38 return sum % size ;
39 }
40

41 public :
42 PerfectHashTable ( int tableSize ) : size ( tableSize ) {
43 // Initialize with nullptrs to create secondary tables only when needed
44 primaryTable . resize ( size , nullptr ) ;
45 }
46

47 ~ PerfectHashTable () {
48 // Free all secondary tables
49 for ( SecondaryTable * table : primaryTable ) {
50 delete table ;
51 }
52 }
53

54 // TODO : Implement the build function to construct the perfect hash table
55 void build ( const std :: vector < std :: pair < std :: string , int > >& data ) {
56 // 1. Distribute items into buckets using the primary hash function
57 // 2. For each non - empty bucket , create a secondary table with size = (
number of items ) ^2
58 // 3. Choose a proper hash function for each secondary table to avoid
collisions
59 // 4. Insert items into secondary tables
60 }
61

62 // TODO : Implement the search function

63 bool search ( const std :: string & key , int & value ) {
64 // 1. Use primary hash to find the correct secondary table
65 // 2. If the secondary table exists , use its hash function to find the
item

University of Science Faculty of Information Technology Page 39

Lab 6. HF & HT Data structures and Algorithms CSC10004

66 // 3. Return true and set the value if found , false otherwise

68 return false ; // Replace with your implementation

69 }
70

71 void print () {
72 for ( int i = 0; i < size ; i ++) {
73 std :: cout << " Primary bucket " << i << " : " ;
74 if (! primaryTable [ i ]) {
75 std :: cout << " Empty " << std :: endl ;
76 continue ;
77 }
78

79 std :: cout << " Secondary table size = " << primaryTable [ i ] - > size <<
std :: endl ;
80 for ( int j = 0; j < primaryTable [ i ] - > size ; j ++) {
81 if ( primaryTable [ i ] - > entries [ j ]. second != -1) {
82 std :: cout << " " << j << " : " << primaryTable [ i ] - > entries [
j ]. first
83 << " -> " << primaryTable [ i ] - > entries [ j ]. second
<< std :: endl ;
84 }
85 }
86 }
87 }
88 };
89

90 int main () {
91 std :: vector < std :: pair < std :: string , int > > data = {
92 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} , { " date " , 12} ,
93 { " grape " , 10} , { " lemon " , 7} , { " orange " , 9} , { " pear " , 4} ,
94 { " fig " , 6} , { " kiwi " , 11}
95 };
96

97 PerfectHashTable pht (7) ;

98 pht . build ( data ) ;
99

100 std :: cout << " Perfect Hash Table Contents : " << std :: endl ;
101 pht . print () ;
102

103 int value ;

104 if ( pht . search ( " date " , value ) ) {

University of Science Faculty of Information Technology Page 40

Lab 6. HF & HT Data structures and Algorithms CSC10004

105 std :: cout << " \ nFound date with value " << value << std :: endl ;
106 } else {
107 std :: cout << " \ nCould not find date " << std :: endl ;
108 }
109

110 if ( pht . search ( " watermelon " , value ) ) {

111 std :: cout << " Found watermelon with value " << value << std :: endl ;
112 } else {
113 std :: cout << " Could not find watermelon " << std :: endl ;
114 }
115

116 return 0;
117 }

Performance analysis:
1) Time Complexity

• Build: O(n) expected time, where n is the number of elements. While we need to try multiple
hash functions until finding a collision-free one, the expected number of trials is constant.

• Search: O(1) worst-case time, since we make exactly two hash function evaluations and array
accesses.

2) Space Complexity

• O(n) expected space overall. Although each secondary table is sized quadratically to the
number of elements it contains, the total space across all secondary tables is O(n) in expec-
tation when using a good primary hash function.

Advantages Disadvantages
Guaranteed O(1) worst-case lookup time Not suitable for dynamic datasets (requires
rebuilding when data changes)
No need to handle collisions during lookup Higher space overhead compared to some
other hashing schemes
Good for static datasets that are queried fre- Complex implementation compared to sim-
quently pler hashing methods

University of Science Faculty of Information Technology Page 41

Lab 6. HF & HT Data structures and Algorithms CSC10004

12 Exercise 11: Password checker

Write a program that evaluates whether a password is “good” based on specified criteria. The
program should read a candidate password from the command line and a dictionary of common
words from standard input, then determine if the password meets all security requirements.
A password is considered “good” if and only if it meets ALL of the following criteria:

1. It is at least 8 characters long

2. It is not a word found in the dictionary

3. It is not a dictionary word followed by a digit 0-9 (e.g., “password1”)

4. It is not two dictionary words separated by a digit (e.g., “hello2world”)

University of Science Faculty of Information Technology Page 42

Lab 6. HF & HT Data structures and Algorithms CSC10004

Regulations
Please follow these regulations:

• You are allowed to use any IDE.

• After completing assignment, check your submission before and after uploading to Moodle.

• Prohibited libraries: <set>, <unordered_set>, <map>, <unordered_map>, <algorithm>,

<list>, <stack>, <queue>, and <bits/stdc++.h>.

• You can use <vector> or any libraries that are not in the prohibited libraries listed above.

Your source code must be contributed in the form of a compressed file and named your sub-
mission according to the format StudentID.zip. Here is a detail of the directory organization:
StudentID
Exercise 1.cpp
Exercise 2.cpp
Exercise 3.cpp
Exercise 4.cpp
Exercise 5.cpp
Exercise 6.cpp
Exercise 7.cpp
Exercise 8.cpp
Exercise 9.cpp
Exercise 10.cpp

The end.

University of Science Faculty of Information Technology Page 43

Lecture 3.2.1 Hashing
No ratings yet
Lecture 3.2.1 Hashing
17 pages
Lecture05 Hash Table
No ratings yet
Lecture05 Hash Table
65 pages
Hash Tables
100% (1)
Hash Tables
30 pages
DS - Unit 5 - Notes
No ratings yet
DS - Unit 5 - Notes
8 pages
Lecture 05
No ratings yet
Lecture 05
19 pages
Lecture_10 (1)
No ratings yet
Lecture_10 (1)
88 pages
DSA2 Chapter 5 Hashing
No ratings yet
DSA2 Chapter 5 Hashing
44 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
69 pages
Separate Chaining Hashing Technique
No ratings yet
Separate Chaining Hashing Technique
50 pages
IT245 - Module 8
No ratings yet
IT245 - Module 8
41 pages
11. Hafta. (3)
No ratings yet
11. Hafta. (3)
34 pages
cn-Unit-3_FOC_250205_141809
No ratings yet
cn-Unit-3_FOC_250205_141809
23 pages
8 Hashtables
No ratings yet
8 Hashtables
84 pages
Hashing Part1
No ratings yet
Hashing Part1
73 pages
HASHING
No ratings yet
HASHING
16 pages
Idst 2016 SA 05 Hashing
No ratings yet
Idst 2016 SA 05 Hashing
68 pages
Chapter 5_Hashing _Part1
No ratings yet
Chapter 5_Hashing _Part1
28 pages
DS Module-X
No ratings yet
DS Module-X
74 pages
Hashing Unit 1
No ratings yet
Hashing Unit 1
91 pages
09 Hashtable
No ratings yet
09 Hashtable
53 pages
Aps Collision Handling Schemes
No ratings yet
Aps Collision Handling Schemes
13 pages
ds 5 update
No ratings yet
ds 5 update
26 pages
Chapter10_HashTables
No ratings yet
Chapter10_HashTables
49 pages
14 Hashing
No ratings yet
14 Hashing
61 pages
HAshing (Satish sir)
No ratings yet
HAshing (Satish sir)
52 pages
Introduction
No ratings yet
Introduction
34 pages
10 More Hashing
No ratings yet
10 More Hashing
5 pages
Muhurta - Raman
100% (4)
Muhurta - Raman
77 pages
Hashing
No ratings yet
Hashing
44 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Hash Tables
No ratings yet
Hash Tables
30 pages
15 HashTables
No ratings yet
15 HashTables
27 pages
Lec12-Hash-Tables-09092024-090609pm (1)
No ratings yet
Lec12-Hash-Tables-09092024-090609pm (1)
48 pages
Lecture 08 - Hash Tables
No ratings yet
Lecture 08 - Hash Tables
21 pages
L5 HashTables
No ratings yet
L5 HashTables
22 pages
Hashing
No ratings yet
Hashing
30 pages
DSAU1HASH
No ratings yet
DSAU1HASH
21 pages
College Bus Tracking System
40% (5)
College Bus Tracking System
45 pages
Unit28 Hashing1
No ratings yet
Unit28 Hashing1
19 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
Unit 3 2nd Half 2024
No ratings yet
Unit 3 2nd Half 2024
16 pages
Lab8 Hash ThamKhao
No ratings yet
Lab8 Hash ThamKhao
3 pages
22CS302_LM21
No ratings yet
22CS302_LM21
7 pages
A2
No ratings yet
A2
2 pages
Hash Tables - : Structure
No ratings yet
Hash Tables - : Structure
21 pages
DSA LABTASK 12
No ratings yet
DSA LABTASK 12
5 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Hashing Updated
No ratings yet
Hashing Updated
26 pages
Unit 5 Session 5 Hashing
No ratings yet
Unit 5 Session 5 Hashing
20 pages
Week 12 Hashing
No ratings yet
Week 12 Hashing
24 pages
Lecture 3.Pptx 3
No ratings yet
Lecture 3.Pptx 3
24 pages
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
No ratings yet
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
19 pages
Lab5 Hashing Algos
No ratings yet
Lab5 Hashing Algos
10 pages
Exp 5 - Dsa Lab File
No ratings yet
Exp 5 - Dsa Lab File
10 pages
Hashing
No ratings yet
Hashing
13 pages
BCS304 DS Module 5 Notes
No ratings yet
BCS304 DS Module 5 Notes
45 pages
Lect10 Hash Basics
No ratings yet
Lect10 Hash Basics
4 pages
Whats New in AutoPlotter 9
No ratings yet
Whats New in AutoPlotter 9
14 pages
Data analysis using Eviews
No ratings yet
Data analysis using Eviews
9 pages
Nano Cars Into The Robotics For The Realistic Movement
No ratings yet
Nano Cars Into The Robotics For The Realistic Movement
20 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
NEP9800 En1-8
No ratings yet
NEP9800 En1-8
76 pages
Risk Management in Commodity Markets (2008)
100% (2)
Risk Management in Commodity Markets (2008)
323 pages
MS 1064 - Preferred Sizes
100% (3)
MS 1064 - Preferred Sizes
15 pages
Image Denoising Using Wavelet Transform
No ratings yet
Image Denoising Using Wavelet Transform
7 pages
Subnetting and Supernetting-CIDR
No ratings yet
Subnetting and Supernetting-CIDR
29 pages
Cash-Los Logs
No ratings yet
Cash-Los Logs
19 pages
Direct-Current Machines
No ratings yet
Direct-Current Machines
108 pages
Chapter 6- NMR
No ratings yet
Chapter 6- NMR
12 pages
Hashing
No ratings yet
Hashing
13 pages
Cooling SistemadeRefrigeracion Refroidissement
No ratings yet
Cooling SistemadeRefrigeracion Refroidissement
124 pages
Panti Ramos, Darío. Trabajo de Estadistica Descriptiva e Inferencial
No ratings yet
Panti Ramos, Darío. Trabajo de Estadistica Descriptiva e Inferencial
13 pages
Septic Tank and Soak Pit
No ratings yet
Septic Tank and Soak Pit
4 pages
Mechanical Vibration by Janusz Krodkiewski
No ratings yet
Mechanical Vibration by Janusz Krodkiewski
247 pages
Capitulo 5 Fracture Rock Properties VC
No ratings yet
Capitulo 5 Fracture Rock Properties VC
105 pages
Structure Function Carbohydrates
No ratings yet
Structure Function Carbohydrates
28 pages
Mark Scheme (FINAL) Summer 2007: GCE Biology SNAB (6131/01)
No ratings yet
Mark Scheme (FINAL) Summer 2007: GCE Biology SNAB (6131/01)
11 pages
38 HW LP
No ratings yet
38 HW LP
5 pages
Overview of Polkadot and Its Design Considerations: Jeff Burdges, Alfonso Cevallos, Peter Czaban
No ratings yet
Overview of Polkadot and Its Design Considerations: Jeff Burdges, Alfonso Cevallos, Peter Czaban
41 pages
ISO-17827-1-2016
No ratings yet
ISO-17827-1-2016
9 pages
ICCRRR Committees Keynote Papers Theme 1: Concrete Durability Aspects
No ratings yet
ICCRRR Committees Keynote Papers Theme 1: Concrete Durability Aspects
28 pages
Herramientas y Especificaciones Duo Cone
No ratings yet
Herramientas y Especificaciones Duo Cone
16 pages
Introduction To MS Excel 2007
No ratings yet
Introduction To MS Excel 2007
12 pages
Advanced Quantum Mechanics, Notes Based On Online Course Given by Leonard Susskind - Lecture 1
No ratings yet
Advanced Quantum Mechanics, Notes Based On Online Course Given by Leonard Susskind - Lecture 1
7 pages
Lab Report. 09: HY - 1104 Hysics Aboratory
No ratings yet
Lab Report. 09: HY - 1104 Hysics Aboratory
6 pages
Paper Title (Use Style: Paper Title) : Subtitle As Needed (Paper Subtitle)
No ratings yet
Paper Title (Use Style: Paper Title) : Subtitle As Needed (Paper Subtitle)
4 pages
Intro To Scrum in Under 10 Minutes
No ratings yet
Intro To Scrum in Under 10 Minutes
3 pages
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
From Everand
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
Kanto
No ratings yet
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet