0% found this document useful (0 votes)
7 views43 pages

Lab8-Hash

Lab 6 focuses on hash functions and hash tables, introducing their definitions, properties, and common techniques for implementation. It covers collision resolution strategies, performance analysis, and various applications of hash tables. The lab includes exercises for implementing hash functions, detecting collisions, and creating a hash table using linear probing for collision resolution.

Uploaded by

thhainguyen1206
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views43 pages

Lab8-Hash

Lab 6 focuses on hash functions and hash tables, introducing their definitions, properties, and common techniques for implementation. It covers collision resolution strategies, performance analysis, and various applications of hash tables. The lab includes exercises for implementing hash functions, detecting collisions, and creating a hash table using linear probing for collision resolution.

Uploaded by

thhainguyen1206
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Lab 6.

HF & HT Data structures and Algorithms CSC10004

Lab 6

Hash Functions & Hash Tables

1 Introduction to Hash Functions and Hash Tables


1.1 Hash Functions
A hash function is a mathematical function that maps data of arbitrary size to fixed-size values.
In the context of data structures, hash functions are used to transform keys into array indices,
allowing for efficient data access. A good hash function has several desirable properties:

• Deterministic: The same input should always produce the same output.

• Efficiency: The computation should be fast.

• Uniform Distribution: The function should map inputs as evenly as possible over the
output range.

• Low Collision Rate: Different inputs should rarely map to the same output.

For a hash function h and a key k, the hash value (or hash code) is calculated as:

h(k) = index in the hash table (1)

Common techniques for creating hash functions include:

1. Division Method: h(k) = k mod m, where m is the size of the hash table.

2. Multiplication Method: h(k) = ⌊m · (k · A mod 1)⌋, where A is a constant in the range


(0, 1).

3. Universal Hashing: A family of hash functions chosen randomly to ensure good average-
case performance.

1.2 Hash Tables


A hash table (or hash map) is a data structure that implements an associative array, a structure
that can map keys to values. A hash table uses a hash function to compute an index into an array
of buckets or slots, from which the desired value can be found.

University of Science Faculty of Information Technology Page 1


Lab 6. HF & HT Data structures and Algorithms CSC10004

Figure 1: A simple hash table with string keys and integer values

The main advantage of hash tables is their efficiency—they provide constant-time average-case
performance O(1) for basic operations like insertion, deletion, and lookup, regardless of the number
of elements stored.

1.3 Collision Resolution


A collision occurs when two different keys hash to the same index. Since this is practically un-
avoidable, hash table implementations must include collision resolution strategies:

1. Separate Chaining: Each bucket holds a linked list of all key-value pairs that hash to the
same index.
lookup time = O(1 + α) (2)

where α is the load factor (number of elements divided by the number of buckets).

2. Open Addressing: All elements are stored directly in the hash table array. When a collision
occurs, we probe for an empty slot according to some probing sequence:

University of Science Faculty of Information Technology Page 2


Lab 6. HF & HT Data structures and Algorithms CSC10004

• Linear Probing: h(k, i) = (h(k) + i) mod m, where i is the probe sequence number.
• Quadratic Probing: h(k, i) = (h(k) + c1 i + c2 i2 ) mod m, where c1 , c2 are constants.
• Double Hashing: h(k, i) = (h1 (k) + i · h2 (k)) mod m, using two different hash func-
tions.

1.4 Performance Analysis


The performance of hash tables depends on various factors:

• Load Factor (α): The ratio of the number of elements to the table size.

n
α= (3)
m

where n is the number of elements and m is the table size.

• Time Complexity:

– Average case: O(1) for search, insert, and delete operations.


– Worst case: O(n) when many elements collide at the same index.

• Space Complexity: O(n), where n is the number of elements.

1.5 Applications
Hash tables are widely used in various applications:

• Database Indexing: To quickly locate records.

• Caches: For fast data retrieval and lookup.

• Symbol Tables: In compilers and interpreters.

• Associative Arrays: Implementation in programming languages.

• Password Authentication: Storing password hashes instead of actual passwords.

• Spell Checkers: Quick word lookup.

• Internet Routers: For packet forwarding.

University of Science Faculty of Information Technology Page 3


Lab 6. HF & HT Data structures and Algorithms CSC10004

In this lab, you will implement various hash functions and hash table operations, exploring
the trade-offs between different collision resolution strategies and analyzing their performance
characteristics.

2 Exercise 1: Basic Hash Function Implementation


In this exercise, you will implement a simple hash function for string keys. Hash functions are
crucial components of hash tables, mapping data of variable size to fixed-size values.
Your task is to create a hash function that:

1. Takes a string key and table size as parameters

2. Sums the ASCII values of all characters in the string

3. Returns the sum modulo the table size

For more especially, the requirements are:

1. Implement the hashFunction() that processes string keys

2. The function should return an integer index within the range [0, tableSize-1]

3. Use modular arithmetic to ensure the hash value fits within the table size

4. Test the function with the provided example strings

Example Input/Output

1. Input:

• Input strings: "apple", "banana", "cherry", "date", "elderberry"


• Table size: 10

2. Expected output:
1 Hash values for different keys ( table size = 10) :
2 Key : apple , Hash value : 0
3 Key : banana , Hash value : 9
4 Key : cherry , Hash value : 3
5 Key : date , Hash value : 4
6 Key : elderberry , Hash value : 2

University of Science Faculty of Information Technology Page 4


Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // File : Exercise_1 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 int hashFunction ( const std :: string & key , int tableSize ) {


7 // TODO : Implement a hash function that sums the ASCII values of all
characters
8 // in the key and returns the sum modulo tableSize
9

10 return 0; // Replace with your implementation


11 }
12

13 int main () {
14 std :: string keys [] = { " apple " , " banana " , " cherry " , " date " , " elderberry " };
15 int tableSize = 10;
16

17 std :: cout << " Hash values for different keys ( table size = " << tableSize
<< " ) : " << std :: endl ;
18 for ( const auto & key : keys ) {
19 std :: cout << " Key : " << key << " , Hash value : " << hashFunction ( key ,
tableSize ) << std :: endl ;
20 }
21

22 return 0;
23 }

Furthermore, you can leverage some hash function to improve the hashing performance:

1. Polynomial Hash Function: A polynomial hash function considers character positions, mak-
ing it more sensitive to character order
1 int i m p r o v e d H a s h F u n c t i o n 1 ( const std :: string & key , int tableSize ) {
2 long hash = 0;
3 const int p = 31; // Prime number
4 long p_pow = 1;
5

6 for ( char c : key ) {


7 hash = ( hash + ( c - ’a ’ + 1) * p_pow ) % tableSize ;
8 p_pow = ( p_pow * p ) % tableSize ;
9 }
10

University of Science Faculty of Information Technology Page 5


Lab 6. HF & HT Data structures and Algorithms CSC10004

11 return hash ;
12 }

2. djb2 Hash Function: A well-known string hash function with good distribution properties
1 int i m p r o v e d H a s h F u n c t i o n 2 ( const std :: string & key , int tableSize ) {
2 unsigned long hash = 5381;
3

4 for ( char c : key ) {


5 hash = (( hash << 5) + hash ) + c ; // hash * 33 + c
6 }
7

8 return hash % tableSize ;


9 }\

3. FNV-1a Hash Function: Fast with good distribution and low collision rates
1 int i m p r o v e d H a s h F u n c t i o n 3 ( const std :: string & key , int tableSize ) {
2 const unsigned int fnv_prime = 16777619;
3 unsigned int hash = 2166136261;
4

5 for ( char c : key ) {


6 hash ^= c ;
7 hash *= fnv_prime ;
8 }
9

10 return hash % tableSize ;


11 }

3 Exercise 2: Collision Detection


In this exercise, you will implement a function to detect collisions in a hash table. Collisions occur
when two or more distinct keys hash to the same index in the hash table. Your task is to identify
and report all collisions that occur when a set of keys is hashed using a simple hash function.
For more especially, the requirements are:

1. Implement the detectCollisions() function that identifies which indices have collisions

2. Track how many keys hash to each index in the table

3. Print all indices that have more than one key (collisions)

University of Science Faculty of Information Technology Page 6


Lab 6. HF & HT Data structures and Algorithms CSC10004

4. List the specific keys that collide at each index

Example Input/Output

1. Input:

• Input keys: ”cat”, ”dog”, ”rat”, ”pig”, ”owl”, ”fox”, ”hen”, ”ant”, ”bee”
• Table size: 7

2. Expected output:
1 Detecting collisions for table size 7:
2 Collision at index 4:
3 - " cat "
4 - " fox "
5 Collision at index 5:
6 - " rat "
7 - " pig "
8 Collision at index 6:
9 - " dog "
10 - " bee "

1 // File : Exercise_2 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 int hashFunction ( const std :: string & key , int tableSize ) {


7 int sum = 0;
8 for ( char c : key ) {
9 sum += static_cast < int >( c ) ;
10 }
11 return sum % tableSize ;
12 }
13

14 void detectCollisions ( const std :: vector < std :: string >& keys , int tableSize ) {
15 // TODO : Create a vector to track how many keys hash to each index
16 // Print out all indices that have more than one key ( collisions )
17 // For each collision , print the keys that collided
18 }
19

20 int main () {

University of Science Faculty of Information Technology Page 7


Lab 6. HF & HT Data structures and Algorithms CSC10004

21 std :: vector < std :: string > keys = { " cat " , " dog " , " rat " , " pig " , " owl " , " fox " ,
" hen " , " ant " , " bee " };
22 int tableSize = 7;
23

24 std :: cout << " Detecting collisions for table size " << tableSize << " : " <<
std :: endl ;
25 detectCollisions ( keys , tableSize ) ;
26

27 return 0;
28 }

Furthermore, you can leverage some hash function to reduce collisions:

1. Use Prime Number Table Sizes: Prime numbers help distribute hash values more evenly
1 // Choose a prime number close to but larger than your expected number of
elements
2 int betterTableSize = 11; // Instead of 10

2. Universal Hashing: Using different hash functions randomly from a carefully designed family
1 int u n i v e r s a l H a s h F u n c t i o n ( const std :: string & key , int tableSize , int a ,
int b , int p ) {
2 // p is a prime larger than the largest possible character value
3 // a and b are random integers between 1 and p -1
4 long hash = 0;
5 for ( char c : key ) {
6 hash = ( hash * a + static_cast < int >( c ) ) % p ;
7 }
8 return ( hash % tableSize ) ;
9 }

3. Double Hashing: Using a secondary hash function to resolve collisions


1 int se co nd Ha sh Fun ct io n ( const std :: string & key , int tableSize ) {
2 // A different hash function than the primary one
3 int hash = 0;
4 for ( char c : key ) {
5 hash = hash * 31 + c ;
6 }
7 // Make sure this never returns 0 to avoid infinite loops in probing
8 return 1 + ( hash % ( tableSize - 1) ) ;
9 }

University of Science Faculty of Information Technology Page 8


Lab 6. HF & HT Data structures and Algorithms CSC10004

4 Exercise 3: Linear Probing Implementation


In this exercise, you will implement a hash table that uses linear probing to handle collisions.
Linear probing is a collision resolution technique where, if the intended slot for a key is already
occupied, the algorithm checks the next slot, and continues until it finds an empty slot.
For more especially, the requirements are:

1. Complete the implementation of a hash table class with linear probing

2. Implement the insert method that handles collisions using linear probing

3. Implement the search method that can find values even after collision resolution

4. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input:

• Insert ("apple", 5); Insert ("banana", 8); Insert ("cherry", 3); Insert ("date",
12); Insert ("grape", 10); Insert ("lemon", 7)
• Search for "banana"; Search for "kiwi"

2. Expected output:
1 Hash Table Contents :
2 0: apple -> 5
3 1: lemon -> 7
4 2: Empty
5 3: cherry -> 3
6 4: date -> 12
7 5: Empty
8 6: Empty
9 7: grape -> 10
10 8: Empty
11 9: banana -> 8
12

13 Lookup Operations :
14 Found banana with value 8
15 Could not find kiwi

University of Science Faculty of Information Technology Page 9


Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // File : Exercise_3 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Entry {
9 std :: string key ;
10 int value ;
11 bool isOccupied ;
12

13 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}


14 };
15

16 std :: vector < Entry > table ;


17 int size ;
18

19 int hashFunction ( const std :: string & key ) {


20 int sum = 0;
21 for ( char c : key ) {
22 sum += static_cast < int >( c ) ;
23 }
24 return sum % size ;
25 }
26

27 public :
28 HashTable ( int tableSize ) : size ( tableSize ) {
29 table . resize ( size ) ;
30 }
31

32 // TODO : Implement the insert method with linear probing


33 bool insert ( const std :: string & key , int value ) {
34 // 1. Compute the initial hash value
35 // 2. If the slot is empty , insert the entry
36 // 3. If there ’s a collision , use linear probing to find the next
available slot
37 // 4. Return false if the table is full
38

39 return false ; // Replace with your implementation


40 }
41

University of Science Faculty of Information Technology Page 10


Lab 6. HF & HT Data structures and Algorithms CSC10004

42 // TODO : Implement the search method


43 bool search ( const std :: string & key , int & value ) {
44 // 1. Compute the initial hash value
45 // 2. Check if the key exists at that position
46 // 3. If not , use linear probing to search for the key
47 // 4. Return true and set the value if found , false otherwise
48

49 return false ; // Replace with your implementation


50 }
51

52 // Print the contents of the hash table


53 void print () {
54 for ( int i = 0; i < size ; i ++) {
55 if ( table [ i ]. isOccupied ) {
56 std :: cout << i << " : " << table [ i ]. key << " -> " << table [ i ].
value << std :: endl ;
57 } else {
58 std :: cout << i << " : Empty " << std :: endl ;
59 }
60 }
61 }
62 };
63

64 int main () {
65 HashTable ht (10) ;
66

67 ht . insert ( " apple " , 5) ;


68 ht . insert ( " banana " , 8) ;
69 ht . insert ( " cherry " , 3) ;
70 ht . insert ( " date " , 12) ;
71 ht . insert ( " grape " , 10) ;
72 ht . insert ( " lemon " , 7) ;
73

74 std :: cout << " Hash Table Contents : " << std :: endl ;
75 ht . print () ;
76

77 std :: cout << " \ nLookup Operations : " << std :: endl ;
78 int value ;
79 if ( ht . search ( " banana " , value ) ) {
80 std :: cout << " Found banana with value " << value << std :: endl ;
81 } else {
82 std :: cout << " Could not find banana " << std :: endl ;

University of Science Faculty of Information Technology Page 11


Lab 6. HF & HT Data structures and Algorithms CSC10004

83 }
84

85 if ( ht . search ( " kiwi " , value ) ) {


86 std :: cout << " Found kiwi with value " << value << std :: endl ;
87 } else {
88 std :: cout << " Could not find kiwi " << std :: endl ;
89 }
90

91 return 0;
92 }

Furthermore, you can leverage some techniques to improve the hashing performance:

1. Adding Deletion Capability: Add a method to delete entries while properly handling the
probe sequence
1 bool remove ( const std :: string & key ) {
2 // Find the key using linear probing
3 // Mark the slot as deleted but not empty ( tombstone )
4 // This requires adding a " isDeleted " flag to the Entry struct
5 // Return true if successful , false if key not found
6 }

2. Load Factor Tracking: Add functionality to monitor and maintain an efficient load factor
1 float getLoadFactor () {
2 int occupiedCount = 0;
3 for ( const auto & entry : table ) {
4 if ( entry . isOccupied ) {
5 occupiedCount ++;
6 }
7 }
8 return static_cast < float >( occupiedCount ) / size ;
9 }
10

11 void rehash () {
12 // Implement a rehashing mechanism when load factor exceeds threshold
13 // Create a new table with larger size
14 // Re - insert all elements from the old table
15 }

3. Quadratic Probing

University of Science Faculty of Information Technology Page 12


Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // In insert method :
2 int i = 0;
3 int index = ( initialHash + i * i ) % size ; // Use quadratic sequence instead
of linear

5 Exercise 4: Quadratic Probing


In this exercise, you will implement a hash table that uses quadratic probing to resolve collisions.
Quadratic probing is a collision resolution technique where, if the intended slot for a key is already
occupied, the algorithm checks positions at quadratically increasing distances from the original
position.
For more especially, the requirements are:

1. Complete the implementation of a hash table class with quadratic probing

2. Implement the insert method that handles collisions using quadratic probing

3. Implement the search method that can find values even after collision resolution

4. Ensure the table returns false if it becomes more than 70% full

5. Handle deleted entries correctly during search operations

Example Input/Output

1. Input:

• Insert 9 key-value pairs including "apple", "banana", "cherry", etc.

2. Expected output:
1 Inserted apple
2 Inserted banana
3 Inserted cherry
4 Inserted date
5 Inserted grape
6 Inserted lemon
7 Inserted orange
8 Inserted pear
9 Inserted fig
10

University of Science Faculty of Information Technology Page 13


Lab 6. HF & HT Data structures and Algorithms CSC10004

11 Hash Table Contents :


12 0: orange -> 9
13 1: Empty
14 2: fig -> 6
15 3: cherry -> 3
16 4: Empty
17 5: Empty
18 6: lemon -> 7
19 7: grape -> 10
20 8: pear -> 4
21 9: Empty
22 10: apple -> 5
23 11: banana -> 8
24 12: date -> 12

1 // File : Exercise_4 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Entry {
9 std :: string key ;
10 int value ;
11 bool isOccupied ;
12 bool isDeleted ; // For handling deletions
13

14 Entry () : key ( " " ) , value (0) , isOccupied ( false ) , isDeleted ( false ) {}
15 };
16

17 std :: vector < Entry > table ;


18 int size ;
19 int count ; // Number of elements in the table
20

21 int hashFunction ( const std :: string & key ) {


22 int sum = 0;
23 for ( char c : key ) {
24 sum += static_cast < int >( c ) ;
25 }
26 return sum % size ;

University of Science Faculty of Information Technology Page 14


Lab 6. HF & HT Data structures and Algorithms CSC10004

27 }
28

29 public :
30 HashTable ( int tableSize ) : size ( tableSize ) , count (0) {
31 table . resize ( size ) ;
32 }
33

34 // TODO : Implement the insert method with quadratic probing


35 bool insert ( const std :: string & key , int value ) {
36 // 1. If the table is more than 70% full , return false
37 // 2. Compute the initial hash value
38 // 3. Use quadratic probing : h (k , i ) = ( h ( k ) + i ^2) % size to find an
empty slot
39 // 4. Return true if successful , false if the item cannot be inserted
40

41 return false ; // Replace with your implementation


42 }
43

44 // TODO : Implement the search method


45 bool search ( const std :: string & key , int & value ) {
46 // Use quadratic probing for searching
47

48 return false ; // Replace with your implementation


49 }
50

51 // Print the contents of the hash table


52 void print () {
53 for ( int i = 0; i < size ; i ++) {
54 if ( table [ i ]. isOccupied && ! table [ i ]. isDeleted ) {
55 std :: cout << i << " : " << table [ i ]. key << " -> " << table [ i ].
value << std :: endl ;
56 } else if ( table [ i ]. isDeleted ) {
57 std :: cout << i << " : Deleted " << std :: endl ;
58 } else {
59 std :: cout << i << " : Empty " << std :: endl ;
60 }
61 }
62 }
63 };
64

65 int main () {
66 HashTable ht (13) ; // Using a prime number for table size is recommended

University of Science Faculty of Information Technology Page 15


Lab 6. HF & HT Data structures and Algorithms CSC10004

for quadratic probing


67

68 std :: vector < std :: pair < std :: string , int > > data = {
69 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} ,
70 { " date " , 12} , { " grape " , 10} , { " lemon " , 7} ,
71 { " orange " , 9} , { " pear " , 4} , { " fig " , 6}
72 };
73

74 for ( const auto & item : data ) {


75 if ( ht . insert ( item . first , item . second ) ) {
76 std :: cout << " Inserted " << item . first << std :: endl ;
77 } else {
78 std :: cout << " Failed to insert " << item . first << std :: endl ;
79 }
80 }
81

82 std :: cout << " \ nHash Table Contents : " << std :: endl ;
83 ht . print () ;
84

85 return 0;
86 }

Furthermore, you can leverage some techniques to improve the hashing performance:

1. Double Hashing: Combine quadratic probing with a second hash function for better distri-
bution
1 int secondHash ( const std :: string & key ) {
2 int hash = 0;
3 for ( char c : key ) {
4 hash = hash * 17 + c ;
5 }
6 return 1 + ( hash % ( size - 1) ) ; // Ensure result is between 1 and size
-1
7 }
8

9 // Then in insert / search :


10 int step = secondHash ( key ) ;
11 int index = ( initialHash + i * step ) % size ;

2. Automatic Rehashing: Implement a rehashing mechanism when the load factor exceeds a
threshold

University of Science Faculty of Information Technology Page 16


Lab 6. HF & HT Data structures and Algorithms CSC10004

1 void rehash () {
2 std :: vector < Entry > oldTable = table ;
3 int oldSize = size ;
4

5 // Double the size , preferably to the next prime number


6 size = nextPrime (2 * size ) ;
7 count = 0;
8 table . clear () ;
9 table . resize ( size ) ;
10

11 // Reinsert all non - deleted entries


12 for ( const auto & entry : oldTable ) {
13 if ( entry . isOccupied && ! entry . isDeleted ) {
14 insert ( entry . key , entry . value ) ;
15 }
16 }
17 }

3. Prime-Sized Tables: Always use prime numbers for the table size to ensure optimal distri-
bution with quadratic probing
1 // Recall your simple check prime function in your Fundamental class
2 bool isPrime ( int n ) {
3 if ( n <= 1) return false ;
4 if ( n <= 3) return true ;
5 if ( n % 2 == 0 || n % 3 == 0) return false ;
6

7 for ( int i = 5; i * i <= n ; i += 6) {


8 if ( n % i == 0 || n % ( i + 2) == 0) return false ;
9 }
10 return true ;
11 }
12

13 int nextPrime ( int n ) {


14 if ( n <= 1) return 2;
15

16 int prime = n ;
17 bool found = false ;
18

19 while (! found ) {
20 prime ++;
21 if ( isPrime ( prime ) ) found = true ;

University of Science Faculty of Information Technology Page 17


Lab 6. HF & HT Data structures and Algorithms CSC10004

22 }
23

24 return prime ;
25 }

6 Exercise 5: Double Hashing


In this exercise, you will implement a hash table that uses double hashing to resolve collisions.
Double hashing is a collision resolution technique that uses two hash functions: the first to deter-
mine the initial position, and the second to determine the step size for probing when a collision
occurs.
For more especially, the requirements are:

1. Complete the implementation of a hash table class with double hashing

2. Implement the hash2() function that creates a suitable secondary hash

3. Implement the insert() method that handles collisions using double hashing

4. Implement the search() method that can find values after collision resolution

5. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input:

• Insert 10 key-value pairs including "cat", "dog", "bird", etc.


• Search for "tiger"

2. Expected output:
1 Inserted cat
2 Inserted dog
3 Inserted bird
4 Inserted fish
5 Inserted lion
6 Inserted tiger
7 Inserted bear
8 Inserted wolf
9 Inserted fox

University of Science Faculty of Information Technology Page 18


Lab 6. HF & HT Data structures and Algorithms CSC10004

10 Inserted deer
11

12 Hash Table Contents :


13 0: cat -> 1
14 1: bird -> 3
15 2: dog -> 2
16 3: Empty
17 4: Empty
18 5: lion -> 5
19 6: tiger -> 6
20 7: bear -> 7
21 8: fox -> 9
22 9: deer -> 10
23 10: fish -> 4
24 11: wolf -> 8
25 12: Empty
26

27 Found tiger with value 6

1 // File : Exercise_5 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Entry {
9 std :: string key ;
10 int value ;
11 bool isOccupied ;
12

13 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}


14 };
15

16 std :: vector < Entry > table ;


17 int size ;
18 int count ;
19

20 // Primary hash function


21 int hash1 ( const std :: string & key ) {
22 int sum = 0;

University of Science Faculty of Information Technology Page 19


Lab 6. HF & HT Data structures and Algorithms CSC10004

23 for ( char c : key ) {


24 sum += static_cast < int >( c ) ;
25 }
26 return sum % size ;
27 }
28

29 // Secondary hash function


30 int hash2 ( const std :: string & key ) {
31 // TODO : Implement a different hash function for double hashing
32 // It should never return 0 to avoid an infinite loop during probing
33 // A common approach is to use a prime number less than the table size :
R - ( key hash % R )
34 // where R is the largest prime number less than table size
35

36 return 1; // Replace with your implementation


37 }
38

39 public :
40 HashTable ( int tableSize ) : size ( tableSize ) , count (0) {
41 table . resize ( size ) ;
42 }
43

44 // TODO : Implement insert with double hashing


45 bool insert ( const std :: string & key , int value ) {
46 // 1. If the table is more than 70% full , return false
47 // 2. Compute both hash values
48 // 3. Use double hashing formula : h (k , i ) = ( h1 ( k ) + i * h2 ( k ) ) % size
49 // 4. Return true if successful , false if the item cannot be inserted
50

51 return false ; // Replace with your implementation


52 }
53

54 // TODO : Implement search with double hashing


55 bool search ( const std :: string & key , int & value ) {
56 // Use double hashing for searching
57

58 return false ; // Replace with your implementation


59 }
60

61 void print () {
62 for ( int i = 0; i < size ; i ++) {
63 if ( table [ i ]. isOccupied ) {

University of Science Faculty of Information Technology Page 20


Lab 6. HF & HT Data structures and Algorithms CSC10004

64 std :: cout << i << " : " << table [ i ]. key << " -> " << table [ i ].
value << std :: endl ;
65 } else {
66 std :: cout << i << " : Empty " << std :: endl ;
67 }
68 }
69 }
70 };
71

72 int main () {
73 HashTable ht (13) ; // Using a prime number for table size
74

75 std :: vector < std :: pair < std :: string , int > > data = {
76 { " cat " , 1} , { " dog " , 2} , { " bird " , 3} , { " fish " , 4} ,
77 { " lion " , 5} , { " tiger " , 6} , { " bear " , 7} , { " wolf " , 8} ,
78 { " fox " , 9} , { " deer " , 10}
79 };
80

81 for ( const auto & item : data ) {


82 if ( ht . insert ( item . first , item . second ) ) {
83 std :: cout << " Inserted " << item . first << std :: endl ;
84 } else {
85 std :: cout << " Failed to insert " << item . first << std :: endl ;
86 }
87 }
88

89 std :: cout << " \ nHash Table Contents : " << std :: endl ;
90 ht . print () ;
91

92 int value ;
93 if ( ht . search ( " tiger " , value ) ) {
94 std :: cout << " \ nFound tiger with value " << value << std :: endl ;
95 } else {
96 std :: cout << " \ nCould not find tiger " << std :: endl ;
97 }
98

99 return 0;
100 }

Furthermore, you can leverage some techniques to improve the hashing performance:

1. Better Secondary Hash Function: A popular secondary hash function is to use a prime

University of Science Faculty of Information Technology Page 21


Lab 6. HF & HT Data structures and Algorithms CSC10004

number smaller than the table size: PRIME - (key % PRIME) where PRIME is a prime smaller
than the TABLE SIZE.

2. Load Factor Monitoring: Add functionality to track and respond to the hash table’s load
factor

3. Handling Deletion: When deleting elements, it’s important to place what is called a “tomb-
stone” at the location rather than simply marking it as empty. This allows the search se-
quence to continue past deleted elements. This can be implemented by adding an isDeleted
flag to the Entry structure. Please check this link for more information.

7 Exercise 6: Universal Hash Function


In this exercise, you will implement a universal hash function using the multiplication method as
described by Knuth. This method is particularly effective at distributing hash values uniformly
across a hash table, making it suitable for various applications.
For more especially, the requirements are:

1. Implement the universalHash() function using the multiplication method

2. The function should take a key, a constant A (between 0 and 1), and the table size as
parameters

3. Test the function with different values of A to observe how the distribution changes

4. Analyze the distribution of hash values for different keys and constants

Example Input/Output

1. Input:

• Keys: 123, 456, 789, 101, 202, 303, 404, 505, 606, 707
• Constant A: 0.6180339887 (golden ratio minus 1)
• Table size: 10

2. Expected output:

University of Science Faculty of Information Technology Page 22


Lab 6. HF & HT Data structures and Algorithms CSC10004

1 Hash values using multiplication method ( table size = 10) :


2 Key : 123 , Hash : 6
3 Key : 456 , Hash : 1
4 Key : 789 , Hash : 7
5 Key : 101 , Hash : 2
6 Key : 202 , Hash : 4
7 Key : 303 , Hash : 7
8 Key : 404 , Hash : 9
9 Key : 505 , Hash : 2
10 Key : 606 , Hash : 4
11 Key : 707 , Hash : 7

1 // File : Exercise_6 . cpp


2 # include < iostream >
3 # include < vector >
4 # include < cmath >
5

6 // Universal hash function using multiplication method


7 int universalHash ( int key , double A , int m ) {
8 // TODO : Implement the multiplication method
9 // 1. Multiply the key by A ( a constant between 0 and 1)
10 // 2. Take the fractional part of the result
11 // 3. Multiply by m and take the floor
12 // This approach tends to distribute keys uniformly across the hash table ,
even when the keys have patterns or are sequential .
13 // The optimal choice of A is often cited as ( sqrt (5) -1) /2 \ approx
0.6180339887 , which is derived from the golden ratio and has been shown to
produce excellent distribution properties .
14

15 return 0; // Replace with your implementation


16 }
17

18 int main () {
19 std :: vector < int > keys = {123 , 456 , 789 , 101 , 202 , 303 , 404 , 505 , 606 , 707};
20 double A = 0.6180339887; // ( sqrt (5) - 1) / 2 , a popular choice
21 int tableSize = 10;
22

23 std :: cout << " Hash values using multiplication method ( table size = " <<
tableSize << " ) : " << std :: endl ;
24 for ( int key : keys ) {
25 std :: cout << " Key : " << key << " , Hash : " << universalHash ( key , A ,
tableSize ) << std :: endl ;

University of Science Faculty of Information Technology Page 23


Lab 6. HF & HT Data structures and Algorithms CSC10004

26 }
27

28 // Try different values of A


29 std :: vector < double > AValues = {0.1 , 0.3 , 0.5 , 0.7 , 0.9};
30

31 std :: cout << " \ nHash distribution with different values of A : " << std :: endl
;
32 for ( double a : AValues ) {
33 std :: cout << " A = " << a << " : " ;
34 for ( int key : keys ) {
35 std :: cout << universalHash ( key , a , tableSize ) << " " ;
36 }
37 std :: cout << std :: endl ;
38 }
39

40 return 0;
41 }

Theoretical Background
The multiplication method involves choosing a table size m that is a power of 2 and a constant
A that is a random-looking real number. Knuth suggests using A = 0.5*(sqrt(5) - 1), which
is the golden ratio minus 1. This value of A helps ensure a good distribution of hash values.
Multiplicative hashing works by setting the hash index from the fractional part of multiplying
the key by a large real number. For computational efficiency, this is typically done using fixed-point
arithmetic rather than floating-point operations.
The choice of A as related to the golden ratio is interesting because repeated multiplication by
this value minimizes gaps in the hash space, creating a well-distributed sequence of values.

8 Exercise 7: Separate Chaining Implementation


In this exercise, you will implement a hash table that uses separate chaining with linked lists to
handle collisions. Separate chaining is a collision resolution technique where each bucket in the
hash table contains a linked list of key-value pairs that hash to the same bucket.
Separate chaining provides reliable performance with:

• Average case time complexity: O(1) for insert, search, and delete operations

• Worst case (when all keys hash to the same bucket): O(n) where n is the number of elements

For more especially, the requirements are:

University of Science Faculty of Information Technology Page 24


Lab 6. HF & HT Data structures and Algorithms CSC10004

1. Complete the implementation of a hash table class with separate chaining

2. Implement the destructor to properly free memory

3. Implement the insert method to add new key-value pairs or update existing ones

4. Implement the search method to find values by key

5. Implement the remove method to delete nodes from the chains

6. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input:

• Insert 12 key-value pairs


• Search for "orange"
• Remove "banana" and "fig"

2. Expected output:
1 Hash Table Contents :
2 Bucket 0: ( cherry , 3) ( peach , 15)
3 Bucket 1: ( date , 12) ( kiwi , 11)
4 Bucket 2: ( mango , 2)
5 Bucket 3: ( grape , 10)
6 Bucket 4: ( apple , 5) ( lemon , 7)
7 Bucket 5: ( banana , 8) ( orange , 9)
8 Bucket 6: ( pear , 4) ( fig , 6)
9

10 Lookup Operations :
11 Found orange with value 9
12

13 After removing ’ banana ’ and ’ fig ’:


14 Bucket 0: ( cherry , 3) ( peach , 15)
15 Bucket 1: ( date , 12) ( kiwi , 11)
16 Bucket 2: ( mango , 2)
17 Bucket 3: ( grape , 10)
18 Bucket 4: ( apple , 5) ( lemon , 7)
19 Bucket 5: ( orange , 9)
20 Bucket 6: ( pear , 4)

University of Science Faculty of Information Technology Page 25


Lab 6. HF & HT Data structures and Algorithms CSC10004

1 // File : Exercise_7 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5

6 class HashTable {
7 private :
8 struct Node {
9 std :: string key ;
10 int value ;
11 Node * next ;
12

13 Node ( const std :: string & k , int v ) : key ( k ) , value ( v ) , next ( nullptr ) {}
14 };
15

16 std :: vector < Node * > table ;


17 int size ;
18

19 int hashFunction ( const std :: string & key ) {


20 int sum = 0;
21 for ( char c : key ) {
22 sum += static_cast < int >( c ) ;
23 }
24 return sum % size ;
25 }
26

27 public :
28 HashTable ( int tableSize ) : size ( tableSize ) {
29 table . resize ( size , nullptr ) ;
30 }
31

32 ~ HashTable () {
33 // TODO : Free all allocated memory to prevent memory leaks
34 }
35

36 // TODO : Implement insert with separate chaining


37 void insert ( const std :: string & key , int value ) {
38 // 1. Compute the hash value to determine the bucket
39 // 2. Create a new node
40 // 3. If the bucket is empty , make the new node the head
41 // 4. If not , append the node to the linked list or update if the key
exists

University of Science Faculty of Information Technology Page 26


Lab 6. HF & HT Data structures and Algorithms CSC10004

42 }
43

44 // TODO : Implement search


45 bool search ( const std :: string & key , int & value ) {
46 // 1. Compute the hash value
47 // 2. Traverse the linked list at that bucket to find the key
48 // 3. Return true and set the value if found , false otherwise
49

50 return false ; // Replace with your implementation


51 }
52

53 // TODO : Implement remove


54 bool remove ( const std :: string & key ) {
55 // 1. Compute the hash value
56 // 2. Traverse the linked list to find and remove the node with the
given key
57 // 3. Return true if removed , false if not found
58

59 return false ; // Replace with your implementation


60 }
61

62 void print () {
63 for ( int i = 0; i < size ; i ++) {
64 std :: cout << " Bucket " << i << " : " ;
65 Node * current = table [ i ];
66 if (! current ) {
67 std :: cout << " Empty " ;
68 }
69 while ( current ) {
70 std :: cout << " ( " << current - > key << " , " << current - > value << "
) ";
71 current = current - > next ;
72 }
73 std :: cout << std :: endl ;
74 }
75 }
76 };
77

78 int main () {
79 HashTable ht (7) ;
80

81 std :: vector < std :: pair < std :: string , int > > data = {

University of Science Faculty of Information Technology Page 27


Lab 6. HF & HT Data structures and Algorithms CSC10004

82 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} , { " date " , 12} ,
83 { " grape " , 10} , { " lemon " , 7} , { " orange " , 9} , { " pear " , 4} ,
84 { " fig " , 6} , { " kiwi " , 11} , { " mango " , 2} , { " peach " , 15}
85 };
86

87 for ( const auto & item : data ) {


88 ht . insert ( item . first , item . second ) ;
89 }
90

91 std :: cout << " Hash Table Contents : " << std :: endl ;
92 ht . print () ;
93

94 std :: cout << " \ nLookup Operations : " << std :: endl ;
95 int value ;
96 if ( ht . search ( " orange " , value ) ) {
97 std :: cout << " Found orange with value " << value << std :: endl ;
98 } else {
99 std :: cout << " Could not find orange " << std :: endl ;
100 }
101

102 std :: cout << " \ nAfter removing ’ banana ’ and ’ fig ’: " << std :: endl ;
103 ht . remove ( " banana " ) ;
104 ht . remove ( " fig " ) ;
105 ht . print () ;
106

107 return 0;
108 }

9 Exercise 8: Load Factor Analysis


In this exercise, you will implement a function to analyze how the load factor of a hash table
affects its performance. The load factor is a critical metric in hash tables, defined as the ratio
of the number of stored elements to the total number of buckets. As the load factor increases,
collisions become more frequent, which can degrade performance.
For more especially, the requirements are:

1. Implement the analyzeLoadFactor() function to measure performance at different load


factors

2. Create a hash table with a specified initial size

University of Science Faculty of Information Technology Page 28


Lab 6. HF & HT Data structures and Algorithms CSC10004

3. Insert random keys until reaching different load factor thresholds (e.g., 0.1, 0.2, ..., 0.9)

4. For each load factor threshold, perform a fixed number of lookup operations

5. Measure and report the average number of probes needed for operations

6. Visualize or display the results in a meaningful way

Example Input/Output

1. Expected output:
1 Performance Analysis of Hash Table with Different Load Factors
2 --------------------------------------------------------------
3 Table size : 1000
4 Operations per load factor : 1000
5

6 Load Factor | Avg Probes ( Insert ) | Avg Probes ( Search ) | Avg Probes (
Failed Search )
7 ---------------------------------------------------------------------------

8 0.10 | 1.05 | 1.03 | 1.10


9 0.20 | 1.11 | 1.08 | 1.22
10 0.30 | 1.17 | 1.15 | 1.43
11 0.40 | 1.25 | 1.22 | 1.66
12 0.50 | 1.38 | 1.33 | 2.00
13 0.60 | 1.52 | 1.48 | 2.50
14 0.70 | 1.83 | 1.76 | 3.33
15 0.80 | 2.27 | 2.18 | 5.00
16 0.90 | 3.60 | 3.48 | 10.00

1 // File : Exercise_8 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5 # include < chrono >
6 # include < random >
7 # include < algorithm >
8

9 class HashTable {
10 private :
11 struct Entry {
12 std :: string key ;

University of Science Faculty of Information Technology Page 29


Lab 6. HF & HT Data structures and Algorithms CSC10004

13 int value ;
14 bool isOccupied ;
15

16 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}


17 };
18

19 std :: vector < Entry > table ;


20 int size ;
21 int count ;
22 int probeCount ; // Track total probes for performance analysis
23

24 int hashFunction ( const std :: string & key ) {


25 int sum = 0;
26 for ( char c : key ) {
27 sum += static_cast < int >( c ) ;
28 }
29 return sum % size ;
30 }
31

32 public :
33 HashTable ( int tableSize ) : size ( tableSize ) , count (0) , probeCount (0) {
34 table . resize ( size ) ;
35 }
36

37 bool insert ( const std :: string & key , int value ) {


38 if ( static_cast < double >( count ) / size >= 0.9) {
39 return false ; // Table is too full
40 }
41

42 int index = hashFunction ( key ) ;


43 int i = 0;
44

45 while ( i < size ) {


46 int probeIndex = ( index + i ) % size ; // Linear probing
47 probeCount ++;
48

49 if (! table [ probeIndex ]. isOccupied ) {


50 table [ probeIndex ]. key = key ;
51 table [ probeIndex ]. value = value ;
52 table [ probeIndex ]. isOccupied = true ;
53 count ++;
54 return true ;

University of Science Faculty of Information Technology Page 30


Lab 6. HF & HT Data structures and Algorithms CSC10004

55 } else if ( table [ probeIndex ]. key == key ) {


56 table [ probeIndex ]. value = value ; // Update existing value
57 return true ;
58 }
59

60 i ++;
61 }
62

63 return false ; // Table is full after probing all positions


64 }
65

66 bool search ( const std :: string & key ) {


67 int index = hashFunction ( key ) ;
68 int i = 0;
69

70 while ( i < size ) {


71 int probeIndex = ( index + i ) % size ;
72 probeCount ++;
73

74 if (! table [ probeIndex ]. isOccupied ) {


75 return false ; // Key doesn ’t exist
76 } else if ( table [ probeIndex ]. key == key ) {
77 return true ; // Key found
78 }
79

80 i ++;
81 }
82

83 return false ; // Key doesn ’t exist after probing all positions


84 }
85

86 int getProbeCount () const {


87 return probeCount ;
88 }
89

90 void resetProbeCount () {
91 probeCount = 0;
92 }
93

94 double getLoadFactor () const {


95 return static_cast < double >( count ) / size ;
96 }

University of Science Faculty of Information Technology Page 31


Lab 6. HF & HT Data structures and Algorithms CSC10004

97 };
98

99 // Generate a random string of a given length


100 std :: string g e n e r a t e R a n d o m S t r i n g ( int length ) {
101 static const char alphanum [] =
102 " 0123456789 "
103 " ABCDEFGHIJKLMNOPQRSTUVWXYZ "
104 " abcdefghijklmnopqrstuvwxyz ";
105

106 std :: random_device rd ;


107 std :: mt19937 gen ( rd () ) ;
108 std :: uniform_int_distribution < > dis (0 , sizeof ( alphanum ) - 2) ;
109

110 std :: string result ;


111 result . reserve ( length ) ;
112

113 for ( int i = 0; i < length ; ++ i ) {


114 result += alphanum [ dis ( gen ) ];
115 }
116

117 return result ;


118 }
119

120 // TODO : Implement a function to analyze how load factor affects performance
121 void ana lyzeL oadFac tor () {
122 // 1. Create a hash table with a large size ( e . g . , 1000)
123 // 2. Insert random keys until reaching different load factors ( e . g . , 0.1 ,
0.2 , ... , 0.9)
124 // 3. For each load factor , perform a fixed number of random lookups
125 // 4. Measure the average number of probes needed for successful and
unsuccessful lookups
126 // 5. Print out the results and draw conclusions
127 }
128

129 int main () {


130 anal yzeLoa dFacto r () ;
131 return 0;
132 }

University of Science Faculty of Information Technology Page 32


Lab 6. HF & HT Data structures and Algorithms CSC10004

10 Exercise 9: Implement Cuckoo Hashing


In this exercise, you will implement a hash table using cuckoo hashing with two hash functions.
Cuckoo hashing is a collision resolution technique named after the cuckoo bird’s behavior of pushing
other birds’ eggs out of their nests. Similarly, when inserting a new key that collides with an existing
key, the existing key is displaced to its alternative location.
For more especially, the requirements are:

1. Complete the implementation of a cuckoo hash table class with two tables

2. Implement the insert() method that handles collisions using the cuckoo algorithm

3. Implement the search() method that looks for keys in both tables

4. Implement the rehash() method to resize the table when cycles are detected

5. The hash table should store key-value pairs where keys are strings and values are integers

Example Input/Output

1. Input: Insert 8 key-value pairs including "apple", "banana", "cherry", etc.

2. Expected output:
1 Inserted apple
2 Inserted banana
3 Inserted cherry
4 Inserted date
5 Inserted grape
6 Inserted lemon
7 Inserted orange
8 Inserted pear
9

10 Cuckoo Hash Table Contents :


11 Table 1:
12 0: Empty
13 1: Empty
14 2: banana -> 8
15 3: Empty
16 4: Empty
17 5: date -> 12
18 6: Empty
19 7: orange -> 9

University of Science Faculty of Information Technology Page 33


Lab 6. HF & HT Data structures and Algorithms CSC10004

20 8: pear -> 4
21 9: Empty
22

23 Table 2:
24 0: grape -> 10
25 1: Empty
26 2: apple -> 5
27 3: Empty
28 4: cherry -> 3
29 5: Empty
30 6: lemon -> 7
31 7: Empty
32 8: Empty
33 9: Empty
34

35 Current load factor : 0.4


36

37 Found orange with value 9

1 // File : Exercise_9 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5 # include < chrono > // For seeding the random number generator
6

7 class CuckooHashTable {
8 private :
9 struct Entry {
10 std :: string key ;
11 int value ;
12 bool isOccupied ;
13

14 Entry () : key ( " " ) , value (0) , isOccupied ( false ) {}


15 };
16

17 std :: vector < Entry > table1 ;


18 std :: vector < Entry > table2 ;
19 int size ;
20 int count ;
21 int maxLoop ; // Maximum number of displacement iterations
22

23 // First hash function

University of Science Faculty of Information Technology Page 34


Lab 6. HF & HT Data structures and Algorithms CSC10004

24 int hash1 ( const std :: string & key ) {


25 int sum = 0;
26 for ( char c : key ) {
27 sum = sum * 31 + static_cast < int >( c ) ;
28 }
29 return std :: abs ( sum ) % size ;
30 }
31

32 // Second hash function


33 int hash2 ( const std :: string & key ) {
34 int sum = 0;
35 for ( char c : key ) {
36 sum = sum * 37 + static_cast < int >( c ) ;
37 }
38 return std :: abs ( sum ) % size ;
39 }
40

41 public :
42 CuckooHashTable ( int tableSize ) : size ( tableSize ) , count (0) , maxLoop (
tableSize ) {
43 table1 . resize ( size ) ;
44 table2 . resize ( size ) ;
45 }
46

47 // TODO : Implement insert with cuckoo hashing


48 bool insert ( const std :: string & key , int value ) {
49 // 1. Check if the key is already in either table
50 // 2. If not , try to insert in table1
51 // 3. If table1 position is occupied , evict the current entry and move
it to table2
52 // 4. Continue this process until either :
53 // - An empty slot is found
54 // - We ’ ve exceeded maxLoop iterations ( indicating a cycle )
55

56 return false ; // Replace with your implementation


57 }
58

59 // TODO : Implement search


60 bool search ( const std :: string & key , int & value ) {
61 // 1. Check table1 using hash1
62 // 2. If not found , check table2 using hash2
63 // 3. Return true and set the value if found , false otherwise

University of Science Faculty of Information Technology Page 35


Lab 6. HF & HT Data structures and Algorithms CSC10004

64

65 return false ; // Replace with your implementation


66 }
67

68 // TODO : Implement rehash for when we detect a cycle


69 bool rehash () {
70 // 1. Create new tables with increased size
71 // 2. Reinsert all elements from the original tables
72 // 3. Return true if successful , false otherwise
73

74 return false ; // Replace with your implementation


75 }
76

77 double getLoadFactor () const {


78 return static_cast < double >( count ) / (2 * size ) ; // Two tables
79 }
80

81 void print () {
82 std :: cout << " Table 1: " << std :: endl ;
83 for ( int i = 0; i < size ; i ++) {
84 if ( table1 [ i ]. isOccupied ) {
85 std :: cout << i << " : " << table1 [ i ]. key << " -> " << table1 [ i ].
value << std :: endl ;
86 } else {
87 std :: cout << i << " : Empty " << std :: endl ;
88 }
89 }
90

91 std :: cout << " \ nTable 2: " << std :: endl ;


92 for ( int i = 0; i < size ; i ++) {
93 if ( table2 [ i ]. isOccupied ) {
94 std :: cout << i << " : " << table2 [ i ]. key << " -> " << table2 [ i ].
value << std :: endl ;
95 } else {
96 std :: cout << i << " : Empty " << std :: endl ;
97 }
98 }
99 }
100 };
101

102 int main () {


103 CuckooHashTable ht (10) ;

University of Science Faculty of Information Technology Page 36


Lab 6. HF & HT Data structures and Algorithms CSC10004

104

105 std :: vector < std :: pair < std :: string , int > > data = {
106 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} , { " date " , 12} ,
107 { " grape " , 10} , { " lemon " , 7} , { " orange " , 9} , { " pear " , 4}
108 };
109

110 for ( const auto & item : data ) {


111 if ( ht . insert ( item . first , item . second ) ) {
112 std :: cout << " Inserted " << item . first << std :: endl ;
113 } else {
114 std :: cout << " Failed to insert " << item . first << " , rehashing ... "
<< std :: endl ;
115 if ( ht . rehash () ) {
116 ht . insert ( item . first , item . second ) ;
117 std :: cout << " Inserted " << item . first << " after rehash " <<
std :: endl ;
118 } else {
119 std :: cout << " Failed to rehash and insert " << item . first <<
std :: endl ;
120 }
121 }
122 }
123

124 std :: cout << " \ nCuckoo Hash Table Contents : " << std :: endl ;
125 ht . print () ;
126

127 std :: cout << " \ nCurrent load factor : " << ht . getLoadFactor () << std :: endl ;
128

129 int value ;


130 if ( ht . search ( " orange " , value ) ) {
131 std :: cout << " Found orange with value " << value << std :: endl ;
132 } else {
133 std :: cout << " Could not find orange " << std :: endl ;
134 }
135

136 return 0;
137 }

University of Science Faculty of Information Technology Page 37


Lab 6. HF & HT Data structures and Algorithms CSC10004

11 Exercise 10: Perfect Hashing with Secondary Tables


In this exercise, you will implement a two-level hash table structure for perfect hashing of a static
set of keys. Perfect hashing guarantees O(1) worst-case lookup time with no collisions, making it
ideal for static datasets that are queried frequently but rarely updated.
For more especially, the requirements are:

1. Complete the implementation of a perfect hash table using a two-level structure

2. Implement the build() function to construct the hash table from a fixed set of key-value
pairs

3. Implement the search() function to find values by key in constant time

4. Ensure the hash table uses O(n) space overall, where n is the number of elements

1 // File : Exercise_10 . cpp


2 # include < iostream >
3 # include < string >
4 # include < vector >
5 # include < cmath >
6 # include < algorithm >
7

8 class PerfectHashTable {
9 private :
10 struct SecondaryTable {
11 std :: vector < std :: pair < std :: string , int > > entries ;
12 int size ;
13 double a ; // Universal hash function parameter
14

15 SecondaryTable ( int tableSize , double hashParam )


16 : size ( tableSize ) , a ( hashParam ) {
17 entries . resize ( size , { " " , -1}) ;
18 }
19

20 int hash ( const std :: string & key ) {


21 // Universal hash function for secondary table
22 int sum = 0;
23 for ( char c : key ) {
24 sum = sum * 31 + static_cast < int >( c ) ;
25 }
26 return static_cast < int >( size * fmod ( a * sum , 1.0) ) ;

University of Science Faculty of Information Technology Page 38


Lab 6. HF & HT Data structures and Algorithms CSC10004

27 }
28 };
29

30 std :: vector < SecondaryTable * > primaryTable ;


31 int size ;
32

33 int primaryHash ( const std :: string & key ) {


34 int sum = 0;
35 for ( char c : key ) {
36 sum += static_cast < int >( c ) ;
37 }
38 return sum % size ;
39 }
40

41 public :
42 PerfectHashTable ( int tableSize ) : size ( tableSize ) {
43 // Initialize with nullptrs to create secondary tables only when needed
44 primaryTable . resize ( size , nullptr ) ;
45 }
46

47 ~ PerfectHashTable () {
48 // Free all secondary tables
49 for ( SecondaryTable * table : primaryTable ) {
50 delete table ;
51 }
52 }
53

54 // TODO : Implement the build function to construct the perfect hash table
55 void build ( const std :: vector < std :: pair < std :: string , int > >& data ) {
56 // 1. Distribute items into buckets using the primary hash function
57 // 2. For each non - empty bucket , create a secondary table with size = (
number of items ) ^2
58 // 3. Choose a proper hash function for each secondary table to avoid
collisions
59 // 4. Insert items into secondary tables
60 }
61

62 // TODO : Implement the search function


63 bool search ( const std :: string & key , int & value ) {
64 // 1. Use primary hash to find the correct secondary table
65 // 2. If the secondary table exists , use its hash function to find the
item

University of Science Faculty of Information Technology Page 39


Lab 6. HF & HT Data structures and Algorithms CSC10004

66 // 3. Return true and set the value if found , false otherwise


67

68 return false ; // Replace with your implementation


69 }
70

71 void print () {
72 for ( int i = 0; i < size ; i ++) {
73 std :: cout << " Primary bucket " << i << " : " ;
74 if (! primaryTable [ i ]) {
75 std :: cout << " Empty " << std :: endl ;
76 continue ;
77 }
78

79 std :: cout << " Secondary table size = " << primaryTable [ i ] - > size <<
std :: endl ;
80 for ( int j = 0; j < primaryTable [ i ] - > size ; j ++) {
81 if ( primaryTable [ i ] - > entries [ j ]. second != -1) {
82 std :: cout << " " << j << " : " << primaryTable [ i ] - > entries [
j ]. first
83 << " -> " << primaryTable [ i ] - > entries [ j ]. second
<< std :: endl ;
84 }
85 }
86 }
87 }
88 };
89

90 int main () {
91 std :: vector < std :: pair < std :: string , int > > data = {
92 { " apple " , 5} , { " banana " , 8} , { " cherry " , 3} , { " date " , 12} ,
93 { " grape " , 10} , { " lemon " , 7} , { " orange " , 9} , { " pear " , 4} ,
94 { " fig " , 6} , { " kiwi " , 11}
95 };
96

97 PerfectHashTable pht (7) ;


98 pht . build ( data ) ;
99

100 std :: cout << " Perfect Hash Table Contents : " << std :: endl ;
101 pht . print () ;
102

103 int value ;


104 if ( pht . search ( " date " , value ) ) {

University of Science Faculty of Information Technology Page 40


Lab 6. HF & HT Data structures and Algorithms CSC10004

105 std :: cout << " \ nFound date with value " << value << std :: endl ;
106 } else {
107 std :: cout << " \ nCould not find date " << std :: endl ;
108 }
109

110 if ( pht . search ( " watermelon " , value ) ) {


111 std :: cout << " Found watermelon with value " << value << std :: endl ;
112 } else {
113 std :: cout << " Could not find watermelon " << std :: endl ;
114 }
115

116 return 0;
117 }

Performance analysis:
1) Time Complexity

• Build: O(n) expected time, where n is the number of elements. While we need to try multiple
hash functions until finding a collision-free one, the expected number of trials is constant.

• Search: O(1) worst-case time, since we make exactly two hash function evaluations and array
accesses.

2) Space Complexity

• O(n) expected space overall. Although each secondary table is sized quadratically to the
number of elements it contains, the total space across all secondary tables is O(n) in expec-
tation when using a good primary hash function.

Advantages Disadvantages
Guaranteed O(1) worst-case lookup time Not suitable for dynamic datasets (requires
rebuilding when data changes)
No need to handle collisions during lookup Higher space overhead compared to some
other hashing schemes
Good for static datasets that are queried fre- Complex implementation compared to sim-
quently pler hashing methods

University of Science Faculty of Information Technology Page 41


Lab 6. HF & HT Data structures and Algorithms CSC10004

12 Exercise 11: Password checker


Write a program that evaluates whether a password is “good” based on specified criteria. The
program should read a candidate password from the command line and a dictionary of common
words from standard input, then determine if the password meets all security requirements.
A password is considered “good” if and only if it meets ALL of the following criteria:

1. It is at least 8 characters long

2. It is not a word found in the dictionary

3. It is not a dictionary word followed by a digit 0-9 (e.g., “password1”)

4. It is not two dictionary words separated by a digit (e.g., “hello2world”)

University of Science Faculty of Information Technology Page 42


Lab 6. HF & HT Data structures and Algorithms CSC10004

Regulations
Please follow these regulations:

• You are allowed to use any IDE.

• After completing assignment, check your submission before and after uploading to Moodle.

• Prohibited libraries: <set>, <unordered_set>, <map>, <unordered_map>, <algorithm>,


<list>, <stack>, <queue>, and <bits/stdc++.h>.

• You can use <vector> or any libraries that are not in the prohibited libraries listed above.

Your source code must be contributed in the form of a compressed file and named your sub-
mission according to the format StudentID.zip. Here is a detail of the directory organization:
StudentID
Exercise 1.cpp
Exercise 2.cpp
Exercise 3.cpp
Exercise 4.cpp
Exercise 5.cpp
Exercise 6.cpp
Exercise 7.cpp
Exercise 8.cpp
Exercise 9.cpp
Exercise 10.cpp

The end.

University of Science Faculty of Information Technology Page 43

You might also like