0% found this document useful (0 votes)
84 views42 pages

Group 15 Hash Tables

The document discusses hash tables and hashing. It defines a hash table as a data structure used to store information by mapping keys to values. It describes how hashing works by using a hash function to convert a key into an integer index in an array where the associated value can be stored or retrieved. The document discusses factors that affect hash table design like hashing functions, table size, and collision handling schemes like separate chaining and open addressing. It provides examples of operations like insertion and different techniques to resolve collisions.

Uploaded by

reagan oloya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views42 pages

Group 15 Hash Tables

The document discusses hash tables and hashing. It defines a hash table as a data structure used to store information by mapping keys to values. It describes how hashing works by using a hash function to convert a key into an integer index in an array where the associated value can be stored or retrieved. The document discusses factors that affect hash table design like hashing functions, table size, and collision handling schemes like separate chaining and open addressing. It provides examples of operations like insertion and different techniques to resolve collisions.

Uploaded by

reagan oloya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

HASH TABLES

NAMES
REG NO.
TUMITHO STEVEN 20/U/2871/GIT
KITARA DANIEL 20/U/0835/GIK/PS
TUKASHABA DICKENS 20/2475/GIT
HASHING

• A Hash table is data structures used to store


information.
• OR
• Hashing is an array of some fixed size number,
usually a prime number.
INTRODUCTION TO HASH
• Key: Paul
TABLES
0 1 2 3
20 60 9

• Value:Age

• Hash(“key”) = index
• 9 Hash(“Paul”) = 3
• 20 Hash( “Tina”) = 0
• 60 Hash(“Somebody”) = 2
Definition cont’d……….
Hash Table Data Structure : Purpose

• To support insertion, deletion and search in


average - case constant time
• Assumption: Order of elements irrelevant .
• Hash[ “string key”] ==> integer value
HASH TABLE OPERATIONS
INSERT OPERATION
• int y =(“ Dickson”);
0 y is now 2

1 Ben int y = 2;

2 Dickson Dickson
int x = (“Aidah”) ;
3
x is now 5
4
int x = 5;
5 Aidah
int z =(“Ben”);
z now 1
FACTORS AFFECTING
HASH TABLE DESIGN
• Hashing functions
• Table size- its usually fixed at the start
• Collision handling scheme- This normally occur
when two or more keys maps to the same array
index.
HASH FUNCTIONS

• Hash functions :These are mathematical operations


through which data is run in order to be inserted
into or obtained from the hash tables.
• How to define a function to be used
• Use only the data being hashed
• Use all the data being hashed
• Uniformly distribute the data
HASH FUNCTIONS

Division Based Hash Functions


The division method uses the modulus operation (%),
which is actually a form of division both conceptually
and operationally. In the division method, the hash
function is of the form
• h(x) = x % M
Collision
• Collision: when two keys map to the same location in the hash table.
QN How does collision come to happen in hash tables
A collision occurs when two pieces of data, when run through the hash function
they yield the same hash code
int y = (“Dickson”);
y is now 2
int y =2;
Dickson
int x = (“Ayda”)
x is now 2
int x =2
Ayda
Collision cont’d
TECHNIQUES FOR
RESOLVING COLLISION
Separate Chaining
• In separate chaining, the hash table is an array of linked lists,
with all keys that hash to the same location in the same list.
New keys are inserted in the front of the list. In other words,
each hash table entry is a pointer to a list of keys and their
associated data items.
• To insert a key into the table, the hash table index is
computed, and then the list is searched to see if the key is
already in the table. If it is not, it is inserted at the head of the
list.
Chaining cont’d

• In the worst case, this requires a search of the


entire list. On average, half of the list is searched
on each insertion. It is not worth keeping the list in
sorted order if it is short.
Chaining cont’d
Chaining cont’d…….
Open Addressing

• In open addressing, there are no separate lists attached to the


table. All values are in the table itself. When a collision occurs,
the cells of the hash table itself are searched until an empty one is
found. Which cells are searched depends upon the specific
method of open addressing. All variations can be described
generically by a sequence of functions
• h0(x) = h(x) + f(0; x) % M
• h1(x) = h(x) + f(1; x) % M
• :::
• hk(x) = h(x) + f(k; x) % M
Open addressing cont’d

• where hi(x) is the i^th location tested and f(i; x) is a function that
returns some integer value based on the values of i and x. The idea
is that the hash function h(x) is first used to find a location in the
hash table for x. If we are trying to insert x into the table, and the
index h(x) is empty, we insert it there. Otherwise we need to
search for another place in the table into which we can store x. The
function f(i; x), called the collision resolution function, serves that
purpose. We search the locations until either an empty cell is found
or the search returns to a cell previously visited in the sequence.
The function f(i; x) need not depend on both i and x.
Open addressing cont’d
• h(x) + f(0; x) % M
• h(x) + f(1; x) % M
• h(x) + f(2; x) % M
• :::
• h(x) + f(k; x)% M
• To search for an item, the same collision resolution function is used. The
hash function is applied to find the first index. If the key is there, the
search stops. Otherwise, the table is searched until either the item is
found or an empty cell is reached. If an empty cell is reached, it implies
that the item is not in the table. This raises a question about how to delete
items. If an item is deleted,
Open addressing cont’d

then there will be no way to tell whether the search should


stop when it reaches an empty cell,
or just jump over the hole. The way around this problem is to
lazy deletion. In lazy deletion, the cell is marked DELETED.
Only when it is needed again is it re-used. Every cell is
marked as either
• ACTIVE: it has an item in it
• EMPTY: it has no item in it
• DELETED: it had an item that was deleted it can be re-used
Different collision control
methods using open Addressing
• linear probing.
• quadratic probing.
• double hashing.
Linear Probing

• In linear probing, the collision resolution function, f(i; x), is a


linear function that ignores the value of x, i.e.., f(i) = a i + b.
In the simplest case, a = 1 and b = 0, so that f(i; x) = i and
hi(x) = (h(x)+i)%M. In other words, consecutive locations in
the hash table are probed, treating the table like a circular list.
• Example 1. Consider a hash table of size 10 with the simple
division hash function h(x) = x % 10 and suppose we insert
the sequence of keys, 5; 15; 6; 3; 27; 8 In principle, only one
collision should occur: 5 and 15 because they both map to the
location 5.
Linear probing cont’d

• But linear probing causes many more collisions.


After inserting 5, 15 causes a collision. It is placed
in H[6]. Then 6 has a collision at H[6] and is
placed in H[7]. 3 gets placed without a collision,
but 27 collides with 6 and is placed in H[8]. This
causes 8 to collide, and it is placed in H[9]. Figure
shows the state of the hash table after each
insertion.
Linear probing cont’d
Linear probing cont’d
LINEAR PROBING HASH TABLE
AFTER EACH INSERTION

• Hash(89,10) = 9
• Hash(18,10) =8
• Hash(49,10) = 9
• Hash(58,10) = 8
TABLE

0 49 49 49
1 58 58
2 9
3
4
5
6
7
8 18 18 18 18
9 89 89 89 89 89
Quadratic Probing

• Quadratic probing eliminates clustering. In quadratic


probing the collision resolution function is a quadratic
function of i and does not depend on x, namely f(i; x) =
i2. In other words, when a collision occurs, the
successive locations to be probed are at a distance
(modulo table size) of 1; 4; 9; 16; 25; 26; 49, and so on.
The sequence of successive locations is denoted by the
equations
• h0 = h(x)
• hi = (h0 + i2) % M
Quadratic probing cont’d
Double Hashing

• In double hashing, the sequence of probes is a linear sequence with


an increment obtained by applying a second hash function to the
key:
• f(i) = i * hash2(x);
• We search locations hash(x) + i*hash2(x) for i = 1,2,3,. . .
• The choice of the second hash function can be disastrous it should
never evaluate to a factor of the table size, obviously. It should be
relatively prime to table size. It should never evaluate to 0 either.
Choosing
• hash2(x) = R- (x % R)
• it work well if R is a small prime number.
Collision cont’d

Advantages of open addressing over chaining


-No need for linked list structures
Disadvantages of open addressing over chaining
-Slower insertion, May need several attempts to find
an empty slot
-Table needs to be bigger (than chaining-based table)
to achieve average-case constant time performance
Rehashing

• If the hash table gets too full it should be resized. The best
way to resize it is to create a new hash table about twice as
large and hash all of the elements of the hash table into the
new table using its hash function.
• Rehashing is expensive, so it should only be done when
necessary:
• 1. When an insertion fails, or
• 2. When the table gets half full, or
• 3. When the table load factor reaches some predefined value.
Load factor
• Load factor λ of a hash table T is defined as follows:
• N = number of elements in T (“current size”)
• M = size of T(“table size”)
• λ= N/M(“ load factor”)
• i.e., λ is the average length of a chain
• Unsuccessful search time: O(λ)
• -Same for insert time
• Successful search time: O(λ/2)
• Ideally, want λ≤ 1 (not a function of N)
implementation

void clear( )
Resets and empties the hash table.

Object clone( )
Returns a duplicate of the invoking object.

boolean contains(Object value)


Returns true if some value equal to the value exists within the hash table.
Returns false if the value isn't found.
boolean containsKey(Object key)
Returns true if some key equal to the key exists within the hash table. Returns
false if the key isn't found.

boolean containsValue(Object value)


Returns true if some value equal to the value exists within the hash table.
Returns false if the value isn't found.

Enumeration elements( )
Returns an enumeration of the values contained in the hash table.
Object get(Object key)
Returns the object that contains the value associated with the key. If the key
is not in the hash table, a null object is returned.

boolean isEmpty( )
Returns true if the hash table is empty; returns false if it contains at least one
key.

Enumeration keys( )
Returns an enumeration of the keys contained in the hash table.
Object put(Object key, Object value)
Inserts a key and a value into the hash table. Returns null if the key isn't
already in the hash table; returns the previous value associated with the key if
the key is already in the hash table.

void rehash( )
Increases the size of the hash table and rehashes all of its keys.
Object remove(Object key)
Removes the key and its value. Returns the value associated with the key. If
the key is not in the hash table, a null object is returned.

int size( )
Returns the number of entries in the hash table.

String toString( )
Returns the string equivalent of a hash table.
IMPLEMENTATION
• import java.util.*;
• public class HashTable{

• public static void main(String args[]) {


• // Create a hash map
• Hashtable balance = new Hashtable();
• Enumeration names;
• String str;
• bal;
IMPLEMENTATION cont’d
• balance.put(“Dickson", new Double(1.5));
• balance.put(“Aidah", new Double(2.5));
• balance.put(“Mary", new Double(3.5));
• balance.put(“Ben", new Double(4.5));
• balance.put(“Hatimah", new Double(5.5));

• // Show all balances in hash table.


• names = balance.keys();

IMPLEMENTATION CONT’d


• // Deposit 1,000 into Dickson’s account
• bal = ((Double)balance.get(“Dickson")).doubleValue();
• balance.put(“Dickson", new Double(bal + 1000));
• System.out.println(
• “Dickson's new balance: " + balance.get(“Dickson"));
• }
• }
References

• Guttag,J.V(2015). Introduction to computation and


programming using python.London,UK,Dunches.
• Szymanski,T.G(1985).Hash table reorganization.
Journal of algorithms,6(3),322-335.
• Nimbe,P.,Opoku,M. & Asante,A.(2014).Hash table
collision resolution using a multi dimensional
array.International journal of innovation and
scientific research,9(2),258-267.

You might also like