0% found this document useful (0 votes)
198 views31 pages

Data Structures: Hash Tables

Hash tables provide a way to store key-value pairs in an array-like data structure. A hash function is used to map keys to array indices, addressing the table via hashes rather than keys directly. Collisions, where two keys hash to the same index, are resolved through chaining or open addressing techniques. Chaining stores collided keys in linked lists, while open addressing resolves collisions by probing to alternate indices until an empty slot is found. Insertion, search, and deletion operations can then be performed in constant average time using hashing and collision resolution.

Uploaded by

Mercyless K1NG-
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
198 views31 pages

Data Structures: Hash Tables

Hash tables provide a way to store key-value pairs in an array-like data structure. A hash function is used to map keys to array indices, addressing the table via hashes rather than keys directly. Collisions, where two keys hash to the same index, are resolved through chaining or open addressing techniques. Chaining stores collided keys in linked lists, while open addressing resolves collisions by probing to alternate indices until an empty slot is found. Insertion, search, and deletion operations can then be performed in constant average time using hashing and collision resolution.

Uploaded by

Mercyless K1NG-
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

Data Structures

Hash Tables

FUIEMS Malik Imran Daud


What are Tables?
• Table is an abstract storage device that contains table
entries

• Each table entry contains a unique key k.

• Each table entry may also contain some information, I,


associated with its key.

• A table entry is an ordered pair (K, I)

FUIEMS Malik Imran Daud


Direct Addressing

FUIEMS Malik Imran Daud


8

Direct Addressing
T Key

0 NULL
1
2 NULL
…. ……..
Universe of Keys
23
0 - 99
….. …….
1 23 45 67
44 87 6 45
46 NULL
…. …….
…. …….

FUIEMS Malik Imran Daud


Tables: rows & columns of
information
• A table has several fields (types of information)
– A telephone book may have fields name, address, phone number
– A user account table may have fields user id, password, home folder
• To find an entry in the table, you only need know the contents of one of the
fields (not all of them). This field is the key
– In a telephone book, the key is usually name
– In a user account table, the key is usually user id
• Ideally, a key uniquely identifies an entry
– If the key is name and no two entries in the telephone book have the same
name, the key uniquely identifies the entries

FUIEMS Malik Imran Daud


The Table ADT: operations
• insert: given a key and an entry, inserts the entry into
the table
• find: given a key, finds the entry associated with the
key
• remove: given a key, finds the entry associated with
the key, and removes it

FUIEMS Malik Imran Daud


TableNode: a key and its entry
• For searching purposes, it is best to store the key and
the entry separately (even though the key’s value may
be inside the entry)
key entry
“Smith” “Smith”, “124 Hawkers Lane”, “9675846”
TableNode
“Yeo” “Yeo”, “1 Apple Crescent”, “0044 1970 622455”

FUIEMS Malik Imran Daud


Advantages with Direct Addressing
• Direct Addressing is the most efficient way to access the
data since.

• It takes only single step for any operation on direct


address table.

• It works well when the Universe U of keys is reasonable


small.

FUIEMS Malik Imran Daud


Difficulty with Direct Addressing
When the universe U is very large…

• Storing a table T of size U may be impractical, given the


memory available on a typical computer.

• The set K of the keys actually stored may be so small


relative to U that most of the space allocated for T would be
wasted.

• Even if memory is not an issue, the time to initialize the


elements to NULL may also be greater

FUIEMS Malik Imran Daud


An ideal table needed!

• The table should be of small fixed size.

• Any key in the universe should be able to be mapped in


the slot into table, using some mapping function

FUIEMS Malik Imran Daud


Hash Tables
• Definition: the ideal table data structure is merely an
array of some fixed size, containing the elements.

• Consists of an array and a mapping function (known


as hash function). A hash table is an array of size Tsize
– has index positions 0 .. Tsize-1

• Used for performing insertion, deletion and lookup in


constant time (on average).

FUIEMS Malik Imran Daud


What is a Hash function?
• A hash function is a mapping between a set of input
values (Keys) and a set of integers, known as hash
values.

Hash
function

Keys Hash values

FUIEMS Malik Imran Daud


Hash Table: A Simple Example
Define Hash Function as T

H(k) = k%10 = 0
1 0
9 0%
9 0)=
h (
Universe of Keys
90 - 99
h(95) = 5
90 95
92 98
96

FUIEMS Malik Imran Daud


Implementation:hashing
• An array in which TableNodes are
not stored consecutively - their key entry
place of storage is calculated using
the key and a hash function
4

hash array
Key index
function 10

• Hashed key: the result of applying a


hash function to a key
• Keys and entries are scattered 123
throughout the array

FUIEMS Malik Imran Daud


Hash Table Collisions
• when two or more keys hash to the same slot
– Can happen when there are more possible keys than slots (|
U| > m).
– For a given set K of keys with |K| ≤ m, may or may not
happen.
– Definitely happens if |K| > m.
– Therefore, must be prepared to handle collisions in all
cases

FUIEMS Malik Imran Daud


Hash Table Collisions

FUIEMS Malik Imran Daud


Hash Table Collisions
T

= 0
10
0 %
=9
(9 0)
h
Universe of Keys
0 - 99
h(95) = 5
90 95
92 98
85 h(85) = 5

FUIEMS Malik Imran Daud


Resolving Collisions
• How can we solve the problem of collisions?

– Solution 1: Chaining
– Solution 2: Open addressing

FUIEMS Malik Imran Daud


Chaining
• Put all the elements that hash to same slot in a linked
list.

• Worst case : All n keys hash to the same slot resulting


in a linked list of length n, running time:

FUIEMS Malik Imran Daud


Chaining

FUIEMS Malik Imran Daud


Chaining
Define Hash Function as
T
H(k) = k%10
= 0
10
0 %
=9
(9 0)
h
Universe of Keys
0 - 99
h(95) = 5
90 95
92 98
85 h(85) = 5

FUIEMS Malik Imran Daud


Open addressing

• Another approach for collision resolution.

• All elements are stored in the hash table itself (so


no pointers involved as in chaining).

• To insert: if slot is full, try another slot, and


another, until an open slot is found (probing)

• To search, follow same sequence of probes as


would be used when inserting the element

FUIEMS Malik Imran Daud


Open addressing: Collisions
• Linear Probing
– Whenever there is a collision, one strategy is to look for the
next unused slot and use it.

FUIEMS Malik Imran Daud


Open addressing: Collisions
When searching for an empty slot, one has to remember to
wrap around (like a circular array)

FUIEMS Malik Imran Daud


Open addressing: Insertion
HASH_INSERT(T,k)
1 i0
2 repeat j  h(k,i)
3 if T[j] = NIL
4 then T[j] = k
5 return j
6 else i  i +1
7 until i = m
8 error “ hash table overflow”

FUIEMS Malik Imran Daud


Open addressing: Search
• To retrieve an item the hash function will be
performed and starting at that location the key will be
searched for until an empty slot is reached.

• If an empty slot was reached and the key wasn’t


found then the item isn’t in the table.

FUIEMS Malik Imran Daud


Open addressing: Search
HASH_SEARCH(T,k)
1 i0
2 repeat j  h(k,i)
3 if T[j] = k
4 then return j
5 i  i +1
6 until T[j] = NIL or i = m
7 return NIL

FUIEMS Malik Imran Daud


Deletion
• Algorithm assumes that keys are not deleted once they
are inserted
• Deleting a key from an open addressing table is
difficult, instead we can mark them in the table as
removed (introduced a new class of entries, full, empty
and removed)
• Need special placeholder deleted, to distinguish slot
that was never used from one that once held a value
• May need to reorganize table after many deletions

FUIEMS Malik Imran Daud


Hash Function
• Maps keys to positions in the Hash Table.
• Be easy to calculate.
• Use all of the key.
• Spread the keys uniformly.
• Should not depend on patterns in the data

FUIEMS Malik Imran Daud


Nature of keys
 Most hash functions assume that universe of keys is the
set N = {0, 1, 2,…} of natural numbers
 If keys are not N, ways to be found to interpret them as
N
 A character key can be interpreted as an integer
expressed in suitable Radix notation.

FUIEMS Malik Imran Daud


– Example:
Radix Notation
– Interpret a character string as an integer expressed in some radix
notation. Suppose the string is CLRS:

• ASCII values: C = 67, L = 76, R = 82, S = 83.


• There are 128 basic ASCII values.
• So interpret CLRS as
• (67 · 128³)+ (76 · 128²)+ (82 · 128¹)+ (83 · 128º)
• = 141,764,947.

FUIEMS Malik Imran Daud

You might also like