0% found this document useful (0 votes)
12 views

Algo Cha 8

Uploaded by

mesfinleul873
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Algo Cha 8

Uploaded by

mesfinleul873
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

CHAPTER 8 HASHING

–Hash table
–Hash function
–Collision resolution
A HASH TABLE
Hash Table is a data structure which stores data in an
associative manner. In a hash table, data is stored in an array
format, where each data value has its own unique index
value. Access of data becomes very fast if we know the index
of the desired data.
Is a collection of items which are stored in such a way as to
make it easy to find them later. Each position of the hash
table, often called a slot, can hold an item and is named by an
integer value starting at 0. For example, we will have a slot
named 0, a slot named 1, a slot named 2, and so on. Initially,
the hash table contains no items so every slot is empty.
Hash function:- The mapping between an item and the slot where that
item belongs in the hash table is called the hash function.
 The hash function will take any item in the collection and return an
integer in the range of slot names, between 0 and m-1.
 Some method for hashing function are
1. Modulo division
2. Mid square method
3. Folding
1. Simply takes an item and divides it by the table size, returning the
remainder as its hash value .
Assume that we have the set of integer items 54, 26, 93, 17, 77, and 31.
Our first hash function, sometimes referred to as the “remainder
method,”
([Math Processing]hash value=item%11).
Cont..
Below table gives all of the hash values for our example items. Note
that this remainder method (modulo arithmetic) will typically be
present in some form in all hash functions, since the result must be in
the range of slot names.

Once the hash values have been computed, we can insert each item into
the hash table at the designated position as shown in below.
Mid square method
First square the item, and then extract some portion of the resulting
digits. Key should be squared and selected the middle of the square
number. And then make reminder method
Deference with modulo e.g. 54*54=2916->middle is 91%11(table
size)=3
26*26=676->midel is 7 so 7%11=7,
93*93=8649->middle is 64 so 64%11=9
FOLDING METHOD
For alphanumeric key/for large number
Fold shift hashing :-key value is divided into equal parts whose size
matches the size of required address and then adds the parts together
egg Telephone number key= 014528345654,becomes size {2}
01+45+28+34+56+54=h(k)=218 table size is 6 218%6
Depending on the size of table ,may then divide by some constant and
take reminder for hash table index.
Folded boundary hashing:-left and right number are folded on a fixed
boundary b/n them and other step folded the center the center number
e.g. key=123456789=center is 456
Left 123 fold=321 and right 789=fold 987 then add them
321+456+987=1764=h(k)=1764-> h(k)
Or 123,456,789 and 123+654+789=1566->h(k)
4365554601=>43,65,55,46,01=>1st step 34+65+55+46+10=210-
k(h)
Cont..
Note:-The number of digits in a group should correspond to the size of
the array.
If the array size is 1,000, you would divide the nine-digit number into
three groups of three digits. If a particular SSN was 123-456-789, you
would calculate a key value of 123+456+789 = 1368.
You can use the % operator to trim such sums so the highest index is
999. In this case, 1368%1000 = 368. If the array size is 100, you would
need to break the nine-digit key into four two-digit numbers and one
digit number: 12+34+56+78+9 = 189, and 189%100 = 89.
Cont..
You can probably already see that this technique is going to work only if each
item maps to a unique location in the hash table. For example, if the item 44
had been the next item in our collection,
it would have a hash value of 0 ([Math Processing ]44%11==0). Since 77 also
had a hash value of 0, we would have a problem. According to the hash
function, two or more items would need to be in the same slot.
This is referred to as a collision (it may also be called a “clash”). Clearly,
collisions create a problem for the hashing technique.
load factor is the number of keys stored in the hash table divided by
the capacity. The size should be chosen so that the load factor is less than 1.
Note that 6 of the 11 slots are now occupied. this is referred to as the load
factor, and is commonly denoted by [math processing
]λ=numberofitems/tablesize.
for this example, [math processing ]λ=6/11= 0.55
Collision Resolution
We now return to the problem of collisions. When two items hash to the
same slot, we must have a systematic method for placing the second item
in the hash table. This process is called collision resolution.
As we stated earlier, if the hash function is perfect, collisions will never
occur. However, since this is often not possible, collision resolution
becomes a very important part of hashing.

There is some method/ Technique


• A. Linear Probing
• B. Chaining
• C. Quadratic Probing
Linear probing.
One method for resolving collisions looks into the hash table and tries
to find another open slot to hold the item that caused the collision.
A simple way to do this is to start at the original hash value position
and then move in a sequential manner through the slots until we
encounter the first slot that is empty.
Note that we may need to go back to the first slot (circularly) to cover
the entire hash table.
This collision resolution process is referred to as open addressing in
that it tries to find the next open slot or address in the hash table. By
systematically visiting each slot one at a time, we are performing an
open addressing technique called linear probing.
H(x)=x%size of table then new hash after we detention collision
H’(x)=[H(x)+f(i)]%tablesize=>f(i)=0,1,2,3,4,5,…..
Cont..
an extended set of integer items under the simple remainder method
hash function (54,26,93,17,77,31,44,55,20).
When we attempt to place 44 into slot 0, a collision occurs.
h=(44)%11=>0 occupied then liner probing =>i=0
h’ =(0)+(0)(probing)%11=0 it also occupied
Then i=1=>h’=0+1=1%11=1 know 1 is free
Under linear probing, we look sequentially, slot by slot, until we find
an open position. In this case, we find slot 1.
Again, 55 should go in slot 0 but must be placed in slot 2 since it is the
next open position. H’=(0)+0(probing)%11=>0 is occupied
i=1 h’=0+1%11=1 it also occupied then i=2=>1+2=3%11=>2 then 2 is
free
Cont..
The final value of 20 hashes to slot 9. Since slot 9 is full, we begin to
do linear probing. We visit slots 10, 0, 1, and 2, and finally find an
empty slot at position 3.by using i=0

h(20)=(9)+0)%11=>3 so 3 is empty so the final hash table will be


Cont...
Once we have built a hash table using open addressing and linear
probing, it is essential that we utilize the same methods to search for
items. Assume we want to look up the item 93. When we compute the
hash value, we get 5. Looking in slot 5 reveals 93, and we can return
True. What if we are looking for 20?

Now the hash value is 9, and slot 9 is currently holding 31. We cannot
simply return False since we know that there could have been collisions.
We are now forced to do a sequential search, starting at position 10,
looking until either we find the item 20 or we find an empty slot.
Cont.
.
A disadvantage to linear probing is the tendency for clustering; items
become clustered in the table. This means that if many collisions occur
at the same hash value, a number of surrounding slots will be filled by
the linear probing resolution.

This will have an impact on other items that are being inserted, as we
saw when we tried to add the item 20 above. A cluster of valueshashing
to 0 had to be skipped to finally find an open position. This cluster is
shown in Figure below.
EXAMPLE
Quadratic probing
A variation of the linear probing idea is called quadratic probing.
There is formula:-firstly like linear probing:-h(x)=x%tablesize if
there is collision h’=[h(x)+f(i)]%tablesize but the f(i)=i2
i=0,1,2,3,..or number of probing eg.
an extended set of integer items under the simple remainder method
hash function with table size is 11 (54(Hval
10),26(4),93(5),17(6),77(0),31(9),44(0),55(0),20(9)).
Lets see collision is begin from 44
H’=((0)+(02))%11=0 is also occupied
H’=((0)+(12))%11=new hash value is 1 is free place 44 to index 1
Second 55 hash value is 0 it Collison
H’=((0)+(02))%11=0 is also occupied
Cont.
H’=((0)+(12))%11=new hash value is 1 is occupied
.
H’=(1+(2 ))%11=new hash value is 5 also occupied
2

H’=(5+(32))%11=new hash value is 3 know place 55 to index 3


Third 20 hash value is 9 occupied and makes collision
H’=(9+(02))%11=new hash value is 9 also occupied
H’=(9+(12))%11=new hash value is 10 also occupied
H’=(10+(22))%11=new hash value is3 also occupied
H’=(3+(32))%11=new hash value is 1 also occupied
H’=(1+(42))%11=new hash value is 5 also occupied
H’=(1+(52))%11=new hash value is 4 also occupied
H’=(4+(62))%11=new hash value is 7 know 7 will be place of 55
Here the problem of linear probing is solved their no more clustered
element .
3. Chaining
An alternative method for handling the collision problem is to allow
each slot to hold a reference to a collection (or chain) of items.
Allows many items to exist at the same location in the hash table. When
collisions happen, the item is still placed in the proper slot of the hash
table. As more and more items hash to the same location, the difficulty
of searching for the item in the collection increases.

Below table shows items as they are added to a hash table that uses
chaining to resolve collisions.
When we want to search for an item, we use the hash function
to generate the slot where it should reside. Since each slot
holds a collection, we use a searching technique to decide
whether the item is present.
The advantage is that on the average there are likely to be
many fewer items in each slot, so the search is perhaps more
efficient.
Exercise
Q-11: In a hash table of size 13 which index positions would
the following two keys map to? 27, 130
String/Character indexing
hash table for character/string first places change into
numbering system Find its ASCII code and make similar.
step (A=65,a=97)=>then add egMia=>77+105+97=279 then
279%11=4

You might also like